implementation of genetic algorithms in prospect theory

Implementation of Genetic Algorithms in Prospect

Theory-based Portfolio Optimization

by

Sahand Kashiri

Master of Science Quantitative Finance

Chair of Monetary Economics and International FinanceFaculty of Business, Economics and Social Sciences

Kiel University

Primary Review:Secondary Review:

Supervision:

Winter 2018

Contents

List of Figures III

List of Tables III

List of Algorithms III

1. Introduction 1

2. Preliminary Models and General Notation 2

2.1. General Portfolio Optimization . . . . . . . . . . . . . . . . . . . . . 22.2. Adding Cardinality Constraints . . . . . . . . . . . . . . . . . . . . . 4

3. Basic Index Tracking 5

3.1. Measuring the Tracking Quality . . . . . . . . . . . . . . . . . . . . . 63.2. Formulation of the Optimization Problem . . . . . . . . . . . . . . . 9

4. Prospect Theory-based Index Tracking 11

5. Adaptive Optimization Methods 14

5.1. Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155.2. Di�erential Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

6. Replication Methodology 20

6.1. Summary of the Optimization Problems . . . . . . . . . . . . . . . . 216.2. Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

7. Testing the Quality of the Algorithms 23

7.1. Basic Prospect Theory Results . . . . . . . . . . . . . . . . . . . . . . 247.2. Alternative Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 25

8. Replication of Results 26

8.1. CPLEX for Basic Index Tracking . . . . . . . . . . . . . . . . . . . . 268.2. Genetic Algorithm for Prospect Theory-based Index Tracking . . . . 288.3. Experimenting on Di�erent Market Environments . . . . . . . . . . . 298.4. Out-of-sample Performance . . . . . . . . . . . . . . . . . . . . . . . 31

9. Discussions 32

9.1. On the Genetic Algorithm and alternative parameters . . . . . . . . . 329.2. On the Population Initialization . . . . . . . . . . . . . . . . . . . . . 339.3. On CPLEX Optimization and performance measurements . . . . . . 349.4. On the Objective Function . . . . . . . . . . . . . . . . . . . . . . . . 35

I

10.Conclusion 37

References 38

A. Appendix 43

A.1. Replicated Genetic Algorithm Pseudo-codes . . . . . . . . . . . . . . 43A.2. Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43A.3. Basic PT model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44A.4. System Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . 45A.5. Original Genetic Algorithm Pseudocode . . . . . . . . . . . . . . . . 45A.6. Original Di�erential Evolution Algorithm . . . . . . . . . . . . . . . . 45

II

List of Figures

1. Prospect Theory evaluation stage functions . . . . . . . . . . . . . . . 122. General Genetic Algorithm flowchart . . . . . . . . . . . . . . . . . . 16

List of Tables

1. Summary of data dimensions . . . . . . . . . . . . . . . . . . . . . . . 232. Summary of algorithm parameters for heuristics . . . . . . . . . . . . 233. Original vs. Replication for PT model . . . . . . . . . . . . . . . . . 244. Summary of altered algorithm parameters for GAEX . . . . . . . . . . 255. Original vs. Replication EX for PT model) . . . . . . . . . . . . . . . 256. Results of the IT optimization problem . . . . . . . . . . . . . . . . . 277. Results of the ITCC optimization problem . . . . . . . . . . . . . . . 278. Original results of the PTIT optimization problem (without Cardinality

Constraints) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299. Results of the PTIT,CC optimization problem . . . . . . . . . . . . . . 2910. Summary of data dimensions for out-of-sample optimization . . . . . 3011. Results of the optimization in specific market environments . . . . . . 3012. Results of the (out-of-sample) optimization . . . . . . . . . . . . . . . 3113. Basic PT model variables . . . . . . . . . . . . . . . . . . . . . . . . . 4414. Di�erences in Hardware and Software . . . . . . . . . . . . . . . . . . 45

List of Algorithms

1. Population Initialization for GAGr and DEGr . . . . . . . . . . . . . . 182. Main Genetic Algorithm GAGr for PTIT optimization . . . . . . . . . 183. Main Di�erential Evolution Algorithm . . . . . . . . . . . . . . . . . 204. Genetic Algorithm Operation: Crossover and Mutation . . . . . . . . 435. Genetic Algorithm Operation: Cardinality and buy-in constraints check 436. Original Di�erential Evolution algorithm pseudo-code for the PTIT

(Grishina et al. 2016) . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

III

1. Introduction

This thesis aims to replicate the main results and methods found by Grishina et al.(2016)1. It elaborates on the considerations when implementing a genetic algorithm forprospect theory-based index tracking as described in the original work and presentsalternative parameters to improve convergence properties.

It is motivated by a renewed focus on the importance of scientific replication ineconomics, as explained by Alm (2010). Chang et al. (2018) illustrated the hardshipof doing so: After analyzing 67 macroeconomic papers that were published in "wellregarded economic journals", they were only able to recreate the results in less thana third of the time without contacting the respective authors. Mueller-Langer et al.(2017) show that less than 0.1% of the papers published in top-50 economic journalsare replication papers. The repercussions are not only limited to academia but alsoa�ects the image held by the general population towards the field.2

Based on the suggestions made by Alm (2010) and King (2006), who set guidelines forthe replication of academic papers, the thesis’s main goal is to fully describe its ownmethodology and computations in order to ensure that any reader is able to replicatethe results herein presented. Descriptive gaps in the work of Grishina et al. (2016)are closed mainly through two channels:

• A doctoral thesis, which precedes the original paper and is from one of itsauthors (Grishina (2014)). Even though the dissertation was not explicitlynamed by the authors of the original paper, it shows a large descriptive overlapwith the paper by Grishina et al. (2016). More importantly, since the reportedresults are exactly the same in both works, it can be reasonably concluded thatany additional information found in the dissertation, can be considered (withcare) to be valid for the published paper as well, i.e. the same computationalresults are based on one and the same implementation.

• Controlled methodological experiments, i.e. tweaking of the replicated algo-rithms.

Both these channels, were used with care and the intent to recreate the work ofGrishina et al. (2016) as faithfully as possible.

The following structure is used for the remainder of this thesis: First, the necessaryeconomic theory is described, beginning from the notation of basic portfolio optimiza-

1N. Grishina et al. (2016). “Prospect theory–based portfolio optimization: an empirical studyand analysis using intelligent algorithms”. In: Quantitative Finance 17.3, pp. 353–367. doi:10.1080/14697688.2016.1149611

2e.g. a recent Bloomberg opinion article titled: "Why Economics Is Having a Replica-tion Crisis" https://www.bloomberg.com/opinion/articles/2018-09-17/economics-gets-it-wrong-because-research-is-hard-to-replicate

1

https://doi.org/10.1080/14697688.2016.1149611

https://www.bloomberg.com/opinion/articles/2018-09-17/economics-gets-it-wrong-because-research-is-hard-to-replicate

https://www.bloomberg.com/opinion/articles/2018-09-17/economics-gets-it-wrong-because-research-is-hard-to-replicate

tion (section 2) to index tracking (section 3) and ending in prospect theory-basedindex tracking in section 4. In order to solve the models, heuristic algorithms are used,which are introduced in section 5. Section 6 elaborates on the replication methodologyand the data used. Sections 7 and 8 report the quality of the replicated algorithm,present alternative parameters and compare the results of the complete models withthe original ones. Section 9 elaborates on this comparison by giving further insightinto the original descriptions and implementation of Grishina et al. (2016), discussesassumptions and attempts to better explain the discrepancies between them.

2. Preliminary Models and General Notation

The optimization problems that Grishina et al. (2016) attempt to solve in their paper,can be separated into two broad categories:

1. Basic index tracking problems, i.e. attempting to minimize a predefined trackingerror function, which represents the di�erence between (the returns of) an indexand (the returns of) a portfolio consisting of the index’s components over agiven time period.

2. Prospect Theory-based problems, which attempt to maximize a non-linear utilityfunction, based on the well-known theory by Kahneman et al. (1979). Specifically,enhancing the basic index tracking problems with behavioral insights.

The original paper’s presentation of these models as building upon each other isadapted in this thesis. Additional details and literature are introduced in order tobuild the theoretical context necessary to uphold the practical decisions made in thelatter half of the thesis.

2.1. General Portfolio Optimization

The following notation and considerations are based on Markowitz (1952, 1959), whichmark the beginning of Modern Portfolio Theory (MPT), a theory that attempted toquantify the general asset allocation decision that investors face when putting togethera portfolio consisting of di�erent assets .

Let a portfolio x be defined as a set of N asset weights w. Its elements sum up to1, so that they can be interpreted as actual weights, i.e. proportions of the wholeportfolio:

x = (w1, . . . , wN) (1)Nÿ

i=1wi = 1 (2)

2

wi Ø 0, i = 1 . . . N (3)

Here, the additional non-negativity condition (eq. (3)) follows the real life restrictionof short sales (Jarrow 1980). Furthermore, let the expected return rx of a givenportfolio be defined as

rx = w1µ1 + · · · + wNµN =Nÿ

i=1wiµi (4)

where µi with i = 1, . . . , N are each asset i’s assumed or expected returns. Thereare di�erent methods on how to form these expectations, e.g. taking the mean ofhistorical (known) returns over a given time period. However, as Markowitz (1991)explains, MPT is based on the simple insight, that, no matter which method is used,the real returns are inherently uncertain in an investment context and if they were not,any agent would maximize the expected value of their portfolio by simply maximizingthe weight that belongs to the asset with the highest expected return.

Accordingly, agents have to incorporate this real uncertainty as a "risk", whichMarkowitz (1952) quantified by measuring the variance of the portfolios expectedreturn.

This subsequently lead to the modern understanding of diversification, i.e. thecombination of assets based on their correlations to minimize the portfolio’s returnvariance. Modern portfolio theory is ultimately concerned with the rationality of risk-averse investors and shows that an optimal asset allocation could be calculated withonly the individual assets’ covariance and their expected returns (Markowitz 1952).The complete optimization problem of MPT, also called mean-variance optimization,can be formulated in the following way:

minNÿ

i=1

Nÿ

j=1wiwj‡ij (5a)

s.t.Nÿ

i=1wiµi = Rú (5b)

Nÿ

i=1wi = 1 (5c)

wi Ø 0, i = 1, . . . , N (5d)

The optimization aims at minimizing the portfolios return variance (eq. (5a)) underthe constraint of reaching a certain return goal (Rú). Where N ,w,and µi are definedas before, ‡ij is the covariance of asset i’s and j’ returns (in case i = j it becomes thevariance).

3

It is equivalently possible to set a variance goal, i.e. qNi=1

qNj=1 wiwj‡ij = V ú and then

setting the maximized expected return as the objective function of the optimization.A mean-variance pair is called e�cient, when it’s optimized in either aforementionedway.

Elton et al. (1997) explain that historically, index models (i.e. single factor models)were used to reduce the computational complexity of the aforementioned mean-varianceoptimization. For instance, the diagonal model, developed by Sharpe (1963), uses amarket index or "any other factor thought to be the most important single influenceon the returns from securities" (Sharpe 1963) in a linear regression model to derivethe parameters that are used to calculate the expected returns and covariance inputsindirectly3.

Equipped with the most basic form of the problem, two possible extensions to (eq. (5)),as described by Chang et al. (2000), will be addressed in the following sections:

• Inclusion of other constraints, e.g. transaction costs, to better reflect real-lifeconsiderations or limitations.

• Considering di�erent measures of "risk", to better reflect the uncertainty consid-erations of agents.

2.2. Adding Cardinality Constraints

Transaction costs a�ect the portfolio by making portfolios that trade less frequentlymore attractive. They can be included explicitly, e.g. by modifying the objectivefunction of the problem, as illustrated by Adcock et al. (1994), or by includingtransaction costs as constraints, shown by Beasley et al. (2003).

Alternatively, Chang et al. (2000) and Jobst et al. (2001) implement them indirectly,through inclusion of a constraint on the number of assets that are ought to be heldin the optimized portfolio. This naturally precedes a reduction in transaction ormaintenance costs of a portfolio, by assuming less frequent trading needs.

Such constraints on the cardinality of the portfolio can be supplemented with aminimum weight for each asset that is included in the portfolio, i.e. buy-in constraints,which can be interpreted as a limitation on additional costs that occur through verysmall positions (Jobst et al. 2001). Grishina et al. (2016) implement the followingconstraints:

3For an analysis of 100 securities, the number of estimates reduces from 5,150 to 302, according tothe author.

4

Nÿ

i=1zi Æ K (6)

‘izi Æ wi Æ ”izi, i = 1, . . . , N (7)

zi œ [0, 1], i = 1, . . . , N (8)

zi =

Y_]

_[

1 if wi ”= 0

0 otherwise(9)

where cardinality limit K constraints the number of positive asset weights4 and ‘

and ” determine the minimum and maximum weight that an included asset can have,respectively. The constraints are codified with by using binary variable z, whichindicates whether an asset is held, i.e. it constraints only non-negative asset weightswi.

A problem including these additional constraints, is classified as a (quadratic) mixed-integer programming problem (QMIP) as described by Bienstock (1996) or moregenerally a nonlinear mixed-integer programming problem as described by Borcherset al. (1994), who subsequently suggest a branch and bound algorithm to solve it.The details of such a solution approach are omitted here, since Grishina et al. (2016)use commercial software to solve their mixed integer problems.

The inclusions of eqs. (6) to (8) are important for the models in the remainder of thethesis, which subsequently will be marked "CC", if they include them.

The next chapters, are going to introduce an alternative approach to portfolio optimiza-tion, which takes a di�erent perspective on agents’ risk and return considerations thanstandard mean-variance optimization, by attempting to track a market index.

3. Basic Index Tracking

As previously mentioned, the first model that Grishina et al. (2016) implement, is anindex tracking model.

Such models develop naturally from passive portfolio strategies, i.e. strategies that aregenerally characterized by diversified, broad, market portfolios. They aim to achievethe average market or segment return (benchmark), as opposed to active investingstrategies, which select specific assets to buy and sell in order to outperform the

4The cardinality constraint has been formulated as an inequality instead of an equality here, forincreased clarity, i.e. to be in tune with the constraints of the models by Grishina et al. (2016).For more details on the di�erence see (Woodside-Oriakhi et al. 2011)

5

market.5

Passive portfolio managers generally want to keep the portfolio’s performance as closeas possible to their benchmark market. Consequently, this tracking error is going tobe set as the main objective function, i.e. the value that is attempted to be minimizedin order to find the optimal portfolio solution. The distinction can be traced back toTreynor et al. (1973).

As the name suggests, index tracking makes use of asset indices as proxies and hasrapidly gained in popularity, especially due to the emergence of exchange-traded funds(ETFs): According to a survey by Kealy et al. (2017), there has been a steady increasein market share of ETFs compared to other open-end funds.

As Rudd (1980) describes, a passively managed fund that seeks to fully replicate anindex, could do so by purchasing all it’s components individually and rebalance theweights to match the index when it changes. This would, however, come at higheradministrative costs in addition to costs based on more frequent dividend reinvestingneeds, which arise over the holding time. In other words, sparse portfolios can limitcosts, by trading fewer assets to replicate the index, but come with a generallyunwanted increase in the di�erence between portfolio returns and index returns.

Balancing this trade-o� forms the core of the index tracking optimization problem(Takeda et al. 2012). The following chapter focuses specifically on the mathematicalformulation underlying the index tracking portfolio strategy.

3.1. Measuring the Tracking Quality

Grishina et al. (2016) define their tracking error measure in the following way:

TEGr =Tÿ

t=1TEGr,t =

Tÿ

t=1|rx,t ≠ It| (10)

Here, TE is formulated as the sum of each time period’s t (with t = 1, . . . , T ) absolutedi�erence between portfolio return rx,t and Index return It. The previous portfolioreturn definition rx (eq. (4)) has been naturally extended to include a time dynamic:

rx,t =Nÿ

i=1wi,tµi,t, , t = 1, . . . , T (11)

wi,t is each assets portfolio weight in period t and µi,t now specifically denotes eachasset’s (known) past return in period t. Because they are looking at past returns,

5For discussions on passive vs. active investing, the reader may refer to the works of Barberet al. (2000) for its e�ects on individual trading performance, French (2008) for a view on theimplications for the whole economy, Braun (2016) for a political economics perspective andDeville (2008) for the history of ETFs.

6

the problem becomes retroactive, and thus, determines a portfolio that would haveperformed well during that observed time period.

To motivate their choice, the authors state that it’s the "simplest possible way" andfurther, that the tracking error can be "defined in di�erent ways". Since it fulfills twocrucial roles, one as the metric given in the majority of results tables in the originalpaper, and one as the objective function of the index tracking optimizations, thischapter attempts to give further insight into these statements.

Using the explanations of Rudd (1980), a tracking error measure, generally, canbe broken down in long term deviations of the portfolio’s and index’s returns, i.e.deviations of the means and short-term fluctuations of the residuals. Within thewell-known framework of the Capital Asset Pricing Model (CAPM) (Sharpe 1964), thiswould correspond to systematic and unsystematic risk, respectively. The latter, is adirect consequence of lower diversification of sparse portfolios, which also ties in to thevariance considerations of the previously described mean-variance optimization.

Consequently, Rudd (1980) suggests an index tracking optimization that minimizes theportfolios residual variance, while simultaneously constraining the portfolio’s (CAPM)beta to be 1, i.e. minimizing diversifiable risk. This is comparable to "Tracking ErrorVariance" as defined by Roll (1992).

It is important to note, however, that using the variance of the di�erence betweenreturns (rx,t ≠ It), ignores the role of a non-volatile bias, i.e. it includes a shift: AsBeasley et al. (2003) explain, in the case of rx,t = It ≠ M ’t, where M > 0 representsa constant underperformance of the portfolio, Var(rx,t ≠ It) = Var((It ≠ M) ≠ It) =Var(M) = 0. Since the measure would fail to show any tracking error in thishypothetical case, other definitions should be considered.

Beasley et al. (2003) themselves define a more direct TE measure:

TEBe,– = 1T

Cÿ

tœS

|rx,t ≠ It|–D(1/–)

(12)

where rx,t and It are defined as above. T is again the number of total time periodsobserved. Additionally, – > 0 is a penalization coe�cient for di�erences between rx,t

and It. As an example, TEBe,2, i.e. setting – = 2, leads TEBe,– to correspond to theroot mean square error6. S modifies the set of time periods which are included forthe calculation, i.e. S = {t|t = 1, . . . , T} would include all available time periods,while for instance S = {t|rx,t < It, t = 1, . . . , T} only takes into account the periodswhen the portfolio return is less than the index return. According to Beasley et

6for equivalency, the factor 1/T would, instead, need to be 1/Ô

T . However, this is irrelevant for theoptimization, as shown below

7

al. 2003, the latter would correspond to the definition of downside risk7. TEBe,–

allows for generalizations through parameters – and set S, which in turn, allowsfor such techniques as cross-validation (Stone 1974) in order to potentially improveout-of-sample performance.

Takeda et al. (2012) use a very similar measure8:

TETa = 1T

(Îrx ≠ IÎ2)2 (13)

Where, rx = (rx,1, . . . , rx,T )| and I = (I1, . . . , IT )| are (T ◊ 1) vector representationsof the portfolio return rx,t and index return It over time periods t = (1, . . . , T ),respectively. The p-norm9 of any given vector z œ Rn, is defined as:

ÎzÎp := (|z1|p + · · · + |zn|p)1p =

Anÿ

i=1|zi|p

B 1p

, p Ø 1 (14)

Next, it will be shown that TETa can be reformulated into the mean of the squareddeviations between the portfolio returns and market index return:

TETa = 1T

Îrx ≠ IÎ22 (15)

= 1T

.........

Q

ccca

rx,1...

rx,T

R

dddb ≠

Q

ccca

I1...

IT

R

dddb

.........

2

2

= 1T

.........

Q

ccca

rx,1 ≠ I1...

rx,T ≠ IT

R

dddb

.........

2

2

(16)

= 1T

S

WU

CTÿ

t=1|rx,t ≠ It|2

D 12T

XV

2

(17)

= 1T

Tÿ

t=1|rx,t ≠ It|2 (18)

= 1T

Tÿ

t=1(rx,t ≠ It)2 ’rx,t, It œ R (19)

Obviously, TETa can also be explicitly formulated in relation to TEBe,– by setting– = 2. Then TETa = T · TE2

Be,2.

It is now straightforward to return to TEGr as defined by Grishina et al. (2016) andinclude it in the more general frameworks:

TEGr =Tÿ

t=1|rx,t ≠ It| (20)

7which is comparable to the definition of ’Semi-Variance’ in Markowitz (1959, pp. 188–204)8They falsely call this measure "Tracking Error Variance" explicitly citing Roll (1992), however the

norm notation does not lead to a centered variance measure (as seen in eq. (19)) and thus, itdoesn’t include the shift problematic that was described earlier (see Karlow (2012, p. 59))

9see Golub et al. (2013, p. 69) for a detailed introduction and properties

8

= Îrx ≠ IÎ1 = T

T

CTÿ

t=1|rx,t ≠ It|1

D 11

(21)

= T · TEBe,1 (22)

which allows for the use of common properties of the p-norm, if needed. For instanceÎrx ≠ IÎ2 Æ Îrx ≠ IÎ1 Æ

ÔTÎrx ≠ IÎ2 (Golub et al. 2013, p. 69).

Aggarwal et al. (2001) argue that the p2-norm and generally all higher p-norms areoften not preferred to the lower ones, like the p1-norm, in optimization problems withhigher dimensionality, because the lower p-norms show weaker distance concentration,i.e. are better able to mitigate the "curse of dimensionality" (s.Biau et al. (2015) foran in-depth analysis). Accordingly, possible extensions of the index tracking modelcould look at the e�ects of using fractional-norms, which the Aggarwal et al. (2001)define as p-norms with 0 < p < 1. In practice, however, larger p-norms play a moreimportant role, "especially when large errors are particularly undesirable" (Takedaet al. 2012).

Moreover, it should be noted that even though these traditional tracking error measuresare very popular, they don’t incorporate a potential negative serial autocorrelationof (rx,t ≠ It). This estimation bias has been described by Pope et al. (1994) andcan be potentially solved by using compounded returns (Karlow 2012), however, adetailed discussion has been omitted due to the scope of this thesis: Grishina et al.(2016) don’t use the tracking error for it’s properties as an actual measurement, i.e.using it as a performance metric to compare di�erent index funds or to determine thecompensation of an index fund manager, but rather as the objective function in theindex tracking optimization problem.

In this case, it is possible to streamline the comparison of di�erent tracking errordefinitions, through the use of the following lemma:Lemma 3.1. If g : R æ R is a monotone increasing mapping, then the problemmin

zf(z) is equivalent to min

zg(f(z)).10

e.g. both the optimization with objective function TEGr and T · TEBe,1 should resultin the same optimal portfolio, but not the same objective function value, since T > 0means g(z) = T · z is a monotone transformation.

3.2. Formulation of the Optimization Problem

In order to solve minx

TEGr, Grishina et al. (2016) linearize the absolute value functionof their tracking error. This is done by introducing supplementary variables ot ©max(rx,t ≠ It, 0) and ut © max(It ≠ rx,t, 0). ot and ut represent the portfolio’s10proof in appendix A.2

9

overperformance and underperformance compared to the index returns for each timeperiod t, respectively, and are always non-negative:

TEGr,t = |rx,t ≠ It| = ot + ut =

Y_]

_[

ot if ot Ø 0

ut otherwise(23)

This implies the variables have mutually exclusive positive values. Thus the absolutevalue function can be substituted by the supplementary variables (Mangasarian 2006),which leads to a linear model similar to the ones introduced by Konno et al. (1991)and Speranza (1996).

It is now possible to formulate the complete index tracking optimization problem withadditional constraints (ITCC), used by Grishina et al. (2016) , through the combinationof the linearized tracking error (as the objective function) (eq. (23)), the constraintsrelating to the portfolio definitions and weight restrictions (eqs. (1) to (3)), and thecardinality constraints (eqs. (6) to (8)) to limit the number of assets and introducebuy-in restrictions:

min TE =Tÿ

t=1(ot + ut) (24a)

s.t. rx,t ≠ It = ot ≠ ut, t = 1, . . . , T (24b)Nÿ

i=1wi = 1, (24c)

‘izi Æ wi Æ ”izi, i = 1, . . . , N (24d)Nÿ

i=1zi Æ K, (24e)

zi œ {0, 1}, i = 1, . . . , N (24f)

wi Ø 0, i = 1, . . . , N (24g)

ot, ut Ø 0, t = 1, . . . , T (24h)

The optimization problem (24) is a linear mixed-integer problem (LMIP). In total,it requires optimization of N binary variables (zi) and N + 2T continuous variables(wi, ot, ut). Similar constraints have been implemented by Scozzari et al. (2012).

Grishina et al. (2016) also perform an IT optimization without cardinality constraints,which they describe as the same optimization problem as above, except that K = N ,i.e. the cardinality limit being the total number of available stocks. It is assumedthat they actually mean the optimization problem without cardinality constraintsand without buy-in constraints, i.e. the optimization omitting binary variable zi andall its related constraints. This assumption is substantiated by the descriptions in

10

Grishina (2014).

Lastly, Grishina et al. (2016), also report values for the total overperformance andunderperformance of an portfolio over all time periods, which are defined as otot ©qT

t=1 ot and utot © qTt=1 ut, respectively.

4. Prospect Theory-based Index Tracking

The second model implemented by Grishina et al. (2016), is based on a behavioralapproach to portfolio optimization and index tracking.

Behavioral considerations can be subdivided into two generations according to Statman(2018):

• the first generation of behavioral finance acknowledges standard finance’s defini-tion of rational wants, i.e. risk minimized returns, but it also identifies systematicirrational behavior, e.g. judgment biases. Based on this consideration, assetmanagers could o�er products or portfolios, which mitigate the influence ofthese irrationalities.

• the second generation of behavioral finance goes beyond the standard finance’sdefinition of rational wants and potentially includes goals such as hope for riches.Based on this consideration, asset managers could o�er products that specificallytarget such wants in exchange for lower returns, in the hope to increase theoverall ’value’ for the customer.

This breakdown invites thinking about asset allocation as a versatile tool, which,depending on the specifics of the optimization model, can aim to achieve a numberof very di�erent goals. The behavioral inclusions made in the model of Grishinaet al. (2016), which is analyzed in the following section, represent only a fractionof possible considerations. For this reason, the following sections are purposefullymore qualitative to preserve this versatility. The authors present a simplified ProspectTheory model and combine it with the basic index tracking model.

In a series of papers, Kahneman et al. (1974, 1979, 1992) discover several decision biaseswhich economic agents tend to make when faced with a decision under uncertainty.They determine these heuristics 11 through surveys which show that the agents’ choicesseemingly go against the standard economic definition of rationality, i.e. violate theexpected utility hypothesis. Subsequently, they develop and refine a framework calledProspect Theory, which models the preferences of an agent by analyzing a givendecision problem in two stages: Editing and Evaluation.

11i.e. shortcuts, not to be confused with heuristic algorithms

11

losses ReferencePoint

gains

Value vpt

(a) PT value function

Probability p

Decision weight fi

(b) PT probability weighting function comparedto fi(p) = p (dotted line)

Figure 1: Prospect Theory evaluation stage functions

Kahneman et al. (1979) explains that In the editing stage, the potential outcomes ofthe decision are redefined in relation to a reference point, i.e. transformed into gainsand losses instead of final value. Other steps, e.g. pooling same outcomes together,are taken, all of which are meant to address how the di�erent outcomes are actuallyperceived by a decision maker. In the evaluation stage, the edited outcomes are thengiven a value and their respective probabilities are weighted, to incorporate severalother mental adjustments, which lead to deviations from the predictions of standardexpected utility theory.

In the following, the most important attributes of the value and decision weightingfunctions are explained by analyzing the attributes shown through the equations andcorresponding plots. The final value of a lottery is defined by Kahneman et al. (1979)as:

Vpt =Jÿ

j=1fi(pj)vpt(yj) (25)

Where yi represents a potential gain or loss (i.e. the edited outcomes in relation to areference point) for a lottery’s outcome j, which occurs with corresponding probabilitypj. Vpt is the prospect (utility) of the whole lottery. It consists of each vpt(yj), thevalue of a given yj and fi(pj), the decision weight given probability pj, accordingto:

vpt(y) =

Y_]

_[

y– for y Ø 0

≠⁄(≠y)— for y < 0(26)

fi(p) = p“

(p“ + (1 ≠ p)“)1/“(27)

Here –, —, “ œ R are model parameters. The general features of these functions areillustrated in Figure fig. 1. Equation (eq. (26)) is non-linear, as it behaves very

12

di�erently depending on whether gains or losses are considered. Specifically, for values–, — Æ 1, it is concave for gains and convex for losses, representing risk-aversionand risk-seeking preferences, respectively (Kahneman et al. 1992). This is done toincorporate the underlying observed behavior, called reflection e�ect, by Kahnemanet al. (1979). Additionally, the function’s sensitivity diminishes as it gets further fromthe reference point. Kahneman et al. (1992) demonstrate that ⁄ > 1 implies the valueof any gain is smaller than the absolute value of the equally sized loss. This is calledloss aversion and represents a central observation of the authors, especially for thecontext of this thesis.

The decision weight function (eq. (27)) overvalues small probabilities and undervaluesmoderate and large ones, i.e. people show limited sensitivity to changes in the middleof the probability range.12

Kahneman et al. (1992) provide ⁄ = 2.25 and – = — = 0.88 as median values basedfrom their experiments, which are adapted by Grishina et al. (2016), however, othervalues are possible, e.g. Rieger et al. (2017) suggest slightly lower values on average,in addition to country-specific values.

Intuitively, the index is a suitable reference point to define gains and losses in thecontext of passive portfolio strategies, which, as described earlier, attempt to replicatethe average performance of a market.

This results in what Grishina et al. (2016) simply call "Prospect Theory with IndexTracking" (PTIT,CC). Instead of minimizing the distance to an index as before, anoptimal portfolio in this model maximizes the prospect utility based on eq. (25), withyj = rx,t ≠ It. fi(p) = p is set by the authors for simplicity. The other constraintsfollow directly from the ITCC problem in eq. (24).

max PTIT(x, I) =Tÿ

t=1ptvpt(rx,t ≠ It) (28a)

s.t.Nÿ

i=1wi = 1, (28b)

‘izi Æ wi Æ ”izi, i = 1, . . . , N (28c)Nÿ

i=1zi Æ K, (28d)

zi œ {0, 1}, i = 1, . . . , N (28e)

The value and decision weight function are defined as before in eq. (26) and eq. (27).

12Kahneman et al. (1979) further di�erentiate between decision weighting for probabilities of gainsand losses. This aspect and the violation of stochastic dominance in prospect theory, wherepurposefully omitted in this section, since decision weighting was not integrated in the model ofGrishina et al. (2016).

13

Thus, this optimization problem explicitly makes the underperformance of a portfoliocompared to the index subject to loss aversion. The short sale constraint wiØ 0,

’i has not been explicitly mentioned in this model by Grishina et al. (2016) but isimplicitly included if ‘i Ø 0.

Furthermore, a basic version of the prospect theory model (without index tracking orcardinality constraints), which Grishina et al. (2016) reportedly use for parameteroptimization of the algorithms (s.section 7.1), is formally included in the appendix A.3,for completeness.

Now that the theoretical models and problems have been su�ciently specified, theremainder of this thesis will elaborate on the implementation of solution techniquesand then proceed to report their results.

5. Adaptive Optimization Methods

As described in Grishina et al. (2016), the complexity of the prospect theory-basedindex tracking problem (eq. (28)), makes it di�cult to solve with traditional opti-mization algorithms in a reasonable amount of time. Thus, heuristic algorithms, i.e.search techniques involving randomness, are used to reduce computation time andmake optimization feasible (Oh et al. 2005; Woodside-Oriakhi et al. 2011).

Based on Storn et al. (1997) and Wilding (2003) and the demonstrations in Changet al. (2000), a set of standards to determine the practicability of optimization methodsis derived. A good method shares the following traits:

1. Usability that is independent of the objective function’s complexity, i.e. beingable to handle non-di�erentiable and non-linear functions

2. The ability to easily impose complex constraints, e.g. including integer or binaryvariables

3. Computational e�ciency, e.g. usage of parallel computation

4. Ease of parameters choice, i.e. fewer control variables and consequently robustoptimization without the need of complex parameter tweaking

5. "Good convergence properties", i.e. the convergence should regularly and quicklyreach the global optimum.

For example, Price et al. (2006) explain the disadvantage of an optimization techniquethat is dependent on the knowledge about the objective function’s gradient.

In the context of this thesis, it is necessary to use heuristics, which are suitable for themaximization of objective function PTIT, as defined in eq. (28), which is non-linearand non-di�erentiable, due to the characteristics of value function (eq. (26)). This is

14

graphically visible through the existence of a kink in fig. 1a. Moreover, it must be ableto appropriately e�ciently handle the integer constraints, i.e. the inclusion of variablezi (eq. (28e)) and in extension, the cardinality (eq. (28d)) and buy-in constraints(eq. (28c)).

Regarding good convergence properties, Holland (1992) makes the important distinc-tion between enumerative and adaptive processes, where the former are "characterizedby the fact that the order in which they test structures is una�ected by the outcomeof previous tests", while the latter retain information about their past attempts. Inother words, an adaptive heuristic is guided (evolving) by applying a specific (partiallyrandom) rule set. By this definition, fully random searches of the solution spacewould still be considered enumerative. The author suggest that the applicability ofsuch plans is low, since the convergence is too slow, exactly because they search thesolution space in such a uniform way.

Accordingly, Grishina et al. (2016) use two adaptive heuristics: A specific imple-mentation of a Genetic Algorithm (GAGr) and a Di�erential Evolution algorithm(DEGr).

Genetic algorithms, as they are described in Holland (1992) and Goldberg (1989),make qualitative use of biological evolution principles and nomenclature, i.e. GeneticMutation. This fact is leveraged in an attempt to describe both algorithms used inGrishina et al. (2016) within the same descriptive framework to compare them moreintuitively.

5.1. Genetic Algorithms

Holland (1992) (and earlier: Holland 1962) described genetic algorithms as processes,based on the mechanisms of genetic mutation of DNA, as they occur in nature13 anddescribe their applicability for a number of di�erent situations. The general structureof the algorithm is illustrated in fig. 2.

In the context of portfolio optimization, a population consists of many di�erentportfolios, i.e. potential solutions. Fitness describes each portfolio’s value of theobjective function, i.e. PTIT(x), but could potentially incorporate other metrics. Ageneration is defined as one iteration of the updating cycle, which consists of selection,mutation and cross-over operations.

Selection determines which portfolios of the old population are used to form newcandidates for the next generations’ population. This can happen based on theirevaluations, where fitter portfolios have a higher chance to reproduce or directly

13for an overview of such mechanisms: https://www.nature.com/scitable/topicpage/genetic-mutation-441

15

https://www.nature.com/scitable/topicpage/genetic-mutation-441

https://www.nature.com/scitable/topicpage/genetic-mutation-441

InitialPopulation

Evaluationof Fitness

Updating(Selection,Cross-over,Mutation)

NewPopulation

Output

If Stopping Criterianot reached

else

Figure 2: General Genetic Algorithm flowchart

continue to the next generation. The cross-over operation of the updating stage thendetermines how the selected parent portfolios are combined in order to form a newchild portfolio, i.e. create new, potentially better solutions based on the old ones.Lastly, mutation incorporates randomness into the process. It is important to notethat according to Holland (1992), "mutation’s primary role is not one of generatingnew structures", i.e. in this case new portfolios and continues: "... a role [that is] verye�ciently filled by crossing-over" (Holland 1992, p. 124). Instead it should help thealgorithm to escape local optima. In other words, a genetic algorithm that exclusivelyuses mutation to generate new solutions, is likely to perform enumerative.14

The remainder of this chapter is about the specific implementation used by Grishinaet al. (2016), which is illustrated in algorithms 1, 2, 4 and 5. As a reminder, N isthe number of total available assets, and K is the cardinality limit set to make theportfolio sparse.

Initialization, i.e. the random creation of the first population of portfolios, is madeby creating a matrix of size S = P 2. It is made sure that all of the initial portfoliosadhere to the optimization problems conditions and have di�erent cardinality (see.section 9.2). These initial dimensions are held constant over all other generationsand are illustrated in algorithm 1. P is an input parameter that is used twice: First,to determine the population size S and then again, during the selection stage, todetermine a number of most fit portfolios that are chosen to proceed into the nextpopulation directly (2P ). Since these portfolios are chosen based on their rankedfitness and proceed without being changed, the new population can not have a lowermaximum fitness than the old one. This is an implementation of an elitist rank-basedselection, as introduced by Baker (1985) and De Jong (1975). Only the remaining(less fit) portfolios are chosen randomly as pairs of parents.

14There is still some history dependence, since mutations tend to be only small changes from thepast population. As the author clarifies, it is a lack of sophistication of the evolution, however,that limits the usability of such a hypothetically pure mutation genetic algorithm and makes itcomparable to enumerative processes.

16

In the crossover stage, a new portfolio (child) is created. Its asset weights are dependenton the respective asset weights of the parent pairs: When both parents have a weightthat’s positive, the child’s asset weight is a recombination based on a random variable‰ ≥ U(0, 1). When only one is positive, the non-zero asset weight is copied withprobability fiCR. When neither is positive, the child’s asset weight is set to zero.

Afterwards, in the mutation step, some of the child’s asset weights that are zero(Grishina 2014), are randomly chosen (with probability z) to be positive values withina single assets buy-in constraint limits (eq. (28c)).

After normalization of the asset weights, i.e. re-calibration of the weights to ensurethey sum to 1 to be in accordance with eq. (28b), the buy-in constraints and cardinalityconstraint are checked: Since Grishina et al. (2016) don’t describe details on how theformer is implemented, it is assumed that the buy-in constraints are checked simplyby rejecting any weights that lie under the limit ‘, as described by (Michalewicz et al.2004, p. 239).

The cardinality check has been described in detail by Grishina (2014): If there are lessthan 3 positive elements in a population’s portfolio, the algorithm randomly replaceszero weights with randomly chosen positive values. If there are more than than K

positive assets in the checked portfolio, the algorithm randomly eliminates positiveweights,i.e. sets them to zero, until the condition is fulfilled.

Finally, only the fittest portfolio from the family (consisting of two parent portfoliosand the child portfolio) is stored in the new population. This is another elitistadjustment.

Grishina et al. (2016) includes another step called assessment, which ensures that thenew population is only used in the next generation if they have a higher maximumfitness, than the old (unchanged) population of the current generation. The detailsand potential redundancy of this step and discussed in chapter (section 9.1).

After the algorithm reaches the final generation, as specified by parameter G, theportfolio with the maximum fitness, i.e. corresponding to the highest value of PTIT,in the final population is chosen as the optimal solution. algorithm 2 summarizesthe process. See appendix A.1 for the pseudo-code representations of the Geneticalgorithm’s intermediate operations.

5.2. Di�erential Evolution

The Di�erential Evolution algorithm was first introduced by Storn et al. (1997),and further elaborated on by Price et al. (2006). The general idea is similar to thegenetic algorithm, thus, the following section is written in terms of a comparison

17

Algorithm 1 Population Initialization for GAGr and DEGr

1: procedure Initialize Population(P ,K,‘)2: S Ω P 2 //Population Size3: Matrix X0 Ω xs,n, with s œ [1, . . . , S] and i œ [1, . . . , N ]4: for sΩ1 to S do //each xs represents a portfolio of the population5: get random L œ [3, K] //random number of positive asset weights6: get ul ≥ U(0, 1), l œ [1, . . . , L] //from Uniform distribution7: xs,l Ω 0.1 + 0.9 · ul, ’l œ [1, . . . , L] //limits range of positive weights8: xs,n Ω 0, ’n œ [K, . . . , N ]9: xs,i Ω xs,i/

qN

n=1|xs,i|, ’i //normalization10: if xs,i Æ ‘ then xs,i Ω 0 //eliminate weights below buy-in constraints11: end if12: xs,i Ω xs,i/

qN

n=1|xs,i|, ’i //normalization13: scramble xs //Distributing the weights over di�erent stocks14: end for15: return X016: end procedure

Algorithm 2 Main Genetic Algorithm GAGr for PTIT optimization1: procedure GA for PTIT(K, P, G œ N, z, fii, œ RØ0)2: Get initial Population of portfolios X0 //Initialization, algorithm 13: for g Ω 1 to G do4: if g = 1 then X Ω X0 //the population matrix in the 1. generation5: end if6: get PTIT(xs), ’s = 1, . . . , S //Evaluate each portfolio’s fitness; S = P 2

7: Set order ms such that PTIT(xm1) Ø · · · Ø PTIT(xmS ) //sort based on fitness8: yf Ω xmf

for f = 1, . . . , 2P //2P fittest portfolios directly stored in new population9: for q Ω 1 to (S ≠ 2P ) do

10: get random {xmj , xmk|2P + 1 Æ j Æ P 2, 2P + 1 Æ k Æ P 2} //"Parents"

11: get cq //"Child" acquired by Crossover & Mutation, algorithm 412: check cq //Cardinality and buy-in constraints, and normalization, algorithm 513: yq+2P Ω {x œ {cq, xmj , xmk

}| max PTIT(x)} //fittest ’family’ member14: end for15: Y Ω yi’i //new population Y

16: if max{PTIT(yq), ’q} Ø max{PTIT(xs), ’s} then //Assessment Stage17: X Ω Y //new population Y overwrites current population X

18: end if19: end for20: xú Ω {xi|PTIT(xi) = max

xiPTIT(xi)} //Get Optimized Portfolio in g = G

21: return xú

22: end procedure

to the last section’s genetic algorithm. Supplementary information is provided inappendix A.6.

As Grishina et al. (2016) describe, after initialization (algorithm 1), the algorithmforegoes the evaluation stage, where previously, the fitness of all the population’s

18

portfolios was assessed. Instead, updating and evaluation are executed in a muchmore individual way: Each portfolio of the population (xs) is compared to a mutated"candidate" version xs, that is created through a mixture of xs and three other portfolios(xa, xb, xc) taken randomly from the population. Grishina et al. (2016) determine thecandidate’s individual asset weights depending on the crossover probability CRœ [0, 1]:

xs,i =

Y_]

_[

xa,i + (F + z1) · (xb,i ≠ xc,i + z2) with probability CR

xs,i else(29)

Where xa,i, xb,i, xc,i, and xs,i denote each respective portfolios weight of asset i =1, . . . , N , and the di�erential weight F œ [0, 2] is a parameter of the algorithm thatdetermines the scaling of weights during the mixture. Additional randomness isintroduced in the form of weights z1 and z2, which are either normally distributedwith zero mean and a low standard deviation (i.e. N (0, 0.02)) or are just zero, witha very low probability. According to the authors they are optional, but meant tointroduce additional randomness, much like the mutation operation in the geneticalgorithm.

This is done for all asset weights. In order to avoid the case where a candidateportfolio is equal to the original portfolio, one asset weight of each portfolio is chosento be mutated into the recombined weight xs,i with certainty, i.e. with probabilityCR=1.

Additionally, the complete candidate is further altered to satisfy the cardinality andbuy-in constraints and normalized to make sure the weights add up to one. This isdone analogous to the genetic algorithm (s. algorithm 5).15

After a candidate fulfills all the constraints of the optimization problem, it is finallycompared to the unaltered portfolio and whichever has the higher fitness proceeds tothe next generation’s population. This is comparable to the selection within a familyin the genetic algorithm and again incorporates the elitist approach: Since these stepsare done for all the portfolios of a population, the next subsequent generation startsout with a population that is a mixture of mutated portfolios and unchanged portfolios,each having won their respective comparisons. This means that the new populationcan not only never have a lower maximum fitness than the previous generation’spopulation but also won’t have a lower average fitness either.

Just as before, after the total number of generations G has been reached, the optimalportfolio is then the portfolio with the maximum objective function value in the lastgenerations’ population.

15A separate check for ” is omitted in the explanations, the normalization limits the size of eachasset weight to 1, which is equivalent to ” in all reported calculations of the original publication

19

Arguably, the biggest di�erence between the two algorithms are their treatment ofpopulations, since the di�erential evolution algorithm treats portfolios much moreseparately, while the genetic algorithm assesses them as a unit. Due to the broadnature and variability of genetic algorithms, it is conceivable that certain combinationsof selection cross-over and mutation could yield extremely similar results to thedi�erential evolution’s cross-over and mutation recombination.

Algorithm 3 Main Di�erential Evolution Algorithm1: procedure DE for PTIT(P, K, G œ N, CR, F, ‘ œ RØ0)2: Get initial Population of portfolios X0 //Initialization, algorithm 13: for g Ω 1 to G do4: if g = 1 then X Ω X0 //the population matrix in the 1. generation5: end if6: for s Ω 1 to S do //loop through the population; S = P 2

7: get random {a, b, c œ N| a, b, c Æ S · a ”= b ”= c ”= s}8: get random {‹ œ N| ‹ Æ N} //the asset chosen to certainly mutate9: for i Ω 1 to N do //loop through each asset weight

10: if with probability 0.9999 then z1 ≥ N (0, 0.02)11: else z1 = 012: end if13: if with probability 0.9998 then z2 ≥ N (0, 0.02)14: else z2 = 015: end if16: if with probability CR or s = ‹ then17: xs,i = xa,i + (F + z1) · (xb,i ≠ xc,i + z2) //’Mutation’18: else xs,i = xs,i

19: end if20: end for21: check constraints on xs //analogue to algorithm 522: if PTIT(xs, I) Ø PTIT(xs, I) then xs Ω xs

23: end if24: end for25: end for26: Optimized Portfolio xú Ω {xs|PTIT(xs) = max

xsPTIT(xs)}

27: return xú

28: end procedure

6. Replication Methodology

In the original work by Grishina et al. (2016), the authors report their main results interms of the tracking error of the optimized portfolio solution (eq. (24)). Since theaforementioned algorithms’ objective function instructs them to maximize the prospecttheory based utility PTIT (eq. (28)), however, the results can not be directly compared.Thus, this thesis uses the results of a supplementary parameter optimization, included

20

in the appendix of the original publication to assess the quality of the replicationalgorithms. This is possible, because these results are given in terms of the prospecttheory utility, i.e. the actual objective function value of the optimization.

Consequently, the remainder of the thesis is structured in the following way: First,the replicated algorithms’ performance is assessed in section 7, as described above.Based on the results, which show poor convergence compared to the original algorithmwhen using the originally reported parameters, alternative parameters are introduced,which improve the performance considerably. These are then used to derive replicatedresults for the main optimization problems analyzed by Grishina et al. (2016) andcompared to the original results. The subsequent discussion (section 9) elaborates onthe details of the preceding computations.

6.1. Summary of the Optimization Problems

Grishina et al. (2016) and, accordingly, this thesis, introduced several optimizationproblems and solution methods, which paired with di�erent data-sets and possiblealgorithm parameters, result in a large number of di�erent possible computations.Before continuing, an overview of the models is provided:

1. The index tracking problem, as it was introduced in section 3.2. Grishina et al.(2016), implement two di�erent versions, one with integer based constraints(ITCC), i.e. the full mixed-integer linear optimization problem, and a smallerversion (IT) without any integer constraints, i.e. a linear optimization problemomitting eqs. (24d) and (24e). Both these versions are replicated using directimplementations through the optimization software CPLEX16. The results ofthe replications are presented in section 8.1.

2. Prospect Theory based Index Tracking, which was introduced in section 4.Again, the problem is implemented once with and once without cardinalityconstraints. According to the authors, both implementations use PTIT,CC, asdefined in eq. (28). The problem without cardinality constraints, is then definedas the special case of setting K = N , ‘ = 0 and ” = 1 in the general problem.These problems are solved using the genetic algorithm described in section 5.1.The results of the replication are presented in section 8.2

Additionally, for the performance assessment of the algorithms, the following model isused by the authors:

3. The basic Prospect Theory (PT) problem, which is used in the original paperfor deriving the algorithm parameters for the aforementioned PTIT,CC problem.

16https://www.ibm.com/analytics/cplex-optimizer

21

https://www.ibm.com/analytics/cplex-optimizer

Grishina et al. (2016) use the aforementioned GAGr and DEGr algorithms forthese calculations, but instead of dynamic index returns, a constant return isused as the benchmark, cardinality is not limited, i.e. set to N , the whole assetspace, and buy-in constraints are removed. Additionally, it includes a constrainton the mean return of the portfolio. It is formally described in appendix A.3.The results of the replication of the parameter optimization are presented insection 7.1.

In the first two cases, the originally reported solutions consists only the optimizedportfolio’s tracking error (TE) (eq. (24)), its separation into total absolute overper-formance and underperformance of the portfolio in relation to the index (otot andutot, respectively), and the number of positive assets in the final optimized portfolion. The conclusions of Grishina et al. (2016) are mainly derived through qualitativecomparisons between the TE measures of the two main models.

6.2. Data

The asset and index returns for the problems are included in the OR-library 17. Itconsists of weekly prices between March 1992 to September 1997 for five di�erentstock indices and their individual stock components, according to the descriptions ofBeasley et al. (2003).

After downloading the data from the OR-library website18, the data was put into R19

through the Rstudio IDE20. It was necessary to transform the price data (asset i hasprice pi,t in time period t) into (log-)returns, which where calculated in accordancewith the following formula by Grishina et al. (2016):

µi,t = ln( pi,t

pi,t≠1), i = 1, . . . , N, t = 1, . . . , T + 1 (30)

The resulting returns fulfill the role of µi,t as defined in section 3.1. The index returnsare calculated analogous and were introduced previously as It in the same section.There are 291 time periods of prices in each test set, which results in T = 290log-returns. Summaries of the data set and the original parameters for the heuristicalgorithms are presented in Table 1 and table 2, respectively.

According to Grishina et al. (2016), while K (the portfolio cardinality limit), P (the

17A set of operations research test problems, which was first introduced by Beasley (1990) and hasbeen continuously expanded since then.

18http://people.brunel.ac.uk/~mastjjb/jeb/orlib/indtrackinfo.html, specifically the filescalled indtrack1.txt,. . . ,indtrack5.txt

19https://www.r-project.org, version 3.5.120https://www.rstudio.com/, version 1.1.46321For the Nikkei Data Set, Grishina et al. (2016) set the same parameters as the S&P

22

http://people.brunel.ac.uk/~mastjjb/jeb/orlib/indtrackinfo.html

https://www.r-project.org

https://www.rstudio.com/

DimensionsIndex T N K

Hang Seng (HS) 290 31 15DAX 100 290 85 20FTSE 100 290 89 25S&P 100 290 98 25Nikkei 225 290 225 25

Table 1: Summary of data dimensions; T number of time periods; N total asset space (notincluding index); K cardinality limit set by Grishina et al. (2016).

GAGr Par. DEGr Par.Index K G P z fiCR G P F CR

HS 15 70 15 0.5 0.1 100 20 0.05 0.5DAX 20 180 40 0.5 0.1 275ú 55ú 0.05 0.5FTSE 25 185 42 0.5 0.1 288ú 58ú 0.05 0.5S&P 25 190 45 0.5 0.1 317ú 64ú 0.05 0.5Nikkei21 25 190 45 0.5 0.1 317ú 64ú 0.05 0.5

Table 2: Summary of parameters for heuristics, set by Grishina et al. (2016); K cardinalitylimit; G and P are the number of iterations and the population size, respectively, úmarks values derived from descriptions and were not used in a computation in theoriginal paper; z mutation rate of the GAGr; fiCR and CR regulate the crossover intheir respective algorithms and F is the di�erential weight parameter of the DEGr

algorithm.

population parameter), and G (the number of generations) are based on the data set’sdimensions, the other algorithm parameters are based on the optimization problemand thus stay constant throughout. The information about the parameter fiCR = 0.1is only included in Grishina (2014).

The parameters for the di�erential evolution algorithm are included for completeness,and have been derived proportionately from the Hang Seng data set (matching thedescription of the authors). For all computations with buy-in constraints, the authorschose ‘ = 0.01 and ” = 1.

7. Testing the Quality of the Algorithms

As described in the previous section, the following part reports results for the limitedprospect theory model (appendix A.3), in order to compare the convergence propertiesof the replicated algorithms with the original computations.

23

7.1. Basic Prospect Theory Results

Replicating the smaller problem and getting very similar results for the objectivefunction PT, would confirm the replication algorithm performs comparably to theoriginal used by Grishina et al. (2016). Table 3 summarizes these results, includinga stability measure ›, which is also included in the original paper. It is defined as› = max(PT(xú)) ≠ min(PT(xú)), i.e. the minimum and maximum values of theoptimized objective functions out of 10 independent executions of the respectivealgorithm. Similarly, time and PT values in table 3 are means of the 10 executions,respectively. For the di�erential evolution algorithm, only the smallest data set’sresults have been published in the original paper.

Original Replicationmean mean mean mean

Index time PT › Algo. time PT ›

HS 61.8s 0.6237 0 (DEGr) 12.28s 0.6194 0.0028HS 36.2s 0.62354 0.0002 (GAGr) 2.45s 0.5485 0.0590DAX 550s 0.6442 0.0002 (GAGr) 60.22s 0.6020 0.0515FTSE 630s 0.8429 0.0004 (GAGr) 68.08s 0.7900 0.0432S&P 721s 0.7822 0.0006 (GAGr) 84.70s 0.7555 0.0519Nikkei 1179s -0.9894 0 (GAGr) 126.68s -0.9566 0.0097

Table 3: Original results vs. the results of the replicated algorithms of the PT model (10executions), using the original parameters of Grishina et al. (2016). Mean time (inseconds); max PT maximum of the objective function of all executions; › di�erence

between minimum PT and maximum PT of all executions.

The replicated genetic algorithm only partially achieves similar quality of convergenceas defined by the lower mean objective function values and the distinctly highervariability of solutions, measured by ›. The general similarity of results comparedto the original paper, however, strongly points to a matching implementation of theoptimization problem. The performance di�erence is most likely due to the hardwareset-up di�erences since it is relatively linear between data sets, i.e. does not changein magnitude with either the di�erence in asset spaces or parameters G and P .

Based on these results, the replication is assumed to correctly implement the objectivefunction PTIT. This is further confirmed by experiments that let the optimizationcontinue beyond the original number of iterations. Consequently, through optimizingthe algorithm’s parameters, while maintaining all the properties of the optimizationproblem, it should move towards the right goal, i.e. converge towards the sametheoretical optimum as Grishina et al. (2016).

24

7.2. Alternative Parameters

Table 4 shows the parameters that have been optimized for improved performance ofthe replicated algorithms. The alternative genetic algorithm that uses the improvedparameters is called GAEX and additionally is executed through parallel processing(using the "parallel" package, included in R). 22

GAGr Par. GAEX Par.Index z fiCR [‰] z fiCR [‰]All 0.5 0.1 [0,1] 0.01 0.5 [-0.5,1.5]

Table 4: Summary of parameters for index tracking used in the alternative genetic algorithmGAEX, compared to the original parameters by Grishina et al. (2016); z mutationrate; [‰] is the interval of uniform distributed parameter ‰, which in additionto parameter fiCR regulates the crossover of weights from parents in the genetic

algorithm as described in section 5.1.

None of the changes are related to the core implementation of the genetic algorithmas explained by Grishina et al. (2016), i.e. they should not a�ect the underlyingoptimization problem. Instead it should just a�ect the convergence properties ofthe algorithm, as compared to the original implementation. This assumption issubstantiated by the results presented in table 5: The alternative genetic algorithmGAEX, achieves better convergence compared to both the originally reported resultsand the replicated algorithm GAGr, across almost all metrics.23 The alternativeparameter choices have been further elaborated on in section 9.1.

Original Replication (EX)mean max mean mean max mean

Index time PT › Algo. time PT ›

HS 36.2s 0.62354 0.0002 (GAEX) 5.03s 0.6235 <0.0005DAX 550s 0.6442 0.0002 (GAEX) 27.26s 0.6573 <0.0001FTSE 630s 0.8429 0.0004 (GAEX) 30.31s 0.8564 <0.0002S&P 721s 0.7822 0.0006 (GAEX) 35.44s 0.8007 <0.0007Nikkei 1179s -0.9894 0 (GAEX) 55.26s -0.9369 <0.0022

Table 5: Original results vs. the results of the alternative genetic algorithm for the PT model(10 executions), using altered parameters and parallel processing (see table 4). Meantime (in seconds); max PT maximum of the objective function of all executions; ›

di�erence between minimum PT and maximum PT of all executions.

The results of the full prospect theory-based index tracking model, presented in the22It also skips the assessment stage, as further explained in section 9.123The mean elapsed time for the Hang Seng data set, is larger than in the replicated algorithm

(table 3, which is most likely due to the implemented parallelization, i.e. there is a trade-o�between computational time to set the parallelization up and performance e�ciency in longercomputations.

25

next section, have been derived through the use of the alternative genetic algorithmGAEX. However, they still represent implementations of the optimization problemsas described by Grishina et al. (2016) and thus it should be possible to compare theresults: Due to the improved convergence properties of the algorithm, the qualitativeresults of the model comparison, made by Grishina et al. (2016), should not only bevisible but possibly more pronounced.

8. Replication of Results

As described in section 6, the following part reports results for the replication of thefull index tracking and prospect theory-based index tracking optimization problems,both with and without additional integer constraints respectively.

8.1. CPLEX for Basic Index Tracking

As described earlier, the Index Tracking problem was solved directly by transferringthe optimization problem into the commercial software CPLEX24. The software iscalled through the R package "Rcplex"25.

The basic index tracking problem (without integer constraints) can be used to confirmthat the same data and tracking error as the original paper were used, due to howresults can be determined exactly rather than heuristically. This is indeed the case, astable 6 shows, since the original and replicated optimal tracking errors match exactly.The di�erence in time measurements is further discussed in section 9.3.

The results of the index tracking problem with cardinality constraints (ITCC, eq. (24))are shown in table 7. A time marked with "ú" means that the respective optimizationdid not stop because it reached an optimum, but because it reached a prior settimelimit. This failure of convergence to the theoretical optimum is in line withcomparable optimization results with cardinality constraints (Mutunge et al. 2018;Woodside-Oriakhi et al. 2011) and illustrates why heuristics can be good alternativesfor such problems.

Furthermore, it should be noted that the time limit was set to 7200s (=2h) for thisthesis because no information was given by the original paper’s authors on why theoptimization reached an end at their reported results. This is especially problematicfor the optimizations that, given enough time, reach lower tracking errors than those

24Readers interested in solving such a problem "manually", may refer to the vast literature on solvingMixed-Integer Problems, especially Branch and Bound Algorithms such as Bertsimas et al. (2009),who describe such an algorithm specifically in the context of cardinality constrained portfoliooptimization and compare it to the implementation with CPLEX.

25https://CRAN.R-project.org/package=Rcplex, version 0.3-3

26

https://CRAN.R-project.org/package=Rcplex

Original Replicated (CPLEX) Same in bothIndex time time n TE otot/TE

HS 0.047s 0.014s 30 0.4289 56.98%DAX 0.109s 0.034s 69 0.3354 54.70%FTSE 0.141s 0.035s 81 0.2855 58.05%S&P 0.125s 0.041s 83 0.2682 57.88%Nikkei 0.266s 0.081s 159 0.1686 54.65%

Table 6: Results of the IT optimization problem (no cardinality constraints); times in seconds;TE: Tracking Error as in eq. (20); otot/TE: The total overperformance of the optimalportfolio in relation to the total tracking error; n: the number of all positive asset

weights in the optimal portfolio.

Original Replicated (CPLEX)Index K time n TE otot/TE time n TE otot/TE

HS 15 102s 15 0.5760 57.57% 16.46s 15 0.5763 58.84%DAX 20 200s 20 0.5889 55.70% 7200sú 20 0.5776 57.02%FTSE 25 193s 25 0.6650 57.43% 7200sú 25 0.5707 56.15%S&P 25 176s 25 0.5555 58.02% 7200sú 25 0.5121 55.83%Nikkei 25 612s 25 0.7211 53.32% 7200sú 25 0.6169 58.07%

Table 7: Results of the ITCC optimization problem; K Cardinality limit; times in seconds;TE: Tracking Error as in eq. (20); otot/TE: The total overperformance of the optimalportfolio in relation to the total tracking error; n: the number of all positive asset


the authors reported. Manually checking the portfolio solutions that were created bythe replication algorithm after the time limit was reached confirms that the solutionindeed fulfills the conditions of the optimization problem at hand. It can be ruledout that the originally reported solutions have reached a global optimum in the casesother than the Hang Seng data set.

It should be further noted that for all datasets, the speed of convergence rapidlydecreases, i.e. the optimization over 7200s only achieved slightly smaller values interms of absolute TE, in comparison to the optimization running for 100 seconds.However, this does not say anything about the theoretically achievable optimum valueof TE since after the quick convergence in the beginning, the algorithm improves insudden steps.26 An even longer optimization will likely result in better solutions.

With the exception of Hang Seng, the TE values of the replication are smaller than thevalues reported in Grishina et al. (2016). It is the one optimization, as was mentionedpreviously, that reached the theoretical global optimum and didn’t stop because of thetime-limit. Thus, TE values under 0.5763 should not be possible for that particular26For instance, in the optimization of the Nikkei data, the algorithm’s best solution had a TE of

0.6507 for more than 75min, until reaching a lower level of 0.6169 within a 5 min window.

27

data set. When looking at the originally reported performance values otot = 0.3316and utot = 0.2448, it becomes clear that the reported solution 0.5760 is most likelydue to a typographical or rounding error in Grishina et al. (2016) and not because ofa better solution: otot + utot = TEGr must hold according to the definitions in eq. (24),and in this case the actual original result is most likely 0.3316 + 0.2448 = 0.5764.

8.2. Genetic Algorithm for Prospect Theory-based Index Tracking

The following section uses GAEX, which was defined as the genetic algorithm withalternative parameters and presented in section 7.2 to derive results for the full prospecttheory-based index tracking model. The results are compared with the originallyreported ones by Grishina et al. (2016): table 8 presents the variant without andtable 9 the one with integer constraints (i.e. cardinality and buy-in constraints).

Grishina et al. (2016) make the following observations based on their results comparingthe prospect theory-based (PTIT) and basic (IT) index tracking models:

• PTIT optimized portfolios are more sparse than IT optimized ones. This isconfirmed by the results of the replication and more pronounced than in theoriginal results.

• The otot/TE ratio for PTIT solutions is higher than the ratio for IT solutionswhich, according to the authors, indicates a preference for higher returns in theformer model. This is observable in the replicated results and partially morepronounced.

• The TE of the IT models is in all data sets lower than the TE of the prospecttheory-based models. It represents a control for the correct implementation ofthe models.

These results are explainable through loss aversion, which is modeled in the prospecttheory optimization problem through the value function. which doesn’t treat deviationsfrom the index indi�erently, like IT, but instead favors assets that have preferredreturns compared to the index.

These results let us conclude that the optimization models have been successfullyreplicated and the algorithms have improved. All solution portfolios display fewerassets than the original results in table 8. Additionally, they all have higher overper-formance ratios (otot/TE) while simultaneously displaying lower overall tracking errors.The results are reported in tables 8 and 9.

28

Original Replicated (GAEX)Index time n TE otot/TE time n TE otot/TE

HS 70s 20 0.8420 67.58% 5.48s 19 0.8263 68.35%DAX 242s 51 1.1763 62.37% 38.61s 27 1.0493 73.79%FTSE 250s 46 1.1463 69.08% 42.98s 40 0.9807 77.56%S&P 347s 67 0.9409 62.50% 51.73s 45 0.7168 74.83%Nikkei 1803s 69 0.9802 64.29% 83.83s 67 0.7222 80.24%

Table 8: Results of the PTIT optimization problem; times in seconds; TE: Tracking Error asin eq. (20); otot/TE: The total overperformance of the optimal portfolio in relation tothe total tracking error; n: the number of all positive asset weights in the optimal

portfolio.

Original Replicated (GAEX)Index K time n TE otot/TE time n TE otot/TE

HS 15 74s 15 1.1871 65.94% 5.2s 15 0.8835 67.89%DAX 20 275s 20 1.3309 72.25% 37.2s 20 1.2545 73.10%FTSE 25 193s 25 1.4432 71.53% 41.98s 25 1.250 75.45%S&P 25 176s 25 1.2972 70.24% 49.73s 25 1.0257 72.42%Nikkei 25 612s 25 1.3179 73.12% 81.45s 25 1.1742 75.36%

Table 9: Results of the PTIT,CC optimization problem; K Cardinality limit; times in seconds;TE: Tracking Error as in eq. (20); otot/TE: The total overperformance of the optimalportfolio in relation to the total tracking error; n: the number of all positive asset


8.3. Experimenting on Di�erent Market Environments

Grishina et al. (2016) also include "out-of-sample" data sets in their analysis. Ac-cording to the authors, the new data is created through sampling returns from theaforementioned indices in bullish time periods , i.e. during a phase of continuingpositive returns (in this case the year 2005). Furthermore, they take samples from theyear 2008 (during the financial crisis), to create Bear data-sets in a similar way27.

Subsequently, they show that for their bullish simulations, the portfolio solutions ofthe index tracking and prospect theory-based index tracking optimization reveal atracking error that is almost fully attributed to the overperformance of the portfolio.In other words, the values of otot/TE are all either exactly 100% or very close. For theBear sample, they find that the prospect-theory based index tracking optimizationperforms similarly in terms of total TE but slightly worse in terms of otot/TE whencompared to the IT model in all but one data set.

This section attempts to replicate these results by use of di�erent data (which issummarized in table 10), since the data used by Grishina et al. (2016) could not be

27by using the historical data’s correlation matrix to derive t-distributed sample returns.

29

DimensionsIndex timeframe T N cum. log-returnS&P 500 (Bull) 2018-04-06 - 118 503 11.77%

2018-09-21S&P 500 (Bear) 2018-09-21 - 63 505 -17.17%

2018-12-21

Table 10: Summary of data dimensions for out-of-sample optimization; T number of (daily)time periods; N total asset space (not including index); K cardinality limit set byGrishina et al. (2016); cum. log-return sum of log-returns over all time periods.

attained. The alternative data is taken from the S&P 500 index and its components.Specifically, the daily adjusted returns during a recent bullish and bearish marketphase.

During each of the chosen time periods, the index demonstrates directional movement,which is visible through the cumulative returns in table 10. The solution methods usedare analogous to the previous section: Directly translating the IT model to be solvedin CPLEX (with a time limit of 100s) and using the alternative genetic algorithmGAEX for the PTIT model. In the latter implementation, the same parameters asshown earlier in table 4 and dimension parameters G = 500 and P = 40 are usedin order to handle the larger asset space of the S&P 500 compared to the previousdata-sets.

ITCC PTIT,CC

Index K n TE otot/TE n TE otot/TE

Bull 25 25 0.0596 52.69% 13 0.5527 85.64%Bear 25 25 0.0175 38,64% 15 0.3044 92.18%

Table 11: Results of the optimization in specific market environments K Cardinality limit;TE: Tracking Error; otot/TE: The total overperformance of the optimal portfolio inrelation to the total tracking error; n: the number of all positive asset weights in

the optimal portfolio.

The resulting optimal portfolios match the observed behavior of the original resultsfor the bullish market environment, but di�er from Grishina et al. (2016) for theBear simulation. As table 11 shows, for both the bearish and the bullish marketenvironment, the PTIT displays a high ratio otot/TE. This is possibly related to thedata that is used, but similar behavior is observed in the original Bear simulations forthe Hang Seng data.

30

8.4. Out-of-sample Performance

Because of their description, it was assumed that Grishina et al. (2016) did not defineout-of-sample to describe the performance of a portfolio solution in time periods thatweren’t included in its proceeding optimization, i.e. the data it wasn’t trained on:The "out-of-sample" consideration of a solution is generally assumed to be worse thanthe in-sample performance of a metric (Bailey et al. 2014) and plays an importantrole for any predictive inferences.

This section attempts to shine light on the latter out-of-sample definition by using theBull and Bear data-sets of the previous section. However, in this case, only the firstthird time periods are used to train the model, i.e. to acquire an optimized portfoliosolution, which is subsequently checked for its performance during the remaining twothirds. The results of this additional computation are presented in table 12.

ITCC PTIT,CC

Index K n TE otot/TE n TE otot/TE

Bull 25 25 0.0399 71.93% 8 0.6785 50.15%Bear 25 21 0.1877 35.32% 6 0.3445 46.38%

Table 12: Results of the (out-of-sample) optimization; K Cardinality limit; TE: out ofsample Tracking Error; otot/TE: The total overperformance of the optimal portfolioin relation to the total tracking error; n: the number of all positive asset weights

in the optimal portfolio.

It can be noticed that the general conclusions described in section 8.2 only applypartially in the out-of-sample results. The prospect theory based index trackingmodel results in lower cardinality n but doesn’t demonstrate a higher overperformanceratio (otot/TE) than the basic index tracking model for the bullish market environment.Compared to the results in table 11, the tracking errors are larger, except in theaforementioned case where it is 0.0399, i.e. very low. This could explain the highoverperformance ratio.

Noticeably, the overperformance ratios are lowered more strongly in the prospecttheory based computation than in the basic index tracking computation. This couldbe explained through the di�erent objective functions, i.e. the di�erent goals of themodels: While the ITCC prioritizes diversification, the prospect theory based modelimplements loss aversion and consequently overweighs assets with better returns inthe first third of the time periods.

Concretely, in the actual portfolio solution for the Bear market, the largest assetweight (¥ 40%) belonged to the stock with the highest return in the data usedfor optimization. The same stock, however, had slightly negative returns in theout-of-sample time periods.

31

Thus, this observation ties back to the descriptions in the beginning of this thesis andthe trade-o� considerations every agent faces.

9. Discussions

The following subsections elaborate on the insights and results of the previous sections.The authors of Grishina et al. (2016) were not available for further clarification, thusthis section additionally aims to detail assumptions made in order to derive replicatedalgorithms GAGr and DEGr.

9.1. On the Genetic Algorithm and alternative parameters

A misunderstanding of the assessment stage as it is described in Grishina et al. (2016),is the most likely source of the poor convergence properties of the replication algorithm.As it is understood in this thesis, the assessment stage by Grishina et al. (2016) onlyallows the new population to proceed to the next generation if its maximum objectivefunction value is larger or equal to the maximum of the old population. The newpopulation, however, includes the portfolio with the highest fitness, which is definedas the value of the objective function by the original authors. This was categorizedas an elitist selection scheme previously and means that it will always have at leastan equally high maximum fitness as the old population. This interpretation is basedon the description of the operation and the pseudo-code in Grishina et al. (2016)and Grishina (2014). This, however, makes the assessment stage redundant. This isunlikely since it has been explicitly included in the description by the authors. Toaccount for this discrepancy, the assessment stage is omitted in the altered geneticalgorithm GAEX but was included in the algorithm using the original parameters.

As table 5 in section 7.2 has shown, a lower mutation rate and higher crossoverprobability fiCR leads to better convergence properties of the genetic algorithm. Thesealternative parameters are in accordance with implementations for similar problemsby Chang et al. (2000), Gong et al. (2017), and Woodside-Oriakhi et al. (2011), butthe nature of genetic algorithms makes a direct parameter comparison less meaningfulbecause it depends on other factors of the algorithm, i.e. the detailed mechanics ofthe updating stage. This is why the following example is used instead in order toillustrate the reasoning in favor of a low mutation rate:

In the case of the Nikkei index, the asset space consists of 225 di�erent stocks. If acardinality of K = 25 is set in the optimization problem, a parent portfolio in thegenetic algorithm, as it is described in Grishina et al. (2016), will consist of up to 25positive asset weights. This also means that at least 200 asset weights are set to 0 in

32

the portfolios used as parents. If mutation probability z = 0.5 is implemented, thismeans that 200 · 0.5 = 100 of a childs’ asset weights are expected to get a randomlydetermined value that is U(0, 1) distributed, which also means that the 100 assets eachhave na expected weight of 0.5 before normalization. The old (parent) asset weightsare significantly lower since they already have been normalized in previous generationsto add to one. Moreover, The implementation of the cardinality constraint check setspositive weights randomly to zero, which means that out of the 25+100 = 125 positiveasset weights, only 25/125 = 20% are going to remain after adjusting for cardinality.Thus, only 25 · 20% = 5 assets of the parent are expected to be included after thesesteps, each having significantly lower relative asset weights compared to 20 randomlychosen included assets. A low parameter fiCR increases this e�ect. This could beconsidered a "low" history dependence, according to the description of enumerativeprocesses in section 5. As Holland (1992) explains, the mutation should generally notbe used to generate new candidates, and Grishina et al. (2016) themselves mention a"small probability" in the description of the mutation operation. It appears justifiedto severely lower mutation probability z.

Furthermore, the interval of ‰ is increased in accordance with the implementationby Gong et al. (2017). This is done in order to limit the averaging e�ect that occurswhen using an interval [0,1]: Whenever two parents have positive weights for the sameasset, the respective asset weight of the child will be the weighted average of the twoparent weights. Extending the interval of ‰ to include [-0.5,1.5] can be interpreted asan increased focus on the inclusion of a weight versus the actual value of the weightsince it likely will test more extreme values for assets that are often found in parentportfolios.

Lastly, in the original pseudo-code by Grishina et al. (2016) (see appendix A.5), itwas assumed that line 5 of the algorithm is inversed, i.e. should read y1, . . . , y2P =xm1 , . . . , xm2P , since the description defines y as the portfolios of the new popula-tion.

9.2. On the Population Initialization

While Grishina et al. (2016) states that each portfolio of the initial population iscreated to have exactly K positive asset weights, Grishina (2014, p. 83) explains that"the first function generates an initial population [. . . ] with limit on the number ofassets K which are chosen for each vector randomly from [3,K]". The latter descriptionis chosen in the implementation based on the following reasoning:

The di�erence is very important for the performance of the genetic algorithm inthe case that no buy-in constraints are set and the cardinality is set to include the

33

complete asset space N : Since the algorithm has no e�ective means of eliminatingasset weights (i.e. setting them to zero) in newly created child portfolios, when bothparents respective asset weights are positive as well, the implementation described byGrishina et al. (2016) results in a very slow convergence. This is because both therole of parameter fiCR and the mutation rate z are diminished, since they both don’tapply when both parents’ asset weights are non-zero.

9.3. On CPLEX Optimization and performance measurements

According to Grishina et al. (2016), the results of table 6 are calculated using opti-mization problem eq. (24) with K = N . This seems unlikely, based on the extremelylow execution times (<1s) compared to similar calculations by Woodside-Oriakhi et al.2011.

It is more likely that the authors actually implemented an optimization problemwithout integer constraints, making it a linear programming problem instead of amixed-integer one. This is supported by the replication’s results which were acquiredthrough such a simple linear optimization, completely omitting the binary variable zi

(indicating whether an asset is held or not) and its related constraints. When lookinginto the solution, which had the exact same TE as reported in Grishina et al. (2016),the optimal portfolio includes weights that are smaller than those allowed by thebuy-in constraint ‘i = 0.01. This means that the original solution was most likelyalso acquired based on this optimization problem. Another way of proving this, isattempting the optimization of the full problem (eq. (24)) and just setting K = N :The results of the tracking error and the computation time both are significantlylarger with the inclusion of buy-in constraints (eq. (7)). For example, in the DAXdata-set, this optimization leads to a minimized TE of 0.3817 after 32.94s vs. a TEof 0.3354 after 0.046s, as reported in table 6. This confirms that when the authorsmention cardinality constraints, they mean all integer constraints (i.e. cardinality aswell as buy-in limits).

Generally, the time measurements of the original paper are vaguely described, whichmakes it harder to compare them with the replicated measurements: As explainedin the documentation of MATLAB28 and AMPL29 (the programs used by Grishinaet al. (2016)), there are di�erent ways to measure the time a script runs. The authorsspecifically call their measurement of CPU time, which theoretically can be higher thanwall-clock time since they use a multi-core/multi-thread cpu for their calculations, butmost likely it is actually a lower estimate that excludes the processors workload outside

28https://www.mathworks.com/help/matlab/matlab_prog/measure-performance-of-your-program.html

29https://mathopt.de/AMPL/AMPL-RefMan.pdf Table A-14

34

https://www.mathworks.com/help/matlab/matlab_prog/measure-performance-of-your-program.html

https://www.mathworks.com/help/matlab/matlab_prog/measure-performance-of-your-program.html

https://mathopt.de/AMPL/AMPL-RefMan.pdf

of the optimization 30. No further definition is given, other than by Grishina (2014),who states that the measurement is in seconds. This thesis assumes that CPU time inGrishina et al. (2016) is comparable to wall-clock time 31. This is because CPLEX’sown elapsed time output is based on wall-clock time, and the originally reportedtimes are broadly comparable to the similar computations by Woodside-Oriakhi et al.(2011), which use the same data and a similar processor.

Furthermore, the replication uses CPLEX by applying parallel threads (6 due tothe processor used), which (depending on the system) could mean that the "CPUtime" measurement may be up to 6 times larger than the elapsed time. In general,comparisons of performance should be taken with caution: table 14 in appendix A.4includes other di�erences of the set-up, each of which may lead to di�erences inperformance.

9.4. On the Objective Function

In the prospect theory model with index tracking (section 4), the objective functionreads as:

PTIT =Tÿ

t=1ptvpt(rx,t ≠ It) (31)

which can be trivially shown to be equivalent to the equation used in Grishina et al.(2016). The authors, however, don’t explain the role of pt, which in the originalProspect Theory model, is defined as the probability of an outcome of the lottery.Since historical return data is used for the computations, it is plausible to defineeach historical time period as an outcome, i.e. each one should be given the sameprobability, i.e. pt = 1/T , where T is the total number of time periods. Then, becauseof theorem 3.1, the inclusion of pt would account for a monotonic transformation.This does not change the optimal portfolio solution but does a�ect the outcome of theobjective function. As previously described, the only way to determine the quality ofthe algorithms, including the use of a matching objective function, is with the valuesgiven in table 3 and table 5 of this thesis since the other results only include theTracking error values. Through experimentation over all data sets and comparisonwith the optimal values (as defined in the original paper) it is determined that thevalues in Grishina et al. (2016) must have been calculated with pt = 1 ’t in theobjective function. The exact matching of the Hang Seng objective further justifiesthis.

More generally, the practicability of such an objective function is debatable, i.e.

30Robert Foure, the designer of AMPL, explains this in https://groups.google.com/d/msg/ampl/PcHyAeZ4AhM/3dzH1F99yr8J and https://groups.google.com/d/msg/ampl/attIOKNdYUE/_F53shn6nZYJ

31measured through taking the di�erence of Sys.time() before and after the optimization in R

35

https://groups.google.com/d/msg/ampl/PcHyAeZ4AhM/3dzH1F99yr8J

https://groups.google.com/d/msg/ampl/PcHyAeZ4AhM/3dzH1F99yr8J

https://groups.google.com/d/msg/ampl/attIOKNdYUE/_F53shn6nZYJ

https://groups.google.com/d/msg/ampl/attIOKNdYUE/_F53shn6nZYJ

whether behavioral economics and prospect theory lend itself to be used as theobjective function for portfolio optimization. This is because the theory is essentiallydescriptive (Stewart et al. 2015). In other words, it illustrated a way to systematicallyevaluate and model common judgment biases. Thus, it’s normative use, i.e. as the"goal" of the optimization, appears in need of further discussion. If mean-varianceoptimization is assumed to be rational, then portfolio managers ultimately shoulduse their behavioral insights to steer the investors towards the rational choice. Thisproblem of reconciling normative and behavioral economics has been intensivelyelaborated by others: As Bleichrodt et al. (2001) write:

"(. . . ) inconsistencies are not the essence of the problem; instead theyare symptoms. The essence of the problem lies in the biases, i.e., the dis-crepancies between elicited preferences and the true preferences accordingto a rational model in which these preferences are to be implemented.Observed inconsistencies prove that biases are present so that correctiveprocedures are called for."

Contrasting ideas have been brought up by McQuillin et al. (2012), who explain thatdeviations from rational choice can only be classified in relation to the definition ofrationality. Subsequently, Sugden (2017) notes:

"Even if behavioral economists or policy makers feel confident that people’slifestyle choices are based on some kind of error, they should not jump tothe conclusion that the error is a self-acknowledged failure of self-control"

which ties in with the discussion about the "second generation" of behavioral financeand opens up the idea of individualized preferences.

Consequently, the best use of such objective functions may be in agent based marketmodels (Farmer et al. 2009; Hommes 2001), which could use the optimizations tomodel the actual portfolio behavior of di�erent agents using the prospect theory basedindex tracking model. The conclusions would then be based on observations of thesimulated market behavior from which policies can be derived.

36

10. Conclusion

This thesis aimed to implement the prospect theory-based index tracking model byGrishina et al. (2016) and replicate the results presented in the original publication.By elaborating on the underlying theoretical models and documenting the replicationprocess, it tried to allow for its own replication.

Even though imitation of the original genetic algorithm, i.e. exact recreation of itsconvergence properties failed, a replication with alternative parameters actually leadto improved performance compared to the originally reported results. In-sample, themain observations made by the original authors were more pronounced in the resultsof this thesis and could consequently be successfully replicated. Through experimentson the out-of-sample performance and a detailed analysis of the algorithm, this thesisattempted to give further insight into the implementation’s details. It concluded withan extensive discussion on the original work and its implications.

37

References

Adcock, C.J. and N. Meade (1994). “A simple algorithm to incorporate transactionscosts in quadratic optimisation”. In: European Journal of Operational Research 79.1,pp. 85–94. issn: 0377-2217. doi: 10.1016/0377-2217(94)90397-2. url: http://www.

sciencedirect.com/science/article/pii/0377221794903972.Aggarwal, Charu C., Alexander Hinneburg, and Daniel A. Keim (2001). “On the Surprising

Behavior of Distance Metrics in High Dimensional Space”. In: Database Theory — ICDT2001. Ed. by Jan Van den Bussche and Victor Vianu. Springer Berlin Heidelberg, pp. 420–434. isbn: 978-3-540-41456-8. doi: 10.1007/3-540-44503-x_27.

Alm, James (2010). “A Call for Replication Studies”. In: Public Finance Review 38.3,pp. 275–281. doi: 10.1177/1091142110374569. eprint: https://doi.org/10.1177/

1091142110374569.Bailey, David H., Jonathan Borwein, Marcos López de Prado, and Qiji Jim Zhu (2014).

“Pseudo-Mathematics and Financial Charlatanism: The E�ects of Backtest Overfittingon Out-of-Sample Performance”. In: Notices of the American Mathematical Society 61.5.doi: 10.1090/noti1105.

Baker, James E. (1985). “Adaptive Selection Methods for Genetic Algorithms”. In: Proceed-ings of the 1st International Conference on Genetic Algorithms. Hillsdale, NJ, USA: L.Erlbaum Associates Inc., pp. 101–111. isbn: 0-8058-0426-9. url: http://dl.acm.org/

citation.cfm?id=645511.657075.Barber, Brad M and Terrance Odean (2000). “Trading is hazardous to your wealth: The

common stock investment performance of individual investors”. In: The journal of Finance55.2, pp. 773–806. doi: 10.1111/0022-1082.00226.

Beasley, J. E. (1990). “OR-Library: Distributing Test Problems by Electronic Mail”. In: TheJournal of the Operational Research Society 41.11, p. 1069. doi: 10.2307/2582903. url:http://www.jstor.org/stable/2582903.

Beasley, J.E., N. Meade, and T.-J. Chang (2003). “An evolutionary heuristic for the indextracking problem”. In: European Journal of Operational Research 148.3, pp. 621–643. issn:0377-2217. doi: 10.1016/s0377-2217(02)00425-3.

Bertsimas, Dimitris and Romy Shioda (2009). “Algorithm for cardinality-constrainedquadratic optimization”. In: Computational Optimization and Applications 43.1, pp. 1–22.doi: 10.1007/s10589-007-9126-9.

Biau, Gérard and David M. Mason (2015). “High-Dimensional $p$-Norms”. In: MathematicalStatistics and Limit Theorems. Ed. by Marc Hallin, David M. Mason, Dietmar Pfeifer,and Josef G. Steinebach. Cham: Springer International Publishing, pp. 21–40. isbn:978-3-319-12442-1. doi: 10.1007/978-3-319-12442-1_3.

Bienstock, Daniel (1996). “Computational study of a family of mixed-integer quadraticprogramming problems”. In: Mathematical Programming 74.2, pp. 121–140. issn: 1436-4646. doi: 10.1007/bf02592208.

Bleichrodt, Han, Jose Luis Pinto, and Peter P. Wakker (2001). “Making Descriptive Use ofProspect Theory to Improve the Prescriptive Use of Expected Utility”. In: ManagementScience 47.11, pp. 1498–1514. doi: 10.1287/mnsc.47.11.1498.10248.

38

https://doi.org/10.1016/0377-2217(94)90397-2

http://www.sciencedirect.com/science/article/pii/0377221794903972


https://doi.org/10.1007/3-540-44503-x_27

https://doi.org/10.1177/1091142110374569

https://doi.org/10.1177/1091142110374569

https://doi.org/10.1177/1091142110374569

https://doi.org/10.1090/noti1105

http://dl.acm.org/citation.cfm?id=645511.657075

http://dl.acm.org/citation.cfm?id=645511.657075

https://doi.org/10.1111/0022-1082.00226

https://doi.org/10.2307/2582903

http://www.jstor.org/stable/2582903

https://doi.org/10.1016/s0377-2217(02)00425-3

https://doi.org/10.1007/s10589-007-9126-9

https://doi.org/10.1007/978-3-319-12442-1_3

https://doi.org/10.1007/bf02592208

https://doi.org/10.1287/mnsc.47.11.1498.10248

Borchers, Brian and John E. Mitchell (1994). “An improved branch and bound algorithm formixed integer nonlinear programs”. In: Computers & Operations Research 21.4, pp. 359–367. issn: 0305-0548. doi: 10.1016/0305-0548(94)90024-8.

Braun, Benjamin (2016). “From performativity to political economy: index investing, ETFsand asset manager capitalism”. In: New Political Economy 21.3, pp. 257–273. doi: 10.

1080/13563467.2016.1094045.Chang, Andrew C., Phillip Li, et al. (2018). “Is Economics Research Replicable? Sixty

Published Papers from Thirteen Journals Say “Often Not””. In: Critical Finance Review7.

Chang, T.-J., N. Meade, J.E. Beasley, and Y.M. Sharaiha (2000). “Heuristics for cardinalityconstrained portfolio optimisation”. In: Computers & Operations Research 27.13, pp. 1271–1302. issn: 0305-0548. doi: 10.1016/s0305-0548(99)00074-x.

De Jong, Kenneth Alan (1975). “An Analysis of the Behavior of a Class of Genetic AdaptiveSystems.” AAI7609381. PhD thesis. Ann Arbor, MI, USA.

Deville, Laurent (2008). “Exchange Traded Funds: History, Trading, and Research”. In:Handbook of Financial Engineering. Ed. by Constantin Zopounidis, Michael Doumpos,and Panos M. Pardalos. Boston, MA: Springer US, pp. 67–98. isbn: 978-0-387-76682-9.doi: 10.1007/978-0-387-76682-9_4. url: https://doi.org/10.1007/978-0-387-

76682-9_4.Elton, Edwin J and Martin J Gruber (1997). “Modern portfolio theory, 1950 to date”. In:

Journal of Banking & Finance 21.11-12, pp. 1743–1759. issn: 0378-4266. doi: 10.1016/

s0378-4266(97)00048-4.Farmer, J. Doyne and Duncan Foley (2009). “The economy needs agent-based modelling”.

In: Nature 460.7256, pp. 685–686. doi: 10.1038/460685a. url: https://doi.org/10.

1038/460685a.French, Kenneth R. (2008). “Presidential Address: The Cost of Active Investing”. In: The

Journal of Finance 63.4, pp. 1537–1573. doi: 10.1111/j.1540-6261.2008.01368.x.Goldberg, David E. (1989). Genetic Algorithms in Search, Optimization and Machine

Learning. 1st. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc. isbn:0201157675.

Golub, Gene H. and Charles F. Van Loan (2013). Matrix computations. Vol. 4. The JohnsHopkins University Press.

Gong, Chao, Chunhui Xu, and Ji Wang (2017). “An E�cient Adaptive Real Coded GeneticAlgorithm to Solve the Portfolio Choice Problem Under Cumulative Prospect Theory”.In: Computational Economics 52.1, pp. 227–252. issn: 1572-9974. doi: 10.1007/s10614-

017-9669-5. url: https://doi.org/10.1007/s10614-017-9669-5.Grishina, N., C. A. Lucas, and P. Date (2016). “Prospect theory–based portfolio optimization:

an empirical study and analysis using intelligent algorithms”. In: Quantitative Finance17.3, pp. 353–367. doi: 10.1080/14697688.2016.1149611.

Grishina, Nina (Aug. 2014). “A Behavioural Approach to Financial Portfolio SelectionProblem: an Empirical Study Using Heuristics”. PhD thesis. School of InformationSystems, Computing and Mathematics Brunel University.

39

https://doi.org/10.1016/0305-0548(94)90024-8

https://doi.org/10.1080/13563467.2016.1094045

https://doi.org/10.1080/13563467.2016.1094045

https://doi.org/10.1016/s0305-0548(99)00074-x

https://doi.org/10.1007/978-0-387-76682-9_4

https://doi.org/10.1007/978-0-387-76682-9_4

https://doi.org/10.1007/978-0-387-76682-9_4

https://doi.org/10.1016/s0378-4266(97)00048-4

https://doi.org/10.1016/s0378-4266(97)00048-4

https://doi.org/10.1038/460685a

https://doi.org/10.1038/460685a

https://doi.org/10.1038/460685a

https://doi.org/10.1111/j.1540-6261.2008.01368.x

https://doi.org/10.1007/s10614-017-9669-5

https://doi.org/10.1007/s10614-017-9669-5

https://doi.org/10.1007/s10614-017-9669-5

https://doi.org/10.1080/14697688.2016.1149611

Holland, John H. (1962). “Outline for a Logical Theory of Adaptive Systems”. In: Journalof the ACM (JACM) 9.3, pp. 297–314. doi: 10.1145/321127.321128.

Holland, John Henry (1992). Adaptation in natural and artificial systems: an introductoryanalysis with applications to biology, control, and artificial intelligence. MIT press.

Homchenko, Andrei Anatol’evich, Nina Pavlovna Grishina, Cormac Lucas, and Sergei Petro-vich Sidorov (2013). “Di�erential evolution algorithm for solving the portfolio optimizationproblem”. In: Izvestiya of Saratov University. New Series. Series Mathematics. Mechanics.Informatics 13.4, pp. 88–91.

Hommes, C.H. (2001). “Financial markets as nonlinear adaptive evolutionary systems”.In: Quantitative Finance 1.1, pp. 149–167. doi: 10.1080/713665542. eprint: https:

//doi.org/10.1080/713665542. url: https://doi.org/10.1080/713665542.Jarrow, Robert (1980). “Heterogeneous Expectations, Restrictions on Short Sales, and

Equilibrium Asset Prices”. In: The Journal of Finance 35.5, pp. 1105–1113. issn: 0022-1082. doi: 10.1111/j.1540-6261.1980.tb02198.x.

Jobst, N.J., M.D. Horniman, C.A. Lucas, and G. Mitra (2001). “Computational aspects ofalternative portfolio selection models in the presence of discrete asset choice constraints”.In: Quantitative Finance 1.5, pp. 489–501. doi: 10.1088/1469-7688/1/5/301. eprint:https://doi.org/10.1088/1469-7688/1/5/301. url: https://doi.org/10.1088/

1469-7688/1/5/301.Kahneman, Daniel and Amos Tversky (1974). “Judgment under Uncertainty: Heuristics and

Biases”. In: Science 185.4157, pp. 1124–1131. doi: 10.1126/science.185.4157.1124.– (1979). “Prospect Theory: An Analysis of Decision under Risk”. In: Econometrica 47.2,

pp. 263–291. doi: 10.2307/1914185.– (1992). “Advances in prospect theory: Cumulative representation of uncertainty”. In:

Journal of Risk and Uncertainty 5.4, pp. 297–323. doi: 10.1007/bf00122574.Karlow, Dennis (2012). “Comparison and development of methods for index tracking”. en.

Hochschulschrift, Dissertation, Thesis. url: http://nbn-resolving.de/urn:nbn:de:

101:1-20140725581.Kealy, Lisa, Andrew Melville, Kieran Daly amd Pierre Kempeneer, Matt Forstenhausler,

Mark D. Michel, and Julie Kerr (2017). Reshaping around the investor – Global ETFResearch 2017. url: https://www.ey.com/Publication/vwLUAssets/ey-global-etf-

survey-2017/%24FILE/ey-global-etf-survey-2017.pdf.King, Gary (2006). “Publication, Publication”. In: PS: Political Science & Politics 39.01,

pp. 119–125. doi: 10.1017/s1049096506060252. url: http://gking.harvard.edu/

papers.Konno, Hiroshi and Hiroaki Yamazaki (1991). “Mean-Absolute Deviation Portfolio Opti-

mization Model and Its Applications to Tokyo Stock Market”. In: Management Science37.5, pp. 519–531. doi: 10.1287/mnsc.37.5.519. eprint: https://doi.org/10.1287/

mnsc.37.5.519. url: https://doi.org/10.1287/mnsc.37.5.519.Mangasarian, O. L. (2006). “Absolute value equation solution via concave minimization”. In:

Optimization Letters 1.1, pp. 3–8. issn: 1862-4480. doi: 10.1007/s11590-006-0005-6.url: https://doi.org/10.1007/s11590-006-0005-6.

40

https://doi.org/10.1145/321127.321128

https://doi.org/10.1080/713665542

https://doi.org/10.1080/713665542

https://doi.org/10.1080/713665542

https://doi.org/10.1080/713665542

https://doi.org/10.1111/j.1540-6261.1980.tb02198.x

https://doi.org/10.1088/1469-7688/1/5/301

https://doi.org/10.1088/1469-7688/1/5/301

https://doi.org/10.1088/1469-7688/1/5/301

https://doi.org/10.1088/1469-7688/1/5/301

https://doi.org/10.1126/science.185.4157.1124

https://doi.org/10.2307/1914185

https://doi.org/10.1007/bf00122574

http://nbn-resolving.de/urn:nbn:de:101:1-20140725581

http://nbn-resolving.de/urn:nbn:de:101:1-20140725581

https://www.ey.com/Publication/vwLUAssets/ey-global-etf-survey-2017/%24FILE/ey-global-etf-survey-2017.pdf

https://www.ey.com/Publication/vwLUAssets/ey-global-etf-survey-2017/%24FILE/ey-global-etf-survey-2017.pdf

https://doi.org/10.1017/s1049096506060252

http://gking.harvard.edu/papers

http://gking.harvard.edu/papers

https://doi.org/10.1287/mnsc.37.5.519




https://doi.org/10.1007/s11590-006-0005-6

https://doi.org/10.1007/s11590-006-0005-6

Markowitz, Harry M. (1952). “Portfolio Selection”. In: The Journal of Finance 7.1, pp. 77–91.doi: 10.2307/2975974.

– (1959). Portfolio Selection. E�cient Diversification of Investments. Portfolio Selection:E�cient Diversification of Investments. John Wiley & Sons, New York, USA. 344 pp.

– (1991). “Foundations of Portfolio Theory”. In: The Journal of Finance 46.2, pp. 469–477.doi: 10.1111/j.1540-6261.1991.tb02669.x.

McQuillin, Ben and Robert Sugden (2012). “Reconciling normative and behavioural eco-nomics: the problems to be solved”. In: Social Choice and Welfare 38.4, pp. 553–567. issn:1432-217X. doi: 10.1007/s00355-011-0627-1.

Michalewicz, Zbigniew and David B Fogel (2004). How to Solve It: Modern Heuristics.Springer Science & Business Media.

Mueller-Langer, Frank, Benedikt Fecher, Dietmar Harho�, and Gert Wagner (2017). “TheEconomics of Replication”. In: Max Planck Institute for Innovation & CompetitionResearch Paper SSRN Electronic Journal 17-03. doi: 10.2139/ssrn.2914225.

Mutunge, Purity and Dag Haugland (2018). “Minimizing the tracking error of cardinalityconstrained portfolios”. In: Computers & Operations Research 90, pp. 33–41. issn: 0305-0548. doi: 10.1016/j.cor.2017.09.002. url: http://www.sciencedirect.com/

science/article/pii/S0305054817302265.Oh, Kyong Joo, Tae Yoon Kim, and Sungky Min (2005). “Using genetic algorithm to support

portfolio optimization for index fund management”. In: Expert Systems with Applications28.2, pp. 371–379. doi: 10.1016/j.eswa.2004.10.014.

Pope, Peter F. and Pradeep K. Yadav (1994). “Discovering Errors in Tracking Error”. In:The Journal of Portfolio Management 20.2, pp. 27–32. issn: 0095-4918. doi: 10.3905/

jpm.1994.409471. eprint: https://jpm.iijournals.com/content/20/2/27.full.pdf.url: https://jpm.iijournals.com/content/20/2/27.

Price, Kenneth, Rainer M. Storn, and Jouni A. Lampinen (2006). Di�erential evolution: apractical approach to global optimization. Springer Science & Business Media.

Rieger, Marc Oliver, Mei Wang, and Thorsten Hens (2017). “Estimating cumulative prospecttheory parameters from an international survey”. In: Theory and Decision 82.4, pp. 567–596. issn: 1573-7187. doi: 10.1007/s11238-016-9582-8. url: https://doi.org/10.

1007/s11238-016-9582-8.Roll, Richard (1992). “A Mean/Variance Analysis of Tracking Error”. In: The Journal of

Portfolio Management 18.4, pp. 13–22. doi: 10.3905/jpm.1992.701922.Rudd, Andrew (1980). “Optimal Selection of Passive Portfolios”. In: Financial Management

9.1, pp. 57–66. doi: 10.2307/3665314.Scozzari, Andrea, Fabio Tardella, Sandra Paterlini, and Thiemo Krink (2012). “Exact and

heuristic approaches for the index tracking problem with UCITS constraints”. In: Annalsof Operations Research 205.1, pp. 235–250. issn: 1572-9338. doi: 10.1007/s10479-012-

1207-1. url: https://doi.org/10.1007/s10479-012-1207-1.Sharpe, William F. (1963). “A Simplified Model for Portfolio Analysis”. In: Management

Science 9.2, pp. 277–293. issn: 00251909, 15265501. doi: 10.1287/mnsc.9.2.277.

41

https://doi.org/10.2307/2975974


https://doi.org/10.1007/s00355-011-0627-1

https://doi.org/10.2139/ssrn.2914225

https://doi.org/10.1016/j.cor.2017.09.002

http://www.sciencedirect.com/science/article/pii/S0305054817302265

http://www.sciencedirect.com/science/article/pii/S0305054817302265

https://doi.org/10.1016/j.eswa.2004.10.014

https://doi.org/10.3905/jpm.1994.409471

https://doi.org/10.3905/jpm.1994.409471

https://jpm.iijournals.com/content/20/2/27.full.pdf

https://jpm.iijournals.com/content/20/2/27

https://doi.org/10.1007/s11238-016-9582-8

https://doi.org/10.1007/s11238-016-9582-8

https://doi.org/10.1007/s11238-016-9582-8

https://doi.org/10.3905/jpm.1992.701922

https://doi.org/10.2307/3665314

https://doi.org/10.1007/s10479-012-1207-1

https://doi.org/10.1007/s10479-012-1207-1

https://doi.org/10.1007/s10479-012-1207-1


Sharpe, William F. (1964). “Capital Asset Prices: A Theory of Market Equilibrium underConditions of Risk”. In: The Journal of Finance 19.3, pp. 425–442. doi: 10.2307/2977928.

Speranza, M. Grazia (1996). “A heuristic algorithm for a portfolio optimization model appliedto the Milan stock market”. In: Computers & Operations Research 23.5, pp. 433–441. issn:0305-0548. doi: 10.1016/0305-0548(95)00030-5. url: http://www.sciencedirect.

com/science/article/pii/0305054895000305.Statman, Meir (2018). “Behavioral Finance Lessons for Asset Managers”. In: The Journal

of Portfolio Management 44.7, pp. 135–147. doi: 10.3905/jpm.2018.44.7.135.Stewart, Neil, Stian Reimers, and Adam J. L. Harris (2015). “On the Origin of Utility,

Weighting, and Discounting Functions: How They Get Their Shapes and How to ChangeTheir Shapes”. In: Management Science 61.3, pp. 687–705. doi: 10.1287/mnsc.2013.1853.eprint: https://doi.org/10.1287/mnsc.2013.1853. url: https://doi.org/10.1287/

mnsc.2013.1853.Stone, M. (1974). “Cross-Validatory Choice and Assessment of Statistical Predictions”. In:

Journal of the Royal Statistical Society: Series B (Methodological) 36.2, pp. 111–133. doi:10.1111/j.2517-6161.1974.tb00994.x.

Storn, Rainer and Kenneth Price (1997). “Di�erential Evolution – A Simple and E�cientHeuristic for global Optimization over Continuous Spaces”. In: Journal of Global Op-timization 11.4, pp. 341–359. issn: 1573-2916. doi: 10.1023/a:1008202821328. url:https://doi.org/10.1023/A:1008202821328.

Sugden, Robert (2017). “Do people really want to be nudged towards healthy lifestyles?” In:International Review of Economics 64.2, pp. 113–123. issn: 1863-4613. doi: 10.1007/

s12232-016-0264-1.Takeda, Akiko, Mahesan Niranjan, Jun ya Gotoh, and Yoshinobu Kawahara (2012). “Simul-

taneous pursuit of out-of-sample performance and sparsity in index tracking portfolios”. In:Computational Management Science 10.1, pp. 21–49. doi: 10.1007/s10287-012-0158-y.

Treynor, Jack L. and Fischer Black (1973). “How to Use Security Analysis to ImprovePortfolio Selection”. In: The Journal of Business 46.1. doi: 10.1086/295508.

Wilding, T (2003). “Using genetic algorithms to construct portfolios”. In: Advances inPortfolio Construction and Implementation. Ed. by Stephen Satchell and Alan Scowcroft.Quantitative Finance. Oxford: Elsevier, pp. 135–160. isbn: 978-0-7506-5448-7. doi: 10.

1016/b978-075065448-7.50007-0. url: http://www.sciencedirect.com/science/

article/pii/B9780750654487500070.Woodside-Oriakhi, M., C. Lucas, and J.E. Beasley (2011). “Heuristic algorithms for the

cardinality constrained e�cient frontier”. In: European Journal of Operational Research213.3, pp. 538–550. doi: 10.1016/j.ejor.2011.03.030.

42

https://doi.org/10.2307/2977928

https://doi.org/10.1016/0305-0548(95)00030-5



https://doi.org/10.3905/jpm.2018.44.7.135

https://doi.org/10.1287/mnsc.2013.1853





https://doi.org/10.1023/a:1008202821328

https://doi.org/10.1023/A:1008202821328

https://doi.org/10.1007/s12232-016-0264-1

https://doi.org/10.1007/s12232-016-0264-1

https://doi.org/10.1007/s10287-012-0158-y

https://doi.org/10.1086/295508

https://doi.org/10.1016/b978-075065448-7.50007-0

https://doi.org/10.1016/b978-075065448-7.50007-0

http://www.sciencedirect.com/science/article/pii/B9780750654487500070

http://www.sciencedirect.com/science/article/pii/B9780750654487500070

https://doi.org/10.1016/j.ejor.2011.03.030

A. Appendix

A.1. Replicated Genetic Algorithm Pseudo-codes

This section provides further pseudo-code representations for the specific operationsof the replicated Genetic Algorithm.

Algorithm 4 Genetic Algorithm Operation: Crossover and Mutation1: for n Ω 1 to N do //loop through asset weights, with xs,n being nth element of xs

2: if xmj ,n > 0 · xmk,n > 0 then //Crossover from both parents3: ci,n Ω ‰ · xmj ,n + (1 ≠ ‰) · xmk,n with ‰ ≥ U(0, 1)4: end if5: if xmj ,n = 0 · xmk,n > 0 then //Crossover only from first parent6: with probability fiCR: ci,n Ω xmk,n else ci,n Ω 07: end if8: if xmj ,n > 0 · xmk,n = 0 then //Crossover only from second parent9: with probability fiCR: ci,n Ω cmj ,n else ci,n Ω 0

10: end if11: if xmj ,n = 0 · xmk,n = 0 then ci,n Ω 0 //Crossover from neither12: end if13: if ci,n = 0 then with probability z: ci,n Ω u ≥ U(0, 1) //Mutation14: end if15: end for16: return ci

Algorithm 5 Genetic Algorithm Operation: Cardinality and buy-in constraints checkci,n Ω ci,n/

qN

n=1|ci,n|, ’n //Normalization of weights eq. (28b)if ci,n < ‘ or ci,n > ” then ci,n Ω 0 //buy-in ’elimination’ eq. (28c) for all n

end ifwhile #{ci,n|ci,n > 0, ’n} > K do //Cardinality Constraint eq. (28d)

get random {n|ci,n > 0} //choose random positive asset weightci,n Ω 0 //replace with zero until CC satisfied

end whilewhile #{ci,n|yi,n > 0, ’n} < 3 do //avoiding portfolios under 3 assets

get random {n|ci,n = 0} //choose random zero asset weightci,n Ω 0.1 + 0.9 · u with u ≥ U(0, 1) //replace with random positive weight

end whileci,n Ω ci,n/

qN

n=1|ci,n|, ’n //Normalization of weights eq. (28b)

A.2. Proofs

Lemma 3.1 Proof. Let f : Rn æ R and x œ Rn. If ÷xú such that xú œ {x|f(x) =min

xf(x)} then f(xú) Æ f(x), ’x œ Rn. Furthermore, let g : R æ R be a monotonic

increasing transformation, then, if a, b œ R and a Æ b, it must follow that g(a) Æ g(b).Thus, g(f(xú)) Æ g(f(x)), ’x. ⌅

43

A.3. Basic PT model

The basic or ’limited’ prospect theory model (PT) is used by Grishina et al. (2016)and Grishina (2014) to derive the parameters of both the di�erential evolution andgenetic algorithm, used for the complete PTIT model. Additionally, its results areused to determine the quality of the replicated algorithm in section 7.1.

max PT(x, r0) =Tÿ

t=1ptvpt(rx,t ≠ r0) (32a)

s.t. rx,t =Nÿ

i=1wiµi Ø d, (32b)

Nÿ

i=1wi = 1, (32c)

wi Ø 0, i = 1, . . . , N (32d)

r0 is a constant value set by Grishina (2014, p. 51). rx, t = qNi=1 wiµi is the mean

portfolio return with each asset having mean return µi = 1T

qTt=1 µi,t over the analyzed

time periods t. d = max(rx) ≠ (max(rx) ≠ min(rx)) · 0.25 is the minimum returnconstraint, where max(rx,t) and min(rx,t) are calculated within each data-set (Grishina2014, p. 50). Grishina (2014, p. 51) uses the first 100 periods for the calculations of d

and the basic PT model values. In the case of no portfolio in a population being ableto satisfy eq. (32b), it is lowered iteratively (each iteration lowers the limit by 0.0002)until a solution is found (Grishina 2014, p. 84). The values for the static referencepoint r0 and the return constraint d are displayed below, in table 13.

Basic PT variablesIndex d r0

Hang Seng 0.0118 0.0005DAX 0.006 0.000025FTSE 0.0077 0.000025S&P 0.0109 0.00005Nikkei 0.0005 0.000001

Table 13: PT model variables; d: mean return minimum; r0 static reference point as chosenby Grishina (2014, p. 51).

44

A.4. System Specifications

Component Original ReplicatedCPU Intel Core i3-2310M Intel Core i5-8400 CPU

@2.10 GHz @2.80GHzRAM 8.0GB 16GBOS MS Windows 7 (64-bit SP 1) Windows 10 Education 64-bitCPLEX AMPL with v12.5.1.0 Rcplex 0-0.3 with v12.8.0Software MATLAB (v.unknown) R 3.5.1time measure "CPU time" (specifics unknown) "wall-clock" time

(function Sys.time())

Table 14: Di�erences in Hardware and Software between Grishina et al. 2016 (Original) andthis thesis (Replicated).

A.5. Original Genetic Algorithm Pseudocode

Original Genetic Algorithm pseudo-code in (Grishina et al. 2016)1: Generate initial population xi œ Dk i = 1 . . . , P 2

2: cycle of G generations3: sort PTIT(xm1) Ø · · · Ø PTIT(xmP 2 )4: save maxPTIT(xi)5: xm1 , . . . , xm2P = y1, . . . , y2P proceed to the next generation6: randomly pick xj and xk in the set {xm2P , . . . , xmP 2 }7: ’i , j, k, l, i, j, k = 2P + 1, . . . , P 2, l = 1, . . . , N8: if xj = wj and xj = wk

9: then aij = ‰ · wj + (1 ≠ ‰) · wk,‰ œ U(0, 1)10: else if xj = 0 and xj = 011: then aij = 012: else if xj = wj and xj = 013: then with fiaij = wj

14: with mutation probability ’ > 015: aij Ω aij, aij œ U(0, 1)16: choose max{PTIT : yi œ {ai, xj, xk}}17: find PTIT(yú

i ) = max{max PTIT(yi), max PTIT(yP 2)}18: yúi is an optimal solution

A.6. Original Di�erential Evolution Algorithm

The Di�erential Evolution algorithm as described by Grishina et al. (2016) is assumedto be based on the preceding doctoral thesis of Grishina (2014) and the work ofHomchenko et al. (2013). The latter work was cited by Grishina et al. (2016) directlyin regards to the optimization by their di�erential evolution algorithm.

45

The original pseudo-code included in Grishina et al. (2016) for the Di�erential Evolu-tion algorithm can be seen in algorithm 6.

Algorithm 6 Original Di�erential Evolution algorithm pseudo-code for the PTIT(Grishina et al. 2016)

1: Generate initial population xi œ Dk i = 1 . . . , P 2

2: cycle of G generations3: for each xi in population P do4: choose 3 random vectors xa ”= xb ”= xc ”= xi

5: for each component j of xi do6: with probability fi1: z1 Ω N(0, ‡1),7: else z1 = 08: with probability fi2: z2 Ω N(0, ‡2),9: else z2 = 0

10: pick uj ≥ U(0, 1)11: if uj <CR or j = R12: xij = xaj + (F + z1)(xbj ≠ xcj + z2)13: else xij = xij

14: if PTIT(xi) > PTIT(xi)15: yi = xi

16: else yi = xi

17: end for18: end for19: In g = G find yi = {yi|max{PTIT,CC(y1), . . . , PTIT,CC(yP 2)}}

Where the P 2 portfolios of the initial population lie in the set of Dk, which is definedto contain exactly K positive weights (see section 9.2). The other formulations aredirectly comparable to the formulations in this thesis.

The following list illustrates the the unclear/inconsistent parts of algorithm 6 and thedescriptions in Grishina et al. (2016), which had to be addressed in order to get thereplication (algorithm 3):

• xij = vaj +(F +z1)(xbj ≠xcj +z2) in the text description is assumed to be a typingerror, i.e. it should be xaj instead of vaj, since vaj has not been defined and thepseudo-code description includes xaj in a comparable way to the operation asdescribed in Storn et al. (1997). Additionally, in the descriptions in Grishina(2014, p. 40), v is used as the symbol for all vectors of the equation.

• Definition of "P":In the paper’s text description, P is simply defined as the parameter P œ Ndetermining the initial population size P 2, with no further explanation. Thisdefinition does not make sense with its usage in line 3, where it most likely isused as a descriptor of the "current" population, similar to the role of xi in line 5.It is highly unlikely that it is meant to describe the population size as in line 1,

46

since the output of the loop y has the size P 2. The notation is most likely justcarried over from Homchenko et al. 2013, which is cited in the text and in factdenotes their current population as P . For clarity in this thesis, the notationfor the size of the populations is changed to S = P 2.

• Missing normalization:In order for the solution of the di�erential evolution, i.e. the optimal portfolio, tosatisfy the constraints set in eqs. (28b) to (28e), one of the following is necessary:

– The portfolios of the initial population all satisfy the conditions andthe mutation/crossover operations under no circumstance can change theweights in such a way that the conditions are not met anymore.

– An appropriate step to filter out portfolios that do not satisfy the conditionsis included, such that only the portfolios satisfying the conditions aretransferred to the next population

– An appropriate normalization step is included after the manipulations tochange the portfolio to adhere to the conditions.

Neither of the above is the case in the pseudo-code of the di�erential evolutionalgorithm by Grishina et al. 2016. As an example, if xaj = 0.01, xbj < xcj,z1 = z2 = 0 F = 0.05, then the resulting transformed weight xij < 0.01 will beunder the limit set in eq. (28c), i.e. the lower buy-in threshold 0.01, making theportfolio not a valid solution to the problem. The replication algorithm thusincludes normalization steps based on descriptions of the normalization stepsfor the genetic algorithm in Grishina et al. (2016) and other constraint-checksas described by Grishina (2014).

47

Erklärung

Ich erkläre, dass ich meine Masterarbeit „Implementation of Genetic Algorithms inProspect Theory-based Portfolio Optimization“ selbstständig und ohne Benutzunganderer als der angegebenen Hilfsmittel angefertigt habe und dass ich alle Stellen,die ich wörtlich oder sinngemäß aus Verö�entlichungen entnommen habe, als solchekenntlich gemacht habe. Die Arbeit hat bisher in gleicher oder ähnlicher Form oderauszugsweise noch keiner Prüfungsbehörde vorgelegen.

Ich versichere, dass die eingereichte schriftliche Fassung der auf dem beigefügtenMedium gespeicherten Fassung entspricht.

Datum, Unterschrift

implementation of genetic algorithms in prospect theory

Documents