[ieee 2012 international conference on advances in social networks analysis and mining (asonam 2012)...

8
Stock Market Investment Advice: A Social Network Approach Negar Koochakzadeh Computer Science Department University of Calgary Calgary, Canada [email protected] Keivan Kianmehr Electrical and Computer Engineering Department Western University London, Canada [email protected] Atieh Sarraf Computer Science Department University of Calgary Calgary, Canada [email protected] Reda Alhajj Computer Science Department University of Calgary Calgary, Canada [email protected] Abstract— Making investment decision on various available stocks in the market is a challenging task. Econometric and statistical models, as well as machine learning and data mining techniques, have proposed heuristic based solutions with limited long-range success. In practice, the capabilities and intelligence of financial experts is required to build a managed portfolio of stocks. However, for non-professional investors, it is too complicated to make subjective judgments on available stocks and thus they might be interested to follow an expert's investment decision. For this purpose, it is critical to find an expert with similar investment preferences. In this work, we propose to benefit from the power of Social Network Analysis in this domain. We first build a social network of financial experts based on their publicly available portfolios. This social network is then used for further analysis to recommend an appropriate managed portfolio to non-professional investors based on their behavioral similarities to the expert investors. This approach is evaluated through a case study on real portfolios. The result shows that the proposed portfolio recommendation approach works well in terms of Sharpe ratio as the portfolio performance metric. Keywords-component; Stock Market Investment Decision, Social Network Analysis, Clustering, Classification, Sharpe Ratio. I. INTRODUCTION Stock market investment is a very complex and multi- faceted decision problem. This decision process involves stock selection and weighting, such that the collection of stocks satisfies an investor’s objectives. It involves forecasting the performance and the volatility of available stocks as well as models for using these predictions in order to obtain a portfolio of stocks that suits the investor’s preference profile. Financial theorists and investors have been dealing with this issue for many years. Econometric and statistical models as well as machine learning and data mining techniques have been used by many researchers and analysts to propose heuristic solutions. In 1950s Markowitz proposed the modern portfolio theory, and since then researchers have considered the correlation between stocks and the trade-off between return and risk of the investment [1]. In addition, several researchers concentrate on the development of realistic models which, in addition to the two basic criteria of return and risk, also consider other equally important criteria derived from fundamental analysis [2, 3]. Fundamental analysis is the study of the sector and company financial indicators to determine the value of a stock. It aims to determine the financial health of the company, by useful ratios, based on the company’s financial statements [2]. In general multiple criteria have to be considered for making investment decision [3]. Each stock instance can be described by a set of features which represent important financial information regarding the company that the stock represents. Although a great deal of effort has been devoted to developing systems for stock investment decision, limited success has been achieved and reported. Quantitative models for stock selection and portfolio management face the challenge of determining the most efficacious factors. In addition, it is believed that the main reason for the slow progress is the abrupt changes of the structural relationship between the stock price and its determinants over time. This phenomenon of unstable structural parameters in asset price models is a special case of a general fundamental critique of econometric and statistical models [4]. In practice, financial experts apply their personal capabilities and intelligence (human-thinking and skills) to solve the problem based on their knowledge of existing theories and strategies. In this case, investment decision results constitute what is called a managed portfolio. However, integrating domain expert knowledge into the process of making investment decisions is very costly and experts need to concentrate on a limited number of available assets as studying a huge number of them through this process is not feasible. Although professional analysts and fund managers make subjective judgments based on objective technical indicators, it is too complicated for non-professionals to do so. This is the motivation of our proposed approach to study the available portfolios of expert investors and recommend the most appropriate portfolio to the non-professional investor. Since in this approach behavioral similarity between investors is an important aspect, Social Network Analysis (SNA) is applied as a solution strategy. SNA was emphasized starting in 1934 as a subarea of sociology and 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 978-0-7695-4799-2/12 $26.00 © 2012 IEEE DOI 10.1109/ASONAM.2012.22 71 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 978-0-7695-4799-2/12 $26.00 © 2012 IEEE DOI 10.1109/ASONAM.2012.22 71

Upload: r

Post on 28-Mar-2017

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

Stock Market Investment Advice: A Social Network Approach

Negar Koochakzadeh Computer Science

Department University of Calgary

Calgary, Canada [email protected]

Keivan Kianmehr Electrical and Computer Engineering Department

Western University London, Canada

[email protected]

Atieh Sarraf Computer Science

Department University of Calgary

Calgary, Canada [email protected]

Reda Alhajj Computer Science

Department University of Calgary

Calgary, Canada [email protected]

Abstract— Making investment decision on various available stocks in the market is a challenging task. Econometric and statistical models, as well as machine learning and data mining techniques, have proposed heuristic based solutions with limited long-range success. In practice, the capabilities and intelligence of financial experts is required to build a managed portfolio of stocks. However, for non-professional investors, it is too complicated to make subjective judgments on available stocks and thus they might be interested to follow an expert's investment decision. For this purpose, it is critical to find an expert with similar investment preferences. In this work, we propose to benefit from the power of Social Network Analysis in this domain. We first build a social network of financial experts based on their publicly available portfolios. This social network is then used for further analysis to recommend an appropriate managed portfolio to non-professional investors based on their behavioral similarities to the expert investors. This approach is evaluated through a case study on real portfolios. The result shows that the proposed portfolio recommendation approach works well in terms of Sharpe ratio as the portfolio performance metric.

Keywords-component; Stock Market Investment Decision, Social Network Analysis, Clustering, Classification, Sharpe Ratio.

I. INTRODUCTION Stock market investment is a very complex and multi-

faceted decision problem. This decision process involves stock selection and weighting, such that the collection of stocks satisfies an investor’s objectives. It involves forecasting the performance and the volatility of available stocks as well as models for using these predictions in order to obtain a portfolio of stocks that suits the investor’s preference profile. Financial theorists and investors have been dealing with this issue for many years.

Econometric and statistical models as well as machine learning and data mining techniques have been used by many researchers and analysts to propose heuristic solutions. In 1950s Markowitz proposed the modern portfolio theory, and since then researchers have considered the correlation between stocks and the trade-off between return and risk of the investment [1]. In addition, several researchers concentrate on the development of realistic models which, in

addition to the two basic criteria of return and risk, also consider other equally important criteria derived from fundamental analysis [2, 3]. Fundamental analysis is the study of the sector and company financial indicators to determine the value of a stock. It aims to determine the financial health of the company, by useful ratios, based on the company’s financial statements [2]. In general multiple criteria have to be considered for making investment decision [3]. Each stock instance can be described by a set of features which represent important financial information regarding the company that the stock represents.

Although a great deal of effort has been devoted to developing systems for stock investment decision, limited success has been achieved and reported. Quantitative models for stock selection and portfolio management face the challenge of determining the most efficacious factors. In addition, it is believed that the main reason for the slow progress is the abrupt changes of the structural relationship between the stock price and its determinants over time. This phenomenon of unstable structural parameters in asset price models is a special case of a general fundamental critique of econometric and statistical models [4].

In practice, financial experts apply their personal capabilities and intelligence (human-thinking and skills) to solve the problem based on their knowledge of existing theories and strategies. In this case, investment decision results constitute what is called a managed portfolio. However, integrating domain expert knowledge into the process of making investment decisions is very costly and experts need to concentrate on a limited number of available assets as studying a huge number of them through this process is not feasible.

Although professional analysts and fund managers make subjective judgments based on objective technical indicators, it is too complicated for non-professionals to do so. This is the motivation of our proposed approach to study the available portfolios of expert investors and recommend the most appropriate portfolio to the non-professional investor. Since in this approach behavioral similarity between investors is an important aspect, Social Network Analysis (SNA) is applied as a solution strategy. SNA was emphasized starting in 1934 as a subarea of sociology and

2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

978-0-7695-4799-2/12 $26.00 © 2012 IEEE

DOI 10.1109/ASONAM.2012.22

71

2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

978-0-7695-4799-2/12 $26.00 © 2012 IEEE

DOI 10.1109/ASONAM.2012.22

71

Page 2: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

anthropology to study the connectedness of people in groups. A social network consists of a set of actors and their relationships; by considering graph theory concepts this is represented as a graph where actors and their relationships correspond to nodes and links, respectively.

In this work, we first construct a social network of expert investors based on their public available portfolios. Each expert is a node in this network and the weighted link between two nodes shows how similar two experts are based on their portfolios. In the next step, various communities of experts are detected from this social network. On the other hand, a non-professional investor is assigned to an appropriate community of expert based on the similarity that he/she has to the experts of each of the existing communities. One expert is then selected as the representative of each community whose portfolio is suggested to the non-professional investor.

Our paper proceeds as follows. Background and related studies are presented in Section II. In Section III, experts’ social network construction is explained in detail. Next in Section IV, we discuss the recommendation system to suggest a managed portfolio built by an expert to a non-professional investor. The proposed approach is then evaluated in a case study described in Section V. Finally, Section VI presents the conclusions drawn from our research and directions for future work.

II. BACKGROUND AND RELATED WORKS In traditional investment decision approaches, investors

focus only on maximizing the expected return without considering the concept of investment risk [1, 5]. In financial investments, it is important for investors to control and manage the risk to which they subject themselves while searching for high returns [6]. In general, investment opportunities that offer higher returns also entail higher risks [6]. Therefore, there is always a trade-off between risk and return in the investment decision process.

Markowitz was the first to quantify the link that exists between portfolio risk and return through which he founded the modern portfolio theory [1]. He demonstrated that the portfolio risk comes from the covariance of the assets making up the portfolio. Following this theory, other financial models (such as CAPM and APT) considered two criteria of risk and return as well as other fundamental criteria in the valuation of assets.

During 1950s and 1960s, Markowitz and Sharpe introduced a model called Capital Asset Pricing Model (CAPM) [3]. It evaluates the asset return in relation to the market return and the sensitivity of the asset to the market [3]. In other words, by measuring the market risk of each asset, a risk-adjusted expected return is measured by CAPM. This model is based on the assumption of Efficient Market Hypothesis (EMH). According to EMH, any useful patterns should have been reflected in the current price [2].

In the next era, the Arbitrage Pricing Theory (APT) is widely used in portfolio management as an alternative approach to CAPM. APT has the benefit of being a more

powerful theory by requiring less stringent assumptions than CAPM while producing similar results. The difficulty with APT is that it shows that there is a way to forecast expected asset returns but it does not specify how the process works. The key idea of APT is that there exists a set of factors based on which expected returns can be described as a linear combination of each asset’s exposure to these factors [2].

Unrealistic assumptions and time complexity of the required calculation in financial theories are issues that make them not applicable in real world problems [5]. Therefore, in practice, a more comprehensive solution is needed. Professional analysts and fund managers make subjective judgment, based on objective technical indicators. Subsequently, computer scientists have tried to apply AI with the purpose of replacing financial professional intelligence by AI. Currently, soft computing techniques are widely accepted in studying investment management and evaluating market behavior [7]. Various techniques have been proposed for this purpose such as Neural Networks (NN), Genetic Algorithms (GA) and Support Vector Machines (SVM). Several researchers presented encouraging results on stock selection using data mining techniques, e.g., [4, 8-14].

In [3] and [2] the interaction between the decision maker and the system was taken into consideration to evaluate stocks based on investor’s preferences, in order to rank them and single out eligible ones to be included in the portfolio. The ranking carried out by [3] is based on the multi-criteria ranking method UTASTAR, which is an improved version of the UTS method. In [2] it is mentioned that since preferences of an investor are presented in verbal form, the problem is usually charged by uncertainty. Therefore, the authors focused on the synthesis of fuzzy logic methods. In their work, the rating of each stock and the weight of each criterion are presented in the form of linguistic terms represented by triangular fuzzy membership functions.

With the rise of the eCommerce systems in the past decade, major internet retailers have begun to build recommender systems to personalize content to show to their users through an information filtering process. Recommendation systems were first employed by Amazon.com, which would show users personalized recommendations of items that the system thought they would like based on the items that they had bought or rated in the past. Since then they have been widely and successfully used in the fields of movies such as the “Each Movie” database, music, books, and documents. Since collaborative filtering recommendation systems carry the social characteristics of users, different concepts of social network analysis can be utilized to improve the accuracy and reliability of recommendations. Several studies have been conducted on the use of social networks in recommendation systems. For example, in [15] authors use two different social networks in a system to recommend possible collaborations for individuals. Ogata et al. [16] use social networks to facilitate finding a person to collaborate with. In [17], trust clusters are used to improve the recommendation in which clusters are based on trust rather than similarity. Further, several trust-aware recommendation methods have been proposed [18] in which it is shown that by using users’

7272

Page 3: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

trust relations, the performance of the traditional recommender systems can be improved.

The main contribution of this paper is to design and develop a general framework that attempts to detect communities of experts in a social network and to build a recommendation system that utilizes making stock investment decision. To the best of our knowledge, there is no research so far on mining the available portfolios of the financial experts with the purpose of recommending one of them to the non-professional investor. We believe that although financial experts can make professional investment decision on stock markets, it is too complicated for the non-professional investors to do so. Therefore, in this work we propose to study the portfolios of experts by applying social network analysis with the purpose of considering the investment behavioral of the users and suggesting a managed portfolio (built by an expert) to a naïve investor.

III. SOCIAL NETWORK OF FINANCIAL EXPERTS A social network is a set of actors and the relations

between them. Actors represent nodes of a graph and relations are reflected as links. This model is general enough to be applied to any domain where the entities could be separated into actors groups and the relationships between actors could be realized as links.

We believe that constructing a social network of financial experts, based on their investment behavior, can be useful for further analysis with the purpose of investment recommendation to non-professional investors. Various financial experts have different investment behavior and preferences and thus similarity between them can be defined based on how their preferences are similar to each other. In this work we are going to construct a virtual social network containing financial experts as the actors and their investment preference similarities as the link between them.

Investor’s portfolio is an important source of information that can be used to extract human investment behavior. For instance, characteristics such as how risk taker the investor is can be measured from his/her available portfolio. Next we explain feature extraction about investors from their publicly available portfolios and the social network construction process based on these features. This social network is then used to categorize the experts into various clusters and to apply further analysis in the proposed recommendation process.

A. Feature Extraction In order to extract the features about investors, we propose to measure the performance of their portfolio. In portfolio performance analysis, return on an asset is the basic element employed to determine its performance [5]. For measuring the return, we assume a period as an interval of time during which an asset is held without being modified, and companies make payments to its shareholders in terms of dividends at the end of the period. When a company earns a profit, it can be put for two uses: (1) re-invested in the business, or (2) paid to the shareholders as dividend. Many companies retain a portion of their earnings and pay the

remainder as a dividend [5]. Return of stock i at time T is calculated by Equation 1 [5].

(1)

where: Pi,T is the price of the stock at time T; Pi,T- �t is the price of the stock at time T- �t; and Di,T is the dividend paid at time T.

The average value of the return of stock i in a period of time is considered as the expected return of stock i at the end of that period (ERi,T) [5]. Expected return of the stocks constructed the portfolio is then used to measure the expected return of the portfolio. Expected return of the portfolio of n stocks at time T based on the assumption that the portfolio has a fixed composition throughout the evaluation period is measured by Equation 2 [5].

(2)

where: ERi,T is the expected return of stock i at time T; xi,T is the weight of stock i in the portfolio at time T.

As discussed in modern portfolio theories, the concept of return is not sufficient on its own to analyze the performance of a portfolio [5]. To analyze portfolio performance more precisely, we need a quantitative measurement of risk. Investment risk can be defined as earning a return that is less than what we have expected [6]. To measure this concept, in portfolio theory, return is considered as a random variable. The reason is that it is influenced by a significant number of uncertainties in both the future price and the future dividend [6]. By considering the mean of historical returns of an asset as its expected return, the probability of less than the mean can be measured as the risk of that asset. Theorists have noticed that return distributions are usually relatively symmetrical. In other words, return is distributed normally, and thus a large left side always implies a large right side as well. Based in this assumption, in portfolio theory, risk is variability of the return. In other words, risk of an asset is defined as the standard deviation of the probability distribution of its return [6]. Equation 3 is the Risk of stock i during the period ending at T [5].

(3)

where: iR is the average return of stock i during the period

ending at T.

The relative risks of stocks are entirely changed in portfolio compared to individual stocks [6]. The complete characteristic of portfolio risk requires that the behaviour of the stocks return be compared with that of other stocks. For this purpose, covariance of the returns of each pair of the stocks should be considered. Equation 4 shows how Markowitz measures the Risk of the portfolio during the period ending at T [1]. Based on Equation 4, the correlation between stocks has direct influence on the portfolio risk. In

���� � �� ��� � � ����� �

���� � ������ �����

���

)ln(,

,,,

tTi

TiTiTi P

DPR

Δ−

+=

7373

Page 4: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

���� � ������� �� ��� ��� ���

����

���

other words, the weaker correlations between stocks in the portfolio lead to the greater reduction in portfolio risk.

(4)

where: Tij ,ρ is correlation coefficient (Equation 5); and xi,T is the weight of stock i in the portfolio at time T.

(5)

To measure the above two portfolio performance metrics (return and risk) we need to decide on the required parameters. First we need to set �t (in Equation 1) which is usually equal to 1, 3 or 6 months and then we should specify the period of time considered as available history of stocks in the portfolio (in Equation 3).

B. Social Network Construction After extracting characteristic features about investors

from their portfolios, the social network of investors is constructed based on one of the network construction approaches we proposed and implemented in our SNA tool called NetDriller [19]. Next we briefly explain the approach used in this paper which is based on actors clustering result.

In many applications, there are several items identified by a feature vector (a set of features). To apply SNA and obtain useful knowledge from the relationship exists between the items, we need to construct a network of these items. To measure the similarity between each pair of items, we are proposing to apply data mining technique of clustering. The items are clustered by K-means algorithm ([20]) with different values of K (number of clusters). The number of common clusters of two items is then used to measure the similarity between them.

In the case of having n items, n-2 clustering solutions are constructed by applying the K-means algorithm with K equal to 2, 3, …, n-1. The similarity of two items ai and aj is calculated based on Equation 6. In this formula, CommonCluster(ai,aj,k) is 1 if in the clustering solution with k number of clusters, ai and aj appear in the same cluster, and is 0 otherwise. It is also possible to receive specific range of number of clusters from the user instead of applying all possible n-2 clustering solutions.

(6)

To demonstrate how the above methodology works, we apply it on a simple example. Figure 1 shows four items with three features and the network constructed by NetDriller based on clustering solutions with 2 and 3 numbers of clusters. 2 clusters are {a1, a3} and {a2, a4} and 3 clusters are {a1, a3}, {a2} and {a4}. Therefore, a1 and a3 are in the same clusters in both clustering solutions and thus the weight

between them is 1 (2 out of 2). a2 and a4 are in the same cluster in one of the clustering solutions and thus the weight between them is 0.5 (1 out of 2).

Based on the modern portfolio theory, higher return comes with higher risk. Thus we believe that by considering two features of return and risk (extracted from investor’s portfolio), the link’s weight in the social network (constructed as explained above) shows how similar two investors are in terms of user’s propensity for risk.

Figure 1. Network Construction based on Clustering

IV. INVESTMENT RECOMMENDATION SYSTEM Constructed social network of expert investors is then

used in our investment recommendation system. In this system the goal is selecting the best appropriate managed portfolio (built by an expert) and recommending it to the non-professional investor. The propensity for risk is the key characteristic to find the most similar expert to the investor. Next we explain experts clustering, and classification of the non-professional investors to assign him/her to one of the discovered clusters of experts. After finding the most appropriate cluster for the investor, one of the experts in this cluster needs to be picked as the representative of the cluster whose portfolio is suggested to the investor.

A. Experts Clustering As discussed in the social network construction section,

the link between two experts is weighted based on various clustering solutions with various numbers of clusters. In this step we are going to consider one of these clustering solutions as various communities of experts discovered based on K-means algorithm.

Name Risk Return Class Labelj 100 0.8 High Riskd 25 0.23 Low Riskk 103 0.76 High Riskb 20 0.2 Low Riskl 110 0.88 High Riskg 70 0.44 Mid Riski 80 0.81 High Riskf 50 0.4 Mid Riske 60 0.35 Mid Riska 30 0.1 Low Riskh 90 0.72 High Riskc 25 0.21 Low RiskG 93 0.94 ?C 26 0.3 ?F 90 0.86 ?D 27 0.25 ?B 20 0.3 ?E 70 0.79 ?A 10 0.12 ?

Expe

rts

(Tra

inin

g Se

t)no

n-pr

ofes

sion

als

(Tes

ting

Set)

Figure 2. Sample Training and Testing sets

��� �� � � ��� � � � ��� � � ������ ��� � � ����� ������ ��� � � �����

������ ��!��� � �� � �� "#��#�"�$%�& '�� � �� � ()���(�� � � �

7474

Page 5: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

The goal is detecting various clusters containing people with different propensity of risk to be able to label the clusters based on how risk taker is the people in each cluster. For this purpose we use the five Likert scale of very high, high, medium, low and very low. Distributions of the risk value in each of these clusters are then used to label each cluster. By accepting the assumption from modern portfolio theory on trade-off between risk and return, we believe that the risk value should be uniformly distributed among the five discovered clusters. This assumption is tested in our case study explained in the next section.

B. Non-Professional Investors Classification The result of experts’ clustering is then used as the

training set to learn a classification model with the purpose of assigning non-professionals to an appropriate expert community. Figure 2 illustrates a sample Training and testing sets containing 12 expert investors and 7 non-professionals. In this example two features of risk and return of both experts and non-professionals are extracted from their available portfolios.

Different classification techniques such as Support Vector Machine, Naïve bayes, or Decision Tree ([20]) can be used to build the classifier from experts’ communities.

C. Finding the Representative Expert in a Community After assigning the non-professional investor to one of

experts’ communities, we suggest a representative expert in that community to the investor. To find the representative person in a social network we apply SNA by measuring centrality metrics. Within the scope of graph theory and network analysis, there are various types of measures of the centrality of a vertex within a graph that determine its relative importance. Degree centrality and closeness centrality are two measures that are widely used in network analysis [21].

Degree centrality is defined as the amount of ties that a node has in a graph. It can be interpreted in terms of the immediate risk of a node for catching whatever is flowing through the network [21]. Closeness centrality is a measure to show how long it will take to spread information from a node to all other nodes in the graph sequentially. In graphs the distance metric between pairs of nodes is defined by the length of their shortest paths. The closeness of a node is the inverse of how far the node is to all other nodes (sum of its distance to all other nodes) [21].

In each community of investors, we measure centrality of all the nodes by only considering the current subset of the network, and then we select the node with the highest centrality as the representative of that community. To make the proposed recommendation system clear, we illustrate the whole process in an example. Figures 3 and 4 show social network construction and further steps on investors shown in Figure 2.

Figure 3 shows 8 clustering solutions with k from 2 to 9, on 12 expert investors (a, b, c, …, l) in our example. These clustering solutions are the result of k-means algorithm on a

dataset containing two features of Risk and Return for the expert investors. More similar experts stay in the same cluster by increasing number of clusters. For instance, k and j stay in the same cluster in all of the 8 solutions, while h and g are in the same cluster only in one solution out of 8.

K=2 K=3

K=4 K=5

K=6 K=7

K=8 K=9

a

bc d

e

f

gh

ij

k l

Return

Ris

k

Figure 3. Different Clustering solutions for sample experts in Figure 2

e

f

g

h

i

k

j

l

c

d

b

a0.375

0.542

0.937

0.937 0.583

0.5830.875

0.563

0.5930.313

0.313

0.593

G

F

E

C

D

B

A

(a) Experts (b) non-professionals Figure 4. (a) Experts’ Social Network and Communities (b) Assigning

non-professionals to Experts’ communities

The social network of experts is then constructed based on these clustering solutions (Figure 4 (a)). As an example, clusters with k=3 show people with high, mid and low risk (respectively shown by red, yellow and green in Figure 4 (a)). These clusters are considered as different communities of experts used in further analysis in our recommendation system.

7575

Page 6: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

Whenever a non-professional investor looks for the most appropriate expert to mimic his/her investment behavior, it is very important to consider their preference similarities. Propensity for risk is an important aspect about the investor, which was our main focus to discover various communities of experts in the constructed social network.

Each non-professional is first assigned to one of the experts’ communities. For this purpose, required features about the investor (risk and return) are extracted from their available portfolios, which then construct the classification testing set. Figure 4 (b) shows the classification result of non-professional investors by the color that illustrates the assigned cluster.

The degree centralities of each expert investor are shown adjacent to the nodes in Figure 4 (a). The expert with the highest centralities is the one whose portfolio is recommended to the non-professional investor. For instance, in low risk community (green cluster) experts c and d have the highest centrality value (0.583), and thus the portfolio of either of them would be appropriate for a non-professional investor who has been assigned to this community (for example A and B).

V. CASE STUDY

A. Exteriment Setup To evaluate the proposed approach, we used publicly

available portfolios in StockPickr website [22]. StockPickr is a financial services site to incorporate both investment ideas as well as social networking. This community, known as the Stock Idea Network, contains insight from professional investors as well as community members (non-professionals). In our study we have downloaded portfolios of 125 experts and 57 non-professional users in January

2012. Each portfolio shows the existing stocks and their proportion weights.

For the feature extraction phase, the historical price of the stocks that exist in the portfolios are needed. Daily price of these stocks have been downloaded from Yahoo! Finance for the period of January 2010 to January 2012. 6-month returns of all of the stocks are calculated based on Equation 1 (�t = 6 months) for the period of January 2011 to January 2012. The expected return of each of these stocks is then calculated; it is used to calculate the expected return of each portfolio (based on Equation 2). The risk of the portfolio is also measured based on Equation 4 during the period starting in January 2011.

The extracted risk and return of experts’ portfolios are then used in social network construction. For this purpose, in NetDriller, k-means algorithm is performed with k=2, 3, …, 9. Figure 5 shows the constructed social network and distribution of risk value in 5 clusters (the clustering solution with k=5). As can be seen in this figure, risk values of all investors are almost uniformly distributed in these 5 clusters, which conforms the risk and return trade-off proposed in the modern portfolio theory. Therefore, we can label these clusters based on five Likert scale of very high, high, medium, low and very low.

These 5 clusters are then used as the training set of our classification step (Support Vector Machine classifier in this case study) to assign each non-professional investor to one of these 5 communities. In the last step, we measure closeness centralities for all the nodes in each community and pick the one with the highest value as the representative of that group of people. The portfolio of this node is recommended to the investors assigned to this group. To evaluate whether the recommended portfolio is better than non-professional’s current own portfolio, we compare the performance of those

Ris

k9654.8

Investor

4866.0

77.36

Very High Risk High Risk Medium Risk Low Risk Very Low Risk

Figure 5. Social Network of 125 professional investors in StockPickr with the risk value distribution throughout the discovered clustered

7676

Page 7: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

portfolios. Next we explain the metric used for measuring the portfolio performance followed by our experimental result.

B. Portfolio Performance Metric For measuring the performance of a portfolio, Sharpe

(1966) has defined reward-to-variability ratio which is known as Sharpe Ratio [5]. The purpose of this metric is to associate a measure of the portfolio risk as well as the portfolio return. Equation 6 illustrates the formula to calculate Sharpe Ratio of a portfolio p at time T.

(6)

where: ERp,T is the expected return of the portfolio at time T (Equation 2); ERRF,T is the risk-free rate at time T; �p,T is the risk of the portfolio at time T measured by standard deviation of the portfolio (Equation 4). Risk-free rate represents the interest that an investor would expect from an absolute risk-free investment over a given period of time. Risk-free assets usually refer to government bonds such as US Treasury bills or German government bills.

Sharpe ratio measures the amount of return added to the portfolio per unit of risk [14]. This is a popular performance metric for comparing the managed portfolio with benchmarks [5]. In this work, we measure this ratio for the recommended portfolios from experts and the portfolios constructed by non-professionals by considering the US Treasury bill as risk-free rate.

C. Experimental Result Based on the classification process result, 57 non-

professional investors used in our study have been assigned to three communities of Very High Risk, Medium Risk and

Low Risk. To evaluate whether the recommended portfolio is an appropriate one compared to managed portfolios from other communities, we measure the Sharpe ratio of representative portfolios in all 5 communities, and compare them with the Sharpe ratio of the portfolios constructed by non-professionals.

Figure 6 illustrates the aggregated result in the form of boxplot of Sharpe ratios for different communities. In each community the Sharpe ratio of the representative person, Sharpe ratios of naïve (non-professional) investors assigned to this community, and Sharpe ratio of representative of four other communities are shown. As can be seen, in all three groups, the Sharpe ratio of the representative expert (recommended expert) is higher than the Sharpe ratio of all Naïve investors. This confirms that we are recommending a portfolio to the non-professional which has higher Sharpe ratio compared to their current portfolio. In other words, non-professionals get higher amount of return for the same amount of risk.

However, the Sharpe ratios of representatives from other communities do not necessarily have higher value compared to naïve investors. This shows that there might be some non-professionals with higher performance, in terms of Sharpe ratio, than the professionals and thus any professional portfolio is not the good replacement for any non-professional portfolio. Propensity for risk is an important investment behavioral and preferences that need to be considered for recommending a managed portfolio to a non-professional investor. For instance, naïve investors assigned to Low Risk community, have lower risk compared to representative of other clusters, so as Figure 6 shows, their Sharpe ratio might be higher than managed portfolios from other clusters (based on equation 6). Thus those managed portfolios are not necessary better investment options for them compared to their current portfolios.

0.2

0.1

0.0

-0.1

-0.2

Shar

pe R

atio

Very High Risk Medium Risk Low Risk

Naï

ve

Inve

stor

s

Rec

omm

ende

dPr

ofes

siona

l

Oth

er C

lust

ers

Rep

rese

ntat

ives

Naï

ve

Inve

stor

s

Rec

omm

ende

dPr

ofes

siona

l

Oth

er C

lust

ers

Rep

rese

ntat

ives

Naï

ve

Inve

stor

s

Rec

omm

ende

dPr

ofes

siona

l

Oth

er C

lust

ers

Rep

rese

ntat

ives

Figure 6. Boxplot of Sharpe ratio of portfolios in three clusters

���� � ����� ���*������

7777

Page 8: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

VI. CONCLUSION AND FUTURE WORKS Investment decision is a complicated problem involving

many factors. In practice, expert investors consider multi criteria to make wise decision. However, it is not easy for non-professionals to do so. Therefore, a non-professional investor can mimic a professional by investing on the same portfolio as his/her. However, the investment preferences of people are various and thus different experts should be found for different persons. In this work, we proposed how publicly available portfolios of experts and non-professionals can be used to extract some features about their investment preferences which are then used to recommend an appropriate expert to each non-professional. The result of our case study confirms that the proposed technique recommends a portfolio of an expert with similar preferences which has higher performance in terms of Sharpe ratio. However, as any experimental study has some threats to validity, our study has some threats which need to be investigated more in future. For instance, various point of time, with different period of histories and other values for required parameters other than the values set in this study need to be examined before generalizing the result.

REFERENCES [1] H. Markowitz, "Portfolio Selection," The Journal of Finance, vol. 7,

pp. 77-91, 1952. [2] G. D. Samaras, N. F. Matsatsinis, and C. Zopounidis, "A

multicriteria DSS for stock evaluation using fundamental analysis," European Journal of Operational Research, vol. 187, pp. 1380-1401, 2008.

[3] P. Sevastjanov and L. Dymova, "Stock screening with use of multiple criteria decision making and optimization," Omega, vol. 37, pp. 659-671, 2009.

[4] A. N. Refenes, M. Azema-Barac, and A. D. Zapranis, "Stock ranking: neural networks vs multiple linear regression," in Proceeding of the IEEE International Conference on Neural Networks, 1993, pp. 1419-1426.

[5] N. Amenc and V. L. Sourd, Portfolio Theory and Performance Analysis, 2003.

[6] W. R. Lasher, P. L. Hedges, and T. Fegarty, Practical Financial Management: Second Canadian Edition, 2009.

[7] G. S. Atsalakis and K. P. Valavanis, "Surveying stock market forecasting techniques - Part II: Soft computing methods," Expert Systems with Applications, vol. 36, pp. 5932-5941, 2009

[8] R. Riolo, T. Soule, B. Worzel, Y. L. Becker, H. Fox, and P. Fei, "An Empirical Study of Multi-Objective Algorithms for Stock Ranking," in Genetic Programming Theory and Practice V: Springer US, 2008, pp. 239-259.

[9] R. Apostolos Nicholas, Z. Achileas, and F. Gavin, "Stock performance modeling using neural networks: a comparative study with regression models," Neural Netw., vol. 7, pp. 375-388, 1994.

[10] K. Kim, "Financial time series forecasting using support vector machines," Neurocomputing, vol. 55, pp. 307-319, 2003.

[11] P. F. Pai and C. S. Lin, "A hybrid ARIMA and support vector machines model in stock price forecasting," Omega, vol. 33, pp. 497-505, 2005.

[12] D. Enke and S. Thawornwong, "The use of data mining and neural networks for forecasting stock market returns," Expert Syst. Appl., vol. 29, pp. 927-940, 2005

[13] A. Fan and M. Palaniswami, "Stock selection using support vector machines," in Proceeding of the International Joint Conference on Neural Networks (IJCNN '01), 2001, pp. 1793-1798 vol.3.

[14] G. H. John, P. Miller, and R. Kerber, "Stock selection using rule induction," IEEE Expert, vol. 11, pp. 52-58, 1996.

[15] D. W. McDonald, "Recommending collaboration with social networks: a comparative evaluation," in SIGCHI Conference on Human Factors in Computing Systems, 2003, pp. 593–600.

[16] H. Ogata, Y. Yano, N. Furugori, and Q. Jin, "Computer supported social networking for augmenting cooperation," Comput. Supported Coop, Work 10, pp. 189–209, 2001.

[17] T. DuBois, J. Golbeck, J. Kleint, and A. Srinivasan, "Improving Recommendation Accuracy by Clustering Social Networks with Trust," in ACM RecSys 2009 Workshop on Recommender Systems and the Social Web, 2009.

[18] H. Ma, I. King, and M. R. Lyu, "Learning to recommend with social trust ensemble," in 32nd International ACM SIGIR, 2009, pp. 203–210.

[19] N. Koochakzadeh, A. Sarraf, K. Kianmehr, J. Rokne, and R. Alhajj, "NetDriller: A Powerful Social Network Analysis Tool," in IEEE 11th International Conference on Data Mining Workshops, 2011.

[20] J. Han and M. Kamber, "Data Mining: Concepts and Techniques," in The Morgan Kaufmann Series in Data Management Systems, 2006.

[21] S. Wasserman and K. Faust, Social Network Analysis: Methods and Applications. New York: Cambridge University Press, 1994.

[22] "About Stockpickr - Stockpickr! Your Source for Stock Ideas, www.Stockpickr.com.," accessed at 2012-01-04.

7878