a distance-based fuzzy time series model for exchange rates forecasting

8
A distance-based fuzzy time series model for exchange rates forecasting Yungho Leu * , Chien-Pang Lee, Yie-Zu Jou Department of Information Management, National Taiwan University of Science and Technology, 43, Keelung Road, Section 4, Taipei 10607, Taiwan, ROC article info Keywords: Distance-based fuzzy time series Fuzzy time series Exchange rate forecasting abstract Fuzzy time series model has been successfully employed in predicting stock prices and foreign exchange rates. In this paper, we propose a new fuzzy time series model termed as distance-based fuzzy time series (DBFTS) to predict the exchange rate. Unlike the existing fuzzy time series models which require exact match of the fuzzy logic relationships (FLRs), the distance-based fuzzy time series model uses the dis- tance between two FLRs in selecting prediction rules. To predict the exchange rate, a two factors dis- tance-based fuzzy time series model is constructed. The first factor of the model is the exchange rate itself and the second factor comprises many candidate variables affecting the fluctuation of exchange rates. Using the exchange rate data released by the Central Bank of Taiwan, we conducted several exper- iments on exchange rate forecasting. The experiment results showed that the distance-based fuzzy time series outperformed the random walk model and the artificial neural network model in terms of mean square error. Ó 2008 Elsevier Ltd. All rights reserved. 1. Introduction Forecasting financial time series such as the stock prices or the exchange rates is important to the investors and the government. A good forecasting of a financial time series requires strong domain knowledge and good analysis tools. Many well-established meth- ods, such as autoregressive (AR), autoregressive moving average (ARMA) and generalized autoregressive conditional heteroscedas- ticity (GARCH), have been successfully applied for financial fore- casting. However, there are some limitations on applying the existing methods. For example, in applying ARMA method one needs to make sure that the residue is normally distributed. Re- cently, due to the advance in artificial intelligence, many research- ers have focused on developing computational intelligence methods for forecasting financial time series. For example, Panda and Narasimhan predicted the exchange rate of Rupee and US dol- lar by artificial neural networks (Panda & Narasimhan, 2007). Cao et al. used support vector machines to predict the USD/GBP ex- change rate (Cao, Pang, & Bai, 2005). Shin and Han proposed an integrated approach (Shin & Han, 2000) which uses genetic algo- rithms to choose the correct threshold parameters for wavelet so as to produce significant signal for a neural network. The inte- grated approach was used to predict the exchange rate of Korean won/US dollar. The fuzzy time series method is one of the compu- tational intelligence methods that draw much attention today. The Fuzzy time series model was first proposed by Song and Chissom (1993a, 1993b). Since then, many fuzzy time series models have been developed for forecasting index prices (Cheng, Chen, Teoh, & Chiang, 2008; Lee, Wang, Chen, & Leu, 2006; Yu, 2005) and enrollment of universities (Chen, 1996, 2002; Song & Chissom, 1993a, 1993b). Recently, many extensions on fuzzy time series have been proposed. For example, Lee et al. (2006) proposed to al- low more than one factor in a fuzzy time series to improve the forecasting accuracy. Yu (2005) proposed to assign different weights on fuzzy logic relationships (FLRs) based on their recency. The weights on fuzzy logic relationships are then used in defuzzifi- cation to determine the forecasted value of TAIEX. The existing models for exchange rate forecasting use only the historical exchange rate itself to build the forecasting model. How- ever, according to the literatures, there are many factors that affect the exchange rate of two currencies. For example, government pol- icies, expectations of the investors, and the stock indices are all shown to affect the fluctuation of the exchange rate (Benjamin, 2006; Nieh & Lee, 2001). Thus, we propose to use the two-factor fuzzy time series to forecast the exchange rate. In an attempt to use the two-factor fuzzy time series model for exchange rate fore- casting, we found that it was very difficult to find matching rules for calculating the forecasted value. To remedy this shortcoming, we propose to extend the fuzzy time series model by using the Euclidean distance between two fuzzy logic relationships to find similar fuzzy logic relationships for calculating the forecasted va- lue. The new fuzzy time series model proposed in this paper is termed as distance-based fuzzy time series (DBFTS). DBFTS is a two-factor high-order fuzzy time series model. The first factor of DBFTS is the historical exchange rate, and the second factor com- prises several important variables that have an effect of on the ex- change rate. 0957-4174/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2008.10.034 * Corresponding author. E-mail address: [email protected] (Y. Leu). Expert Systems with Applications 36 (2009) 8107–8114 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Upload: yungho-leu

Post on 26-Jun-2016

215 views

Category:

Documents


3 download

TRANSCRIPT

Expert Systems with Applications 36 (2009) 8107–8114

Contents lists available at ScienceDirect

Expert Systems with Applications

journal homepage: www.elsevier .com/locate /eswa

A distance-based fuzzy time series model for exchange rates forecasting

Yungho Leu *, Chien-Pang Lee, Yie-Zu JouDepartment of Information Management, National Taiwan University of Science and Technology, 43, Keelung Road, Section 4, Taipei 10607, Taiwan, ROC

a r t i c l e i n f o

Keywords:Distance-based fuzzy time seriesFuzzy time seriesExchange rate forecasting

0957-4174/$ - see front matter � 2008 Elsevier Ltd. Adoi:10.1016/j.eswa.2008.10.034

* Corresponding author.E-mail address: [email protected] (Y. Leu).

a b s t r a c t

Fuzzy time series model has been successfully employed in predicting stock prices and foreign exchangerates. In this paper, we propose a new fuzzy time series model termed as distance-based fuzzy time series(DBFTS) to predict the exchange rate. Unlike the existing fuzzy time series models which require exactmatch of the fuzzy logic relationships (FLRs), the distance-based fuzzy time series model uses the dis-tance between two FLRs in selecting prediction rules. To predict the exchange rate, a two factors dis-tance-based fuzzy time series model is constructed. The first factor of the model is the exchange rateitself and the second factor comprises many candidate variables affecting the fluctuation of exchangerates. Using the exchange rate data released by the Central Bank of Taiwan, we conducted several exper-iments on exchange rate forecasting. The experiment results showed that the distance-based fuzzy timeseries outperformed the random walk model and the artificial neural network model in terms of meansquare error.

� 2008 Elsevier Ltd. All rights reserved.

1. Introduction

Forecasting financial time series such as the stock prices or theexchange rates is important to the investors and the government. Agood forecasting of a financial time series requires strong domainknowledge and good analysis tools. Many well-established meth-ods, such as autoregressive (AR), autoregressive moving average(ARMA) and generalized autoregressive conditional heteroscedas-ticity (GARCH), have been successfully applied for financial fore-casting. However, there are some limitations on applying theexisting methods. For example, in applying ARMA method oneneeds to make sure that the residue is normally distributed. Re-cently, due to the advance in artificial intelligence, many research-ers have focused on developing computational intelligencemethods for forecasting financial time series. For example, Pandaand Narasimhan predicted the exchange rate of Rupee and US dol-lar by artificial neural networks (Panda & Narasimhan, 2007). Caoet al. used support vector machines to predict the USD/GBP ex-change rate (Cao, Pang, & Bai, 2005). Shin and Han proposed anintegrated approach (Shin & Han, 2000) which uses genetic algo-rithms to choose the correct threshold parameters for wavelet soas to produce significant signal for a neural network. The inte-grated approach was used to predict the exchange rate of Koreanwon/US dollar. The fuzzy time series method is one of the compu-tational intelligence methods that draw much attention today. TheFuzzy time series model was first proposed by Song and Chissom(1993a, 1993b). Since then, many fuzzy time series models have

ll rights reserved.

been developed for forecasting index prices (Cheng, Chen, Teoh,& Chiang, 2008; Lee, Wang, Chen, & Leu, 2006; Yu, 2005) andenrollment of universities (Chen, 1996, 2002; Song & Chissom,1993a, 1993b). Recently, many extensions on fuzzy time serieshave been proposed. For example, Lee et al. (2006) proposed to al-low more than one factor in a fuzzy time series to improve theforecasting accuracy. Yu (2005) proposed to assign differentweights on fuzzy logic relationships (FLRs) based on their recency.The weights on fuzzy logic relationships are then used in defuzzifi-cation to determine the forecasted value of TAIEX.

The existing models for exchange rate forecasting use only thehistorical exchange rate itself to build the forecasting model. How-ever, according to the literatures, there are many factors that affectthe exchange rate of two currencies. For example, government pol-icies, expectations of the investors, and the stock indices are allshown to affect the fluctuation of the exchange rate (Benjamin,2006; Nieh & Lee, 2001). Thus, we propose to use the two-factorfuzzy time series to forecast the exchange rate. In an attempt touse the two-factor fuzzy time series model for exchange rate fore-casting, we found that it was very difficult to find matching rulesfor calculating the forecasted value. To remedy this shortcoming,we propose to extend the fuzzy time series model by using theEuclidean distance between two fuzzy logic relationships to findsimilar fuzzy logic relationships for calculating the forecasted va-lue. The new fuzzy time series model proposed in this paper istermed as distance-based fuzzy time series (DBFTS). DBFTS is atwo-factor high-order fuzzy time series model. The first factor ofDBFTS is the historical exchange rate, and the second factor com-prises several important variables that have an effect of on the ex-change rate.

8108 Y. Leu et al. / Expert Systems with Applications 36 (2009) 8107–8114

The contributions of this paper are threefold: (1) we propose touse the distance between two FLRs for selecting prediction ruleswhich makes the fuzzy time series model applicable for real lifeapplications in which the exact match of FLRs is not possible; (2)we propose to dynamically adjust the number of intervals of thesecond factor to improve the prediction accuracy;(3) we proposeto use PCA to derive the second factor, which allows incorporatingmultiple influential variables into the second factor.

The remainder of this paper is organized as follows. Section 2reviews several definitions of fuzzy time series models. Section 3introduces the DBFTS model. Section 4 presents the procedure ofusing DBFTS in exchange rate forecasting. In Section 5, we reportthe results of several experiments and compare the performanceof DBFTS, random walk model and RBFNN model in terms of meansquare error (MSE) and directional symmetry (DS). Finally, we con-clude this paper in Section 6.

2. Review of fuzzy time series models

According to the literature, the following definitions are givenfor fuzzy time series models.

Definition 1. Let Y(t) (t = . . .,0,1,2, . . .), a subset of R1, be theuniverse of discourse on which fuzzy sets fi(t) (i = 1,2, . . .) aredefined. If F(t) is a collection of fi(t). Then F(t) is called a fuzzy timeseries defined on Y(t).

Definition 2. If for any fj(t) 2 F(t), there exists an fi(t � 1) 2 F(t � 1),such that there exists a fuzzy relation Rij(t, t � 1) and fj(t) =fi(t � 1) � Rij(t, t � l) where ‘�’ is the max–min composition, thenF(t) is said to be caused by F(t � 1) only. Let R(t, t � 1)=[ijRij(t, t � 1) where ‘[’ is the union operator. Then R(t, t � 1) iscalled the fuzzy relation between F(t) and F(t � 1). We can writeF(t) = F(t � 1) � R(t, t � 1), and denote this as F(t � 1) ? F(t).

Definition 3. If F(t) is caused by F(t-1),F(t � 2), . . ., and F(t � n), F(t)is called a one-factor n-order fuzzy time series, and is denoted by

Fðt � nÞ; . . . ; Fðt � 2Þ; Fðt � 1Þ ! FðtÞ:

Based on the one-factor n-order fuzzy time series, Lee et al. (2006)proposed the two-factor high-order fuzzy time series as is definedin the following.

Definition 4. If F1(t) is caused by (F1(t � 1),F2(t � 1)),(F1(t�2),F2(t � 2)), . . ., (F1(t � n),F2(t � n)), F1(t) is called a two-fac-tor n-order fuzzy time series, which is denoted by

ðF1ðt � nÞ; F2ðt � nÞÞ; . . . ; ðF1ðt � 2Þ; F2ðt � 2ÞÞ; ðF1ðt � 1Þ; F2ðt � 1ÞÞ! F1ðtÞ:

Let F1(t) = Xt and F2(t) = Yt, where Xt and Yt are fuzzy sets on day t.Then, a two-factor n-order fuzzy relationship can be expressed as

ðXt�n;Yt�nÞ; . . . ; ðXt�2;Yt�2Þ; ðXt�1; Yt�1Þ ! Xt;

where (Xt�n,Yt�n), . . ., (Xt�2,Yt�2) and (Xt�1,Yt�1), are referred to asthe left-hand side (LHS) of the relationship, and Xt is referred toas the right-hand side (RHS) of the relationship. A fuzzy relationshipis also called a fuzzy logic relationship (FLR) in Chen (1996) andSong and Chissom (1993a, 1993b).

3. Distance-based fuzzy time series

As discussed in Section 1, the fluctuation of exchange rates in anexchange rate market is affected by many important variables. Inthe beginning, we employed the two-factor high-order fuzzy time

series proposed by Lee et al. (2006) to forecast the exchange ratesof Taiwan exchange rate market. Unfortunately, there are to toomany fuzzy sets at the LHS of an FLR, it was very difficult to finda matched FLR for predicting the exchange rate. This motivatesus to propose the DBFTS model which features in selecting fore-casting FLRs based on a distance metric defined on the LHS of anFLR, instead of a complete match of the LHS. The details of theDBFTS are explained in the following procedure. Note that differentfrom the Lee’s model, our approach contains two more steps fordetermining the second factor.

3.1. Step 1: Test of correlation coefficient

As defined in Definition 4, the first factor is caused by the firstand the second factors. Furthermore, the second factor may bederived from many candidate variables. To decide whether a can-didate variable is suitable for a component of the second factor,one has to test the correlation between the candidate variableand the first factor. If the correlation coefficient of the first factorand the candidate variable is not significant, the candidate vari-able should not be considered in the derivation of the secondfactor.

3.2. Step 2: Principal components analysis

After testing of correlation coefficient, the second factor canbe expressed as a linear combination of the candidate variablesthrough principal components analysis (PCA). For exchange rateforecasting, the second factor is the absolute score value of thefirst principal component, which is denoted by |PRIN1|. Note thatusually more than one component is needed for the secondfactor. However, for exchange rate forecasting, the first compo-nent has already explained 50–70% of the proportion of variance;we therefore choose only the first component as the secondfactor.

3.3. Step 3: Divide the universe of discourse

The universe of discourse of the first factor is defined asU = [Dmin � D1,Dmax + D2], where Dmin and Dmax are the minimumand maximum values of the first factor, respectively; D1 and D2

are two positive real numbers to divide the universe of discourseinto n equal length intervals (Song & Chissom, 1994). The universeof discourse of the second factor is defined as V = [Vmin � V1,Vmax + V2], where Vmin and Vmax are the minimum and maximumvalues of the second factor, respectively; V1 and V2 are two positivereal numbers used to divide the universe of discourse of the secondfactor into m equal length intervals.

3.4. Step 4: Define fuzzy sets

Linguistic term Ai, 1 6 i 6 n, is defined as a fuzzy set on theintervals of the first factor. Ai is defined as follows:

A1 ¼ 1=u1 þ 0:5=u2 þ 0=u3 þ � � � þ 0=un�2 þ 0=un�1 þ 0=un;

A2 ¼ 0:5=u1 þ 1=u2 þ 0:5=u3 þ � � � þ 0=un�2 þ 0=un�1 þ 0=un;

..

.

An�1 ¼ 0=u1 þ 0=u2 þ 0=u3 þ � � � þ 0:5=un�2 þ 1=un�1 þ 0:5=un;

An ¼ 0=u1 þ 0=u2 þ 0=u3 þ � � � þ 0=un�2 þ 0:5=un�1 þ 1=un:

Similarly, linguistic term Bj, 16 j 6m, is defined as a fuzzy set on theintervals of the second factor. Bj is defined as follows:

Y. Leu et al. / Expert Systems with Applications 36 (2009) 8107–8114 8109

B1 ¼ 1=v1 þ 0:5=v2 þ 0=v3 þ � � � þ 0=vm�2 þ 0=vm�1 þ 0=vm;

B2 ¼ 0:5=v1 þ 1=v2 þ 0:5=v3 þ � � � þ 0=vm�2 þ 0=vm�1 þ 0=vm;

..

.

Bm�1 ¼ 0=v1 þ 0=v2 þ 0=v3 þ � � � þ 0:5=vm�2 þ 1=vm�1 þ 0:5=vm;

Bm ¼ 0=v1 þ 0=v2 þ 0=v3 þ � � � þ 0=vm�2 þ 0:5=vm�1 þ 1=vm:

3.5. Step 5: Adjust the number of intervals of the second factor

Since the distance between two FLRs is used for rule selection,the number of intervals of the universes of discourse of the firstand the second factors plays an important role in selecting fore-casting rules. If the number of intervals of the first factor is muchgreater than that of the second factor, the first factor will dominatein selecting FLR forecasting rules; consequently, the forecast resultwill be primarily based on first factor. To balance the power inselecting forecast FLR rules, we adjust the number of intervals ofthe second factor to improve the forecasting accuracy.

3.6. Step 6: Fuzzification

Fuzzify the historical data of the first factor and the second fac-tor. An example of fuzzification is given in Section 4.

3.7. Step 7: Construct fuzzy logic relationships (FLR) database

For historical data on day i, let Xi�k, Yi�k denote the values ofF1(i � k) and F2(i � k) of the fuzzy time series, respectively, where1 6 k 6 n, and let Xi denote the value of F1(i). An FLR rule of the fuz-zy time series on day i can be represented as follows:

ðXi�n;Yi�nÞ; . . . ; ðXi�2;Yi�2Þ; ðXi�1;Yi�1Þ ! Xi:

3.8. Step 8: Forecast the first factor on day t

The value of the first factor on day t can be obtained through thefollowing procedure.

3.8.1. Construct the LHS of the FLR on day tConstruct the LHS of the FLR on day t and denote it as follows:

ðXt�n;Yt�nÞ; . . . ; ðXt�2;Yt�2Þ; ðXt�1;Yt�1Þ: ð1Þ

3.8.2. Search for similar FLRsCalculate the Euclidean distance (ED) between the LHS of the

FLR on day t and each LHS of a candidate FLRs in FLR database.Let ED denote the distance between the LHS on day t and theLHS of a candidate FLR rule on day i, then we have

ED ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXn

j¼1

ððIXt�j � RXi�jÞ2 þ ðIYt�j � RYi�jÞ2vuut Þ; ð2Þ

where IXt�j and IYt�j are the subscripts of the fuzzy sets of the LHS ofday t’s FLR, respectively; that is, IXt�j is the subscript of fuzzy setXt�j and IYt�j is the subscript of fuzzy set Yt�j in formula (1). Simi-larly, RXi�j and RYi�j are the subscripts of their corresponding fuzzysets at the LHS of day i’s FLR.

Fig. 1. The framework of an artificial neural network.

3.8.3. Forecast the first factorTo forecast the value of the first factor on day t, we choose the

FLRs from FLR database with k smallest distances from the FLR onday t, and use the RHSs of these k FLRs to generate the forecast va-lue. The value of the first factor on day t can be calculated with thefollowing formula:

forecast value ¼ 1W

Xk

j¼1

1ED½j�M½j�; ð3Þ

where W ¼Pk

j¼11=ED½j�; ED[j] is the jth smallest Euclidean distance;M[j] is the midpoint value of the fuzzy set at the RHS of the FLR withthe jth smallest Euclidean distance. If ED[j] = 0 for any j where1 6 j 6 k, then the forecast_value is equal to the average of M[j]whose corresponding ED[j] equals 0.

4. Exchange rate forecasting

To study the performance of DBFTS, we compare it with the RWmodel and the RBFNN model in exchange rate forecasting. The RWmodel is deemed a simple and accurate model for financial timeseries (Meese & Rogoff, 1983), and the RBFNN model is a state-of-the-art model which is also used in financial time series fore-casting. We briefly review these two models in the following.

4.1. Random walk

Let St denote the exchange rate at time t. The random walkmodel for St can be written as

St ¼ St�1 þ et ; ð4Þ

where et is a white noise at time t satisfying E(et) = 0, var(et) = r2

and cov(et,es) = 0 for t – s. formula. (4) states that the exchange rateat time t is equal to the exchange rate at time t � 1 plus a whitenoise value at time t.

4.2. Artificial neural networks

The concept of artificial neural networks (ANN) was first intro-duced in 1950s. Since then many artificial neural network modelshave been proposed. Among them, the feedforward neural networkand the radial basis function neural network are most well-known.The framework of an artificial neural network, as shown in Fig. 1,contains three layers: the input layer, the hidden layer and the out-put layer. Since RBFNN model has been successfully applied inmany financial time series applications (Panda & Narasimhan,2007), we choose it as a target for comparison. We used the statis-tical software package R (R Project for Statistical Computing) tobuild an RBFNN model for exchange rate forecasting. In the RBFNNmodel, the input layer contains 3 nodes (one node for the exchangerate on day t � 3, day t � 2 and day t � 1, respectively); the hidden

Table 3Principal components analysis of the second factor.

Eigenvectors of the first principal component

JPY 0.463KRW 0.666CNY 0.118TAIEX �0.573

8110 Y. Leu et al. / Expert Systems with Applications 36 (2009) 8107–8114

layer contains 4 nodes and the output layer contains a single nodefor the exchange rate (i.e., the forecasted exchange rate on day t).

4.3. Data

The NTD/USD, JPY/USD, CNY/USD and KRW/USD exchange ratesare collected from the website of the Central Bank of Taiwan, andthe Taiwan Stock Exchange Capitalization Weighted Stock Index(TAIEX) is collected from the website of the Taiwan Stock ExchangeCorporation. The exchange rates and TAIEX prices are the closeprices of each trading day. The experimental data comprise ex-change rates and TAIEX prices from March 1, 2006 to March 1,2007. Training data from March 1, 2006 to October 26, 2006 areused to construct FLR database for predicting the exchange rateon October 27, 2006. To predict the exchange rate on October 28,2006, we add the corresponding FLR on October 27, 2006 to theFLR database, and so on. The variables of the second factor areshown in Table 1. Note that due to the similarity in the economicstructures of Taiwan, China, Japan and South Korea, we chooseJPY/USD, CNY/USD and KRW/USD, along with TAIEX, as candidatevariables for the second factor.

4.4. Forecast exchange rates using the distance-based fuzzy time series

To demonstrate the way of using DBFTS in exchange rate fore-casting, we describe the procedure of forecasting the exchange rateof NTD/USD on October 27, 2006. Note that the exchange rates ofNTD/USD on the other days can be forecasted in a similar way.We have implemented DBFTS using statistical software packageR on a personal computer. The implemented program calls a sub-routine of R to perform the principal components analysis.

4.4.1. Step 1: Test of correlation coefficientTo decide whether a candidate variable is required for forecast-

ing NTD/USD, one has to test the correlation between the candidatevariable and NTD/USD. The results of the tests of correlation coef-ficients are shown in Table 2. Since the p values of all the tests aresignificant (at a significant level a = 0.05), all candidate variablesare required for NTD/USD.

4.4.2. Step 2: Principal components analysisThe second factor is created by inputting the training data into

R’s principal components analysis. The eigenvectors of the firstprincipal component are shown in Table 3. The second factor canbe written as

jPRIN1j ¼ j0:463 � JPY� þ 0:666 � KRW� þ 0:118 � CNY�

� 0:573 � TAIEX�j; ð5Þ

Table 1The candidate variables for the second factor.

Candidatevariable 1

Candidatevariable 2

Candidatevariable 3

Candidatevariable 4

The secondfactor

JPY/USD KRW/USD CNY/USD TAIEX

Table 2Correlation coefficients test.

Pearson correlation coefficients Prob > |r| under H0: Rho = 0

JPY/USD KRW/USD CNY/USD TAIEXNTD/USD <0.001 <0.001 <0.001 <0.001

Note: numbers in table are p value.

where JPY*, KRW*, CNY*, and TAIEX* are the values of JPY/USD, KRW/USD, CNY/USD, and TAIEX after normalization, respectively.

4.4.3. Step 3: Divide the universe of discourseLet U denote the universe of discourse of the first factor and V

denote the universe of discourse of the second factor. For forecast-ing the exchange rate on October 27, 2006, Dmin = 31.338, Dmax =33.316, D1 = 0 and D2 = 0.002; consequently, we have U =[31.338,33.318]. Similarly, we have Vmin = 0.01057013, Vmax =4.144669, V1 = 0, V2 = 0.0302023, and V = [0.01057013,4.14466920].

4.4.4. Step 4: Define fuzzy setsSince the minimum fluctuation of NTD/USD is 0.001 in the

training data set, the interval length of the first factor is set to0.005. Accordingly, the first factor is divided into 396 inter-vals, which are u1 = [31.338,31.343),u2 = [31.343,31.348), . . .,u396 = [33.313,33.318). The fuzzy sets of the first factor are repre-sented as follows:

A1 ¼ 1=u1 þ 0:5=u2 þ 0=u3 þ � � � þ 0=u394 þ 0=u395 þ 0=u396;

A2 ¼ 0:5=u1 þ 1=u2 þ 0:5=u3 þ � � � þ 0=u196 þ 0=u197 þ 0=u198;

..

.

A395 ¼ 0=u1 þ 0=u2 þ 0=u3 þ � � � þ 0:5=u394 þ 1=u395 þ 0:5=u396;

A396 ¼ 0=u1 þ 0=u2 þ 0=u3 þ � � � þ 0=u394 þ 0:5=u395 þ 1=u396:

Because the second factor is a linear combination of several vari-ables, it is not easy to choose an appropriate interval length. Instead,we decide the number of intervals first and then divide the universeof discourse evenly. To predict NTD/USD on October 27, 2006, weassume that the number of intervals of the second factor equals thatof the first factor. Consequently, the second factor is dividedinto 396 intervals, which are v1 = [0.01057013,0.02100977),v2 = [0.02100977,0.03144942), . . .,v396 = [4.13422955,4.14466920).The fuzzy sets of the second factor are represented as follows:

B1 ¼ 1=v1 þ 0:5=v2 þ 0=v3 þ � � � þ 0=v394 þ 0=v395 þ 0=v396;

B2 ¼ 0:5=v1 þ 1=v2 þ 0:5=v3 þ � � � þ 0=v394 þ 0=v395 þ 0=v396;

..

.

B395 ¼ 0=v1 þ 0=v2 þ 0=v3 þ � � � þ 0:5=v394 þ 1=v395 þ 0:5=v396;

B396 ¼ 0=v1 þ 0=v2 þ 0=v3 þ � � � þ 0=v394 þ 0:5=v395 þ 1=v396:

4.4.5. Step 5: Adjust the number of intervals of the second factorTo balance the power in selecting the forecasting FLR rules, we

adjust the number of intervals of the second factor. Let p denotethe proportion of the number of intervals of the first factor to thenumber of intervals of the second factor. The value of p which of-fers the minimal MSE is termed as the optimal p value. The optimalp value depends on the training data set. In our experiment, wevary p from 1 to 50 and choose the one with the smallest MSE asthe optimal p value. Notice that the optimal p value equals 1 forpredicting NTD/USD on October 27, 2006. If the optimal p valueis not equal to 1, the number of intervals of the second factor mustbe adjusted according to the optimal p value. For example, if theoptimal p value equals 2, then the number of intervals of the sec-ond factor is half of that of the first factor. Consequently, V is

Table 5Fuzzy sets of the second factor.

Date Value of the second factor Fuzzy set

2006/03/01 1.318329 B126

2006/03/02 1.19954 B114

2006/03/03 1.56683 B150

2006/03/06 1.868528 B178

2006/03/07 2.328661 B223

2006/03/08 2.709579 B259

2006/03/09 2.719682 B260

2006/03/10 2.666628 B255

2006/03/13 2.706004 B259

2006/03/14 2.607744 B

Y. Leu et al. / Expert Systems with Applications 36 (2009) 8107–8114 8111

divided into 198 intervals, which are v1 = [0.01057013,0.03144942),v2 = [0.03144942,0.05232870), . . ., and v198 =[4.12378991,4.14466920). The corresponding fuzzy sets of the sec-ond factor are shown as follows:

B1 ¼ 1=v1 þ 0:5=v2 þ 0=v3 þ � � � þ 0=v196 þ 0=v197 þ 0=v198;

B2 ¼ 0:5=v1 þ 1=v2 þ 0:5=v3 þ � � � þ 0=v196 þ 0=v197 þ 0=v198;

..

.

B197 ¼ 0=v1 þ 0=v2 þ 0=v3 þ � � � þ 0:5=v196 þ 1=v197 þ 0:5=v198;

B198 ¼ 0=v1 þ 0=v2 þ 0=v3 þ � � � þ 0=v196 þ 0:5=v197 þ 1=v198:

249

. . . . . . . . .

2006/10/13 0.307075 B29

2006/10/16 0.404901 B38

2006/10/17 0.404847 B38

2006/10/18 0.295565 B28

2006/10/19 0.136572 B13

2006/10/20 0.2735 B26

4.4.6. Step 6: FuzzificationAfter defining the fuzzy sets and adjusting the number of inter-

vals of the second factor, we fuzzify the historical data set of thefirst factor and the second factor. The fuzzy sets of the first factorand the second factor are shown in Tables 4 and 5, respectively.

2006/10/23 0.029515 B2

2006/10/24 0.098702 B9

2006/10/25 0.272792 B26

2006/10/26 0.772426 B73

4.4.7. Step 7: Construct fuzzy logic relationship (FLR) database

After completing step 6, we establish the FLR database. The FLRdatabase is shown in Table 6.

Table 6The FLR database.

FLR1: A190B126A195B114A213B150 ? A228

FLR2: A195B114A213B150A228B178 ? A239

FLR3: A213B150A228B178A239B223 ? A239

FLR4: A228B178A239B223A239B259 ? A238

FLR5: A239B223A239B259A238B260 ? A236

4.4.8. Step 8: Forecast NTD/USD on October 27, 20064.4.8.1. Construct the LHS of the FLR on October 27, 2006. In this step,we first construct the LHS of the FLR on October 27, 2006, and thenuse it to select the forecasting FLR rules from the FLRs database.The LHS of the FLR on October 27, 2006 is shown in the following:

ðA395; B9Þ; ðA396;B26Þ; ðA386;B73Þ:

. . . . . . . . . . . .

FLR159: A365B38A369B28A373B13 ? A373

FLR160: A369B28A373B13A373B26 ? A388

FLR161: A373B13A373B26A388B2 ? A395

FLR162: A373B26A388B2A395B9 ? A396

FLR163: A388B2A395B9A396B26 ? A386

4.4.8.2. Search for forecasting FLR rules. The k forecasting FLR rulesare selected from the FLR database. In this experiment, k is set to3. The EDs between the LHS of the FLR on October 27, 2006 andthe FLRs from the FLR database are calculated and shown inTable 7.

The FLRs whose EDs are among the 3 smallest EDs are FLR163,FLR156 and FLR155, whose corresponding EDs are 51.93265,53.28227 and 55.95534.

Table 4Fuzzy sets of the first factor.

Date Exchange rate Fuzzy set

2006/03/01 32.283 A190

2006/03/02 32.310 A195

2006/03/03 32.401 A213

2006/03/06 32.473 A228

2006/03/07 32.529 A239

2006/03/08 32.530 A239

2006/03/09 32.523 A238

2006/03/10 32.513 A236

2006/03/13 32.500 A233

2006/03/14 32.434 A220

. . . . . . . . .

2006/10/13 33.130 A359

2006/10/16 33.234 A380

2006/10/17 33.158 A365

2006/10/18 33.180 A369

2006/10/19 33.199 A373

2006/10/20 33.202 A373

2006/10/23 33.276 A388

2006/10/24 33.308 A395

2006/10/25 33.316 A396

2006/10/26 33.265 A386

4.4.8.3. Forecast the exchange rate onOctober 27, 2006. The exchangerate of NTD/USD on October 27, 2006 can be forecasted by usingformula. (3) with k equal to 3. Substituting ED[1] by 51.93265,ED[2] by 53.28227 and ED[3] by 55.95534, we haveW ¼

P3i¼11=ED½i� = 0.05589508. Substituting M[1] by 33.2655,

M[2] by 33.1605 and M[3] by 33.2355 into formula. (3), the ex-change rate of NTD/USD on October 27, 2006 is equal to33.22065. The forecasted exchange rates of NTD/USD from October27, 2006 to March 1, 2007 are shown in Table 8. Note that since aniteration of this procedure (step 1–8) produces only one predictionof the exchange rate, we perform this procedure 81 times to get theresults in Table 8.

5. Results and analysis

5.1. Results

We have conducted the experiments on one-day, three-day,five-day and seven-day forecasting based on DBFTS, RW andRBFNN models, and compared their performance in terms of meansquare error (MSE) and directional symmetry (DS). The meansquare error is used to measure the accuracy of a prediction. TheMSE is calculated with the following formula:

MSE ¼Pn

t¼1ðActual value on dayt � Forecasting value on daytÞ2

n:

ð6Þ

Table 8The forecast exchange rate for one-day forecasting.

Date Forecast exchangerate

Actual exchangerate

Error:(actual � forecast)2

2006/10/27

33.22065 33.28 0.0035222

2006/10/30

33.28216 33.221 0.0037821

2006/10/31

33.21221 33.26 0.0022834

2006/11/01

33.2587 33.167 0.0084083

2006/11/02

33.15976 33.045 0.0131687

2006/11/03

33.12557 32.92 0.042258

2006/11/06

33.0023 32.95 0.0027353

2006/11/07

32.95344 32.84 0.0128687

2006/11/08

32.91563 32.869 0.002174

2006/11/09

32.8539 32.87 0.0002591

. . . . . . . . . . . .

2007/02/06

32.91121 32.968 0.0032257

2007/02/07

32.96553 32.995 0.0008687

2007/02/08

32.97682 33.01 0.0011008

2007/02/09

32.94318 32.971 0.0007742

2007/02/12

32.97672 33.024 0.0022354

2007/02/13

32.99844 33.081 0.0068154

2007/02/14

33.01324 32.99 0.00054

2007/02/26

32.99533 32.974 0.0004551

2007/02/27

32.99262 32.95 0.0018169

2007/03/01

32.92636 32.91 0.0002677

Table 9Exchange rate forecasting using RW, RBFNN, and DBFTS.

RW RBFNN DBFTS

MSE DS (%) MSE DS (%) MSE DS (%)

One-day forecasting 0.0111 50.00 0.0359 55.50 0.0065 58.75Three-day forecasting 0.0376 57.89 0.0772 52.63 0.0283 57.89Five-day forecasting 0.0582 51.39 0.0870 52.78 0.0501 58.33Seven-day forecasting 0.1010 48.53 0.0650 52.94 0.0532 58.82

Fig. 2. Comparison of DBFTS, RBFNN and RW on MSE.

Fig. 3. Time series of one-day forecasting.

Table 7The distance between FLR on October 27, 2006 and FLRs from the FLR database.

FLR1:ED ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið395� 190Þ2 þ ð9� 126Þ2 þ ð396� 195Þ2 þ ð26� 114Þ2 þ ð386� 213Þ2 þ ð73� 150Þ2

q¼ 373:78737g

FLR2: ED ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið395� 195Þ2 þ ð9� 114Þ2 þ ð396� 213Þ2 þ ð26� 150Þ2 þ ð386� 228Þ2 þ ð73� 178Þ2

q¼ 368:61769

. . . . . .

FLR162: ED ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið395� 373Þ2 þ ð9� 26Þ2 þ ð396� 388Þ2 þ ð26� 2Þ2 þ ð386� 395Þ2 þ ð73� 9Þ2

q¼ 74:76630

FLR163: ED ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið395� 388Þ2 þ ð9� 2Þ2 þ ð396� 395Þ2 þ ð26� 9Þ2 þ ð386� 396Þ2 þ ð73� 26Þ2

q¼ 51:93265

8112 Y. Leu et al. / Expert Systems with Applications 36 (2009) 8107–8114

The directional symmetry is used to measure the coincidence in thedirections of tendency (i.e., up or down) of the forecasted values andthe actual values. DS is defined by the following formula:

DS¼Pn

t¼1dt

n�100%; where dt ¼

1 if ðAt�At�1Þ�ðFt�Ft�1Þ>0;0 otherwise;

ð7Þ

where At and At�1 are the actual values at time t and time t�1,respectively; Ft and Ft�1 are the forecasted values at time t and timet�1, respectively. A good forecasting is a forecasting with small MSEand large DS.

The procedure for one-day forecasting has been explained inSection 4. The procedure for three-day, five-day and seven-dayforecasting is similar to that of one-day forecasting except thattheir FLRs are different. For example, for a three-day forecastingon day t, an FLR rule can be represented as follows:

ðXt�9;Yt�9Þ; ðXt�6; Yt�6Þ; ðXt�3;Yt�3Þ ! Xt :

The results of MSE and DS of RW, RBFNN and DBFTS for one-day,three-day, five-day and seven-day forecasting of the exchange rateof NTD/USD are shown in Table 9. Fig. 2 is a bar chart to comparethe MSEs of the three models. Figs. 3–6 are time series for one-

Y. Leu et al. / Expert Systems with Applications 36 (2009) 8107–8114 8113

day, three-day, five-day and seven-day exchange rate forecasting,respectively. As shown in Table 9, DBFTS has the smallest MSEand largest DS among the three models for one-day forecasting. Thisresult is consistent with Fig. 3, in which the line of DBFTS is muchcloser to the line of the actual exchange rate than those of RWand RBFNN are. It also shows that when the exchange rate changesrapidly, RW and RBFNN models produce many spikes which aredetrimental to the prediction accuracy. Fortunately, this phenome-non does not exist in DBFTS. To summarize, the experimental re-sults show the following: (1) the MSE of DBFTS is smaller thanthose of RW and RBFNN models; (2) the DS of DBFTS is close to60%, which is much larger than those of RW and RBFNN models.

Fig. 5. Time series of five-day forecasting.

Fig. 6. Time series of seven-day forecasting.

Fig. 4. Time series of three-day forecasting.

5.2. Performance testing

Although the experimental results in the previous section sug-gest that DBFTS outperforms RW and RBFNN models, we cannotjump into conclusion before performing statistical testing. To testwhether the MSE of the three models are equal or not, we performan F-test on the MSE of these three models. The testing results areshown in Tables 10–13. At significant level a = 0.05, the MSE ofDBFTS is significantly smaller than those of RW and RBFNN forone-day exchange rate forecasting. For three-day and five-dayforecasting, the MSE of DBFTS is significantly smaller than that ofthe RBFNN model. Finally, for seven-day forecasting, the MSE ofDBFTS is significantly smaller than that of RW model.

Since DS is a proportion, we use two-sample proportion test totest the difference in DS among three models. The testing resultsare shown in Tables 14–17. According to the results in these tables,the DS of three models are not significantly different. The perfor-mance on DS for three models can be deemed the same.

Table 10The MSE test matrix of RW, RBFNN and DBFTS for one-day forecasting.

DBFTS RW RBFNN

DBFTSRW 0.0095RBFNN <0.001 <0.001

Note: numbers in table are p value.

Table 11The MSE test matrix of RW, RBFNN and DBFTS for three-day forecasting.

DBFTS RW RBFNN

DBFTSRW 0.1071RBFNN <0.001 0.0009

Table 12The MSE test matrix of RW, RBFNN and DBFTS for five-day forecasting.

DBFTS RW RBFNN

DBFTSRW 0.2584RBFNN 0.0087 0.0410

Table 13The MSE test matrix of RW, RBFNN and DBFTS for seven-day forecasting.

DBFTS RBFNN RW

DBFTSRBFNN 0.1937RW 0.0032 0.0299

Table 14The DS test matrix of RW, RBFNN and DBFTS for one-day forecasting.

DBFTS RBFNN RW

DBFTSRBFNN 0.2448RW 0.1143 0.3034

Note: numbers in table are p value.

Table 15The DS test matrix of RW, RBFNN and DBFTS for three-day forecasting.

DBFTS RW RBFNN

DBFTSRW 0.5RBFNN 0.2570 0.2570

Table 16The DS test matrix of RW, RBFNN and DBFTS for five-day forecasting.

DBFTS RBFNN RW

DBFTSRBFNN 0.2512RW 0.2012 0.5662

Table 17The DS test matrix of RW, RBFNN and DBFTS for seven-day forecasting.

DBFTS RBFNN RW

DBFTSRBFNN 0.2448RW 0.1143 0.3034

8114 Y. Leu et al. / Expert Systems with Applications 36 (2009) 8107–8114

6. Conclusion

Fuzzy time series has been successfully applied in many appli-cations. To overcome the difficulty in finding matching rules forforecasting in the fuzzy time series model, we have proposed touse the Euclidean distance between two fuzzy logic relationshipsas a metric for selecting matching rules. This extension makesthe fuzzy time series model more applicable for applications forwhich complete matching of the fuzzy logic relationships is notpossible. To show the feasibility of the proposed model, we applied

it on forecasting the exchange rate of NTD/USD. Through experi-ments, we showed that the proposed model outperformed the ran-dom walk model and the artificial neural network model in termsof mean square error.

References

Benjamin, M. T. (2006). The dynamic relationship between stock prices andexchange rates: Evidence for Brazil. International Journal of Theoretical andApplied Finance, 9(8), 1377–1396.

Cao, D. Z., Pang, S. L., & Bai, Y. H. (2005). Forecasting exchange rate using supportvector machines. In Proceedings of the 4th international conference on machinelearning and cybernetics, Guangzhou, 18–21 August 2005.

Chen, S. M. (1996). Forecasting enrollments based on fuzzy time series. Fuzzy Setsand Systems, 81(3), 311–319.

Chen, S. M. (2002). Forecasting enrollments based on high-order fuzzy time series.Cybernetics and Systems, 33(1), 1–16.

Cheng, C. H., Chen, T. L., Teoh, H. J., & Chiang, C. H. (2008). Fuzzy time-series basedon adaptive expectation model for TAIEX forecasting. Expert Systems withApplications, 34, 1126–1132.

Lee, L. W., Wang, L. H., Chen, S. M., & Leu, Y. H. (2006). Handling forecastingproblems based on two-factors high-order fuzzy time series. IEEE Transactionson Fuzzy Systems, 14(3), 468–477.

Meese, R. A., & Rogoff, K. (1983). Empirical exchange rate models of the seventies:Do they fit out of sample? Journal of International Economics, 14, 3–24.

Nieh, C. C., & Lee, C. F. (2001). Dynamic relationship between stock prices andexchange rates for G-7 countries. Quarterly Review of Economics and Finance,41(4), 477–490.

Panda, C., & Narasimhan, V. (2007). Forecasting exchange rate better with artificialneural network. Journal of Policy Modeling, 29, 227–236.

R Project for Statistical Computing. The official website of the R Project forStatistical Computing. <http://www.r-project.org/>.

Shin, T., & Han, I. (2000). Optimal signal multi-resolution by genetic algorithms tosupport artificial neural networks for exchange-rate forecasting. Expert Systemswith Applications, 18, 257–269.

Song, Q., & Chissom, B. S. (1993a). Fuzzy time series and its models. Fuzzy Sets andSystems, 54(3), 269–277.

Song, Q., & Chissom, B. S. (1993b). Forecasting enrollments with fuzzy time series –Part I. Fuzzy Sets and Systems, 54(1), 1–9.

Song, Q., & Chissom, B. S. (1994). Forecasting enrollments with fuzzy time series –Part II. Fuzzy Sets and Systems, 62(1), 1–8.

Yu, H. K. (2005). Weighted fuzzy time series models for TAIEX forecasting. Physica A,349, 609–624.