explaining the market price of bitcoin and other cryptocurrencies with statistical ...814478/... ·...

63
Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical Analysis ERIK PÄRLSTRAND AND OTTO RYDÉN Stockholm 2015 Supervisor: Henrik Hult Department of Mathematical Statistics Department of Mathematics Kungliga Tekniska Högskolan

Upload: others

Post on 18-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

Explaining the market price of Bitcoin and otherCryptocurrencies with Statistical Analysis

ERIK PÄRLSTRAND AND OTTO RYDÉN

Stockholm 2015

Supervisor: Henrik HultDepartment of Mathematical Statistics

Department of MathematicsKungliga Tekniska Högskolan

Page 2: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

AbstractIn this thesis there will be an attempt to model the market price of cryptocurrencies.Since 2010 cryptocurrencies have gone from being fairly unknown to being familiaramongst the general public which increases the need for knowledge on what affects themarket price of cryptocurrencies.

These connections will be found by statistical analysis and be applied on cryptocur-rency data from January 2012 to January 2015. The data will be modeled by linearregression and implemented in R after the data have been formating in Excel. Theresults suggest that the price of cryptocurrencies depends heavily on the search trafficon the specific cryptocurrency name on Google’s search engine.

I

Page 3: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

SammanfattningI denna uppsats kommer det att göras en strukturtolkning av priset på kryptovalutor.Sedan 2010 har kryptovalutor gått från att vara anonymt till att bli välkänt vilket göratt kunskaper om vad som påverkar priset på dem är betydande.

För att hitta dessa underliggande samband kommer en statistisk analys göras på dataför kryptovalutor mellan januari 2012 och januari 2015. Modellerna som kommer attanvändas baserar sig på linjär regression och kommer implementeras i R efter att datanhar formaterats i Excel. Resultatet av undersökningen är att priset på kryptovalutorberor starkt på hur många som har använt kryptovalutans namn på Googles sökmotor.

II

Page 4: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

AcknowledgmentsWe would like to thank our supervisor Henrik Hult for his suggestions and guidance.

Furthermore we would like to thank our fellow students for providing thoughts andcriticism on our thesis.

III

Page 5: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

CONTENTS CONTENTS

Contents1 Introduction 1

1.1 Delimitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Theoretical Background 32.1 Modeling Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1.1 Linear regression model . . . . . . . . . . . . . . . . . . . . . . . 32.1.2 Ordinary Least Squares . . . . . . . . . . . . . . . . . . . . . . . 3

2.1.2.1 BLUES . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.3 Dummy Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.4 R2-value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.5 Choice of covariates . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.1.5.1 Akaike Information Criterion . . . . . . . . . . . . . . . 42.1.5.2 Bayesian Information Criterion . . . . . . . . . . . . . . 52.1.5.3 Differences between AIC and BIC . . . . . . . . . . . . 5

2.1.6 Effect size and Cohen’s rule . . . . . . . . . . . . . . . . . . . . . 52.1.7 Theory of Distribution and Hypothesis . . . . . . . . . . . . . . . 5

2.1.7.1 Type I and Type II errors . . . . . . . . . . . . . . . . . 62.1.8 F-distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.1.9 Hypothesis testing for linear models . . . . . . . . . . . . . . . . 6

2.2 Problems with the least square estimation . . . . . . . . . . . . . . . . . 72.2.1 Heteroscedasticity . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1.1 Breusch-Pagan test . . . . . . . . . . . . . . . . . . . . 72.2.1.2 White’s Consistent Variance Estimator . . . . . . . . . 82.2.1.3 Reformulating the model . . . . . . . . . . . . . . . . . 8

2.2.2 Endogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2.2.1 Causes for endogeneity . . . . . . . . . . . . . . . . . . 92.2.2.2 Instrumental Variables . . . . . . . . . . . . . . . . . . 92.2.2.3 Durbin-Wu-Hausmann test . . . . . . . . . . . . . . . . 92.2.2.4 TSLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.3 Multicollinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.3.1 Variance Inflation Factor . . . . . . . . . . . . . . . . . 10

3 Information about cryptocurrencies and paper specific terms 113.1 Cryptocurrencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.1.1 Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.1.2 Ripple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.1.3 Litecoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 Paper specific terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2.1 Google Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2.2 Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2.3 Other terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Model and Experimental Setup 164.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.1.1 Cryptocurrencies setup . . . . . . . . . . . . . . . . . . . . . . . 164.1.2 Stock setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.2 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

IV

Page 6: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

CONTENTS CONTENTS

5 Cryptocurrency analysis 205.1 Bitcoin Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.1.1 The linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . 205.1.2 Reducing the model . . . . . . . . . . . . . . . . . . . . . . . . . 215.1.3 Multicollinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . 215.1.4 Endogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.2 XRP Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235.2.1 The linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . 235.2.2 Reducing the model . . . . . . . . . . . . . . . . . . . . . . . . . 245.2.3 Multicollinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.2.4 Endogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.3 Litecoin Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255.3.1 The linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . 255.3.2 Reducing the model . . . . . . . . . . . . . . . . . . . . . . . . . 265.3.3 Multicollinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . 275.3.4 Endogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

6 Stock analysis 296.1 Genworth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296.2 Clorox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296.3 Denstsply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296.4 Equifax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296.5 Mylan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306.6 Stericycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

7 Discussion and Conclusions 317.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

7.1.1 Heteroscedasticity and reducing the model . . . . . . . . . . . . . 317.1.2 Multicollinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . 327.1.3 Endogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327.1.4 Missing covariates . . . . . . . . . . . . . . . . . . . . . . . . . . 327.1.5 Google . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337.1.6 Currency specific covariates . . . . . . . . . . . . . . . . . . . . . 347.1.7 Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357.1.8 Index covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . 357.1.9 Dummy7, Dummy30 and Dummy365 . . . . . . . . . . . . . . . 367.1.10 BTC price as a covariate . . . . . . . . . . . . . . . . . . . . . . . 367.1.11 Other covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

7.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

8 Appendix 388.1 Bitcoin Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388.2 XRP Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438.3 Litecoin Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458.4 Stock Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468.5 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488.6 Data sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

V

Page 7: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

LIST OF FIGURES LIST OF TABLES

List of Figures1 Example of heteroscedasticity . . . . . . . . . . . . . . . . . . . . . . . . 72 Example of endogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Overview of cryptocurrencies . . . . . . . . . . . . . . . . . . . . . . . . 114 Describing the price of Bitcoin . . . . . . . . . . . . . . . . . . . . . . . 125 Overview of Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Describing the price of XRP . . . . . . . . . . . . . . . . . . . . . . . . . 147 Describing the price of Litecoin . . . . . . . . . . . . . . . . . . . . . . . 148 Residual and BTC plot . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 Resdual and log(BTC) plot . . . . . . . . . . . . . . . . . . . . . . . . . 2110 Residual and log(Google) plot (BTC) . . . . . . . . . . . . . . . . . . . 2211 Residual and XRP plot . . . . . . . . . . . . . . . . . . . . . . . . . . . 2312 Residual and log(XRP) plot . . . . . . . . . . . . . . . . . . . . . . . . . 2413 Residual and log(Google) plot (XRP) . . . . . . . . . . . . . . . . . . . 2514 Residual and LTC plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2615 Residual and log(LTC) plot . . . . . . . . . . . . . . . . . . . . . . . . . 2616 Residual and LTC plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2717 Residual and log(Google) plot (LTC) . . . . . . . . . . . . . . . . . . . . 2818 BTC specific variables normalized . . . . . . . . . . . . . . . . . . . . . 4919 BTC normalized to the initial value . . . . . . . . . . . . . . . . . . . . 50

List of Tables1 Bitcoin covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Ripple covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Litecoin covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Stock covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 VIF-values for BTC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 F-statistic for IVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Initial VIF-values for XRP . . . . . . . . . . . . . . . . . . . . . . . . . . 248 VIF-values for the final XRP model . . . . . . . . . . . . . . . . . . . . 259 VIF-values for LTC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2710 The initial regression for BTC . . . . . . . . . . . . . . . . . . . . . . . . 3111 η2-values for Google trend index . . . . . . . . . . . . . . . . . . . . . . 3312 η2-values for the currency specific variables for BTC . . . . . . . . . . . 3413 η2-values for volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3514 η2-values for the index covariates for BTC . . . . . . . . . . . . . . . . . 3515 η2-values for the index covariates for LTC . . . . . . . . . . . . . . . . . 3616 η2-values for the index covariates for XRP . . . . . . . . . . . . . . . . . 3617 η2-values for the dummy variables . . . . . . . . . . . . . . . . . . . . . 3618 The initial regression for BTC . . . . . . . . . . . . . . . . . . . . . . . . 3819 First log-log regression for BTC . . . . . . . . . . . . . . . . . . . . . . . 3920 Regression with two stock currency exchange and one stock exchange

removed for BTC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4021 The first LS-regression for BTC . . . . . . . . . . . . . . . . . . . . . . . 4122 Regression with the first reduction for BTC . . . . . . . . . . . . . . . . 4123 Regression with the second reduction for BTC . . . . . . . . . . . . . . 4224 Wu-Hausmann test for BTC . . . . . . . . . . . . . . . . . . . . . . . . . 4225 TSLS regression for BTC . . . . . . . . . . . . . . . . . . . . . . . . . . 4326 The initial log-log regression for XRP . . . . . . . . . . . . . . . . . . . 4327 Bench mark model for XRP . . . . . . . . . . . . . . . . . . . . . . . . . 44

VI

Page 8: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

LIST OF TABLES LIST OF TABLES

28 Final regression for XRP . . . . . . . . . . . . . . . . . . . . . . . . . . . 4429 The initial log-log regression for LTC . . . . . . . . . . . . . . . . . . . . 4530 The final model for LTC . . . . . . . . . . . . . . . . . . . . . . . . . . . 4531 The Genworth regression . . . . . . . . . . . . . . . . . . . . . . . . . . . 4632 The Clorox regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4633 The Dentsply regression . . . . . . . . . . . . . . . . . . . . . . . . . . . 4634 The Equifax regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4735 The Mylan regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4736 The Stericycle regression . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

VII

Page 9: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

LIST OF TABLES LIST OF TABLES

Abbreviations

OLS Ordinary Least Square

BLUES Best Linear Unbiased EStimator

AIC Akaike Information Criterion

BIC Bayesian Information Criterion

TSLS Two Stage Least Squares

IV Instrumental V ariable

V IF V ariance Inflation F actor

VIII

Page 10: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

LIST OF TABLES LIST OF TABLES

Symbols

y Regression vector

y Predicted regression vector

X A matrix containing all the covariates

Xi The i:th column in X

β,γ Coefficient vector for regression

βi, γi The i:th coefficient in β, γ

β, γ The least-squares coefficient vector

βi, γi The i:th coefficient in β, γ

e, ε The error vector after a regression

e∗, ε∗ The error vector of the restricted regression

σ The standard deviation

f Degrees of freedom

k The numbers of covariates

n The numbers of data points

s The estimated standard error, s = 1n−k−1 |e|

2

H0 The null hypothesis

H1 The alternative hypothesis

α The error rate for which the true hypothesis H0 is falsely rejected

E(X) The expected value for a random variable X

V ar(X) The variance for a random variable X

Cov(X) The covariance for a random variable X

log(X) The natural logarithm of X

a · b The inner product of the vectors a and b

|a| The norm of vector a

XT The transpose of matrix X

I The identity matrix

IX

Page 11: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

1 INTRODUCTION

1 IntroductionDuring the last years a new way to make payments has arisen, cryptocurrencies withBitcoin being the largest one. Modeling and understanding the behaviors of these cur-rencies is essential to those who decide to invest in this new asset. Cryptocurrencies arealso used to do transactions between users which makes the stability of the currencycrucial.

The largest cryptocurrency, Bitcoin was created in 2009 and since then numerousother cryptocurrencies have been created. However it is only during the last years thatthey have picked up some recognition. For example, Bitcoin will be introduced as acertificate at Nasdaq Stockholm 18:th of May 2015 [34].

Cryptocurrencies are an decentralized currencies that are not controlled by economicsystems and centralized banks which therefore are protected from political influencesapplied by governments and corporations.

Transactions between users are anonymous and can not be traced. Due to thisuntraceability Bitcoin was and still is used on illegal markets such as Silk Road tomake payments for drugs and other black market items. Due to criminal activities,cryptocurrencies have been a controversial topic and still are. In 2013 the Chinesegovernment decided to ban Bitcoin.

The price of Bitcoin has drastically increased since the creation and the total marketvalue of Bitcoin is over three billion USD as of 2015. However the price per Bitcoinpeaked in 2013 with a market value of over 1200 USD per Bitcoin. There are a few caseswere individuals have made bad decisions and lost Bitcoins worth millions of dollars,for example one individual that threw away a hard drive that contained Bitcoins worthover 5 million USD [2].

This thesis will determine which factors influence the market price of cryptocurren-cies by linear regression. Here are some questions that will be addressed:

• Which parameters will influence the market price of Bitcoin and other cryptocur-rencies?• Which trends can be found in the market price of cryptocurrencies?• Are the behaviors of cryptocurrencies comparable to other assets such as stocks?

The following cryptocurrencies will be researched: Bitcoin, Ripple and Litecoin. Anattempt will be made to create a structural interpretation where the market price ofcryptocurrencies will be the dependent variable and different covariates will be chosen.Some examples of covariates are: cryptocurrency specific covariates, volatility, dummyvariables and a measure of demand.

The same method will be used on other assets such as stocks in order to see if thereare any commonalities with cryptocurrencies.

The next chapter will discuss the mathematical concepts that will be used in thisthesis which are needed to draw some conclusions about the data. In chapter three thesetup of the model will be explained and which modifications will be made to the data.Thereafter, the analysis will be done and the results will be obtained. Lastly the resultswill be discussed and some general conclusions will be drawn.

1

Page 12: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

1.1 Delimitation 1 INTRODUCTION

1.1 DelimitationThere was a decision to only include three cryptocurrencies because the main topicwould be Bitcoin and the other two are just complementary. It also became hard to finddata about the smaller cryptocurrencies which encouraged the decision.

Due to limited time only six stocks were analyzed. Furthermore only cryptocurrencieswere investigated, not regular currencies because it would deviate from the main topicwhich was Bitcoin.

2

Page 13: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

2 THEORETICAL BACKGROUND

2 Theoretical Background2.1 Modeling Theory2.1.1 Linear regression model

Linear regression is a method to model a relation between a dependent variable and aset of independent variables according to the following relationship:

Y = Xβ + e (1)

Where Y =

y1y2...yn

, X =

1 x1,1 · · · x1,m1 x2,1 · · · x2,m...

......

...1 xn,1 · · · xn,m

, β =

β0β1...βn

, e =

e1e2...en

.

yi is the observed variable known as dependent variable that depends on the covariatesxi,∗ and an error term ei.

βi is interpreted as the change of yi that correspond to a unit change of xi, that is:

βi = ∂yi∂xi

(2)

The error terms ei are independent variables and have the following properties:

E(ei) = 0 and E(e2i ) = σ2

Where σ is unknown. This is the homoscedastic version which requires that the vari-ance of the error terms ei are identical. If E(e2

i ) = σ2i then the model suffers from

heteroscedasticity which will be dealt with on page 7 [25, p. 3-5].

2.1.2 Ordinary Least Squares

OLS is an estimation of β that minimizes ete = |e|2. The estimation of β is known asβ and is obtained by solving the normal equation:

Xte = 0 (3)

Combining equation 1 and equation 3 yields the following OLS estimation of β:

β = (XtX)−1XtY (4)

The covariance matrix for β is defined as:

Cov(β) = E[(β − β)(β − β)t] = (XtX)−1Xt(σ2I)X(XtX)−1 = (XtX)−1σ2

An unbiased estimator for σ is:

s2 = 1n−k−1 |e|

2

The covariance matrix can now be estimated as:

ˆCov(β) = (XtX)−1s2

This is only valid if the model is homoscedastic otherwise the covariance matrix will beinconsistent and White’s Consistent Variance Estimator will have to be used (see page8) [25, p. 5-6] [25, p. 15-18].

3

Page 14: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

2.1 Modeling Theory 2 THEORETICAL BACKGROUND

2.1.2.1 BLUESThe Gauss-Markov theorem state that OLS is the Best Linear Unbiased EStimator(BLUES). There are four conditions that needs to be fulfilled in order for this to be true[25, p. 7].

1: The expected value of the error terms are zero (that is E(ei) = 0).2: The error terms all have the same variance (that is E(e2

i ) = σ2).3: The covariates can not be correlated with the error terms.4: There needs to be a linear relationship between the dependent variable and the

covariates.

2.1.3 Dummy Variables

Some data is not quantifiable and thus needs to be transformed in order to make itusable. In these cases dummy variables are appropriate. Let xi be a dummy variablethat has the following properties [25, p. 20].

xi = 1 if the observation is active.xi = 0 if the observation is inactive.

When xi is active the dependent variable will increase by βi.For example, if the causes of lung cancer were examined a dummy variable could be

introduced that checked if an individual was a smoker or not. xi = 1 if the individualwas a smoker and xi = 0 if the individual was not a smoker.

2.1.4 R2-value

In statistics the R2-value is often used. It explains what percentage of the error termsdisappears in the regression when it is compared with only the constant terms. It canbe described as a measurement of the goodness of fit [11].The definition of the R2-value is:

R2 = V ar(Xβ)V ar(Y ) = 1− V ar(e)

V ar(Y ) (5)

2.1.5 Choice of covariates

The question if a covariate should be included or not often arises in model selection.There are two types of tests that are commonly used in order to answer this question,

the AIC-test and the BIC-test [25, p. 21].

2.1.5.1 Akaike Information CriterionAIC measures the relative quantity of a statistical model given a set of data. AIC canbe viewed as a trade-off between the goodness of fit and the complexity (number ofcovariates) [31].

The AIC test minimizes the following expression:

AIC = n ln(|e|2) + 2k (6)

Equation 6 discourages models that are overfitted and encourages models with a goodgoodness of fit [25, p. 21] [13].

4

Page 15: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

2.1 Modeling Theory 2 THEORETICAL BACKGROUND

2.1.5.2 Bayesian Information CriterionBIC measures the relative quantity of a statistical model given a set of data. BIC canbe viewed as a trade-off between the goodness of fit and the complexity (number ofcovariates) [31].

The BIC test minimizes the following expression:

BIC = n ln(|e|2) + k ln(n) (7)

Equation 7 discourages models that are overfitted and encourages models with a goodgoodness of fit [25, p. 21] [14].

2.1.5.3 Differences between AIC and BICThe AIC and BIC test are very similar according to equation 6 and 7. They differ inthe last term, AIC has a 2k term and BIC has a k ln(n) term.

AIC and BIC are similar because they are derived from the same theory (informationtheory) and the same framework but have different priorities. BIC usually reduce themodel more than AIC does. Which test that should be used depends on the model [3].

2.1.6 Effect size and Cohen’s rule

Effect size measures the strength of a covariate. Effect size is independent of samplesize unlike significance tests [36].

There are many formulas describing effect size but only η2 will be used in this thesis.η2 is defined as:

η2 =e2effect

e2total

(8)

Where e2effect is the sum of squares of the effect of interest and e2

total is the sum ofsquares of all effects.

In order to determine if the effect has a small, medium or big impact, Cohen’s ruleof thumb is applied which states [38]:

Impact Small Medium Bigη2 ≈ 0.02 0.13 0.26

Cohen’s rule of thumb.

2.1.7 Theory of Distribution and Hypothesis

Conclusions often need to be drawn from a given data set if it has certain attributes. Inorder to do this hypothesis testing is used. The usual process of hypothesis testing is:

1: Begin by defining the null hypothesis H0 and an alternative hypothesis H1.

2: Make a statistical assumption about the sample, for example that the sample isstatistically independent or about the distribution of the observations.

3: Decide which test should be used and state the relevant test statistic F . The dis-tribution of the test statistic under the null hypothesis needs to be derived. It can forexample be normal distributed or F-distributed.

4: Define a significance level α which is the probability threshold below which the nullhypothesis will be rejected. A common level is 5%.

5

Page 16: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

2.1 Modeling Theory 2 THEORETICAL BACKGROUND

5: The distribution that was derived will divide possible values of the test statistic intotwo parts, one where the null hypothesis will be rejected with the probability α (namedcritical region) and one where the null hypothesis will not be rejected.

6: Compute the observed value Fobs for the test statistic F .

7: The final step is to either reject the null hypothesis in favor of the alternative or notreject the null hypothesis. The null hypotheses H0 will be rejected if the observed valueFobs is in the critical region and will otherwise not be rejected [33].

2.1.7.1 Type I and Type II errorsSometimes the wrong conclusion will be drawn from the hypothesis testing. This risk oferror α is predefined and is the accepted error risk. There are two types of errors, typeI and type II.

Type I is the incorrect rejection of a true null hypothesis (also called "false positive").

Type II is the failure to reject a false null hypothesis (also called "false negative") [35].

2.1.8 F-distribution

The F-distribution is a probability distribution that is commonly used in statistics. Itis the ratio of two chi-square distributions and therefore right skewed [32].

The F-distribution is defined as [28]:

F (n, p) = χ2(n)/nχ2(p)/p (9)

2.1.9 Hypothesis testing for linear models

To test if multiple covariates are significant will be important in this thesis. In order todo that a hypothesis needs to be defined in order to test for significance of a group ofcovariates at a type II error rate α:

H0 : βi = 0 ∀ i = 1, 2, 3...n, n ∈ N (10)

H1 : βi 6= 0 ∀ i = 1, 2, 3...n, n ∈ N. (11)

When a coefficient βi is tested for significance a restriction is applied onto the model. Ifn restrictions are tested, regress without the restrictions (βi 6= 0) and obtain the errorvector e. Then regress with the restrictions (βi = 0) and obtain the error vector e∗.The test F-statistic is then:

F = n− k − 1r

( |e∗|2

|e|2− 1) (12)

F is F(r,n− k − 1) distributed [25, p. 9]. The test is now:

Reject H0 if F > Fα(r, n− k − 1)

Where Fα(r, n− k − 1) is the F cumulative distribution function.

6

Page 17: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

2.2 Problems with the least square estimation 2 THEORETICAL BACKGROUND

2.2 Problems with the least square estimation2.2.1 Heteroscedasticity

In the homoscedastic case the assumption is that:

E(ei) = 0 and E(e2i ) = σ2

These conditions are not always met. Sometimes the error terms are normally dis-tributed, that is:

E(ei) = 0 and E(e2i ) = σ2

i

This is called heteroscedasticity. In this case the F-test is not valid without any modi-fications [25, p. 3-5].

A common way to detect if a model contains homoscedasticity is to plot the errorterm e versus the dependent variable Y . For homoscedasticity the error term should bespread out evenly. Figure 1 illustrate the different cases.

Figure 1: Example plots of heteroscedasticity. Y is the error terms and X is the depen-dent variable. The error terms are evenly spread out around the line in A (homoscedas-ticity) but not in B-D (heteroscedasticity).

2.2.1.1 Breusch-Pagan testIt is often valuable to know if there is heteroscedasticity in the model and the Breusch-Pagan test is used to check that. It controls if the estimated variance of the error termsVar(e) are dependent on the covariates. If this is true then the model is heteroscedastic.

The condition for homoscedasticity is that E(e2i ) = σ2, that is that the variance does

not depend on the covariates. This variance can be calculated by taking the average ofthe squared error terms e2.

If the model is homoscedastic then the variance is linearly dependent on the covariatesand this can be verified by hypothesis testing:

H0 =: The model is homoscedasticH1 =: The model is hetroscedastic

Proceed by preforming a regression with e2 as the independent variable and the covari-ates X, that is:

e2 = Xγ + ε (13)If an F-test can confirm that the independent variable is jointly significant then the nullhypothesis can be rejected [29].

7

Page 18: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

2.2 Problems with the least square estimation 2 THEORETICAL BACKGROUND

2.2.1.2 White’s Consistent Variance EstimatorOne approach to decrease the heteroscedasticity in a model is to use a more robustcovariance matrix estimator. White’s Consistent Variance Estimator is one of those andis defined as:

Cov∗(β) = (XtX)−1D(e2)(XtX)−1 (14)

Here D(e2) is a n× n diagonal matrix with the i:th diagonal element being ei2.

In order to make it more robust the covariance matrix can be scaled by n/(n−k−1),that is [25, p. 15-18]:

CovRobust(β) = Cov∗(β) n

n− k − 1 (15)

2.2.1.3 Reformulating the modelHeteroscedasticity can be dealt with by reformulating the model. If the error termsincreases linear/almost linear with the Y then the error terms could be scaled with Y ,that is:

ei = yi − yiyi

(16)

This can be achieved by taking the natural logarithm of both the dependent variableand the covariates which yields:

log(Y ) = 1 · β0 +X1β1 +X2β2 + ε (17)

WhereX1 = log

x1,1 · · · x1,mx2,1 · · · x2,m...

......

xn,1 · · · xn,m

, β1 =

β1β2...βn

andX2 is a dummy variable matrix.

The interpretation of βi is now that if xi is changed by one percent then yi is expectedto change βi percent, that is [24].

%∆yi = %βi∆xi (18)

2.2.2 Endogeneity

If the error term e is correlated with one or more covariates then the model suffersfrom endogeneity. The problem with endogeneity is that the OLS estimation turnsinconsistent [37].

A common way to detect if a model contains endogeneity is to plot the error terme versus the covariate that is suspected to be endogenous. If there exist obvious lin-ear patterns then the model contains endogeneity. Figure 2 illustrate a model withendogeneity.

8

Page 19: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

2.2 Problems with the least square estimation 2 THEORETICAL BACKGROUND

Figure 2: Example plots of endogeneity. Y is the error terms and X is the suspectedcovariate. It is endogenous because the error terms display a dependency of X andtherefore have a linear trend.

2.2.2.1 Causes for endogeneityThere are many reasons why a model contains endogeneity. In this thesis the two mainendogeneity problems will be simultaneity and missing relevant covariates.

SimultaneitySometimes the independent variable Y influences one or more covariates. This becomesproblematic since the causality will go in more than one direction [25, p. 26].

Missing relevant covariatesSometimes a component of the error term correlates with some covariate which can beidentified. A remedy for this is simply adding the missing covariate.

2.2.2.2 Instrumental VariablesAssume the model in equation 1 has i endogenous covariates and n − i exogenous co-variates. Another i new covariates Ti needs to be discovered that are correlated withthe endogenous covariates but not with the error term e. These new covariates Ti plusthe exogenous covariates Xn−i are known as instrumental variables. [4].

2.2.2.3 Durbin-Wu-Hausmann testThe Durbin-Wu-Hausmann test is used to see if one covariate (Xi) is endogenous. Topreform the Hausmann test one should OLS regress the endogenous variables {Xi} ontothe instrumental variables, that is:

Xi = Zγi + u (19)

Where Z = [1 Xn−i Xn−i+1 · · · Xn T1 T2 · · · Ti].

Then extract the error vector u and OLS regress it with X, that is:

Y = Xβ + Γu+ e (20)

Now determine if u is significant by hypothesis testing [8]:

9

Page 20: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

2.2 Problems with the least square estimation 2 THEORETICAL BACKGROUND

H0: Γ = 0 hypothesis that Xi is exogenous.

H1: Γ 6= 0 hypothesis that Xi is endogenous.

2.2.2.4 TSLS

If endogeneity occurs in the model it can be redeemed by using Two Stage LeastSquare estimation which requires instrumental variables.

Begin by forming the matrix Z = [Xn−i Xn−i+1 · · · Xn T1 T2 · · · Ti], where{Xn−i} is the exogenous covariates and {Ti} is the "new" covariates.

Then project X onto Z [25, p. 27-28]:

X = Z(ZtZ)−1ZtX (21)

The point estimate of β is now:

β = (XtX)−1X

tY (22)

The White’s Robust covariance matrix estimator is now:

CovRobust(β) = n

n− k − 1(XtX)−1X

tD(e2)X(X

tX)−1 (23)

2.2.3 Multicollinearity

Multicollinearity occurs when the OLS estimation does not have a unique solution. Thiswill appear when two or more covariates are linearly dependent.

One simple way of detecting if there is multicollinearity in the model is to examinethe standard errors of βi. If they are large then the model most likely suffers frommulticollinearty.

Multicollinearty can be dealt with by removing covariates that are linearly depen-dent. Those covariates can be found by calculating their VIF-values.

2.2.3.1 Variance Inflation FactorMulticollinearity can be detected by looking at covariates VIF-values. The VIF describeshow much larger the variance is compared with what it would be if the variable wereuncorrelated with the other covariates.

If there is a model with k different covariates that are suspected to be correlated,calculate k VIF-values, one for eachXi. Then OLS regressXi onto the other covariates:

Xi = 1 · γ0 + γ1Xi+1 + γ2Xi+2 + . . . γkXk + ei (24)

Here i = 1, 2 . . . , k. Then calculate the V IF (βi), defined as:

V IF (βi) = 11−R2

i

(25)

A rule of thumb is that if V IF (βi) > 10 then there is a high colinearity [7].

10

Page 21: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

3 INFORMATION ABOUT CRYPTOCURRENCIES AND PAPER SPECIFICTERMS

3 Information about cryptocurrencies and paper spe-cific terms

3.1 CryptocurrenciesCryptocurrencies are digital currencies which unlike other currencies are not controlledby economic systems and centralized banks, see figure 3. Most of these cryptocurrenciesare anonymous (which means that they do not identify who is making transactions whilestill making the transaction data public).

The rate with which new currency are created is defined from the beginning and isknown to the public unlike centralized systems where governments or banks can issuenew currency or demand more to the digital bank ledgers. This makes it protected frompolitical influences applied by governments and corporations.

The majority of cryptocurrencies have been designed to progressively decrease theproduction of new units, making it reach a cap on the total amount of currency thatexists in order to prevent high inflation.

The transactions are secured through cryptography and are recorded in a publicledger. The ledgers are secured through a community called miners which verifies trans-actions by computational power and in return receives a certain amount of cryptocur-rency. This is how new currency assets are added to the system.

Cryptocurrencies apply different time stamping schemes in order to remove the needfor a trusted third party to timestamp the transactions and add it to the block chainledger. The first and most common one is the Proof-of-work schemes. As the namesuggest, this scheme ensures that the miners actually contribute with computationalwork and do not trick the system [12]. As of April 2015 there are hundreds of differentcryptocurrencies and the three biggest are Bitcoin, Ripple and Litecoin [5].

Figure 3: Centralized versus decentralized transactions systems.

3.1.1 Bitcoin

Bitcoin (also called BTC) is the biggest cryptocurrency as of April 2015 with a totalworth of over 3 billion US dollars, the price development can be seen in figure 4. Itwas created in 2009 by Satoshi Nakamoto (which many speculate is a pseudonym) [6].Bitcoin uses public-key cryptography, were two cryptographic keys are used, one privateand one public. Bitcoin users are using digital wallets, where these keys are stored, seefigure 5. These keys are sent between users in order to make transactions.

11

Page 22: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

3.1 Cryptocurrencies3 INFORMATION ABOUT CRYPTOCURRENCIES AND PAPER SPECIFIC

TERMS

All the transaction data is stored permanently in so called "blocks". These blocksform chains which contain the previous transactions. Money is not transferred until theblock which the transaction belongs to is added to the chain. This process requires com-putational power, which individual users contribute with (in order to receive Bitcoins).

These are called computational nodes or just nodes. Those nodes that are ready toassist with computational power collect all unverified transactions (those who have notbeen added to the chain yet) into a candidate block.

The nodes then create a cryptographic hash for this candidate block, which requiresa certain amount of trial and error calculations. When a node find this hash it notifiesthe rest of the network which verifies it and add this block to the chain. This block isnow "solved". These chains contain all transactions ever made within the currency.

If a users tries to spend a certain amount of Bitcoins that it already spent, thenetwork will detect this and deny the transaction.

Whenever a node solves a block (that is move a candidate block into the chain) itreceives a reward, Bitcoins. The amount of Bitcoins that they receive decreases withtime and will tend to zero eventually, so the total amount of Bitcoins created will neverreach over a certain threshold, which for Bitcoins is 21 million.

The program that mines Bitcoin is designed to only complete a block every 10 min-utes, so the difficulty to solve a block changes with time (which means that the amountof trial and error computation changes).

The user can pay a transaction fee, which goes to the node that solves the candidateblock. This makes it more attractive to solve blocks that contains transaction fees,which makes these transactions prioritized.

This is how the network intend to work when miners no longer receive Bitcoins fromsolving blocks (from the "program"). The users will pay a transaction fee which will goto the node that solves it and the network will remain active [10].

Figure 4: A figure describing the price of BTC between January 2012 to January 2015.

12

Page 23: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

3.1 Cryptocurrencies3 INFORMATION ABOUT CRYPTOCURRENCIES AND PAPER SPECIFIC

TERMS

Figure 5: An overview on how Bitcoin works.

3.1.2 Ripple

Ripple is a payment system distributed by Ripple Labs Inc. It was released as an opensource software in 2013, but the development of Ripple had been going on since 2004.Ripple is designed as an open source Internet protocol that has its own currency, ripples(also known as XRP). It is the second biggest cryptocurrency, see figure 6 for the pricedevelopment for XRP.

When Ripple was released, 100 billion XRP was created and no more can be gener-ated according to the rules of the Ripple protocol. Of these 100 billion that was created,20 Billion XRP were divided between the creators, investors and other individuals thatwere part of the creation of Ripple. The other 80 billion was given to Ripple Labs Inc. Asof 2012, 7.2 billion XRP have been distributed to different individuals and organizations(including various charities) [22].

Ripple is constructed around a shared public database (also known as a ledger).These ledger hold information about different offers to buy and sell currencies. Whenevera user participate in the Ripple network it agrees to change the ledger via a process called"consensus", which happens every 2-5 seconds. This consensus process allows paymentsand exchanges to take place between users.

In Ripple users makes transactions between each other (that are secured throughcryptography) in either XRP or in other assets (this includes real-world assets such asdollar, yen, gold etc.).

The transactions with XRP uses the internal ledger, however with payment in otherassets, the Ripple ledger only records the amount that one user owes another which canbe viewed as debts. Since the transaction is only recorded in the ledger (and Ripplehas no real world enforcement power) other assets than XRP requires trust between theusers to be completed. Users can specify who they trust and the amount of debt thatcan be owed [26].

13

Page 24: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

3.1 Cryptocurrencies3 INFORMATION ABOUT CRYPTOCURRENCIES AND PAPER SPECIFIC

TERMS

Figure 6: A figure describing the price of XRP between between February 2013 toJanuary 2105.

3.1.3 Litecoin

Litecoin (also known as LTC) is the third biggest cryptocurrency as of 2014, see figure7 for the price development for Litecoin. It was released 7:th October 2011 by CharlesLee. However it was not until 2013 it picked up some momentum and became morerecognized as a alternative to Bitcoin. It differ on three different levels from Bitcoin:

i) The transactions are supposedly faster since a new block get processed every 2.5minutes (compared to Bitcoin’s 10 minutes). However this means that the number ofblocks in the ledger is larger.ii) Litecoin has a cap of 84 million Litecoins, which is four times as many as Bitcoin’s.iii) The Litecoin network uses scrypt in the proof-of-work algorithm which requiresasymptotically more memory.

Aside from these differences Bitcoin and Litecoin are very similar and Litecoin wasinspired by Bitcoin. Litecoin uses the same blockchain system as Bitcoin and the miningprocess is the same [27].

Figure 7: A figure describing the price of LTC between January 2014 to January 2015.

14

Page 25: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

3.2 Paper specific terms3 INFORMATION ABOUT CRYPTOCURRENCIES AND PAPER SPECIFIC

TERMS

3.2 Paper specific terms3.2.1 Google Trends

Google Trends is a tool developed by Google Inc. to see how search behaviors changeover time. Every time someone searches for a word/phrase on Google’s search en-gine (www.google.com) that search gets counted into the total volume of searches onthat/those words that week, which then get normalized. Google then publish the nor-malized volume of searches done on a specific word every week. If the search volume islow then the monthly data is published instead. [30].

3.2.2 Volatility

Volatility measures the size of changes of the value of an asset. A high volatility meansthat the value can be spread out over a larger range of values which can dramaticallychange over a short period of time. A low volatility means that the value does notfluctuate dramatically [21].

3.2.3 Other terms

SP500 is an index based on the capitalization of the 500 biggest companies listed onNYSE and NASDAQ (American stock exchange market).

Nikkei225 is a stock market index for the Tokyo Stock Exchange.

15

Page 26: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

4 MODEL AND EXPERIMENTAL SETUP

4 Model and Experimental Setup4.1 Experimental SetupThis section will be devoted to explain which covariates were used. Comments on thedata collection and modifications will also be discussed.

4.1.1 Cryptocurrencies setup

The data was collected from multiple sources (see Appendix for the exact sources). Thefollowing covariates were used in the initial analysis:

Bitcoin covariates

Table 1: Bitcoin covariatesCovariate ExplanationGoogleBTC Google trends index for the word "Bitcoin"VolBTC The volatility of BitcoinNumBTC The total number of BitcoinTransBTC Total number of Bitcoin transaction made each dayUnicBTC The number of unique addresses that have traded Bitcoin the given dateDifBTC The difficulty to mine BitcoinProdBTC The total number of Bitcoin produced (mined) each dayWeekend A dummy variable to see if the given date is during the weekendDummy365BTC A dummy variable to see if the value of Bitcoin has risen the last 365 daysDummy30BTC A dummy variable to see if the value of Bitcoin has risen the last 30 daysDummy7BTC A dummy variable to see if the value of Bitcoin has risen the last 7 daysDummyChina A dummy variable to see if Bitcoin was banned in China the given dateBrentOil The Crude Oil Prices: Brent in EuropeJapanUS Japan / U.S. Foreign Exchange RateEUUS E.U./ U.S. Foreign Exchange RateUKUS U.K./ U.S. Foreign Exchange RateGold The price of gold in USDSP500 The Standard and Poor’s 500 indexNikkei225 The Nikkei 225 index

16

Page 27: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

4.1 Experimental Setup 4 MODEL AND EXPERIMENTAL SETUP

Ripple covariates

Table 2: Ripple covariatesCovariate ExplanationGoogleXRP Google trends index for the word "Ripple cryptocurrency"VolXRP The volatility of XRPValBTC The market price of Bitcoin in USDDummy30XRP A dummy variable to see if the value of XRP has risen the last 30 daysDummy7XRP A dummy variable to see if the value of XRP has risen the last 7 daysBrentOil The Crude Oil Prices: Brent in EuropeJapanUS Japan / U.S. Foreign Exchange RateEUUS E.U./ U.S. Foreign Exchange RateUKUS U.K./ U.S. Foreign Exchange RateGold The price of gold in USDSP500 The Standard and Poor’s 500 indexNikkei225 The Nikkei 225 index

Litecoin covariates

Table 3: Litecoin covariatesCovariate ExplanationGoogleLTC Google trends index for the word "Litecoin"VolLTC The volatility of LTCValBTC The market price of Bitcoin in USDDummy30LTC A dummy variable to see if the value of LTC has risen the last 30 daysDummy7LTC A dummy variable to see if the value of LTC has risen the last 7 daysBrentOil The Crude Oil Prices: Brent in EuropeJapanUS Japan / U.S. Foreign Exchange RateEUUS E.U./ U.S. Foreign Exchange RateUKUS U.K./ U.S. Foreign Exchange RateGold The price of gold in USDSP500 The Standard and Poor’s 500 indexNikkei225 The Nikkei 225 index

Data was collected for three different cryptocurrencies, Bitcoin, Ripple and Litecoin.The Bitcoin data was gathered from January 2011 to January 2015, the Ripple datawas collected from February 2013 to January 2015 and the Litecoin data was gatheredfrom January 2014 to January 2015, one data point per day.

There were more available data points for Bitcoin which were neglected. This dueto the fact that these values were much smaller compared to values later on. Also sinceBitcoin had not become well known, including these data points would have made theregression worse.

This data was stored and modified in Microsoft Excel. Since raw data was collectedfrom different sources they needed to be modified in order to make it uniform, whichwas done with VBA (which is a programming language designed to help modify data inExcel). After the data was modified the statistical analysis was made in a programminglanguage named R (which is specialized for handling statistics).

Since all the data was collected raw, some modifications were made to make the datauniform: For certain stock exchange there were missing data points. In those cases theywere replaced by values from the day before. Furthermore, the stock exchanges wereclosed on weekend so the weekend values were filled with that Friday’s value instead.

17

Page 28: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

4.2 Modeling 4 MODEL AND EXPERIMENTAL SETUP

There were also missing data points for the market price of Ripple and Litecoin. Inthose cases data from the day before was used.

Some Google trend values for Ripple and Litecoin were 0, which would cause prob-lems if the natural logarithm was applied since log(0) is undefined. In those cases log(0)was replaced by −0.1.

The reduction was made by looking at the p-values and using AIC. The covariateswith the lowest p-values were removed. If the AIC value was increased the covariatewas reintroduced again.

4.1.2 Stock setup

The data was collected from multiple sources (see Appendix for the exact source). Thefollowing covariates were used in the initial analysis:

Stock covariates

Table 4: Stock covariatesCovariate ExplanationGoogle Google trend index of the stock nameVol The volatility of the stockVolume The trade volume of the specific stockDummy30 A dummy variable to see if the value of the stock has risen the last 30 daysDummy7 A dummy variable to see if the value of the stock has risen the last 7 daysBrentOil The Crude Oil Prices: Brent in EuropeJapanUS Japan / U.S. Foreign Exchange RateGold The price of gold in USDSP500 The Standard and Poor’s 500 indexNikkei225 The Nikkei 225 index

Six companies from the the SP500 list were chosen to do the analysis on. These compa-nies did not produce any consumer consumption products and therefore is less knownto the public. They were chosen because otherwise the Google trend index would gethits for the product instead of the stock price and therefore make it less accurate.

Data was gathered from 1:st January 2011 to 31:st January 2015, which is a similarperiod to the Bitcoin data. However the stock data is only available on trading dayswhile the Bitcoin data is available every day. The first 30 trading days are not used inthe analysis because they were used to create dummy variables.

The reduction was made by looking at their p-values and using AIC. The covariateswith the lowest p-value were removed. If the AIC value was increased the covariate wasreintroduced again.

4.2 ModelingThe collected data was used to estimate a linear regression model. The assumption wasthat the model was given by:

Y = Xβ + e (26)Here Y will be the the market price of the specific cryptocurrency/stock and X will bea matrix containing the specific covariates described in model setup.

In certain cases a log-log model will be used instead:

log(Y ) = 1 · β0 + β1X1i + β2log(X2i) + ε (27)

18

Page 29: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

4.2 Modeling 4 MODEL AND EXPERIMENTAL SETUP

Here X1i are dummy variables and log(X2i) are the natural logarithm of the rest ofthe covariates.

19

Page 30: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

5 CRYPTOCURRENCY ANALYSIS

5 Cryptocurrency analysis5.1 Bitcoin Analysis5.1.1 The linear model

The initial goal is to decrease the complexity (number of covariates) of the model byremoving insignificant variables and receive the best possible model. The decision if avariable should stay in the model depends on its p-value and by looking at AIC. Thep-value will reflect the hypothesis that a coefficient in front of a covariate can be set tozero. This is reported by the functions lm and LS in R, using F-tests. The significancelevel α = 0.05 were chosen.

The analysis was initiated by running a regression with Bitcoin market price as thedependent variable and all the proposed covariates, see results in table 18, Appendix.The model had a good fit with an R2-value of R2 = 0.937 but many of the proposedcovariates had high p-values and should therefore be removed from the model.

But before any reductions were made heteroscedasticity was checked. This was doneby plotting the residuals from the regression in table 18 against the price of Bitcoin.

0 200 400 600 800 1000

−2

00

02

00

ValBTC

resid

ua

ls

Figure 8: The residuals plotted against the value of the Bitcoin from the first OLSregression, table 18, Appendix.

After examining figure 8 one could argue that the model contains heteroscedasticitysince the absolute value of the error term increases when V alBTC increases. This canbe verified by using the Breusch-Pagan test. Applying it on the model in table 18gives a p-value of p = 2.848 · 10−188 for the null hypothesis. The null hypothesis ofhomoscedasticity can thus be rejected.

This shows that this is not a good model formulation. An attempt was made to fixthis by using a log-log-model instead. The same covariates as in table 18 were used butthe model was:

log(V alBTC) = 1 · β0 + β1X1i + β2X2i + β3log(X3i) + e (28)

HereX1i is V olBTC, X2i are dummy variables and log(X3i) are the natural logarithmof the rest of the covariates.

The results of this regression are shown in table 19, Appendix. This model is bettersince it contains less heteroscedasticity which is shown in figure 9.

20

Page 31: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

5.1 Bitcoin Analysis 5 CRYPTOCURRENCY ANALYSIS

2 3 4 5 6 7

−0

.6−

0.2

0.2

0.6

logValBTC

resid

ua

ls

Figure 9: The residuals plotted against the natural logarithm of the Bitcoin price fromthe first OLS regression of the log-log-model.

Figure 9 shows that the model is now less heteroscedastic with the standard deviationof the residual constant over the interval. This can also be confirmed by preforming aBreusch-Pagan test which yields a p-value of p = 0.797 for the null hypothesis whichtherefore can not be rejected. Since the log-log model does not show any signs ofheteroscedasticity it will be used in the rest of the analysis of BTC.

The model was reduced so it only contained one exchange rate and one stock index.The VIF values for the covariates were examined and the results can be found in table19, Appendix.

The reduction was made by removing logEUUS, logUKUS and logNikkei225.SP500 is still in the model since it reflects the world index better then Nikkei225 does[1]. JapanUS was kept since it is one of the largest exchange rates. The VIF valueswere examined after this reduction and can be viewed in table 20, Appendix.

The Breusch-Pagan test was used on the reduced model, see table 20, Appendix.The p-value was p = 0.628 for the null hypothesis, which means that the model is stillhomoscedastic but this model has a lower p-value for the Breusch-Pagan test whichshows that the model becomes more heteroscedastic if the model is reduced. ThereforeWhite’s Robust Estimator will be used in the rest of the analysis of BTC. Also, themodel will not get worse by including White’s Robust Estimator.

5.1.2 Reducing the model

The model will be reduced by using the AIC method. As seen from the first regres-sion (table 21, Appendix) of the log-log-model, all covariates except Weekend andlogDifBTC have p-values below 0.05. This model has AIC = 4236.0 and R2 = 0.98865.

The dummy variable Weekend and logDifBTC were removed, see regression intable 22, Appendix. This model has AIC = 4234.0 and R2 = 0.98863 which is a lowerAIC-value than the previous model.

Then V olBTC was removed since it had the highest p-value, see the regression intable 23, Appendix. This model has AIC = 4238.4 which is larger than the previousmodel. Thus the model in table 22 will be used from here on.

5.1.3 Multicollinearity

VIF-values were calculated to see if the covariates suffer from multicollinearity. Thiswas done by using the function VIF in R.

21

Page 32: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

5.1 Bitcoin Analysis 5 CRYPTOCURRENCY ANALYSIS

Table 5: VIF-values for BTC

Covariate VolBTC logProdBTC Dummy365BTC logGoogleBTC logTransBTCVIF-value 1.8 2.2 3.4 8.7 24.3Covariate Dummy30BTC logNumbBTC logGold Dummy7BTC logUnicBTCVIF-value 1.4 51.4 8.6 1.2 56.9Covariate logBrentOil DummyChina logJapanUS logSP500VIF-value 4.4 6.7 23.0 52.1

VIF-values for the covariates in table 22, Appendix.

From table 5 it is clear that many covariates experience high multicollinearity becausethey have VIF-values of over 10. This would be a problem if the model containedless data points, however since the model have many data points the variables get lowstandard deviations anyway so this is not an issue.

5.1.4 Endogeneity

A important question is whether there is endogeneity in the model. This can be in-vestigated by plotting the error terms against the different covariates to see if thereexists any patterns. The covariate that seemed to experience most endogeneity waslogGoogleBTC, see figure 10.

1 2 3 4

−0

.8−

0.4

0.0

0.4

logGoogleBTC

resid

ua

ls

Figure 10: The residuals plotted against the natural logarithm of GOOGLE.

The Wu-Hausmann test was preformed on logGoogleBTC to check if it experiencedendogeneity. The Wu-Hausmann test requires IV’s for logGoogleBTC which needs tobe found and tested for weakness.

There were two considered instruments: Google trend on the word "SilkRoadMarket"and Google trend on the word "Litecoin". Silk road market was an illegal internet marketfor drug trades. Because BTC are hard to track, they were used on the Silk road marketand thus it could be a good correlation between the variable logGoogleBTC and thevariable logGoogleSilk which is a variable for the search volume on the word Silk roadmarket. Litecoin’s Google trend result is another possible instrument.

The following regression was made to test for weakness:

logGoogleBTC = γ1X1 + γ2Z + ε (29)

22

Page 33: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

5.2 XRP Analysis 5 CRYPTOCURRENCY ANALYSIS

Where X1 are the exogenous variables in the model and Z are the IV’s. A F-statisticwas computed with the null hypothesis: γ2 = 0. The F-statistic should at least be 10for a significant model [32].

The results are shown in table 6:

Table 6: F-statistic for IVs

IV F-statisticlogGoogleLTC 434.8logGoogleSilk 10.6

The result from the test for weak instruments.

Table 6 reveals that logGoogleSilk is almost a weak IV and logGoogleLTC is a non-weak IV so logGoogleLTC will be used as an IV for the Wu-Hausmann test. TheWu-Hausman test is now performed with logGoogleLTC as IV. The results are shownin table 24, Appendix. The p-value for the coefficient of res.WU is now the p-valuefor the null hypothesis in the Wu-Hausman test. The test confirms that the model isendogenous but with a small margin (p = 0.042).

A TSLS regression was preformed and the results are shown in table 25, Appendix.This regression has an R2 = 0.9881 which is similar to the model without TSLS.

5.2 XRP Analysis5.2.1 The linear model

The regular model was tested for heteroscedasticity with the Breusch-Pagan test whichgave p = 6.392 · 10−76. This means that the model suffers from heteroscedasticity whichcan be seen in figure 11. Therefore the log-log version was used instead.

0.00 0.01 0.02 0.03 0.04 0.05 0.06

−0

.01

0.0

10

.03

ValXRP

resid

ua

ls

Figure 11: The residuals plotted against the XRP price for the regular model.

logNikkei225, logUKUS and logJapanUS were removed from the regression.logJapanUS was removed instead of logEUUS because it gave a better fit. The re-sult from the regression is shown in table 26, Appendix.

The log-log model were tested for heteroscedasticity with the Breusch-Pagan testwhich yielded a p-value of p = 0.012 for the null hypothesis. Therefore the null hypoth-esis can be rejected with the chosen α. Figure 12 also display patterns of heteroscedas-ticity since the residual increases with the value of XRP.

23

Page 34: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

5.2 XRP Analysis 5 CRYPTOCURRENCY ANALYSIS

−7 −6 −5 −4 −3

−1

.5−

0.5

0.5

1.5

logValXRP

resid

ua

ls

Figure 12: The residuals plotted against the logarithm of the XRP price from the firstOLS regression (table 26, Appendix) of the log-log-model.

The Breusch-Pagan test showed that the model is heteroscedastic so in the remaininganalysis of XRP White’s Robust Estimator will be used. Also if more variables areremoved during the reduction then the model might become even more heteroscedastic.The VIF values for the covariates were then examined:

Table 7: Initial VIF-values for XRP

Covariate logValBTC logGoogleXRP logBrentOil logEUUS logGoldVIF-value 11.8 3.4 8.8 7.6 2.1Covariate logSP500 Dummy7XRP Dumy30XRP VolXRPVIF-value 12.7 1.1 1.3 1.3

VIF-values for the covariates in table 26, Appendix.

5.2.2 Reducing the model

The model will be reduced by using the AIC method. A White’s Robust regression wasmade to get a benchmark value for AIC, the results are shown in table 27, Appendix.This model has AIC = 3493.6 and R2 = 0.6262. In table 27, Appendix Dummy7XRPhas a high p-value so that variable is removed and a new regression is performed. Thismodel has AIC = 3491.6 which is lower than the model before.

logV alBTC was removed since it had the highest p-value in the new model withoutDummy7XRP . This model has AIC = 3490.7 which is even lower.

V olXRP is then removed and this model has AIC = 3493.0 which is higher thanthe previous model. Thus the model from the second reduction will be used, see table28, Appendix. V olXRP is still in the model even though it has a p-value of over 0.05.This is because the AIC value was lower with this variable included. This model hasR2 = 0.6256.

5.2.3 Multicollinearity

VIF-values were calculated to see if the covariates suffer from multicollinearity.

24

Page 35: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

5.3 Litecoin Analysis 5 CRYPTOCURRENCY ANALYSIS

Table 8: VIF-values for the final XRP model

Covariate logGoogleXRP logBrentOil logEUUS logGold logSP500VIF-value 1.2 8.1 6.8 2.1 3.1Covariate Dummy30XRP VOLXRPVIF-value 1.2 1.3

VIF-values for the covariates in table 27, Appendix.

From table 8 it is clear that none of the covariates experience high multicollinearitybecause they have VIF-values below 10. So multicollinearity is not a problem in thismodel.

5.2.4 Endogeneity

The error terms (from the model in table 28, Appendix) were plotted against the valueof XRP to inspect endogeneity:

3.0 3.5 4.0 4.5

−1

.5−

0.5

0.5

logGoogleXRP

resid

ua

ls

Figure 13: The residuals plotted against the value of GOOGLE.

There might be some patterns in figure 13 that suggest endogeneity however since itis hard to find IV and this is not the main topic of this thesis the final model will bewithout endogeneity. It is shown in table 28, Appendix.

5.3 Litecoin Analysis5.3.1 The linear model

The regular model was tested for heteroscedasticity with the Breusch-Pagan test whichgave p = 9.388 · 10−9. This means that the model suffers from heteroscedasticity whichcan be seen in figure 14. We therefore use the log-log version instead.

25

Page 36: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

5.3 Litecoin Analysis 5 CRYPTOCURRENCY ANALYSIS

5 10 15 20

−2

02

4

ValLTC

resid

ua

ls

Figure 14: The residuals plotted against the Litecoin price for the regular model.

logNikkei225, logUKUS and logEUUS were removed from the regression, similar toBTC. The results from the regression are shown in table 29, Appendix.All the covariates are significant in the model and the log-log model was tested forheteroscedasticity with the Breusch-Pagan test which gave a p-value of p = 0.206 forthe null hypothesis.Therefore the null hypothesis can not rejected with α so the log-log-model will be used throughout the analysis of LTC.

The error terms from this regression is plotted against the natural logarithm ofLitecoin price in figure 15. The error terms are evenly spread out for all values. HoweverWhite’s Robust Estimator will still be used because the model tend to become moreheteroscedastic when reduced.

0.0 0.5 1.0 1.5 2.0 2.5 3.0

−0

.40

.00

.2

logValLTC

resid

ua

ls

Figure 15: The residuals plotted against the natural logarithm of the Litecoin price fromthe first OLS regression of the log-log-model.

5.3.2 Reducing the model

The model will be reduced by using the AIC method. The model in table 30, Appendixhas an AIC-value of 812.1 and R2 = 0.950. In table 30, Appendix logGold has a highp-value so that variable is removed. This model has AIC = 816.3 which is a larger AICvalue than before so no reduction is made and the model in table 30 will be used.

26

Page 37: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

5.3 Litecoin Analysis 5 CRYPTOCURRENCY ANALYSIS

5.3.3 Multicollinearity

VIF-values were calculated to see if the covariates suffer from multicollinearity.

Table 9: VIF-values for LTC

Covariate logValBTC logGoogleLTC logBrentOil logJapanUS logGoldVIF-value 4.0 5.7 153.3 1805.6 1423.3Covariate logSP500 Dummy7LTC Dummy30LTC VolLTCVIF-value 5.8 1.6 1.1 1.3

VIF-values for the covariates in table 30, Appendix.

From table 9 it is clear that many covariates experience high multicollinearity becausethey have a VIF-value of over 10. This would be a problem if the model containedless data points however since the model has many data points the variables get lowstandard deviations anyway so this is not a problem.

5.3.4 Endogeneity

The error terms (from the model in table 12, Appendix) were plotted against the valueof LTC and BTC to inspect endogeneity:

5.5 6.0 6.5

−0.4

0.0

0.2

logValBTC

resid

uals

Figure 16: The residuals plotted against the value of LTC.

27

Page 38: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

5.3 Litecoin Analysis 5 CRYPTOCURRENCY ANALYSIS

2.0 2.5 3.0 3.5 4.0

−0.4

0.0

0.2

logGoogleLTC

resid

uals

Figure 17: The residuals plotted against the value of GOOGLE.

There are no apparent patterns of endogeneity in these plots (figure 16 and 17). Alsosince this is not the main topic we will not preform a Wu-Hausmann test so the finalmodel will be the one in table 30, Appendix.

28

Page 39: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

6 STOCK ANALYSIS

6 Stock analysisA valid question to ask is if the results from the cryptocurrency section, that the marketprice depends on the Google trend index also holds true for other assets such as stocksfrom the SP500 index. All covariates in the model setup were used.

There were two models that could be chosen, the regular one and the log-log model.If the data sample showed signs of heteroscedasticity then the log-log model was chosenotherwise the regular model was selected. The Breusch-Pagan test were preformed ondifferent models to see which model was less heteroscedastic.

The models were reduced by using AIC. However no tests were done to look formulticollinearity or endogeneity since it is only complementary to the cryptocurrencyanalysis. V olume was divided by 106 in the regular model since the LS-function couldnot handle large numbers.

6.1 GenworthA regression was made on Genworth which is a financial security company. The Com-pany provides insurance, wealth management, investment and financial solutions [18].

Both the log-log and regular model were tested for heteroscedasticity with theBreusch-Pagan test. The p-value for the null hypothesis was p = 4.063 · 10−8 for thelog-log model and p = 0.534 for the regular model. Since the regular model had a betterp-value that model was chosen (see table 31, Appendix).

The best fit for Genworth had R2 = 0.859 and η2 = 0.052 for Google.

6.2 CloroxA regression was made on The Clorox Company (Clorox) which is a United States-basedmanufacturer and marketer of consumer and professional products [15].

Both the log-log and regular model were tested for heteroscedasticity with theBreusch-Pagan test. The p-value for the null hypothesis was p = 0.865 for the log-logmodel and p = 5.290 · 10−10 for the regular model. Since the log-log model had a betterp-value that model was chosen (see table 32, Appendix).

The best fit for Clorox had R2 = 0.962 and η2 = 0.013 for Google.

6.3 DenstsplyA regression was made on Dentsply International Inc. which is a designer, developer,manufacturer and marketer of consumable dental products for the professional dentalmarket [16].

Both the log-log and regular model were tested for heteroscedasticity with theBreusch-Pagan test. The p-value for the null hypothesis was p = 0.0002 for the log-logmodel and p = 0.0005 for the regular model. Since the regular model had a betterp-value that model was chosen (see table 33, Appendix).

The best fit for Dentsply had R2 = 0.955 and η2 = 0.029 for Google.

6.4 EquifaxA regression was made on Equifax Inc. which is a provider of information solutions forbusinesses and consumers [17].

Both the log-log and regular model were tested for heteroscedasticity with theBreusch-Pagan test. The p-value for the null hypothesis was p = 4.680 · 10−18 for thelog-log model and p = 0.0001 for the regular model. Since the regular model had abetter p-value that model was chosen (see table 34, Appendix).

29

Page 40: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

6.5 Mylan 6 STOCK ANALYSIS

The best fit for Equifax had R2 = 0.971 and η2 = 0.082 for Google.

6.5 MylanA regression was made on Mylan N.V., formerly Mylan Inc. which is a global pharma-ceutical company, that develops, licenses, manufactures, markets and distributes generic,branded generic and specialty pharmaceuticals [19].

Both the log-log and regular model were tested for heteroscedasticity with theBreusch-Pagan test. The p-value for the null hypothesis was p = 0.005 for the log-logmodel and p = 2.864 · 10−13 for the regular model. Since the log-log model had a betterp-value that model was chosen (see table 35, Appendix).

The best fit for Clorox had R2 = 0.966 and η2 = 0.038 for Google.

6.6 StericycleA regression was made on Stericycle, Inc. which is engaged in the business of provid-ing regulated and compliance solutions to healthcare and commercial businesses, thisincludes the collection and processing of specialized waste for disposal, and a variety oftraining, consulting, recall/return, communication, and compliance services [20].

Both the log-log and regular model were tested for heteroscedasticity with theBreusch-Pagan test. The p-value for the null hypothesis was p = 8.030 · 10−5 for thelog-log model and p = 0.964 for the regular model. Since the regular model had a betterp-value that model was chosen (see table 35, Appendix).

The best fit for Clorox had R2 = 0.955 and η2 = 0.015 for Google.

30

Page 41: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

7 DISCUSSION AND CONCLUSIONS

7 Discussion and Conclusions7.1 DiscussionThe table below shows references to the best models for each cryptocurrency and stock:

Table 10: The initial regression for BTC

Investigated Best modelBitcoin Table 22, AppendixRipple Table 28, AppendixLitecoin Table 30, AppendixGenworth Table 31, AppendixClorox Table 32, AppendixDenstsply Table 33, AppendixEquifax Table 34, AppendixMylan Table 35, AppendixStericycle Table 36, Appendix

The best model for each cryptocurrency and stock

Both the best Bitcoin and Litecoin model had high R2-values unlike Ripple which didnot show as good of a result. Litecoin and Bitcoin have a similar structure unlike Ripplewhich could explain the difference.

7.1.1 Heteroscedasticity and reducing the model

The regular linear model was replaced by a log-log model because the regular one suf-fered from heteroscedasticity. When the log-log model was used this phenomenon dis-appeared. In the regular model the differences in magnitude between the largest andsmallest values were significant. That lead to an uneven distribution in the error termswhich caused heteroscedasticity, that can be seen in figure 8. When the log-log modelwas used all values became the same order of magnitude which therefore reduced theheteroscedasticity.

Only one currency exchange and one exchange rate were kept due to the fact thatthey are describing the same phenomenon. The AIC-method and the F-test said that allcurrency exchanges and exchange rates should be kept in the model of Bitcoin howeverthe theory that was used could not explain why the variables had the signs they did (seetable 19, Appendix). Also all the major currency exchanges are correlated. If the UnitedStates economy increases then the exchange rate of the American dollar will increaseagainst the Euro, Pound and Yen. By removing two currency exchanges and one stockindex the VIF-values of the model got better however this was not the main reasonwhy the reductions were made. By including these covariates the model would havebeen adapted after what values were desired which is a bad approach. The reason thecovariates SP500 and JapanUS were used is because SP500 is the largest national stockindex in the world. JapanUS was chosen because it covers the four biggest exchangemarkets [9]. Bitcoin was also invented in Japan which makes it a good covariate toinclude in the model.

White’s Robust Estimator was included in the final model even though both Bitcoinand Litecoin did shown signs of heteroscedasticity. This was made because the modelbecame more heteroskedastic if reductions were made and in order to not deterioratethe model, the White’s Robust Estimator was added.

31

Page 42: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

7.1 Discussion 7 DISCUSSION AND CONCLUSIONS

The covariates that had the highest p-values were removed since it reflected that thecovariates were not significant and therefore did not contribute anything to the model.If the AIC value increased then the variable was reintroduced. The model with thelowest AIC-value was chosen which resulted for example that V olXRP was kept in theXRP model even though it had a p-value of over 0.05. AIC were used instead of BICsince AIC does not reduce the model as strongly as BIC [23].

7.1.2 Multicollinearity

When multicollinearity was investigated the VIF-values for the covariates were exam-ined. VIF essentially looks at the variance of the covariate and a high variance meansa high standard deviation which causes an increase in the p-value. This increase cancause significant covariates to be removed from the model however when the VIF-valueswere examined (see table 20, Appendix) it is clear that covariates with high VIF-valueare still in the model so this is not a problem.

If the Bitcoin specific covariates are examined for example, they all have largeVIF-values. These are kept however because they give a lower AIC-value and their p-values are low. Since a lot of data points were available the problems that occurs withmulticollinearity are decreased.

7.1.3 Endogeneity

A Wu-Hausmann test was preformed on the covariate GoogleBTC to look for endo-geneity. The result was that GoogleBTC was endogenous but with a small margin (thep-value for the Wu-Hausmann test was p = 0.042). However this test gives a false pictureof the real situation since there were a lot of data points being used which increased therelevance of the IV and therefore gave a positive outcome of the test. In the plot thatinvestigated endogeneity no apparent linear connection could be found which makes itharder to decide whether there is endogeneity in the model or not.

A TSLS regression was preformed which gave completely different results thanthe none TSLS regression. For example the η2 for GoogleBTC had a difference of 0.35which makes that covariate go from high impact (none TSLS) to medium impact (TSLS)according to Cohen’s rule. Therefore the TSLS regression was not being used in the finalregression because throughout the analysis there has been a strong correlation betweenGoogle trend index and the market price of cryptocurrencies and it should not changedrastically because a TSLS regression was made. One reason this might occur is becausethe IV was not strong enough.

If the result of the Wu-Hausmann test would have been followed then the TSLSregression would have been used instead of the regular one but the weak result from thetest in combination with the arguments above concluded that GoogleBTC is not en-dogenous. One could argue that GoogleBTC suffers from simultaneity with the marketprice of Bitcoin however it could be the other way around as well.

From the endogeneity results for Bitcoin there was a decision not to do the sameanalysis for the other currencies. Bitcoin is the main topic of this thesis and the othercurrencies are complementary.

If the market price is plotted versus Google trend index then it is apparent thatthey are strongly correlated. This correlation with Google Trend index is the main topicof this thesis and is discussed further down in the Google section.

7.1.4 Missing covariates

During the examined time period Litecoin had a strong downwards trend which makesit almost possible to fit it linearly. It would have been interesting to look at Litecoin

32

Page 43: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

7.1 Discussion 7 DISCUSSION AND CONCLUSIONS

for a longer period of time (that includes up and downwards trends) and make analysisabout it however no such data was found.

Litecoin received a large R2-value probably due to the fact that it had a downwardstrend during the time period which made the regression good. XRP had a worse R2-value which can be caused by fewer covariates and up and downward trends.

It is today possible to use Bitcoin and other cryptocurrencies to purchase goods instores, both online and physical. It would had been interesting to see how the numberof stores that accept Bitcoin affects the market price of Bitcoin however no such datacould be found.

7.1.5 Google

The main result of this thesis is that the market price of Bitcoin and the other cryp-tocurrencies depend heavily on Google trend index which therefore can be viewed as ameasurement of demand. The results can be seen in table 11.

Table 11: η2-values for Google trend index

Regression Sign on β for Google η2

BTC Positive 0.593XRP Positive 0.182LiteCoin Positive 0.312Genworth Positive 0.052Clorox Positive 0.0128Dentsply Positive 0.029Equifax Negative 0.082Mylan Negative 0.038Stericycle Negative 0.015

The η2-values for the variable Google trend index in the different models.

This result is stronger for the cryptocurrencies than it is for the stocks that were ex-amined. XRP had a lower dependence on Google then LTC and BTC had. This canbe caused by the fact that the search word for XRP was "Ripple cryptocurrency" sinceRipple has a different meaning in the English language than the desired one. This wouldhave caused problems since unwanted search volume would have been added.

All cryptocurrencies had positive sign on Google while the stocks had 3 positive signsand 3 negative signs. This can explain why cryptocurrencies had a higher η2-value thanthe stocks. For cryptocurrencies all publicity is good because the common knowledgeabout the currencies are low which makes Google searches a positive thing. For thestock and their underlying companies all publicity is not positive. A bad quarter reportcan lead to a lot of bad publicity which decrease the stock price. The negative publicitycan also increase the number of Google searches which could explain the negative signs.

For Bitcoin there are major peaks in the search volume as the currencies increasesrapidly (see figure 19). This raises the question if the Google trend index could be usedfor investment purposes. The data is a collection over one week for all currencies exceptfor XRP which was a collected over one month. If there was access to real time dataon the search volume on a specific word then this could be used for investments in bothcryptocurrencies and stocks. This is an interesting topic that should be explored in thefuture.

The positive correlation between Google and the market price of cryptocurrenciesraises the question if a cryptocurrency can gain in value by doing a market campaign

33

Page 44: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

7.1 Discussion 7 DISCUSSION AND CONCLUSIONS

like regular companies does. It would also be interesting to see if this correlation willbe consistent in the future when the cryptocurrencies are more mature and well known.

7.1.6 Currency specific covariates

The XRP and LTC model do not show as good of a results as the Bitcoin model does.This can be caused by many different things for example that the BTC regression hadmore data points than XRP and LTC. Fewer data points gives higher p-values in theF-test which leads to more reduction in the model. However this is not the biggestdifference. There were no available data for currency specific covariates like ProdBTCfor XRP and LTC which made the models less complex. The currency specific covariatesare shown in table 12.

Table 12: η2-values for the currency specific variables for BTC

Regression Sign on β for currency specific variables η2

NumBTC Positive 0.024UnicBTC Positive 0.029ProdBTC Negative 0.151TransBTC Positive 0.012

The η2-values for the currency specific variables for BTC which were included in thefinal model (DifBTC is not included because it was removed from the model).

The sign of NumBTC is surprising because more units of BTC should decrease thevalue. This covariate increases constantly since the number of Bitcoin will increase untilit reaches the cap. The positive sign of this covariate can thus be explained by thefact that BTC value as a whole has increased during the examined time period butNumBTC has not changed a lot. The covariate (not log) has increased from 8.0 · 106

to 13.8 · 106 during the examined period.ProdBTC has a medium impact according to Cohen’s rule and this is the BTC

specific covariate that has the biggest impact on the model. The sign is expected becauseif more BTC are on the market then the value of Bitcoin will decrease.

UnicBTC and TransBTC have the expected signs because many transactionsand transactions from different addresses is a sign of high liquidity which is good forcurrencies or investment assets. This can also be seen as a measurement of demand (incombination with Google). The η2-values for these variables are low. UnicBTC has asmall impact on the model and TransBTC less than small impact according to Cohen’srule. A possible explanation for this is the high multicollinearity between UnicBTC,TransBTC and ProdBTC.

DifBTC was removed from the model by the AIC criterion. The cause of thiscould be the high correlation between DifBTC an the other Bitcoin specific covariates.DifBTC also experience a exponential growth which is absent for the Bitcoin marketprice.

The Bitcoin specific covariates that were included in the model can be examinedin figure 18, Appendix. The figure reveals that UnicBTC and TransBTC have highcollinearity.

The currency specific values had a lower impact than expected. Only one of fivevariables had a η2-value of over 0.13 which is the lower limit for medium impact inCohen’s rule.

34

Page 45: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

7.1 Discussion 7 DISCUSSION AND CONCLUSIONS

7.1.7 Volatility

Table 13: η2-values for volatility

Regression Sign on β for volatility η2

BTC Negative 0.006XRP Negative 0.006LTC Negative 0.069

The η2-values for the variable volatility in the different models.

Table 13 reveals that the sign for volatility is negative. This implies that a high volatilityis negative for the price of the currency. When the volatility is high the value of thecurrency is uncertain and this is unfavorable for a currency. Therefore the negative signon the covariate volatility was expected. This is also the case for the stocks that wereexamined. The significance of the volatility was low especially for the stocks where nostock had a η2-value of over 0.02. The volatility was significantly stronger for the cryp-tocurrencies but only Litecoin had a volatility that was considered significant accordingto Cohen’s rule. It was surprising that volatility did not have a larger impact than ithad. This could be explained by the fact that the volatility gets a high value in both risesand falls so the value averages out. Litecoin had a higher volatility dependency than theother cryptocurrencies. This could be explained by the fact that Litecoin experienced adownwards trend (see figure 7) and therefore the contributions from the volatility willnot cancel each other as they do for the other cryptocurrencies.

7.1.8 Index covariates

There were no anticipations on the signs for the index covariates that were includedin the model. It is interesting to observe the differences between the cryptocurrencies.However, there also exist similarities. For example Gold and SP500 have negative signsfor all currencies. It was clear that most of the examined stock had a high positivecorrelation with SP500. This is expected and it is common knowledge that most stockin the SP500 index are highly correlated with the stock index.

Table 14: η2-values for the index covariates for BTC

Regression Sign on β for index covariates (BTC) η2

Gold Negative 0.234SP500 Negative 0.067JapanUS Negative 0.151BrentOil Positive 0.099

The η2-values for the index covariates for BTC which are included in the final model.

35

Page 46: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

7.1 Discussion 7 DISCUSSION AND CONCLUSIONS

Table 15: η2-values for the index covariates for LTC

Regression Sign on β for index covariates (LTC) η2

Gold Negative 0.017SP500 Negative 0.115JapanUS Negative 0.041BrentOil Positive 0.093

The η2-values for the index covariates for LTC which are included in the final model.

Table 16: η2-values for the index covariates for XRP

Regression Sign on β for index covariates (XRP) η2

Gold Negative 0.394SP500 Negative 0.264EUUS Positive 0.171BrentOil Negative 0.232

The η2-values for the index covariates for XRP which are included in the final model.

7.1.9 Dummy7, Dummy30 and Dummy365

Table 17: η2-values for the dummy variables

Regression Sign on β for dummy variables η2

Dummy7BTC Positive 0.016Dummy30BTC Positive 0.007Dummy365BTC Positive 0.019Dummy30XRP Positive 0.014Dummy7LTC Positive 0.037Dummy30LTC Positive 0.028

The η2-values for the dummy variables in the different cryptocurrency models.

Table 17 shows that every dummy variable has a positive sign which is expected becauseit reflects if the market value has increased. All dummy variables have low η2-valueswhich means that they have low impact which is somewhat surprising. The reasonDummy365 was not included for XRP and LTC is that these cryptorcurrencies havefewer data points so by removing 365 data points the model would have been worse.

7.1.10 BTC price as a covariate

In the XRP and LTC model the BTC market price was included as a covariate. In theXRP model the BTC price had no impact and was removed. For LTC the BTC price gotη2 = 0.445 which is a big impact according to Cohen’s rule. The difference between thecurrencies are big and can be explained by the different properties that the currencieshave.

36

Page 47: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

7.2 Conclusions 7 DISCUSSION AND CONCLUSIONS

7.1.11 Other covariates

The covariate China was included in the BTC model and received a η2 = 0.057. The signof the covariate China was not expected since the market price should have decreasedwhen Bitcoin got banned in China. However the ban might have increased the exposureof Bitcoin which lead to an increase of the market price. If the Bitcoin data is dividedinto two parts, one before and one after the ban then the mean value of the market priceis higher in the part were Bitcoin is banned because the price of Bitcoin has increasedsince the ban which can explain the unexpected sign.

7.2 ConclusionsThe results from the analysis was that the market price of cryptocurrencies was im-pacted strongly by Google trend index according to Cohen’s rule: Bitcoin - high impact,XRP - medium impact, Litecoin - high impact. Because of this, Google trend index canbe viewed as a measurement of demand according to national economic theory. Cryp-tocurrencies do not have an underlying value in the way stocks have which means thatcryptocurrencies are more demand sensitive so the market value depends more on howwell known the currency is.

The price of cryptocurrencies did not heavily depend on the volatility because theη2 was small, according to Cohen’s rule no impact except for Litecoin which had a smallimpact.

The results for Bitcoin and Litecoin are fairly similar however Ripple differ sub-stantially. This could be explained by the fact that Bitcoin and Litecoin have the samestructure which Ripple does not share.

37

Page 48: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8 APPENDIX

8 Appendix8.1 Bitcoin Appendix

Table 18: The initial regression for BTC

Covariate Estimate Std.Error t-value p-valueIntercept 7.0e+02 1.3e+02 5.5 4e-08GoogleBTC 6.07e+00 2.0e-01 30.2 < 2e-16V olBTC -3.84e+01 3.7e+00 -10.4 < 2e-16NumBTC -8.3e-06 6.8e-06 -1.2 0.221TransBTC -2.9e-04 3.1e-04 -0.9 0.346UnicBTC 7.17e-05 1.6e-04 0.4 0.656DifBTC -4.43e-09 4.4e-10 -10.0 < 2e-16ProdBTC -1.08e-06 9.5e-07 -1.1 0.255Dummy30BTC 2.91e+01 4.8e+00 6.1 1e-09Dummy7BTC 1.96e+01 4.3e+00 4.5 7e-06Dummy365BTC 1.21e+01 8.1e+00 1.5 0.135DummyChina -3.43e+02 1.2e+01 -29.3 < 2e-16BrentOil -3.0e-01 3.5e-01 -0.9 0.384JapanUS -1.43e+01 1.4e+00 -9.9 < 2e-16EUUS -2.11e-01 5.6e-02 -3.8 0.0001UKUS -7.1e-02 6.2e-02 -1.1 0.251Gold -6.16e-02 7.6e-03 -8.1 1e-15SP500 3.91e-01 7.0e-02 5.6 2e-08Nikkei225 4.76e-02 6.2e-03 7.7 3e-14Weekend -2.5e+02 1.6e+02 -1.5 0.133

Model: regular. R2-value: 0.937. Currency: Bitcoin. General information: The initialregression.

38

Page 49: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.1 Bitcoin Appendix 8 APPENDIX

Table 19: First log-log regression for BTC

Covariate Estimate Std.Error t-value p-value VIF-valueIntercept -3.20e+01 4.1e+00 -7.8 1e-14 -V olBTC -3.0e-02 1.1e-02 -2.6 0.009 1.9logGoogleBTC 5.56e-01 1.7e-02 32.5 < 2e-16 11.3logNumBTC 2.17e+00 3.0e-01 7.3 4e-13 63.3logUnicBTC 3.43e-01 5.0e-02 6.9 1e-11 65.2logDifBTC 3.8e-02 1.3e-02 3.0 0.003 83.3logProdBTC -1.36e-01 1.1e-02 -11.9 < 2e-16 2.4logBrentOil 3.64e-01 9.5e-02 3.8 0.0001 8.5logJapanUS -6.29e+00 4.4e-01 -14.2 < 2e-16 107.8logTransBTC 1.48e-01 4.1e-02 3.6 0.0003 29.7logEUUS 3.60e+00 4.3e-01 8.4 < 2e-16 8.7logUKUS -4.33e+00 6.0e-01 -7.2 1e-12 13.1logGold -1.77e+00 1.6e-01 -11.5 < 2e-16 13.9logSP500 1.8e+00 3.8e-01 4.8 2e-06 105.6logNikkei225 2.47e+00 2.6e-01 9.4 < 2e-16 132.5Dummy365BTC 7.4e-02 2.9e-02 2.5 0.011 3.7Dummy30BTC 4.7e-02 1.3e-02 3.6 0.0003 1.4Dummy7BTC 5.2e-02 1.2e-02 4.3 2e-05 1.2DummyChina -4.53e-01 4.0e-02 -11.2 < 2e-16 12.6Weekend -6.1e-04 1.3e-02 -0.05 0.962 1.1

Model: log-log. R2-value: 0.990 . Currency: Bitcoin. General information: Firstlog-log regression.

39

Page 50: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.1 Bitcoin Appendix 8 APPENDIX

Table 20: Regression with two stock currency exchange and one stock exchange removedfor BTC

Covariate Estimate Std.Error t-value p-value VIF-valueIntercept -29.4 4.2 -7.0 4e-12 -V olBTC -0.029 0.012 -2.5 0.013 1.8logGoogleBTC 0.643 0.016 39.3 < 2e-16 9.0logNumBTC 1.62 0.30 5.3 1e-07 58.4logUnicBTC 0.29 0.053 5.5 4e-08 64.1logJapanUS -2.04 0.24 -8.4 < 2e-16 28.6logDifBTC -0.0096 0.013 -0.8 0.443 69.1logProdBTC -0.168 0.012 -14.0 < 2e-16 2.2logBrentOil 0.7907 0.076 10.5 < 2e-16 4.7logTransBTC 0.1305 0.043 3.0 0.003 28.6logGold -2.42 0.14 -17.2 < 2e-16 10.1logSP500 3.66 0.34 10.7 < 2e-16 74.2Dummy365BTC 0.135 0.030 4.4 1e-05 3.5Dummy30BTC 0.0411 0.014 2.9 0.003 1.4Dummy7BTC 0.05614 0.013 4.3 1e-05 1.2DummyChina -0.2751 0.039 -7.1 2e-12 10.2Weekend -0.01696 0.014 -1.2 0.213 1.1

Model: log-log. R2-value: 0.9887 . Currency: Bitcoin. Breusch-Pagan test: 0.628.General information: The regression with two currency exchange and the NIKKEI

index removed.

40

Page 51: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.1 Bitcoin Appendix 8 APPENDIX

Table 21: The first LS-regression for BTC

Covariate Estimate Std.Error Eta.sq p-valueIntercept -29.4 4.6 0.042 0.000V olBTC -0.0294 0.015 0.006 0.047logGoogleBTC 0.6428 0.019 0.584 0.000logNumBTC 1.612 0.31 0.025 0.000logUnicBTC 0.2927 0.055 0.027 0.000logJapanUS -2.041 0.22 0.060 0.000logDifBTC -0.0096 0.012 0.0005 0.438logProdBTC -0.1676 0.014 0.151 0.000logTransBTC 0.1305 0.042 0.008 0.002logGold -2.416 0.15 0.211 0.000logBrentOil 0.7907 0.070 0.090 0.000logSP500 3.659 0.33 0.093 0.000Dummy365BTC 0.1350 0.037 0.017 0.0002Dummy30BTC 0.0411 0.015 0.008 0.005Dummy7BTC 0.0561 0.014 0.017 0.000DummyChina -0.2751 0.039 0.043 0.000Weekend -0.01698 0.014 0.001 0.222

Model: log-log. R2-value: 0.98865. Currency: Bitcoin. AIC-value: 4236.0. Generalinformation: The first LS-regression.

Table 22: Regression with the first reduction for BTC

Covariate Estimate Std.Error Eta.sq p-valueIntercept -27.3 4.4 0.044 0.000V olBTC -0.0296 0.015 0.006 0.042logGoogle 0.6432 0.019 0.593 0.000logNumBTC 1.481 0.29 0.024 0.000logUnicBTC 0.2843 0.051 0.029 0.000logJapanUS -1.942 0.21 0.067 0.000logProdBTC -0.1645 0.014 0.151 0.000logTransBTC 0.1477 0.039 0.012 0.0001logGold -2.3832 0.14 0.234 0.000logBrentOil 0.8046 0.070 0.099 0.000logSP500 3.529 0.28 0.120 0.000Dummy365BTC 0.1414 0.036 0.019 0.0001Dummy30BTC 0.04015 0.015 0.007 0.007Dummy7BTC 0.0543 0.014 0.016 0.0001DummyChina -0.2583 0.031 0.057 0.000

Model: log-log. R2-value: 0.98863. Currency: Bitcoin. AIC-value: 4234.0. Generalinformation: The results after the first reduction. This is the best model for Bitcoin.

41

Page 52: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.1 Bitcoin Appendix 8 APPENDIX

Table 23: Regression with the second reduction for BTC

Covariate Estimate Std.Error Eta.sq p-valueIntercept -29.0 4.3 0.051 0.000logGoogleBTC 0.6263 0.018 0.625 0.000logNumBTC 1.575 0.29 0.027 0.000logUnicBTC 0.2698 0.053 0.026 0.000logJapanUS -2.007 0.21 0.071 0.000logProdBTC -0.1680 0.014 0.158 0.000logTransBTC 0.1543 0.039 0.013 0.0001logBTC -2.403 0.14 0.237 0.000logBrentOil 0.8413 0.069 0.111 0.000logSP500 3.618 0.29 0.126 0.000Dummy365BTC 0.1338 0.036 0.017 0.0002Dummy30BTC 0.04043 0.0148 0.007 0.006Dummy7BTC 0.0553 0.014 0.016 0.0001DummyChina -0.2651 0.032 0.060 0.000

Model: log-log. R2-value: 0.98857. Currency: Bitcoin. AIC-value: 4238.4. Generalinformation: The results after the second reduction.

Table 24: Wu-Hausmann test for BTC

Covariate Estimate Std.Error Eta.sq p-valueIntercept -27.3 4.4 0.044 0.000WU.res -0.0296 0.015 0.007 0.042V olBTC 0.6432 0.019 0.593 0.000logGoogleBTC 1.481 0.29 0.024 0.000logNumBTC 0.2843 0.051 0.029 0.000logUnicBTC -1.942 0.21 0.067 0.000logJapanUS 0.8046 0.069 0.099 0.000logBrentOil -0.1645 0.014 0.151 0.000logProdBTC 0.14769 0.039 0.012 0.0001logTransBTC -2.383 0.14 0.234 0.000logGold 3.529 0.28 0.112 0.000logSP500 0.1414 0.036 0.019 0.0001Dummy365BTC 0.0402 0.015 0.007 0.007Dummy30BTC 0.0543 0.014 0.016 0.0001Dummy7BTC -0.2583 0.031 0.057 0.000

Model: log-log. Currency: Bitcoin. General information: The result from theWu-Hausmann test.

42

Page 53: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.2 XRP Appendix 8 APPENDIX

Table 25: TSLS regression for BTC

Covariate Estimate Std.Error Eta.sq p-valueIntercept -23.7 4.5 0.030 0.000logGoogleBTC 0.5278 0.044 0.156 0.000V olBTC 0.0059 0.022 0.0001 0.786logNumBTC 1.188 0.30 0.014 0.0001logUnicBTC 0.4326 0.074 0.037 0.000logJapanUS -1.403 0.27 0.023 0.000logProdBTC -0.1616 0.014 0.140 0.000logTransBTC 0.1227 0.042 0.008 0.004logGold -2.561 0.15 0.227 0.000logBrentOil 0.9526 0.090 0.100 0.000logSP500 3.273 0.30 0.095 0.000Dummy365BTC 0.1912 0.039 0.028 0.000Dummy30BTC 0.0520 0.015 0.011 0.0007Dummy7BTC 0.0531 0.014 0.0146 0.0001DummyChina -0.2428 0.035 0.048 0.000

Model: log-log. R2-value: 0.9881. Currency: Bitcoin. General information: The resultfrom the TSLS regression.

8.2 XRP Appendix

Table 26: The initial log-log regression for XRP

Covariate Estimate Std.Error t-value p-valueIntercept 114.7 7.4 15.5 < 2e-16logV alBTC 0.0802 0.079 1.0 0.308logGoogleXRP 0.3530 0.053 6.6 6e-11logBrentOil -3.678 0.26 -14.0 < 2e-16logEUUS 12.774 1.17 11.0 < 2e-16logGold -8.162 0.39 -21.0 < 2e-16logSP500 -6.566 0.76 -8.6 < 2e-16Dummy7XRP 0.0057 0.039 0.1 0.882Dummy30XRP 0.1222 0.041 3.0 0.003V olXRP -0.00176 0.00086 -2.0 0.042

Model: log-log. R2-value: 0.6260. Currency: Ripple. General information: The initiallog-log regression.

43

Page 54: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.2 XRP Appendix 8 APPENDIX

Table 27: Bench mark model for XRP

Covariate Estimate Std.Error Eta.sq p-valueIntercept 114.7 6.9 0.262 0.000logV alBTC 0.0802 0.085 0.002 0.347logGoogleXRP 0.3530 0.054 0.061 0.000logBrentOil -3.678 0.22 0.223 0.000logEUUS 12.77 1.1 0.150 0.000logGold -8.162 0.42 0.393 0.000logSP500 -6.566 0.76 0.098 0.000Dummy7XRP 0.0057 0.039 0.00003 0.882Dummy30XRP 0.1222 0.037 0.013 0.001V olXRP -0.00176 0.0010 0.006 0.086

Model: log-log. R2-value: 0.6262. Currency: Ripple. AIC-value: 3493.6. Generalinformation: The benchmark model for the reduction.

Table 28: Final regression for XRP

Covariate Estimate Std.Error Eta.sq p-valueIntercept 109.4 4.5 0.378 0.000logGoogleXRP 0.3965 0.032 0.182 0.000logBrentOil -3.598 0.21 0.232 0.000logEUUS 13.116 1.01 0.171 0.000logGold -8.167 0.42 0.395 0.000logSP500 -5.882 0.29 0.264 0.000Dummy30XRP 0.1256 0.036 0.015 0.0004V olXRP -0.00177 0.0010 0.006 0.079

Model: log-log. R2-value: 0.6256. Currency: Ripple. AIC-value: 3490.7. Generalinformation: The best model for XRP.

44

Page 55: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.3 Litecoin Appendix 8 APPENDIX

8.3 Litecoin Appendix

Table 29: The initial log-log regression for LTC

Covariate Estimate Std.Error t-value p-valueIntercept 28.4 3.4 8.3 2e-15logV alBTC 0.8948 0.053 16.9 < 2e-16logGoogleLTC 0.3862 0.030 12.7 < 2e-16logBrentOil 0.3191 0.053 6.1 3e-09logJapanUS -0.723 0.18 -3.9 0.0001logGold -0.669 0.27 -2.5 0.014logSP500 -3.465 0.51 -6.8 4e-11Dummy7LTC 0.0683 0.018 3.7 0.0002Dummy30LTC 0.0789 0.024 3.2 0.001V olLTC -0.003587 0.00070 -5.1 4-07

Model: log-log. Currency: Litecoin. General information: The initial log-logregression.

Table 30: The final model for LTC

Covariate Estimate Std.Error Eta.sq p-valueIntercept 28.4 4.3 0.162 0.000logV alBTC 0.8948 0.061 0.445 0.000logGoogleLTC 0.3862 0.036 0.312 0.000logBrentOil 0.3191 0.058 0.094 0.000logJapanUS -0.723 0.16 0.042 0.000logGold -0.669 0.22 0.017 0.003logSP500 -3.465 0.63 0.115 0.000Dummy7LTC 0.0683 0.017 0.037 0.0001Dummy30LTC 0.0789 0.019 0.030 0.0001V olLTC -0.003587 0.00065 0.070 0.000

Model: log-log. R2-value: 0.950. Currency: Litecoin. AIC-value: 812.1. Generalinformation:The best model for LTC.

45

Page 56: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.4 Stock Appendix 8 APPENDIX

8.4 Stock Appendix

Table 31: The Genworth regression

Covariate Estimate Std.Error Eta.sq p-valueIntercept -2.752 1.97 0.002 0.162Google 0.04144 0.0058 0.052 0.000V olume -0.05638 0.0010 0.038 0.000Dummy7 0.185 0.10 0.003 0.066Dummy30 0.356 0.11 0.011 0.001V olatiltiy -0.759 0.25 0.016 0.002JapanUS 0.0490 0.015 0.010 0.001SP500 0.004993 0.00061 0.067 0.000Gold -0.012991 0.00054 0.354 0.000BrentOil 0.16835 0.0044 0.527 0.000

Model: regular. R2-value: 0.859. Stock: Genworth. General information: The bestmodel according to AIC.

Table 32: The Clorox regression

Covariate Estimate Std.Error Eta.sq p-valueIntercept -1.492 0.20 0.058 0.000logGoogle 0.02619 0.0078 0.013 0.001Dummy7 0.00805 0.0018 0.019 0.000Dummy30 0.02428 0.0018 0.140 0.000logJapanUS 0.6458 0.024 0.386 0.000logSP500 0.3205 0.015 0.263 0.000logGold 0.1046 0.016 0.049 0.000logBrentOil -0.06260 0.0057 0.077 0.000

Model: log-log. R2-value: 0.962. Stock: Clorox. General information: The best modelaccording to AIC.

Table 33: The Dentsply regression

Covariate Estimate Std.Error Eta.sq p-valueIntercept 7.557 1.45 0.020 0.000Google 0.02406 0.0044 0.029 0.000Dummy7 0.5125 0.083 0.037 0.000Dummy30 1.3687 0.085 0.200 0.000JapanUS 0.1342 0.011 0.100 0.000SP500 0.012925 0.00046 0.408 0.000Gold 0.000926 0.00040 0.004 0.022BrentOil -0.02357 0.0038 0.033 0.000

Model: regular. R2-value: 0.945. Stock: Dentsply. General information: The bestmodel according to AIC.

46

Page 57: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.4 Stock Appendix 8 APPENDIX

Table 34: The Equifax regression

Covariate Estimate Std.Error Eta.sq p-valueIntercept -47.8 4.1 0.15966 0.000GOOGLE -0.09246 0.0095 0.082 0.000V olume 0.884 0.24 0.012 0.000Dummy30 1.835 0.18 0.087 0.000V olatiltiy -1.578 0.25 0.028 0.000JapanUS 0.2081 0.030 0.055 0.000SP500 0.048213 0.00097 0.668 0.000Gold 0.01167 0.0010 0.129 0.000BrentOil -0.04498 0.0098 0.027 0.000

Model: regular. R2-value: 0.971. Stock: Equifax. General information: The bestmodel according to AIC.

Table 35: The Mylan regression

Covariate Estimate Std.Error Eta.sq p-valueIntercept -9.369 0.38 0.338 0.000logGoogle -0.1925 0.030 0.038 0.000logV olume 0.04004 0.0050 0.062 0.000Dummy30 0.02515 0.0048 0.027 0.000logSP500 1.96952 0.026 0.841 0.000logGold -0.2048 0.032 0.035 0.000

Model: log-log. R2-value: 0.966. Stock: Mylan. General information: The best modelaccording to AIC.

Table 36: The Stericycle regression

Covariate Estimate Std.Error Eta.sq p-valueIntercept 33.49 4.9 0.054 0.000Google -0.0507 0.014 0.015 0.0002V olume -1.823 0.54 0.012 0.001Dummy7 1.139 0.23 0.026 0.000Dummy30 1.872 0.24 0.060 0.000V olatiltiy -0.970 0.24 0.007 0.000JapanUS 0.5121 0.034 0.181 0.000SP500 0.02434 0.0012 0.249 0.000Gold -0.00837 0.0013 0.046 0.000Brentoil -0.0197 0.010 0.003 0.061

Model: regular. R2-value: 0.955. Stock: Stericycle. General information: The bestmodel according to AIC.

47

Page 58: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.5 General 8 APPENDIX

8.5 General

48

Page 59: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.5 General 8 APPENDIX

Figu

re18:The

BTC

specificvaria

bles

norm

alized

andplottedin

thesamegrap

hover

thewho

leexam

ined

timeforBTC.

49

Page 60: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.5 General 8 APPENDIX

Figu

re19:The

varia

bleGoogle

andthevalueof

oneun

itof

BTC

norm

alized

totheinitial

valueplottedover

thewho

leexam

ined

time.

50

Page 61: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

8.6 Data sources 8 APPENDIX

8.6 Data sourcesGoogle trend indexAll Google trend index data were gathered from http://www.google.se/trends/ with thespecific search word.Taken: 30 January 2015 (Bitcoin), 19 February 2015 (Ripple), 16 February 2015 (Lite-coin), 14 April 2015 (Stocks).

Litecoin till BitcoinFrom: http://bitcoincharts.com/charts/krakenLTC#rg730zigDailyztgCzm1g10zm2g25.Taken: 11 February 2015.

XRP to BitcoinFrom: http://bitcoincharts.com/charts/krakenLTC/krakenXRP#rg730zigDailyztgCzm1g10zm2g25.Taken: 11 February 2015.

Bitcoin till USDFrom: http://bitcoincharts.com/charts/krakenUSD#rg730zigDailyzczsg2014-01-07zeg2015-02-10ztgCzm1g10zm2g25.Taken: 28 January 2015.

Stock dataAll stock data were collected from https://www.google.com/finance/ with the specificstock name.All data were taken 14 April 2015.

General dataThe data about Gold price, SP500, Japan/US exchange rate, EU/US exchangerate, UK/US exchange rate, Nikkei 225 and Brent Oil price were collected fromhttps://research.stlouisfed.org/.Taken 10 February 2015.

Bitcoin specific dataAll Bitcoin specific data were collected from https://blockchain.info/.Taken: 28 January 2015.

51

Page 62: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

BIBLIOGRAPHY BIBLIOGRAPHY

Bibliography[1] Amadeo, K. What is the s and p 500?, April 2015.

[2] Bie, N. Slängde hårddisk värd 50 miljoner på soptippen, May 2015.

[3] Burnham, K. P., A. D. R. Model Selection and Multimodel Inference: PracticalInformation-Theoretic Approach, 2nd ed. ISBN 0-387-95364-7. Springer-Verlag,2002.

[4] cameron.econ.ucdavis.edu. Instrumental variable, March 2015.

[5] coinmarketcap.com. Crypto-currency market capitalizations, April 2015.

[6] Desjardins, J. The definitive history of bitcoin, March 2015.

[7] econweb.ucsd.edu. More on multicollinearity (mc), March 2015.

[8] Espinoza, J. Hausman test for endogeneity: Parents education as iv for offspringeducation-transmission of inate ability, August 2010.

[9] finance.mapsofworld.com. Foreign exchange market japan, April 2015.

[10] Foundation, B. Bitcoin developer guide, March 2015.

[11] Frost, J. Regression analysis: How do i interpret r-squared and assess thegoodness-of-fit?, April 2015.

[12] Graydon, C. What is cryptocurrency?, March 2015.

[13] Harri Merisaari, Jambor I, V. O. Akaike information criterium (aic) in modelselection, March 2015.

[14] http://www.statoek.wiso.uni goettingen.de/. Bayesian information crite-rion, akaike information criterion, adjusted r2, March 2015.

[15] investing.com. Clorox company (clx), April 2015.

[16] investing.com. Dentsply international inc (xray), April 2015.

[17] investing.com. Equifax inc (efx), April 2015.

[18] investing.com. Genworth financial inc (gnw), April 2015.

[19] investing.com. Mylan inc (myl), April 2015.

[20] investing.com. Stericycle inc (srcl), April 2015.

[21] investopedia.com. Volatility, April 2015.

[22] Jackson, B. Beyond bitcoin: Top 5 cryptocurrencies by market cap, March 2015.

[23] John J. Dziak, Donna L. Coffman, S. T. L. R. L. Sensi-tivity and specificity of information criteria. The Methodology Center .http://methodology.psu.edu/media/techreports/12-119.pdf.

[24] Kephart, C. Interpret regression coefficient estimates, February 2015.

[25] Lang, H. Elements of regression analysis, November 2014.

[26] Lewis, A. Ripple explained: Medieval banking with a digital twist, March 2015.

52

Page 63: Explaining the market price of Bitcoin and other Cryptocurrencies with Statistical ...814478/... · 2015. 5. 27. · rency data from January 2012 to January 2015. ... för kryptovalutor

BIBLIOGRAPHY BIBLIOGRAPHY

[27] Litecoin.info. Litecoin, March 2015.

[28] mathworld.wolfram.com. F distribution, April 2015.

[29] Pedace, R. The role of the breusch-pagan test in econometrics, March 2015.

[30] Smarty, A. Understanding google insights, March 2015.

[31] Songsiri, J. Model selection and model validation, March 2015.

[32] statstodo.com. F test explained, April 2015.

[33] StatTrek.com. What is hypothesis testing?, March 2015.

[34] SVD.se. Bitcoin till stockholmsbörsen, May 2015.

[35] Taylor, C. What is the difference between type i and type ii errors?, March 2015.

[36] uccs.edu. Effect size (es), April 2015.

[37] ucl.ac.uk. Endogeneity, March 2015.

[38] Watson, P. Rules of thumb on magnitudes of effect sizes, April 2015.

53