exponential kernels with latency in hawkes processes

22
Exponential Kernels with Latency in Hawkes Processes: Applications in Finance Marcos Costa Santos Carreira * Jan-2021 Abstract The Tick library [tick] allows researchers in market microstructure to simulate and learn Hawkes process in high-frequency data, with optimized parametric and non-parametric learners. But one challenge is to take into account the correct causal- ity of order book events considering latency: the only way one order book event can inļ¬‚uence another is if the time diļ¬€erence between them (by the central order book timestamps) is greater than the minimum amount of time for an event to be (i) pub- lished in the order book, (ii) reach the trader responsible for the second event, (iii) inļ¬‚uence the decision (processing time at the trader) and (iv) the 2nd event reach the order book and be processed. For this we can use exponential kernels shifted to the right by the latency amount. We derive the expression for the log-likelihood to be minimized for the 1-D and the multidimensional cases, and test this method with simulated data and real data. On real data we ļ¬nd that, although not all decays are the same, the latency itself will determine most of the decays. We also show how the decays are related to the latency. Code is available on GitHub at https://github.com/MarcosCarreira/Hawkes-With-Latency. Keywords : Hawkes processes; Kernel estimations; High-frequency; Order book dynamics; Market microstructure; Python * Ɖcole Polytechnique, CMAP - PhD under the Quantitative Regulation chair 1 arXiv:2101.06348v1 [stat.ML] 16 Jan 2021

Upload: others

Post on 08-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exponential Kernels with Latency in Hawkes Processes

Exponential Kernels with Latency in HawkesProcesses: Applications in Finance

Marcos Costa Santos Carreiraāˆ—

Jan-2021

Abstract

The Tick library [tick] allows researchers in market microstructure to simulateand learn Hawkes process in high-frequency data, with optimized parametric andnon-parametric learners. But one challenge is to take into account the correct causal-ity of order book events considering latency: the only way one order book event caninfluence another is if the time difference between them (by the central order booktimestamps) is greater than the minimum amount of time for an event to be (i) pub-lished in the order book, (ii) reach the trader responsible for the second event, (iii)influence the decision (processing time at the trader) and (iv) the 2nd event reachthe order book and be processed. For this we can use exponential kernels shiftedto the right by the latency amount. We derive the expression for the log-likelihoodto be minimized for the 1-D and the multidimensional cases, and test this methodwith simulated data and real data. On real data we find that, although not alldecays are the same, the latency itself will determine most of the decays. We alsoshow how the decays are related to the latency. Code is available on GitHub athttps://github.com/MarcosCarreira/Hawkes-With-Latency.

Keywords: Hawkes processes; Kernel estimations; High-frequency; Order bookdynamics; Market microstructure; Python

āˆ—Ɖcole Polytechnique, CMAP - PhD under the Quantitative Regulation chair

1

arX

iv:2

101.

0634

8v1

[st

at.M

L]

16

Jan

2021

Page 2: Exponential Kernels with Latency in Hawkes Processes

1 IntroductionIt is well known that markets function in response both to external (or exogenous)

information - e.g. news - and internal (endogenous) information, like the behavior ofmarket participants and the patterns of price movements. The systematization of markets(synthesized by the Central Limit Order Book) led to two important developments onthese responses: order book information is now available in an organized and timelyform (which enables the study of intraday patterns in historical data) and algorithmscan be deployed to process all kinds of information fast enough to trade (even before theinformation is received or processed by all the market participants).

That combination increases the importance of understanding how endogenous tradinginteracts with itself: activity in the order book leads to activity from fast traders whichleads to more activity and so on. While the arrival of orders in an order book due toexogenous information can be described as a Poisson process, the interaction of the agentscorrespond to a self-excitatory process, which can be described by Hawkes processes. Theimportance of Hawkes processes in finance has been described quite well in [Bacry at al2015], and recent work by [Euch et al 2016] has established the connection between Hawkesprocesses, stylized market microstructure facts and rough volatility. More recently, [Bacryet al 2019] improve the Queue-Reactive model by [Huang et al 2015] by adding Hawkescomponents to the arrival rates.

We start focusing on [Bacry at al 2015], which apply Hawkes processes learners asimplemented in the Tick library [tick] to futures. There are two limitations in theseapplications: for parametric learners the decay(s) must be given and be the same for allkernels, and influences are assumed to be instantaneous. In this short paper we showhow to code a parametric learner for an exponential kernel (or kernels in the multivariatecase) in which we learn the decay and consider a latency (given). We wonā€™t revisit themathematical details for the known results; for this, we refer the reader to [Toke 2011]and [Abergel et al 2016]. One of the goals is to show a working Python script optimizedat the most by [Numba]; this can be optimized for improved performance. The othergoal is to understand what the learned parameters of the kernels mean given the tradingdynamics of the BUND futures.

2 One-dimensional Hawkes processes

2.1 Definition

As explained in [Toke 2011], a Hawkes process with an exponential kernel has itsintensity described by:

Ī» (t) = Ī»0 (t) +āˆ‘ti<t

Pāˆ‘j=1

[Ī±j Ā· exp (āˆ’Ī²j Ā· (tāˆ’ ti))] (1)

For P = 1 and Ī»0 (t) = Ī»0 :

Ī» (t) = Ī»0 +āˆ‘ti<t

[Ī± Ā· exp (āˆ’Ī² Ā· (tāˆ’ ti))] (2)

Page 2 of 22

Page 3: Exponential Kernels with Latency in Hawkes Processes

With stationarity condition:

Ī±

Ī²< 1 (3)

And unconditional expected value of intensity:

E [Ī» (t)] =Ī»0

1āˆ’ Ī±Ī²

(4)

2.2 Maximum-likelihood estimation

Following [Toke 2011]:

LL =

Tw

0

(1āˆ’ Ī» (s)) ds+

Tw

0

(ln (Ī» (s))) dN (s) (5)

Which is computed as:

LL = tN āˆ’Tw

0

Ī» (s) ds+Nāˆ‘i=1

ln

[Ī»0 (ti) +

Pāˆ‘j=1

iāˆ’1āˆ‘k=1

Ī±j Ā· exp (āˆ’Ī²j Ā· (ti āˆ’ tk))

](6)

Defining the function:

Rj (i) =iāˆ’1āˆ‘k=1

exp (āˆ’Ī²j Ā· (ti āˆ’ tk)) (7)

The recursion observed by [Ogata et al 1981] is:

Rj (i) = exp (āˆ’Ī²j Ā· (ti āˆ’ tiāˆ’1)) Ā· (1 +Rj (iāˆ’ 1)) (8)

And therefore the log-likelihood is:

LL = tN āˆ’Tw

0

Ī»0 (s) dsāˆ’Nāˆ‘i=1

Pāˆ‘j=1

[Ī±jĪ²j

(1āˆ’ exp (āˆ’Ī²j Ā· (tN āˆ’ ti)))]

+Nāˆ‘i=1

ln

[Ī»0 (ti) +

Pāˆ‘j=1

Ī±j Ā·Rj (i)

](9)

For P = 1 and Ī»0 (t) = Ī»0 :

R (i) = exp (āˆ’Ī² Ā· (ti āˆ’ tiāˆ’1)) Ā· (1 +R (iāˆ’ 1)) (10)

And:

LL = tn āˆ’ Ī»0 Ā· T āˆ’Nāˆ‘i=1

[Ī±

Ī²(1āˆ’ exp (āˆ’Ī² Ā· (tN āˆ’ ti)))

]

+Nāˆ‘i=1

ln [Ī»0 + Ī± Ā·R (i)] (11)

Page 3 of 22

Page 4: Exponential Kernels with Latency in Hawkes Processes

2.3 Simulation and learning

Using tick we simulate 100 paths for different end times (100, 1000, 10000) for boththe builtin ExpKernel and for a user-defined Time Function that mirrors an ExpKernel :

Algorithm 1 Definitions of Exponential Kernels (builtin and custom) and simulationInitialize {Ī»0, Ī±, Ī²}Define the kernel function f (Ī±, Ī², t)Define the support sBuild the TimeFunction with:

n + 1 steps for the time interval {0, s}n + 1 evaluations of f for these pointsan interpolation mode (e.g. TimeFunction.InterConstRight )

Build a Kernel with HawkesKernelTimeFuncDefine the builtin HawkesKernelExp with parameters

{Ī±Ī², Ī²}

Define the number of simulations NS

Define the end time TBuild the SimuHawkes object with Ī»0, a kernel and TBuild the SimuHawkesMulti object with SimuHawkes and NS

Simulate SimuHawkesMulti

We use minimize from [scipy] with method ā€™SLSQPā€™; as pointed out in [Kisel 2017],it can handle both bounds (ensuring positivity of parameters) and constraints (ensuringthe stationarity condition). The differences should be calculated at the start of the opti-mization process and passed to the likelihood function instead of being calculated insidethe likelihood function.

Page 4 of 22

Page 5: Exponential Kernels with Latency in Hawkes Processes

Algorithm 2 Negative Log-Likelihood and its minimization with scipyā€™s minimizedef ll (Īø, ts,āˆ†ts, Ī“ts):

{Ī»0, Ī±, Ī²} =Īø parameters are defined as one argument for the optimizationts is one of the SimuHawkesMulti.timestampsāˆ†ts is tsi āˆ’ tsiāˆ’1

Ī“ts is tsN āˆ’ tsiāˆ’1

Define R as an array of zeros with the same size as tsKeep its first element R1 = 0Calculate Ri =exp (āˆ’Ī² Ā·āˆ†tsiāˆ’1) Ā· (1 +Riāˆ’1) recursively for i > 1Return:āˆ’[tsN Ā· (1āˆ’ Ī»0)āˆ’

(Ī±Ī²

)Ā·āˆ‘

Ī“ts (1āˆ’ exp (āˆ’Ī² Ā· Ī“ts)) +āˆ‘

Ri(log (Ī»0 + Ī± Ā·Ri))

]----------------------------------------------------------------------------Minimize ll (Īø, ts,āˆ†ts, Ī“ts) with:

bounds on Ī»0, Ī±, Ī² (all must be positive) andconstraint Ī± < Ī²

2.4 Kernels with latency

For P = 1 and Ī»0 (t) = Ī»0 , with latency Ļ„:

Ī» (t) = Ī»0 +āˆ‘ti<tāˆ’Ļ„

[Ī± Ā· exp (āˆ’Ī² Ā· (tāˆ’ Ļ„ āˆ’ ti))] (12)

We can use the same (exponential) function and shift it to the right by the latency:

Page 5 of 22

Page 6: Exponential Kernels with Latency in Hawkes Processes

Algorithm 3 Exponential Kernel with latencyInitialize {Ī»0, Ī±, Ī²}Define the kernel function f (Ī±, Ī², t)Define the support sBuild the TimeFunction with:

n + 1 steps for the time interval {0, s}n + 1 evaluations of f for these pointsan interpolation mode (e.g. TimeFunction.InterConstRight )If latency Ļ„ > 0:

shift {0, s} and f ({0, s}) to the right by Ļ„add zeros to the interval {0, Ļ„}

Build a Kernel with HawkesKernelTimeFuncDefine the number of simulations NS

Define the end time TBuild the SimuHawkes object with Ī»0, the kernel and TBuild the SimuHawkesMulti object with SimuHawkes and NS

Simulate SimuHawkesMulti

The new likelihoods are defined (for P = 1 and Ī»0 (t) = Ī»0) as:

RĻ„ (i) = RĻ„ (iāˆ’ 1) Ā· exp (āˆ’Ī² Ā· (ti āˆ’ tiāˆ’1)) (13)

+

{āˆ‘tk

[exp (āˆ’Ī² Ā· (ti āˆ’ Ļ„ āˆ’ tk))] tiāˆ’1 āˆ’ Ļ„ ā‰¤ tk < ti āˆ’ Ļ„0 otherwise

(14)

With RĻ„ (1) = 0, and:

LLĪ“ = tN āˆ’ Ī»0 Ā· tN āˆ’āˆ‘ti<tāˆ’Ļ„

[Ī±

Ī²(1āˆ’ exp (āˆ’Ī² Ā· (tN āˆ’ Ļ„ āˆ’ ti)))

]

+nāˆ‘i=1

ln [Ī»0 + Ī± Ā·RĻ„ (i)] (15)

Even with the latency Ļ„ = 0, for each ti we pass tiāˆ’1. But this enough to slow downthe optimization, as observed while testing that the algorithm returns the same resultsas the previous one for latency zero and seen in Table 1.

Page 6 of 22

Page 7: Exponential Kernels with Latency in Hawkes Processes

Algorithm 4 Log-likelihood with latencydef lllat (Īø, t,āˆ†t, Ī“t, Ļ„ t):

{Ī»0, Ī±, Ī²} =Īø parameters are defined as one argument for the optimizationt is one of the SimuHawkesMulti.timestampsāˆ†t is the series ti āˆ’ tiāˆ’1 (excluding t1)Ī“t is the series tN āˆ’ Ļ„ āˆ’ tiāˆ’1

Ļ„t is the series of arrays ti āˆ’ Ļ„ āˆ’tk for all tk such that tiāˆ’1 āˆ’ Ļ„ ā‰¤ tk < ti āˆ’ Ļ„Define R as an array of zeros with the same size as tsKeep its first element R1 = 0Calculate Ri = exp (āˆ’Ī² Ā·āˆ†tiāˆ’1) Ā·Riāˆ’1 +āˆ‘

Ļ„ti(exp (āˆ’Ī² Ā· Ļ„ti)) recursively for i > 1

Return:āˆ’[tN Ā· (1āˆ’ Ī»0)āˆ’

(Ī±Ī²

)Ā·āˆ‘

Ī“t (1āˆ’ exp (āˆ’Ī² Ā· Ī“t)) +āˆ‘

Ri(log (Ī»0 + Ī± Ā·Ri))

]----------------------------------------------------------------------------Minimize lllat (Īø, t,āˆ†t, Ī“t, Ļ„ t) with:

bounds on Ī»0, Ī±, Ī² (all must be positive) andconstraint Ī± < Ī²

We obtain the following results for 100 paths (parameters Ī»0 = 1.20 , Ī± = 0.60 andĪ² = 0.80) with the total running times in seconds (on a MacBook Pro 16-inch 2019, 2.4GHz 8-core Intel Core i9, 64 GB 2667 MHz DDR4 RAM):

End Time Kernel Algorithm runtime (s) Ī»0 Ī± Ī²

100 Builtin ll 0.535 1.40 0.59 0.87100 Builtin lllat 1.49 1.40 0.59 0.87100 Custom ll 0.521 1.41 0.63 0.88100 Custom lllat 1.6 1.41 0.63 0.881000 Builtin ll 0.992 1.22 0.60 0.801000 Builtin lllat 11.3 1.22 0.60 0.801000 Custom ll 1.13 1.23 0.63 0.811000 Custom lllat 12.8 1.23 0.63 0.8110000 Builtin ll 6.02 1.20 0.60 0.8010000 Builtin lllat 143. 1.20 0.60 0.8010000 Custom ll 7.29 1.20 0.62 0.8010000 Custom lllat 181. 1.20 0.62 0.80

Table 1: Learning results and times

With latency Ļ„ = 2, we find that we can still recover the parameters of the kernelquite well (still 100 paths):

Page 7 of 22

Page 8: Exponential Kernels with Latency in Hawkes Processes

End Time runtime Ī»0 Ī± Ī²

100 1.76 1.41 0.59 0.811000 15.6 1.26 0.62 0.8010000 223. 1.21 0.61 0.79

Table 2: Learning results and times for latency Ļ„ = 2

3 Multidimensional Hawkes

3.1 No latency

For P = 1 and Ī»0 (t) = Ī»0 :

Ī»m (tm) = Ī»m0 +Māˆ‘n=1

āˆ‘tni <t

m

[Ī±m,n Ā· exp (āˆ’Ī²m,n Ā· (tm āˆ’ tni ))] (16)

With the recursion now:

Rm,n (i) = Rm,n (iāˆ’ 1)Ā·exp(āˆ’Ī²m,n Ā·

(tmi āˆ’ tmiāˆ’1

))+

āˆ‘tmiāˆ’1ā‰¤tnk<t

mi

exp (āˆ’Ī²m,n Ā· (tmi āˆ’ tnk)) (17)

And log-likelihood for each node m:

LLm = tmN Ā· (1āˆ’ Ī»m0 )āˆ’Māˆ‘n=1

āˆ‘tni <t

mN

[Ī±m,n

Ī²m,n(1āˆ’ exp (āˆ’Ī²m,n Ā· (tmN āˆ’ tni )))

]

+āˆ‘tni <t

mN

ln

[Ī»m0 +

Māˆ‘n=1

Ī±m,n Ā·Rm,n (i)

](18)

3.2 Latency

For P = 1 and Ī»0 (t) = Ī»0 :

Ī»m (tm) = Ī»m0 +Māˆ‘n=1

āˆ‘tni <t

māˆ’Ļ„

[Ī±m,n Ā· exp (āˆ’Ī²m,n Ā· (tm āˆ’ Ļ„ āˆ’ tni ))] (19)

With the recursion now:

Rm,n (i) = Rm,n (iāˆ’ 1)Ā·exp(āˆ’Ī²m,n Ā·

(tmi āˆ’ tmiāˆ’1

))+

āˆ‘tmiāˆ’1āˆ’Ļ„ā‰¤tnk<t

mi āˆ’Ļ„

exp (āˆ’Ī²m,n Ā· (tmi āˆ’ Ļ„ āˆ’ tnk))

(20)And log-likelihood for each node m:

Page 8 of 22

Page 9: Exponential Kernels with Latency in Hawkes Processes

LLm = tmN Ā· (1āˆ’ Ī»m0 )āˆ’Māˆ‘n=1

āˆ‘tni <t

mNāˆ’Ļ„

[Ī±m,n

Ī²m,n(1āˆ’ exp (āˆ’Ī²m,n Ā· (tmN āˆ’ Ļ„ āˆ’ tni )))

]

+āˆ‘

tni <tmNāˆ’Ļ„

ln

[Ī»m0 +

Māˆ‘n=1

Ī±m,n Ā·Rm,n (i)

](21)

Now there is a choice on how to efficiently select tnk on the recursion: either find thecorresponding arrays (which could be empty) for each

{tmiāˆ’1, t

mi

}pair with numpy.extract

or find the appropriate tmi for each tnk using numpy.searchsorted and building a dictionary;we chose the latter.

To optimize it further all the arrays are padded so all accesses are on numpy.ndarrays(more details on the code - also see [numpy]).

Page 9 of 22

Page 10: Exponential Kernels with Latency in Hawkes Processes

Algorithm 5 Log-likelihood with latency - multidimensional for one time series mdef lllatm (Īø,m, nts, tmN ,āˆ†t, Ī“t, Ļ„ t):

{Ī»01 , Ī»02 , . . . , Ī»0M , Ī±

1,1, Ī±1,2, . . . , Ī±1,M , . . . , Ī±M,M , Ī²1,1, Ī²1,2, . . . , Ī²1,M , . . . , Ī²M,M}

=Īøparameters are defined as one argument for the optimizationm is the index of the time series in the timestamps objectnts is the array of lengths of all the time series of

the timestamps object

tmN is the last timestamp in all of the time series of

the timestamps object

āˆ†t is the collection of series tmi āˆ’ tmiāˆ’1 for all 1 ā‰¤ m ā‰¤M including tm1Ī“t is the collection of collections of series tmNāˆ’Ļ„āˆ’tniāˆ’1

for all 1 ā‰¤ m ā‰¤M and 1 ā‰¤ n ā‰¤M

Ļ„t is the collection of collections of series of arrays tmi āˆ’Ļ„ āˆ’ tnk

for all tnk such that tmiāˆ’1 āˆ’ Ļ„ ā‰¤ tnk <tmi āˆ’ Ļ„ for all 1 ā‰¤ m ā‰¤M and 1 ā‰¤ n ā‰¤M

For each n:

Define Rm,n as an array of zeros with the same size as tm

Keep its first element Rm,n1 = 0

Calculate Rm,ni =

exp (āˆ’Ī²m,n Ā·āˆ†tmi ) Ā·Rm,niāˆ’1 +

āˆ‘Ļ„ti

(exp (āˆ’Ī²m,n Ā· Ļ„tm,ni ))

recursively for i > 1

Return:

āˆ’

[tmN Ā· (1āˆ’ Ī»m0 )āˆ’

āˆ‘Ī“t

((Ī±m,n

Ī²m,n

)Ā· (1āˆ’ exp (āˆ’Ī²m,n Ā· Ī“tm,n))

)]

āˆ’

[+āˆ‘Ri

(log (Ī»m0 + Ī±m,n Ā·Rm,ni ))

]----------------------------------------------------------------------------Minimize lllatm (Īø,m, nts, tmN ,āˆ†t, Ī“t, Ļ„ t) with:

bounds on Ī»0, Ī±, Ī² (all must be positive)no constraints for just m

A slower optimization can be done for more than one time series in parallel, with theadvantage of enforcing symmetries on coefficients by re-defining the input Īø with the same

Page 10 of 22

Page 11: Exponential Kernels with Latency in Hawkes Processes

decays for blocks (e.g. on [Bacry et al 2016] instead of 16 different decays we can use 4decays, as described further down the paper); the algorithm below shows the case wherewe optimize it for all the time series, but we can modify it to run on bid-ask pairs of timeseries.

Page 11 of 22

Page 12: Exponential Kernels with Latency in Hawkes Processes

Algorithm 6 Log-likelihood with latency - multidimensional for all time seriesdef lllatall (Īø, nts, tmN ,āˆ†t, Ī“t, Ļ„ t):

{Ī»01 , Ī»02 , . . . , Ī»0M , Ī±

1,1, Ī±1,2, . . . , Ī±1,M , . . . , Ī±M,M , Ī²1,1, Ī²1,2, . . . , Ī²1,M , . . . , Ī²M,M}

=Īøparameters are defined as one argument for the optimizationm is the index of the time series in the timestamps objectnts is the array of lengths of all the time series of

the timestamps object

tmN is the last timestamp in all of the time series of

the timestamps object

āˆ†t is the collection of series tmi āˆ’ tmiāˆ’1 for all 1 ā‰¤ m ā‰¤M including tm1Ī“t is the collection of collections of series tmNāˆ’Ļ„āˆ’tniāˆ’1

for all 1 ā‰¤ m ā‰¤M and 1 ā‰¤ n ā‰¤M

Ļ„t is the collection of collections of series of arrays tmi āˆ’Ļ„ āˆ’ tnk

for all tnk such that tmiāˆ’1 āˆ’ Ļ„ ā‰¤ tnk <tmi āˆ’ Ļ„ for all 1 ā‰¤ m ā‰¤M and 1 ā‰¤ n ā‰¤M

For each m:

For each n:

Define Rm,n as an array of zeros with the same size as tm

Keep its first element Rm,n1 = 0

Calculate Rm,ni = exp (āˆ’Ī²m,n Ā·āˆ†tmi ) Ā·Rm,n

iāˆ’1 +āˆ‘Ļ„ti

(exp (āˆ’Ī²m,n Ā· Ļ„tm,ni ))

recursively for i > 1

Return:

LLm =

[tmN Ā· (1āˆ’ Ī»m0 )āˆ’

āˆ‘Ī“t

((Ī±m,n

Ī²m,n

)Ā· (1āˆ’ exp (āˆ’Ī²m,n Ā· Ī“tm,n))

)]

+

[āˆ‘Ri

(log (Ī»m0 + Ī±m,n Ā·Rm,ni ))

]

Return āˆ’āˆ‘

m (LLm)

----------------------------------------------------------------------------Minimize lllatall (Īø, nts, tmN ,āˆ†t, Ī“t, Ļ„ t) with:

bounds on Ī»0, Ī±, Ī² (all must be positive)inequality constraints: stationarity conditionequality constraints: symmetries on coefficients

Page 12 of 22

Page 13: Exponential Kernels with Latency in Hawkes Processes

3.3 Simulations and learning

With latency Ļ„ = 2 and two kernels we get good results (100 paths, method ā€™SLSQPā€™):

End Time runtime(s) Ī»10 Ī»20 Ī±1,1 Ī±1,2 Ī±2,1 Ī±2,2 Ī²1,1 Ī²1,2 Ī²2,1 Ī²2,2

Simulated 0.6 0.2 0.5 0.7 0.9 0.3 1.4 1.8 2.2 1.0100 7.1 0.68 0.28 0.55 0.73 1.03 0.37 1.87 1.93 2.44 2.651000 67. 0.63 0.21 0.52 0.75 0.98 0.31 1.44 1.78 2.18 1.0510000 804. 0.60 0.20 0.52 0.75 0.97 0.31 1.37 1.76 2.12 1.00

Table 3: Learning results and times for latency Ļ„ = 2

4 Results on real data

4.1 Estimation of Slowly Decreasing Hawkes Kernels: Applica-tion to High Frequency Order Book Dynamics

Here one must consider the latency of 250Āµs (as informed in [Eurex 2017]):

Figure 1: Eurex Latency

Page 13 of 22

Page 14: Exponential Kernels with Latency in Hawkes Processes

We have available data for 20 days (Apr-2014) on events P and T, from 8h to 22h (14hours on each day), as described in https://github.com/X-DataInitiative/tick-datasets/tree/master/hawkes/bund. Because in some days we have late starts, and because liq-uidity late at night is not better, we consider events only from 10h to 18h, which representmore than 75% of all events (daily counts available on GitHub).

To get a better idea of the time scales involved, we calculate the descriptive statisticsof each time series:

Pu Pd Ta Tb

n (events) 4569 4558 11027 12811āˆ†t10% (Ī¼s) 196 196 147 229āˆ†t15% (Ī¼s) 241 242 317 471āˆ†t25% (Ī¼s) 342 340 1890 11756

āˆ†t50% (ms) 14.1 13.0 159.7 183.2āˆ†t75% (s) 2.5 2.4 1.9 1.5āˆ†t90% (s) 16.9 17.1 7.5 6.3

āˆ†t (s) 6.3 6.3 2.6 2.2

Table 4: Statistics for the time intervals within each series

And we can see that between 10% and 15% of events within each time series happenless than 250Āµs after the previous event, and therefore should not be considered as beingcaused by that previous event. We can also see that the average time interval is consid-erably higher than the median time interval, which means that this distribution is morefat-tailed than an exponential distribution.

We run optimizations for the sum of (Bid+Ask) log-likelihoods where the kernel in-tensities are the same within diagonals: Ī± (Ta ā†’ Pu) = Ī± (Tb ā†’ Pd) and Ī± (Tb ā†’ Pu) =Ī± (Ta ā†’ Pd) and the kernel decays are the same within blocks Ī² (Ta ā†’ Pu) = Ī² (Tb ā†’ Pd) =Ī² (Tb ā†’ Pu) = Ī² (Ta ā†’ Pd); so the total LL is optimized individually as LLopt = LLoptPa+Pb

+

LLoptTa+Tb. We also run exploratory optimizations with completely independent intensities

and decays, and tested different optimizer algorithms, in order to reduce the bounds forthe parameters. The code (including initial values and bounds) is at the GitHub page,together with the spreadsheet with the results of the optimization (using the DifferentialEvolution, Powell and L-BFGS-B methods from [scipy]; Powell was the best-performingmethod). The parameters for T ā†’ T were the most unstable (charts in the Appendix).

Page 14 of 22

Page 15: Exponential Kernels with Latency in Hawkes Processes

Pu Pd Ta Tb

Ī»0 4.0% 4.0% 14.3% 16.1%R 25% 25% 38% 37%

Pu ā†’ Pu Pd ā†’ Pu Ta ā†’ Pu Tb ā†’ Pu

Ī± 737 488 401 14.5Ī² 3113 3125Ī±/Ī² 23.7% 15.7% 12.8% 0.46%

Pu ā†’ Ta Pd ā†’ Ta Ta ā†’ Ta Tb ā†’ Ta

Ī± 23.6 316 3.8 (median) 0.27 (median)Ī² 1122 7.8 (median)Ī±/Ī² 2.1% 28.3% 48.0% 3.1%

Table 5: Learned parameters

We calculate the exogeneity ratio R between the number of exogenous events and thetotal number of events n for each event type i and day d as:

Ri,d =(Ī»0)i,d(n)i,d8Ā·3600

(22)

And these ratios are much higher than those found in [Bacry et al 2016], which canbe explained as the reassignment of the events within the latency. Even then we findthat it is reasonable to expect that a good part of the trades are exogenous (ie causedby reasons other than short-term dynamics of the order book). Most interestingly, eventsthat change the mid-price (trades, cancels and replenishments) are (i) faster and (ii) moreendogenous than trades that do not change the mid-price. Please note how the first chartof Figure 2 is similar to the first chart of Figure 7 in [Bacry et al 2016] after approximately250Āµs.

Page 15 of 22

Page 16: Exponential Kernels with Latency in Hawkes Processes

Figure 2: Learned Exponential Kernels

Itā€™s not by chance that the decay on T ā†’ P and P ā†’ P kernels is about 3000; wecan plot the first chart again but changing the x-axis to units of latency:

Page 16 of 22

Page 17: Exponential Kernels with Latency in Hawkes Processes

Figure 3: P, T ā†’ P Kernels in units of latency

Itā€™s expected that fast participants will act quickly on: (i) sharp imbalances by eithercanceling or consuming the remaining queue (Ta ā†’ Pu) (ii) filling in recent gaps (P ā†’ P )and that they will be able to react to the reactions as quickly as before, so the influenceof the first event decays fast. We expect that Figure 3 will be found in different marketsand different years (e.g. BUND futures in 2012, which had a latency of 1250Āµs instead of250Āµs).

The main differences found with [Bacry et al 2016] are on the anti-diagonals. ForP ā†’ P we find that diagonals are stronger than anti-diagonals (even on a large tickasset such as the BUND future); to investigate this further a better breakdown of P intoCancels and Trades would help. For P ā†’ T our method is not able to learn negativeintensities because of the log-likelihood, but none of the values reached the lower bound;average intensity was about 2%; although a change in mid-price will make limit orderswith the same sign miss the fill and become diagonals in the sub-matrix P ā†’ P marketorders would not be affected; but more importantly the speed and intensity of the P ā†’ Pkernels ensure that gaps close quickly; we think that the introduction of the latency isenough to simplify the model such that negative intensities would not be necessary.

We have not tested a Sum of Exponentials model, but it might be interesting to fixthe decays for the first exponential kernel at ā€œknownā€ values and see what we can find.

5 ConclusionsWe show (with formulas and code) how to simulate and estimate exponential kernels

of Hawkes processes with a latency, and interpreted the results of applying this methodto a (limited) set of BUND futures data. We show how some features of this model (theP ā†’ P kernels and parameter symmetries) might be quite universal in markets. We canalso extend this approach to two related securities that trade in different markets by usingdifferent latencies on the cross-market kernels without changing the model.

Page 17 of 22

Page 18: Exponential Kernels with Latency in Hawkes Processes

We would like to thank Mathieu Rosenbaum (CMAP) for the guidance and SebastianNeusuess (Eurex) for the latency data.

Page 18 of 22

Page 19: Exponential Kernels with Latency in Hawkes Processes

References[tick] Bacry, E., Bompaire, M., GaĆÆffas, S. and Poulsen, S., (2017), ā€œtick:

a Python library for statistical learning, with a particular empha-sis on time-dependent modelingā€, available at https://arxiv.org/abs/1707.03003

[numpy] Harris, C. P. et al (2020) ā€œArray programming with NumPyā€, Nature,585, 357ā€“362

[scipy] Virtanen, P. et al (2020) ā€œSciPy 1.0: Fundamental Algorithms forScientific Computing in Pythonā€, Nature Methods, 17(3), 261-272

[Numba] Lam, S. K., Pitrou, A. and Seibert, S. (2015) ā€œNumba: a LLVM-based Python JIT compilerā€, LLVM ā€™15: Proceedings of the SecondWorkshop on the LLVM Compiler Infrastructure in HPC, ArticleNo.: 7 Pages 1ā€“6

[Eurex 2017] Eurex, ā€œInsights into trading system dynamicsā€, presentation

[Abergel et al 2016] Abergel, F., Anane, M., Chakraborti, A., Jedidi, A. and Toke, I.M. (2016) ā€œLimit Order Booksā€, Cambridge University Press, ISBN978-1-107-16398-0

[Bacry at al 2015] Bacry, E., Mastromatteo, I. and Muzy, J-F., (2015), ā€œHawkes pro-cesses in financeā€, Market Microstructure and Liquidity 1(1)

[Bacry et al 2016] Bacry, E., Jaisson, T. and Muzy, J. F., (2016), ā€œEstimation of SlowlyDecreasing Hawkes Kernels: Application to High Frequency OrderBook Dynamicsā€, Quantitative Finance 16(8), 1179-1201

[Bacry et al 2019] Bacry, E., Rambldi, M., Muzy, J. F. and Wu, P. (2019), ā€œQueue-reactive Hawkes models for the order flowā€, Working Papers hal-02409073, HAL, available at https://arxiv.org/abs/1901.08938

[Euch et al 2016] Euch, O. E., Fukasawa, M. and Rosenbaum, M., (2018) ā€œThe mi-crostructural foundations of leverage effect and rough volatilityā€, Fi-nance and Stochastics 22, 241-280

[Huang et al 2015] Huang, W., Lehalle, C.A. and Rosenbaum, M. (2015) ā€œSimulatingand Analyzing Order Book Data: The Queue-Reactive Modelā€, Jour-nal of the American Statistical Association, 110(509):107ā€“122

[Kisel 2017] Kisel, R. (2017): ā€œQuoting behavior of a market-maker under differ-ent exchange fee structuresā€, Masterā€™s thesis

[Ogata et al 1981] Ogata, Y., (1981) ā€œOn Lewisā€™ simulation method for point processesā€,IEEE Transactions on Information Theory 27(1), 23-31

Page 19 of 22

Page 20: Exponential Kernels with Latency in Hawkes Processes

[Toke 2011] Toke, I. M., (2011), ā€œAn Introduction to Hawkes Processeswith Applications to Financeā€, BNP Paribas Chair Meet-ing, available at http://lamp.ecp.fr/MAS/fiQuant/ioane_files/HawkesCourseSlides.pdf

Page 20 of 22

Page 21: Exponential Kernels with Latency in Hawkes Processes

AppendixA Appendix

The charts and tables for the daily estimates are below. The only real worry is theinstability of the parameters of the T ā†’ T kernels, although the intensities Ī±

Ī²are stable.

The upper bound for Ī² (T ā†’ T ) was hit on day 16.

Figure 4: Daily estimates of parameters

Page 21 of 22

Page 22: Exponential Kernels with Latency in Hawkes Processes

Day Ī± (Ta ā†’ Ta) Ī± (Tb ā†’ Ta) Ī² (T ā†’ T )

1 2.73 0.14 4.912 2.80 0.18 4.883 3.81 0.33 8.574 7.12 0.62 15.645 36.12 1.94 106.446 2.96 0.17 5.157 4.68 0.31 10.818 4.19 0.21 7.759 7.78 0.67 18.7310 2.55 0.29 4.6811 22.68 1.22 73.7912 2.28 0.15 4.1313 3.33 0.24 6.1414 4.04 0.27 8.4615 4.75 0.35 10.1516 117.41 2.28 500.0017 3.85 0.27 7.9318 3.63 0.22 7.5419 3.24 0.15 6.2220 1.76 0.04 2.68

Table 6: Daily estimates of parameters for T ā†’ T kernels

Page 22 of 22