applications of machine learning in high frequency trading
TRANSCRIPT
Applications of Machine Learning in
High Frequency Trading
HELLO!We are :-
Aashish Jhamtani (15BM6JP01)Ayan Sengupta (15BM6JP09)Neeti Pokharna (15BM6JP27)
Siddhant Sanjeev (15BM6JP45)Sujan Kumar Biswas (15BM6JP49)
What is High Frequency
Trading?Let’s start with the first set of slides 1
“Mandatory DisclaimerAll characters and events depicted in this
project are entirely modeled. Any similarity to actual events or stock price movements is
purely awesome.
“Using computer algorithms to rapidly trade securities
● Positions are held for seconds to minutes● Reaction times to market changes are sub-millisecond● HFT accounts for more than 60% of all trading volume in some
markets● Due to market efficiency it is challenging to come up with robust
predictive models● Columns include timestamp, price, order-flag, size, etc.
Challenges and Scope
◉ The special challenges for machine learning : due to very fine granularity of data.
◉ A lack of understanding of how such low-level data relates to actionable circumstances (such as profitably buying or selling shares, optimally executing a large order, etc.).
◉ No prior intuitions about how (say) the distribution of liquidity in the order book relates to future price movements, if at all.
SCOPE : a comparative study of various ML strategies and their performance on the data obtained through the Bloomberg Terminal. Finally, to design a successful strategy to operate in such a scenario.
How Does the Data Look Like
WANT BIG IMPACT?USE BIG IMAGE.
A PICTURE IS WORTH A THOUSAND WORDS
A complex idea can be conveyed with just a
single still image.
Namely making it possible to absorb large amounts of data quickly.
Holt – Winter’s Model
Simple Exponential Smoothing is used for applying as many as three low-pass filters with exponential window functions. It was then used to predict the output for the next tick and the MSE was found to be : 253.9Output curve is as shown :-
Feed Forward Model
A feed-forward neural network is an artificial neural network where connections between the units do not form a cycle. Here it was used to predict the output for the next tick and the MSE was found to be : 144.67Output curve is as shown :-
News Sentiment Analysis
Stanford Parser was used to get the dependency graph for a sentence.Sentiment judged using senti-word-net.Output graph is as shown :-
Black Schole’s Model● Stock prices follows geometric Brownian motion i.e. returns follow lognormal distribution with
constant drift and volatility
● For one period binomial model call price under risk neutral probability measure Q in arbitrage free market is
● The call price C(t,x) satisfies the equation
● Solving the above equation gives us
Limitations of Black Schole's R Simulation● Log returns of stocks do not always follow normal distribution as shown in the graph below.
● Fixed risk free rate, no dividend and fixed volatility assumptions are not always valid.
● The market is not complete, because of the transaction cost i.e. not always possible to choose suitable hedging portfolio (aH,bH) in risky and risk free assets.
● Out of the money performance of BS model is not as good as in the money performance.
● A fixed no-arbitrage price for any option on the stock.
Markov Diffusion Model● The drift and volatility are finite state Markov chain in continuous time. ● For example - Assume drift and volatility both are 2-state processes. Then the stock price
follows Markov switching diffusion model -
● Then the arbitrage free European call price for maturity T, strike price K are two solutions -
Markov Diffusion Model R Simulation• If at both states drift and volatility remains same, this model corresponds to Black and Scholes
model.• The market is not complete because of the state transitions. • The output is shown below :-
Jump Diffusion Process• Jumps should occur in an instantaneous fashion, neglecting the possibility of a Delta hedge
• Probability of any jump occurring in a particular interval of time should be approximately proportional to length of that time interval
• All future jumps should have no "memory" of past jumps
• The extra parameters ν and m represent the standard deviation of the lognormal jump process and the scale factor for jump intensity, respectively.
Jump Diffusion Result DiscussionHigh RMSE (=26.7) since, Jump-diffusion model cannot incorporate possible dependence structure among asset returns (the so-called “volatility clustering effect”), simply because the model assumes independent increments. Here is the plot for the same :-
Support Vector Regression• Motivation: Find a function that tries to predict dependent variable (Y) on the basis of
independent variables (X) such that deviation from predicted value and actual value is less than or equal to a measure (ε) till which error is tolerated
• Methodology: Polynomial Kernel used - Radial Basis function kernel
• Advantage: In traditional methods, only Stock Price, Strike Price and Time to Maturity are used. SVR can capture the effect of Risk free interest rate and volatility which are dynamic with time
• Conclusion: Decent RMSE (=19.56) compared to other parametric methods
Symmetrized Nearest Neighbour• Semi-parametric estimation of liquidity effects on option pricing
• When obtaining volatility nonparametric function in money-ness intervals for which amount of data is relatively small, use of multivariate kernel based on global smoothing parameter may lead to poor estimation results
• When estimating in one point we calculate weight for rest of our observations by looking at distance between values of empirical distribution at each point rather than distance between points themselves
• Empirical distribution changes the random design to a uniform design with knots uniformly spaced between zero and one
Symmetrized Nearest Neighbour Regression Results
Inputs:• Money-ness• Bid ask Spread• Days until Expiration (T)• Volatility calculated was given as input in BS
model
Conclusion: In-sample performance of model is better compared to a competing model without liquidity. However, out-of-sample performance is quite disappointing resulting in high MSE
K Nearest Neighbour Implementation and Results
• Non-parametric algorithm that can be used for either classification or regression.
• For each data point, the algorithm finds the k closest observations, and then classifies the data point to the majority.
• The data set was encoded so as to fit the KNN classifier to it. The data was compared to the previous tick and if there was an increase from a previous value then the data was encoded as 1, 0 otherwise
• Accuracy was checked for the data and a corresponding confusion matrix was created for the data
• Highest Accuracy was found for value of K=5
Artificial Neural Network : Conjugate GradientMOTIVATION :-
• Results suggest that machine learning can be used as a basis for effective option investment strategies using several Multilayer Perceptron models.
• Existing models in finance for predicting the price of an option, most of which revolve around the Black-Scholes model; however, these models tend to involve highly complex mathematics and often make many assumptions about the underlying characteristics of the market.
ADVANTAGES:-
• Machine Learning techniques do not make any implicit assumptions about the relationships between input variables. Using an Artificial Neural Network, we are able to let the learner discover relationships that may not be included in standard models like Black-Scholes.
Artificial Neural Network : Conjugate Gradient (Cont…)
METHODOLOGY :-
From The data obtained by Bloomberg Terminal, we obtained the historical data of the following attributes.
• Strike Price (X)• Underlying Stock Price (S)• 10-day Historical Volatility• 30-day Historical Volatility• Days until Expiration (T)• The Market Price of the Option (P)• The 10 previous days of stock prices• Risk-free rate• Expiration Price of Stock (E)
Apply ANN using these parameters
Artificial Neural Network : Conjugate Gradient (Cont…)
CONCLUSIONS :-
Using Artificial Neural Network on the test data for option pricing gives a more accurate result as compared to the other base learner models.For further improvements we can add the following parameters as the input to the specified model as well.• Measures of Volatility• Previous Stock Prices
Although ANN using Conjugate Gradient is a very renowned and used Machine Learning method, it doesn’t provide a very accurate result as Conjugate Gradient doesn’t account for the damping Least Square Error..Due to this problem we switch over to the Levenberg Marquardt Optimization in Neural Network which takes care of the above specified problem.
Artificial Neural Network : Levenberg Marquardt
• Majorly used to solve non-linear least squares problems.• Finds local minima instead of global minima.• LMA interpolates between the Gauss-Newton Algorithm (GNA) and the method of gradient
descent.
TRADING 2
TRADING
Price appreciation,Timing risk Market
impact
trade fast
trade slowlyWe want to balance market risk and market impact.
Market risk
Market impact
We have to deal with …
DISCRETE TRADING MODEL
◉ Trading is possible at N discrete times ◉ No interest on cash position◉ A trading strategy is given by (xi)i=0..N+1 where
xk = #units hold at t=k (i.e. we sell nk=xk-xk+1 at price Sk)◉ Boundary conditions: x0 = X and xN+1 = 0◉ Price dynamics:
Exogenous: Arithmetic Random WalkSk = Sk-1 + (k+), k=1..N
with k ~ N(0,1) i.i.d Endogenous: Market Impact
HIDDEN MARKOV MODEL
◉Decide the “hidden” states: up trend, mean reverting, down trend
MEAN REVERTING
μ ≈ 0DOWNμ >> 0
UPμ >> 0
p11
p12
p13
p22
p21 p23
p33
p32
p31
TECHNICAL INDICATORSBOLLINGER BANDS & STOCHASTIC
OSCILLATORS
Stochastic Oscillator◉ Offers a measurement of deviance
of currency pair’s rate (price) from its normal levels
◉ Offers indications of when a currency pair is overbought/oversold
◉ Works well in markets that are not trending, but rather just fluctuating back and forth between an upper level (resistance) and a lower level (support)
Bollinger Bands◉ Excellent range-bound indicator that
measures standard deviation from the moving average
◉ Operates under the logic that a currency pair’s price is most likely to gravitate towards its average, and hence when it strays too far – such as two standard deviations away – it is due to retrace back to its moving average
“Plots three bands on a price chart to create two price channels. The security is said to be overbought if price line is consistently near or breaches upper price band. It may be oversold if the price line is consistently near or drops below the lower price band.
“ EMA
“SMA
“Overbought position is confirmed if stochastic lines cross above 80 and, at the same time, price line is consistently near upper Bollinger Band. At that level, prices are expected to drop soon. The opposite is also true.
“MA Type = EMA
“MA Type = SMA
STRATEGY IS SIMPLE
The first condition you are looking for is a
candle breaking the UPPER or LOWER Bollinger Band
Look for stochastics to have traveled above 80 line (for a bullish candle traveling outside upper Bollinger Band), or below the 20 line
Wait for the next candle to form
before you get into the trade
Once the next candle has formed the
stochastics lines should have crossed and be
heading back towards the white line.
If the stochastic lines have not yet crossed or they are becoming further apart do not take the trade, wait
until all of the conditions are met.
If the conditions have been met,
place a trade in the opposite direction of the previous candle
PERFORMANCE INDICATORSharpe RatioIs the ratio of yearly return to yearly volatility
SHARPE RATIO TABLE TO COMPARE ALGORITHMS
Value
DataSet -0.00029266783
Black Scholes -0.1735110594
Jump Process -0.173409357
SNN & SVR -0.00012865643
PROFIT AND LOSS CURVE
CONCLUSIONS
• The out-of-sample performance is not comparable regardless of what option pricing model is employed in the estimation
• Artificial Neural Network (Feed Forward) model gives best result among forecasting tools
• Semi-parametric implied volatility estimation is more effective than BS implied volatility
• Non-parametric method give better accuracy compared to parametric methods
• SVR takes less time and gives decent result among the non-parametric methods
• ANN able to capture effect of many more variables like dividend and historical volatilities but largely depends on the volatility of data input
REFERENCES
• Machine Learning for Market Microstructure and High Frequency Trading
• Quest for Efficient Option Pricing Prediction model using Machine Learning Techniques - B.V. Phani, B. Chandra, Vijay Raghav
• A Semiparametric Estimation of Liquidity Effects on Option Pricing - Eva Ferreira
• http://www.platonniaga.com/downloads/ea-documents/Lesson%205%20Stochastics%20and%20Bollinger%20Bands.pdf
THANKS!