predicting cyber-attacks using hawkes processes · data breaches dataset autocorrelation of the...
Post on 11-Jul-2020
7 Views
Preview:
TRANSCRIPT
Alexandre Boumezoued, Milliman R&D
OICA, 29 April 2020
Joint work with Caroline Hillairet (ENSAE) and Yannick Bessy-Roland (AXA GIE)
Predicting Cyber-Attacks using
Hawkes Processes
2
Agenda
1 Introduction
2 Data breaches dataset
3 Hawkes model
4 Fitting and prediction
5 References
Attack that occurs on the same day a weakness is discovered in software. At that point, it's exploited before a fix becomes available from its creator
The attacker secretly relays and possibly alters the communications between two parties who believe they are directly communicating with each other
Introduction
3
Example of types of cyber attacks
The attacker sends a document
appearing reliable (mainly e-mail) in
order to collect sensitive information
Click to edit Master text styles
Man in the middle
Zero day
Denial of service
MalwarePhishing
• Definition: The risk of an attack on digital data as well as the consequences on the information system
Software that is specifically designed to disrupt, damage, or gain unauthorized access to a computer system
Attack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by disrupting services of a host connected to the Internet
4
IntroductionCyber insurance covers
Damage
Crisis
management
Third party
Cyber insurance
Crisis management:
Costs of investigation
Costs of assistance
Other costs of crisis management
Damage:
Data cleaning
Data restauration
Payment of ransom
Operating loss
Third party liability:
Virus transmission
Personal liability insurance
Denial of service
5
Agenda
1 Introduction
2 Data breaches dataset
3 Hawkes model
4 Fitting and prediction
5 References
Name of the covered entity
Type of the covered entity:Healthcare provider, businesses…
6
Data breaches datasetThe Privacy Rights Clearinghouse database
PRC
Database
Description
A public database that contains 8871 data breaches in the US over the period 2005-2019
https://www.privacyrights.org/data-breaches
Covariates
Those used in the results presented are highlighted in blue
Total records breached
Date Made Public: day/month/year
Type of Breach : « Hacking / IT
incident » , « Theft »
…
Localization of
the breached
entity: state/city
7
Data breaches datasetDataset description: types of breaches
8
Data breaches datasetDataset description: types of targets
9
A majority of Theft/Loss and Hacking/Malware
21% of Unintented disclosure
A majority in Healthcare/Medical
Businesses are well represented too
Data breaches datasetDescriptive statistics over 2010-2018
Data breaches datasetCyber attacks frequencies by type and organization
10
Apparent clusteringby type of attacks
Deterministic trendsor stochastic regimes?
Apparent clusteringby type of organization attacked
No clear trends
Data breaches datasetAutocorrelation of the number of incidents
11
R-squared : 0.726
Confidence interval (95%)[0.687, 0.766]
R-squared : 0.718
Confidence interval (95%)[0.702, 0.735]
Regression of the number of event during the following month 𝒕 + 𝟏 as a function of the number of event during the current month 𝑡 → should be independent for a Poisson process model to be valid
Autocorrelation dramatically increases when focusing on attacks and/or organizations of the same type
R-squared : 0.154
Confidence interval (95%)[0.030, 0.278]
R-squared : 0.780
Confidence interval (95%)[0.750, 0.810]
12
Agenda
1 Introduction
2 Data breaches dataset
3 Hawkes model
4 Fitting and prediction
5 References
Hawkes modelChoice of the Hawkes model
Taking into account autocorrelation
Cox model : Poisson model with stochastic intensity → difficulty to specify the stochastic intensity dynamics
Hawkes model : Self-exciting model with stochastic intensity, fully specified by the point process itself
Choice of the Hawkes model:
Self-excitation: every event increases the probability for a new event to occur within a given group (same
organization or attack type)
Clustering: the self-exciting property allows to model cluster effect (groups of attacks – same origin!)
Inter-excitation: in the case of multi-dimensional Hawkes process, every attack in one group increases the
occurrence probability of new events in the other groups
Related references:
Peng et al. (2017), Baldwin et al. (2017)
13
A Hawkes process with exponential kernel is a counting process 𝑁𝑡 = σ𝑛≥1 1𝑇𝑛≤𝑡 with intensity:
𝜆 𝑡 = 𝜇 𝑡 +
𝑇𝑛<𝑡
𝛼 exp(−𝛽(𝑡 −𝑇𝑛))
𝜇:ℝ+ → ℝ+is a deterministic baseline intensity
The sum represents the impact of past events; it captures the self-excitation property
Hawkes modelUnivariate Hawkes process
14
Each jump represents an attack
Clustering phenomena
Intensity decreases exponentially between jumps
𝜆 𝑡 𝑁𝑡
Multivariate Hawkes process - presentation
Multivariate Hawkes process allows to model interactions between types of entities/attacks/states:
𝑁𝑡1
𝑡≥0, … , 𝑁𝑡
𝐾
𝑡≥0, 𝐾 counting processes with jump times 𝑇𝑛
(1)
𝑛≥1, … , 𝑇𝑛
(𝐾)
𝑛≥1
The intensity process with exponential kernel of the counting process (𝑖) is defined as:
15
Matrix of excitation:
𝛼 =𝛼1,1 𝛼1,2𝛼2,1 𝛼2,2
=0.0 0.990.0 0.90
Group 2 is purely self-excited
Group 1 is fully influenced by Group 2
Hawkes model
𝜆𝑖 𝑡 = 𝜇𝑖(𝑡) +
𝑗=1
𝐾
𝑇𝑛𝑗<𝑡
𝛼𝑖,𝑗 exp −𝛽𝑖,𝑗(𝑡 − 𝑇𝑛𝑗)
Group 1 self-excitation
Impact of Group 2 on Group 1
Impact of Group 𝑗 on Group 𝑖
Hawkes modelMultivariate Hawkes process - kernels
16
« Classical » exponential kernel:𝜙𝑖,𝑗 𝑠 = 𝛼𝑖,𝑗 exp −𝛽𝑖,𝑗𝑠
Instantaneous excitation Complexity: we assume all 𝛽𝑖,𝑗 (excitation
memory of 𝑖 from 𝑗) depend on the groups and are to be calibrated
The intensity process is not Markov (for dimension ≥ 2)
Kernel with delay: 𝜙𝑖,𝑗 𝑠 = 𝛼𝑖,𝑗𝑠 exp −𝛽𝑖𝑠 The intensity process is not Markov (even
in dimension 1) Complexity: we assume all 𝛽𝑖 (excitation
memory of 𝑖 from any other group) depend on the groups and are to be calibrated
Kernels Intensity process
Development of closed-form formulas for the expected number of claims for such multivariate Hawkes processes
17
Agenda
1 Introduction
2 Data breaches dataset
3 Hawkes model
4 Fitting and prediction
5 References
18
Fitting and predictionData Grouping
Crossing variables: attack type, sector, state
Retaining groups with more than 200 attacks and remaining in OTHER
Total: six segments
18
19
Fitting and predictionModel specification
The three Hawkes kernels considered:
Baseline intensity: 𝝁𝒊 𝒕 = 𝝁𝟎,𝒊 + 𝜸𝒊𝒕 to account for trends in the dataset
Possibility to add a Lasso penalty (not discussed here)
𝐿 𝜇, 𝜙 𝑝𝑒𝑛𝑎𝑙𝑖𝑧𝑒𝑑 = 𝐿(𝜇, 𝜙) − 𝝂σ𝟏≤𝒊,𝒋≤𝒅 |𝜶𝒊,𝒋|
Reduce complexity
Improve prediction capacity
19
Kernel 1 (exp) Kernel 2 (exp) Kernel 3 (delay)
𝜙𝑖,𝑗 𝑠 𝛼𝑖,𝑗exp(−𝜷𝒊𝑠) 𝛼𝑖,𝑗exp(−𝜷𝒊,𝒋𝑠) 𝛼𝑖,𝑗𝐬exp(−𝜷𝒊𝑠)
Nb parameters 54 84 54
Likelihood: kernel 3 with delay best fits the data
Adequacy tests (Kolmogorov-Smirnov): adequacy is satisfactory, except for group (4)
20
Fitting and prediction Calibration results
Kernel 1
(exp)
Kernel 2
(exp)
Kernel 3
(delay)
𝜙𝑖,𝑗 𝑠 𝛼𝑖,𝑗exp(−𝜷𝒊𝑠) 𝛼𝑖,𝑗exp(−𝜷𝒊,𝒋𝑠) 𝛼𝑖,𝑗𝐬exp(−𝜷𝒊𝑠)
Nb parameters 54 84 54
-Likelihood (2011-2015) 6513 6172 6153
-Likelihood (2011-2016) 7639 7516 7485
Parameter estimates (kernel 3):
21
Fitting and prediction Calibration analysis
Matrix of ratios between maximal
excitation and baseline intensity:
Same orders of magnitude
Partly captures the historical trend
Captures the baseline non-excited intensity
Captures the major self/externalinteractions
Strong self-excitation in theseMED groups
Strong reciprocalself-excitation of groups (2) and (4)
«Causal» excitation of (2) by (5)
Fitting and prediction Out-of-sample prediction results for 2017 (kernel 3)
22
Simulation based on the thinning algorithm for point processes
Predictions with mean and (0.5%, 99.5%) percentiles
Joint prediction of all groups capturing the causal and asymmetric interactions
Parameter uncertainty can be added
23
Agenda
1 Introduction
2 Data breaches dataset
3 Hawkes model
4 Fitting and prediction
5 References
References
Baldwin, A., Gheyas, I., Ioannidis, C., Pym, D., & Williams, J. (2017). Contagion in cyber security attacks. Journal of the
Operational Research Society, 68(7), 780-791.
Böhme, R., & Kataria, G. (2006, June). Models and Measures for Correlation in Cyber-Insurance. In WEIS.
Boumezoued, A. (2016). Population viewpoint on Hawkes processes. Advances in Applied Probability, 48(2), 463-480.
Daley, D. J., & Vere-Jones, D. (2007). An introduction to the theory of point processes: volume II: general theory and
structure. Springer Science & Business Media.
Edwards, B., Hofmeyr, S., & Forrest, S. (2016). Hype and heavy tails: A closer look at data breaches. Journal of
Cybersecurity, 2(1), 3-14.
Hawkes, Alan G. (1971). Spectra of some self-exciting and mutually exciting point processes. Biometrika, 58(1), 83-90.
Hawkes, Alan G, David Oakes. (1974). A cluster process representation of a self-exciting process. J. of Applied Probability
493–503.
Oakes, David. (1975). The Markovian self-exciting process. Journal of Applied Probability 69–77.
Peng, C., Xu, M., Xu, S., & Hu, T. (2017). Modeling and predicting extreme cyber attack rates via marked point processes.
Journal of Applied Statistics, 44(14), 2534-2563.
Xu, M., & Hua, L. (2019). Cybersecurity Insurance: Modeling and Pricing. North American Actuarial Journal, 1-30.
24
Disclaimer
25
This presentation presents information of a general nature. It is not intended to guide or determine any specific individual situation and Milliman recommends that users of this presentation will seek explanation and/or amplification of any part of the presentation that they consider not to be clear. Neither the presenter nor the presenter's employer shall have any responsibility or liability to any person or entity with respect to damages alleged to have been caused directly or indirectly by the content of this presentation. All persons who choose to rely in any way on the contents of this presentation do so entirely at their own risk.The contents of this presentation are confidential and must not be modified, copied, quoted, distributed or shown to any other parties without Milliman's prior written consent. Copyright © Milliman 2020. All rights reserved
top related