Using Markov Process in the Analysis of Intrusion Tolerant
Systems
Quyen L. Nguyen
CS 795 – Computer Security Architectures
2
References1. Sheldon M. Ross. “Introduction to Probability Models”, Academic Press.2. Kishor Shridharbhai Trivedi. “Probability and Statistics with Reliability, Queuing, and
Computer Science Applications, 2nd Edition”. Wiley-Interscience, 2001.3. Bharat B. Madan, Katerina Goseva-Popstojanova, Kalyanaraman Vaidyanathan, and
Kishor S. Trivedi. “A Method for Modeling and Quantifying the Security Attributes of Intrusion Tolerant Systems”. Performance Evaluation 56 (2004), 167-186.
4. Khin Mi Mi Aung, Kiejin Park, and Jong Sou Park. “A Model of ITS Using Cold Standby Cluster”. ICADL 2005, LNCS 3815, pp. 1-10, 2005.
5. Alex Hai Wang, Su Yan and peng Liu. “A Semi-Markov Survivability Evaluation Model for Intrusion Tolerant Database Systems”. 2010 International Conference on Availability, Reliability and Security.
6. Quyen Nguyen and Arun Sood. “Quantitative Approach to Tuning of a Time-Based Intrusion-Tolerant System Architecture”. WRAITS 2009, Lisbon, Portugal.
Note: State Diagrams and matrix snapshots in subsequent slides are taken from [3], [4] and [5].
11/03/2010
3
Outline
Markov Chain– Semi-Markov Process (SMP)
Analysis Model of ITS– Mean Time to Security Failure (MTTSF)– Availability
SCIT Cluster ITDB
11/03/2010
4
Stochastic Process
Given that it rains today, will it rain or shine tomorrow? Given that it is sunny today, will it rain or shine tomorrow?
11/03/2010
5
Markov ProcessState space: {rainy, sunny}Parameter space: X1, X2, … Markov property: next state depends only on current statepij = p(Xn+1 = j | Xn = i, Xn-1 = in-1, …, X0 = i0)
= p(Xn+1 = j | Xn = i) Transition Probability Matrix: – P = [pij] with ∑j pij= 1 for every i
Markov Chain: finite state spaceDiscrete-time, Continuous-time
11/03/2010
bbaa
P1
1
6
Steady-state Probabilities
Stationary Process: transition probability independent of n– p(Xn+1 = j | Xn = i) = p(Xn = j | Xn-1 = i)
Chapman-Kolmogorov for n-step transition matrix– P(n) = Pn
Pn converges to steady state values, as n --> ∞ Solution of system (1) of equations:
– x.P = x– Σi xi= 1
11/03/2010
7
Semi-Markov Process
Time spent in a state i is a random variable with mean µ1
– If amount of time in each state is 1, then SMP is a Markov.
Embedded DTMC with steady-state probabilities πi
Time proportion in state i:– Pi = (πi * µi) / ∑j (πj * µj) (2)
Steps to solve an SMP:– Solve steady-probabilities of DTMC using system (1)– Use (2)
11/03/2010
8
Modeling ITS
Modeling steps:– Identify states– Identify transitions– Assign transition probabilities
11/03/2010
9
ITS State Diagram [3]
11/03/2010
10
ITS: Embedded DTMC [3]
11/03/2010
11
DTMC Transition Probability Matrix [3]
p1 = 1 - pa
p2 = 1 – pm – pu
p3 = 1 – ps - pg
11/03/2010
12
Calculating Availability [3]
A = 1 – (PFS + PF + PUC) Transition Diagram and formula depend on attack scenario
and metric to compute. Example: DoS attack, remove unused states MC and FS: A = 1 – (PF + PUC)
11/03/2010
13
Availability: Numerical Examples [3]
A is decreasing function of Pa and increasing function of hG.
11/03/2010
14
Absorbing and Transient States if pij = 0 for i ≠ j, then i is an absorbing state.
– Example: complete system failure state. Arranging Transition Probability, with Q containing
transitions between transient states only.
11/03/2010
ICQ
P0
15
Example of Absorbing State
11/03/2010
fdC
fdaa
Q
ffdd
aa
P
FailedDetectAttackedGood
10
,0010001
1000001
100001
,,,
16
Visit Times k-step transition probability matrix Pk
∑Qk = I + Q1 + Q2 + … converges to (I – Q)-1 = M = [mij] (I – Q)-1 = M ↔ M(I – Q) = I ↔ M = I + MQ Theorem: Let Xij be the visit times of state j starting from
state i before going to absorbing states: E[Xij] = mij
Starting from state 1, V = (V1, V2, …, Vn) can be solved by system of equations:– V = I + V.Q11/03/2010
ICQk
kP0
')(
17
Calculating MTTSF
Determine absorbing states: {UC, FS, GD, F}. Transient states: {G, V, A, MC, TR} Form transition matrix comprising of transient states Q. Compute visit times Vi using the equations:
– v = q + v.Q MTTSF = v.µ
11/03/2010
18
ITS: Transient States [3]
11/03/2010
19
MTTSF Numerical Examples [3]
MTTSF decreases as Pa increases
MTTSF increases as hG increases.
11/03/2010
20
Issues
Parameter Modeling– Probability Distribution: exponential, Weibull, etc.
Mean value Estimation
11/03/2010
21
SCIT Parameters
Online window Wo: server accepts requests from the network
Grace period Wg: server stops accepting new requests and tries to fulfill outstanding requests already in its queue.
Exposure window: W = Wo + Wg. Nonline: # redundant online nodes. Ntotal : total nodes in the cluster. Ntotal, W, and the cleansing-time Tcleansing are inter-related.
11/03/2010
22
SCIT: State Transition Diagram with Absorbing States
Pa: probability of successful attack Pc: probability of cleansing when in A. F: low chance of occurrence, but still possible:
–Virtual machine and/or the host machine no longer respond to the Controller.
–Controller itself fails due to a hardware fault.
G
V
A
F
CleansingG V A F
G 0 1 0 0
V 1–Pa 0 Pa 0
A Pc 0 0 1-Pc
F 0 0 0 1
11/03/2010
23
0001100
0010010
PcPcPaPa
P
SCIT: MTTSF ComputationXa and Xt are absorbing states and transient states
Xa = {F} and Xt = {G, V, A} q: probabilities that process starts at each state in Xt :
q = (1,0,0), since it starts with state G.V = (V0 V1 V2): number of visit times for each state in Xt. h: mean sojourn times in each stateSolve system of equations: V = q + VQUsing solutions for V, compute MTTSFscit = V.h
11/03/2010
Q
24
SCIT: MTTSF Expression
Pa ↓ → MTTSFscit ↑ Pc ↑ → MTTSFscit ↑How to make Pa ↓ and Pc ↑?
)P(1
hP
hh
MTTSFc
2a
1 0
SCIT
11/03/2010
)P(1P1 V
ca0
)P(1P1 V
ca1
)P(11 V
c2
25
SCIT: Relationship between Pa and W
Modeling malicious attack arrivals:– Assumption: non-staged attacks– (Attack arrivals) ̴ Poisson (λ)
Then, inter-arrival time Y between attacks is exponential distribution: – P(Y ≤ W) = 1 - e-λW
P(Y ≤ W) is also prob. that attacks occur in exposure window.
Then:– Pa ≤ P(Y ≤ W)– → Pa ≤ 1 - e-λW
11/03/2010
26
SCIT: Relationship between Pc and W
Resident time of the attack modeled as a “service” time Z with rate μ.
Assume Z exponential distribution:P(Z > W) = e-μW
probability that the service time is greater than W is limited by the fact that the system moves out of state A due to the cleansing mode: – P(Z > W) ≤ Pc ↔ Pc ≥ e-µW
System cannot “serve” more than the arriving attacks: μ ≤ λ.
Then: e-μW ≥ e-λW .
11/03/2010
27
SCIT: MTTSF and W
W ↓ → (Pa ≤ 1 - e-λW) ↓ W ↓ → (Pc ≥ e-µW) ↑ Then: W ↓ → MTTSFscit ↑ MTTSFSCIT ≥ F(W), where F(W) is a decreasing function of
W:
Significance: engineer instance of SCIT architecture by tuning W in order to increase or decrease the value of MTTSFSCIT.
)1(
h)1(
h h
F(W)W
2W10
e
e
11/03/2010
28
SCIT: MTTSF Trend
11/03/2010
29
SCIT Failure State
Is state F really absorbing?– Compromise of Controller is very minimal due to the
one-way data.– System automatically recovers back to the G state.
Use Semi-Markov Process with embedded DTMC (Discrete-Time Markov Chain) to compute the steady-state Availability (state without security faults).
11/03/2010
30
SCIT: Availability
Solve the DTMC steady-state probabilities vector y = (y0, y1, y2, y3) for all states in {G, V, A, F}:– y = y.P– Σi yi = 1.
11/03/2010
0001100
0010010
PcPcPaPa
P
31
SCIT: Availability and Exposure Window
Compute SMP stead-state probability πF for state F: – πF = y3h3/y.h, with h = (h0, h1, h2, h3) being extended to
include the mean sojourn time h3 for state F. Availability = 1 − πF
Availability monotonically decreases with Pa but increases
with Pc.
Using the same line of reasoning and the assumption of Poisson attack arrival process as for MTTSFSCIT above, we can also conclude that decreasing the exposure window will increase Availability .
3ca2a10
2a10
)hP(1PhPhhhPhh
tyAvailabili
11/03/2010
32
Rejuvenation: Single System [4]
11/03/2010
Rejuvenation: stop software, clean internal state, service restart.
Reconfiguration: patching, anti-virus, access control (IP blocking, port blocking, session drop, content filtering), traffic control by limiting bandwidth.
Both may be needed depending on the situation.
33
Rejuvenation: Transition Probability [4]
Equation System:– π = π.P and Σi πi = 1.
πi, i= (H,I,J,C,F). A = 1 – (πF + πJ + πC ) Paper uses balance
equations of probabilities leaving and entering a state.
11/03/2010
34
Rejuvenation: Cluster Analysis [4]
11/03/2010
Use SMP for modeling with State Space: Xs = {(1,1), (I,1), (J,1), (C,1), (F,1), (0,1), (0,I), (0,J), (0,C), (F,F)} d is the solution of DTMC equations: d.P and Σdi = 1 Then, the prob. for SMP is given by:
A = 1 – (πF 1 + πFF ) Deadline D of mean sojourn time (dihi). Indicator variable Y:
– Yi = 0 if dihi ≤ D and Yi = 1 if dihi > D Survavibility S =
– A – [YJ1πF 1 + YC1πC1 + Y0Jπ0J + Y0Cπ0C]
35
Rejuvenation: Numerical Results [4]
11/03/2010
As prob. for (Rj,1), (Rc,1) or (0,Rj), (0,Rc) increase, availability and survivability decrease.
36
Rejuvenation: Numerical Results [4]
11/03/2010
Changes of survability vs. changes in rejuvenation when attacked. No significant difference between deadlines when prob < .4
37
Coping Ability: Numerical Results [4]
11/03/2010
Survivability is maximized when primary-secondary servers detect abnormal behavior early.
38
Intrusion Tolerance DB [5]
11/03/2010
39
ITDB: State Transition [5]
Integrity: fraction of time when all accessible data are clean– I = πG + πQ + πR
Availability: fraction of time when all clean data are accessible– A = πG + πR
11/03/2010
40
ITDB: False Alarm Rate [5]
ITDB maintains I and A even at high FA rate.
Degradation of I and A as FA increases.
11/03/2010
41
ITDB: Detection Rate [5]
ITDB depends on detection probability.
When Pd = 0, I and A are at low level.
When Pd increases, I and A go up.
ITDB can maintain I and A at some level at low detection rate.
11/03/2010
d
42
ITDB: Attack Rate [5]
Heavy attack: hG = 5. Compare “good” and
“poor” systems in terms of Pd, Pfa, hI, hQ, hR.
When attack rate increases, observe:– I and A– Q and R
11/03/2010
d
43
Summary
What is a Markov Process? How to model an ITS using a Semi-Markov Process? How to calculate MTTSF based on the model? Application to SCIT Analysis Rejuvenation Cluster Analysis ITDB Analysis
11/03/2010