mining task precedence graphs from real-time embedded...
TRANSCRIPT
Mining Task Precedence Graphs fromReal-Time Embedded System Traces
Oleg Iegorov Sebastian Fischmeister
Department of Electrical and Computer EngineeringUniversity of Waterloo, Canada
oiegorov,[email protected]
April 13, 2018
Sources:• Charette RN. “This Car Runs on Code”. IEEE Spectrum. 2009 Feb 1;46(3):3.• Holzmann GJ. “Landing a Spacecraft on Mars”. IEEE Software. 2013 Mar;30(2):83-6.• Hofland L, van der Linden J. “Software in MRI Scanners”. IEEE Software. 2010 Jul 1;27(4):87.
Sebastian Fischmeister, [email protected] 2 / 28
Software-based Recalls in Automotive
Johnson, Taylor T., Raghunath Gannamaraju, and Sebastian Fischmeister. ”A Survey of Electrical andElectronic (E/E) Notifications for Motor Vehicles.” 24th International Technical Conference on the EnhancedSafety of Vehicles (ESV). 2015.
Sebastian Fischmeister, [email protected] 3 / 28
Busy Beaver
S(n,m): the largest number of steps taken by an n-state, m-symbol machine started on aninitially blank tape before halting.
A B0 1RB 1LA1 1LB 1RH
1:1L 1:1R
0:1L
0:1R
A B H
http://bit.ly/IJF9dO0 0 1 1 1 1 0 0 (6 steps, four ”1”s total)
Sebastian Fischmeister, [email protected] 4 / 28
Busy Beaver
S(n,m): the largest number of steps taken by an n-state, m-symbol machine started on aninitially blank tape before halting.
A B0 1RB 1LA1 1LB 1RH
1:1L 1:1R
0:1L
0:1R
A B H
http://bit.ly/IJF9dO
0 0 1 1 1 1 0 0 (6 steps, four ”1”s total)
Sebastian Fischmeister, [email protected] 4 / 28
Busy Beaver
S(n,m): the largest number of steps taken by an n-state, m-symbol machine started on aninitially blank tape before halting.
A B0 1RB 1LA1 1LB 1RH
1:1L 1:1R
0:1L
0:1R
A B H
http://bit.ly/IJF9dO0 0 1 1 1 1 0 0 (6 steps, four ”1”s total)
Sebastian Fischmeister, [email protected] 4 / 28
Busy BeaverS(5, 2) =
0:1R0:1L
1:0L
1:1R
0:1R
1:1L
1:1L
0:1R
0:1R
1:0L
B
A CE
H
D
≥ 47, 176, 870
S(2, 5) = ≥ 1.9× 10704
Sebastian Fischmeister, [email protected] 5 / 28
Busy BeaverS(5, 2) =
0:1R0:1L
1:0L
1:1R
0:1R
1:1L
1:1L
0:1R
0:1R
1:0L
B
A CE
H
D
≥ 47, 176, 870
S(2, 5) =
≥ 1.9× 10704
Sebastian Fischmeister, [email protected] 5 / 28
Busy BeaverS(5, 2) =
0:1R0:1L
1:0L
1:1R
0:1R
1:1L
1:1L
0:1R
0:1R
1:0L
B
A CE
H
D
≥ 47, 176, 870
S(2, 5) = ≥ 1.9× 10704
Sebastian Fischmeister, [email protected] 5 / 28
Consequence from Current Trends
Need better theory, methods, and tools to build and understandsystems.
→ Reverse engineering
Good news: more complexity means more data!
→ Data-driven reverse engineering
Sebastian Fischmeister, [email protected] 6 / 28
Consequence from Current Trends
Need better theory, methods, and tools to build and understandsystems.
→ Reverse engineering
Good news: more complexity means more data!
→ Data-driven reverse engineering
Sebastian Fischmeister, [email protected] 6 / 28
Consequence from Current Trends
Need better theory, methods, and tools to build and understandsystems.
→ Reverse engineering
Good news: more complexity means more data!
→ Data-driven reverse engineering
Sebastian Fischmeister, [email protected] 6 / 28
Consequence from Current Trends
Need better theory, methods, and tools to build and understandsystems.
→ Reverse engineering
Good news: more complexity means more data!
→ Data-driven reverse engineering
Sebastian Fischmeister, [email protected] 6 / 28
Reverse Engineering
“(Software) reverse engineering is the process of analyzing a subject system to createrepresentations of the system at a higher level of abstraction” [Chikofsky et al, 1990]
Traditionally used in the domain of desktop and enterprise software
Sebastian Fischmeister, [email protected] 7 / 28
Uses of Reverse Engineering in Real-time Systems
Possible applications:• legacy code maintenance• debugging• anomaly detection• testing• documentation
Timinig is neglected in standard software reverse engineering tools.
→ Tools for reverse engineering for real-time systems are needed
Sebastian Fischmeister, [email protected] 8 / 28
Uses of Reverse Engineering in Real-time Systems
Possible applications:• legacy code maintenance• debugging• anomaly detection• testing• documentation
Timinig is neglected in standard software reverse engineering tools.
→ Tools for reverse engineering for real-time systems are needed
Sebastian Fischmeister, [email protected] 8 / 28
Uses of Reverse Engineering in Real-time Systems
Possible applications:• legacy code maintenance• debugging• anomaly detection• testing• documentation
Timinig is neglected in standard software reverse engineering tools.
→ Tools for reverse engineering for real-time systems are needed
Sebastian Fischmeister, [email protected] 8 / 28
Characteristics of a Reverse Engineering Tool
• Representation of a system• Finite state machine• Petri net• Regular expressions• UML model• and others. . .
• Extraction of information from a system• Static: from source code• Dynamic: from traces• Hybrid
• Learning behaviour• Active: create new test cases and observe reaction• Passive: only observe
Sebastian Fischmeister, [email protected] 9 / 28
Characteristics of a Reverse Engineering Tool
• Representation of a system• Finite state machine• Petri net• Regular expressions• UML model• Task precedence graph ←− TPG miner• and others. . .
• Extraction of information from a system• Static: from source code• Dynamic: from traces• Hybrid
• Learning behaviour• Active: create new test cases and observe reaction• Passive: only observe
Sebastian Fischmeister, [email protected] 9 / 28
Characteristics of a Reverse Engineering Tool
• Representation of a system• Finite state machine• Petri net• Regular expressions• UML model• Task precedence graph ←− TPG miner• and others. . .
• Extraction of information from a system• Static: from source code• Dynamic: from traces ←− TPG miner• Hybrid
• Learning behaviour• Active: create new test cases and observe reaction• Passive: only observe
Sebastian Fischmeister, [email protected] 9 / 28
Characteristics of a Reverse Engineering Tool
• Representation of a system• Finite state machine• Petri net• Regular expressions• UML model• Task precedence graph ←− TPG miner• and others. . .
• Extraction of information from a system• Static: from source code• Dynamic: from traces ←− TPG miner• Hybrid
• Learning behaviour• Active: create new test cases and observe reaction• Passive: only observe ←− TPG miner
Sebastian Fischmeister, [email protected] 9 / 28
TPG miner (Task Precedence Graph miner)
Real-TimeEmbedded System
Trace
execution TPG_miner
Task Precedence Graph
τ1
τ2 τ3 τ4 τ5 τ6
τ7 τ8
τ9
τ10 τ11 τ12 τ13
Sebastian Fischmeister, [email protected] 10 / 28
Real-time System Execution Trace
task ID
event
timestamp fields
• Task ID: conjuction of fields that uniquely identify a task.• Event: tuple <timestamp, task ID>• Trace: chronologically ordered list of events
Sebastian Fischmeister, [email protected] 11 / 28
Task Precedence Graph (TPG)
τ1
τ2 τ3 τ4 τ5 τ6
τ7 τ8
τ9
τ10 τ11 τ12 τ13
TPG is a DAG:• Nodes (τi ): task IDs;• Edges (τi → τj): immediate precedence relations (τi ’s output is τj ’s input), τi τj .
Problem: How to mine immediate precedence relations from a trace?
Sebastian Fischmeister, [email protected] 12 / 28
Task Precedence Graph (TPG)
τ1
τ2 τ3 τ4 τ5 τ6
τ7 τ8
τ9
τ10 τ11 τ12 τ13
TPG is a DAG:• Nodes (τi ): task IDs;• Edges (τi → τj): immediate precedence relations (τi ’s output is τj ’s input), τi τj .
Problem: How to mine immediate precedence relations from a trace?Sebastian Fischmeister, [email protected] 12 / 28
TPG Miner Approach
1. Transactionalize trace to different events
2. Identify occurrence of events between events (→ occurrence pattern)3. Extract precedence relations4. Create DAG from relations5. Clean up the DAG
Sebastian Fischmeister, [email protected] 13 / 28
TPG Miner Approach
1. Transactionalize trace to different events2. Identify occurrence of events between events (→ occurrence pattern)
3. Extract precedence relations4. Create DAG from relations5. Clean up the DAG
Sebastian Fischmeister, [email protected] 13 / 28
TPG Miner Approach
1. Transactionalize trace to different events2. Identify occurrence of events between events (→ occurrence pattern)3. Extract precedence relations
4. Create DAG from relations5. Clean up the DAG
Sebastian Fischmeister, [email protected] 13 / 28
TPG Miner Approach
1. Transactionalize trace to different events2. Identify occurrence of events between events (→ occurrence pattern)3. Extract precedence relations4. Create DAG from relations
5. Clean up the DAG
Sebastian Fischmeister, [email protected] 13 / 28
TPG Miner Approach
1. Transactionalize trace to different events2. Identify occurrence of events between events (→ occurrence pattern)3. Extract precedence relations4. Create DAG from relations5. Clean up the DAG
Sebastian Fischmeister, [email protected] 13 / 28
Occurrence Pattern (op)
Given a string s, its occurrence pattern, op(s), is the shortest substring of s that ”covers”s:
s = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0〉
• p1 = 〈1, 0, 0, 1, 0〉 is the occurrence pattern of s, op(s) = p1• p2 = 〈3, 2〉 is not a substring of s• p3 = 〈1, 0〉 is a substring of s, doesn’t cover s• p4 = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0〉 covers s, but not the shortest (see p1).
Given a trace T = 〈τi , τh, ..〉, op(τi , τj) is the occurrence pattern of a string s whoseelements are the numbers of occurrences of τi between consecutive occurrences of τj .
Sebastian Fischmeister, [email protected] 14 / 28
Occurrence Pattern (op)
Given a string s, its occurrence pattern, op(s), is the shortest substring of s that ”covers”s:
s = 〈 1,0,0,1,0 , 1,0,0,1,0 , 1, 0, 0〉
• p1 = 〈1, 0, 0, 1, 0〉 is the occurrence pattern of s, op(s) = p1
• p2 = 〈3, 2〉 is not a substring of s• p3 = 〈1, 0〉 is a substring of s, doesn’t cover s• p4 = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0〉 covers s, but not the shortest (see p1).
Given a trace T = 〈τi , τh, ..〉, op(τi , τj) is the occurrence pattern of a string s whoseelements are the numbers of occurrences of τi between consecutive occurrences of τj .
Sebastian Fischmeister, [email protected] 14 / 28
Occurrence Pattern (op)
Given a string s, its occurrence pattern, op(s), is the shortest substring of s that ”covers”s:
s = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0〉
• p1 = 〈1, 0, 0, 1, 0〉 is the occurrence pattern of s, op(s) = p1• p2 = 〈3, 2〉 is not a substring of s
• p3 = 〈1, 0〉 is a substring of s, doesn’t cover s• p4 = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0〉 covers s, but not the shortest (see p1).
Given a trace T = 〈τi , τh, ..〉, op(τi , τj) is the occurrence pattern of a string s whoseelements are the numbers of occurrences of τi between consecutive occurrences of τj .
Sebastian Fischmeister, [email protected] 14 / 28
Occurrence Pattern (op)
Given a string s, its occurrence pattern, op(s), is the shortest substring of s that ”covers”s:
s = 〈 1,0 , 0, 1,0 , 1,0 , 0, 1,0 , 1,0 , 0〉
• p1 = 〈1, 0, 0, 1, 0〉 is the occurrence pattern of s, op(s) = p1• p2 = 〈3, 2〉 is not a substring of s• p3 = 〈1, 0〉 is a substring of s, doesn’t cover s
• p4 = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0〉 covers s, but not the shortest (see p1).
Given a trace T = 〈τi , τh, ..〉, op(τi , τj) is the occurrence pattern of a string s whoseelements are the numbers of occurrences of τi between consecutive occurrences of τj .
Sebastian Fischmeister, [email protected] 14 / 28
Occurrence Pattern (op)
Given a string s, its occurrence pattern, op(s), is the shortest substring of s that ”covers”s:
s = 〈 1,0,0,1,0,1,0,0,1,0 , 1, 0, 0〉
• p1 = 〈1, 0, 0, 1, 0〉 is the occurrence pattern of s, op(s) = p1• p2 = 〈3, 2〉 is not a substring of s• p3 = 〈1, 0〉 is a substring of s, doesn’t cover s• p4 = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0〉 covers s, but not the shortest (see p1).
Given a trace T = 〈τi , τh, ..〉, op(τi , τj) is the occurrence pattern of a string s whoseelements are the numbers of occurrences of τi between consecutive occurrences of τj .
Sebastian Fischmeister, [email protected] 14 / 28
Occurrence Pattern (op)
Given a string s, its occurrence pattern, op(s), is the shortest substring of s that ”covers”s:
s = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0〉
• p1 = 〈1, 0, 0, 1, 0〉 is the occurrence pattern of s, op(s) = p1• p2 = 〈3, 2〉 is not a substring of s• p3 = 〈1, 0〉 is a substring of s, doesn’t cover s• p4 = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0〉 covers s, but not the shortest (see p1).
Given a trace T = 〈τi , τh, ..〉, op(τi , τj) is the occurrence pattern of a string s whoseelements are the numbers of occurrences of τi between consecutive occurrences of τj .
T = 〈 τ1, τ2, τ3, τ4, τ5, τ6, τ7, τ3, τ4, τ7, τ5, τ4, τ7, τ2, τ1, τ5, τ7, τ6, τ7, τ3,τ2, τ5, τ4, τ7, τ7, τ7, τ5, τ2, τ7, τ1, τ7, τ2, τ7, τ5, τ1, τ7, τ3〉
op(τ2, τ7) = 〈1, 0, 0, 1, 0〉Sebastian Fischmeister, [email protected] 14 / 28
Occurrence Patterns and Immediate Precedence Relations
TheoremIf a task τi is an immediate predecessor of a task τj (τi τj) in a real-time system S, thenthere exists op(τi ,τj) in any execution trace of S captured during at least two hyperperiodsof S.
time, ms0 20 50 100 200
Visualization of occurrences of tasks τ2 and τ7 during two hyperperiods;τ2 τ7; op(τ2, τ7) = 〈1, 0, 0, 1, 0〉.
Sebastian Fischmeister, [email protected] 15 / 28
Mining Occurrence Patterns in a Trace
Function mine ops(T )Data: trace TResult: op(τi , τj ) ∀τi , τj in Tforeach τi ∈ T do
foreach τj ∈ T , i 6= j dos ← numbers of occurrences of τj between consecutive occurrences of τi in Tforeach prefix q of s, length(q)=1,2,..,length(s)/2 do
if q covers s thenop(τi , τj )← qbreak
else if length(q)==length(s)/2 thenop(τi , τj )← NULL
endreturn op
endend
s = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0〉
Sebastian Fischmeister, [email protected] 16 / 28
Mining Occurrence Patterns in a Trace
Function mine ops(T )Data: trace TResult: op(τi , τj ) ∀τi , τj in Tforeach τi ∈ T do
foreach τj ∈ T , i 6= j dos ← numbers of occurrences of τj between consecutive occurrences of τi in Tforeach prefix q of s, length(q)=1,2,..,length(s)/2 do
if q covers s thenop(τi , τj )← qbreak
else if length(q)==length(s)/2 thenop(τi , τj )← NULL
endreturn op
endend
s = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0〉q = 〈1〉
Sebastian Fischmeister, [email protected] 16 / 28
Mining Occurrence Patterns in a Trace
Function mine ops(T )Data: trace TResult: op(τi , τj ) ∀τi , τj in Tforeach τi ∈ T do
foreach τj ∈ T , i 6= j dos ← numbers of occurrences of τj between consecutive occurrences of τi in Tforeach prefix q of s, length(q)=1,2,..,length(s)/2 do
if q covers s thenop(τi , τj )← qbreak
else if length(q)==length(s)/2 thenop(τi , τj )← NULL
endreturn op
endend
s = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0〉q = 〈1, 0〉
Sebastian Fischmeister, [email protected] 16 / 28
Mining Occurrence Patterns in a Trace
Function mine ops(T )Data: trace TResult: op(τi , τj ) ∀τi , τj in Tforeach τi ∈ T do
foreach τj ∈ T , i 6= j dos ← numbers of occurrences of τj between consecutive occurrences of τi in Tforeach prefix q of s, length(q)=1,2,..,length(s)/2 do
if q covers s thenop(τi , τj )← qbreak
else if length(q)==length(s)/2 thenop(τi , τj )← NULL
endreturn op
endend
s = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0〉q = 〈1, 0, 0〉
Sebastian Fischmeister, [email protected] 16 / 28
Mining Occurrence Patterns in a Trace
Function mine ops(T )Data: trace TResult: op(τi , τj ) ∀τi , τj in Tforeach τi ∈ T do
foreach τj ∈ T , i 6= j dos ← numbers of occurrences of τj between consecutive occurrences of τi in Tforeach prefix q of s, length(q)=1,2,..,length(s)/2 do
if q covers s thenop(τi , τj )← qbreak
else if length(q)==length(s)/2 thenop(τi , τj )← NULL
endreturn op
endend
s = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0〉q = 〈1, 0, 0, 1〉
Sebastian Fischmeister, [email protected] 16 / 28
Mining Occurrence Patterns in a Trace
Function mine ops(T )Data: trace TResult: op(τi , τj ) ∀τi , τj in Tforeach τi ∈ T do
foreach τj ∈ T , i 6= j dos ← numbers of occurrences of τj between consecutive occurrences of τi in Tforeach prefix q of s, length(q)=1,2,..,length(s)/2 do
if q covers s thenop(τi , τj )← qbreak
else if length(q)==length(s)/2 thenop(τi , τj )← NULL
endreturn op
endend
s = 〈1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0〉q = 〈1, 0, 0, 1, 0〉
Sebastian Fischmeister, [email protected] 16 / 28
Mining a TPG from a Trace
Function mine TPG(T )Data: trace TResult: TPG of Tnodes← unique task IDs in Tedges← mine ops(T )TPG← (nodes, edges)TPG← transitive closure(TPG)return TPG
Sebastian Fischmeister, [email protected] 17 / 28
Mining a TPG from a Trace
Function mine TPG(T )Data: trace TResult: TPG of Tnodes← unique task IDs in Tedges← mine ops(T )TPG← (nodes, edges)TPG← transitive closure(TPG)return TPG
τ1
τ2 τ3 τ4 τ5 τ6
τ7 τ8
τ9
τ10 τ11 τ12 τ13
Sebastian Fischmeister, [email protected] 17 / 28
Mining a TPG from a Trace
Function mine TPG(T )Data: trace TResult: TPG of Tnodes← unique task IDs in Tedges← mine ops(T )TPG← (nodes, edges)TPG← transitive closure(TPG)return TPG
τ1
τ2 τ3 τ4 τ5 τ6
τ7 τ8
τ9
τ10 τ11 τ12 τ13
Sebastian Fischmeister, [email protected] 17 / 28
Mining a TPG from a Trace
Function mine TPG(T )Data: trace TResult: TPG of Tnodes← unique task IDs in Tedges← mine ops(T )TPG← (nodes, edges)TPG← transitive closure(TPG)return TPG
τ1
τ2 τ3 τ4 τ5 τ6
τ7 τ8
τ9
τ10 τ11 τ12 τ13
Sebastian Fischmeister, [email protected] 17 / 28
Anomaly-based Intrusion Detection Systems
• Intrusion detection system (IDS) monitors a system/network for malicious activity• Anomaly-based IDS identifies observations (packets, syscalls, etc.) which do not
conform to an expected pattern observed during normal operation.• IDS must be trustworthy, low false positive rate is often more important than low false
negative rate• Desired features of an IDS:
• online operation• explainable results
• TPG represents an expected pattern for an anomaly-based IDS.
Sebastian Fischmeister, [email protected] 18 / 28
TPGs for Anomaly DetectionTrain TPG on other traces and perform online anomaly detection:
Data: trace stream W, TPG GResult: trained G or anomalous event ε in Wforeach ε ∈ W in the increasing order of ε.time do
υ ← ε.task idforeach τ such that e(τ, υ) ∈ G do
u(τ, υ)← append the # of occurrences of τ since previous occurrence of υif u(τ, υ) is not a prefix of op(τ, υ) then
if training then remove e(τ, υ) from G; update Gelse return ε.time, u(τ, υ), op(τ, υ)
endif training then return Gelse return False
end
T = 〈op(τ2, τ7) = 〈1, 0, 0, 1, 0〉u(τ2, τ7) = 〈0〉
Sebastian Fischmeister, [email protected] 19 / 28
TPGs for Anomaly DetectionTrain TPG on other traces and perform online anomaly detection:
Data: trace stream W, TPG GResult: trained G or anomalous event ε in Wforeach ε ∈ W in the increasing order of ε.time do
υ ← ε.task idforeach τ such that e(τ, υ) ∈ G do
u(τ, υ)← append the # of occurrences of τ since previous occurrence of υif u(τ, υ) is not a prefix of op(τ, υ) then
if training then remove e(τ, υ) from G; update Gelse return ε.time, u(τ, υ), op(τ, υ)
endif training then return Gelse return False
end
T = 〈op(τ2, τ7) = 〈1, 0, 0, 1, 0〉u(τ2, τ7) = 〈
Sebastian Fischmeister, [email protected] 19 / 28
TPGs for Anomaly DetectionTrain TPG on other traces and perform online anomaly detection:
Data: trace stream W, TPG GResult: trained G or anomalous event ε in Wforeach ε ∈ W in the increasing order of ε.time do
υ ← ε.task idforeach τ such that e(τ, υ) ∈ G do
u(τ, υ)← append the # of occurrences of τ since previous occurrence of υif u(τ, υ) is not a prefix of op(τ, υ) then
if training then remove e(τ, υ) from G; update Gelse return ε.time, u(τ, υ), op(τ, υ)
endif training then return Gelse return False
end
T = 〈τ1, τ2,op(τ2, τ7) = 〈1, 0, 0, 1, 0〉u(τ2, τ7) = 〈1〉
Sebastian Fischmeister, [email protected] 19 / 28
TPGs for Anomaly DetectionTrain TPG on other traces and perform online anomaly detection:
Data: trace stream W, TPG GResult: trained G or anomalous event ε in Wforeach ε ∈ W in the increasing order of ε.time do
υ ← ε.task idforeach τ such that e(τ, υ) ∈ G do
u(τ, υ)← append the # of occurrences of τ since previous occurrence of υif u(τ, υ) is not a prefix of op(τ, υ) then
if training then remove e(τ, υ) from G; update Gelse return ε.time, u(τ, υ), op(τ, υ)
endif training then return Gelse return False
end
T = 〈τ1, τ2, τ3, τ4, τ5, τ6, τ7op(τ2, τ7) = 〈1, 0, 0, 1, 0〉u(τ2, τ7) = 〈1〉 u(τ2, τ7) is a prefix of op(τ2, τ7)
Sebastian Fischmeister, [email protected] 19 / 28
TPGs for Anomaly DetectionTrain TPG on other traces and perform online anomaly detection:
Data: trace stream W, TPG GResult: trained G or anomalous event ε in Wforeach ε ∈ W in the increasing order of ε.time do
υ ← ε.task idforeach τ such that e(τ, υ) ∈ G do
u(τ, υ)← append the # of occurrences of τ since previous occurrence of υif u(τ, υ) is not a prefix of op(τ, υ) then
if training then remove e(τ, υ) from G; update Gelse return ε.time, u(τ, υ), op(τ, υ)
endif training then return Gelse return False
end
T = 〈τ1, τ2, τ3, τ4, τ5, τ6, τ7, τ3, τ4, τ7op(τ2, τ7) = 〈1, 0, 0, 1, 0〉u(τ2, τ7) = 〈1, 0〉 u(τ2, τ7) is a prefix of op(τ2, τ7)
Sebastian Fischmeister, [email protected] 19 / 28
TPGs for Anomaly DetectionTrain TPG on other traces and perform online anomaly detection:
Data: trace stream W, TPG GResult: trained G or anomalous event ε in Wforeach ε ∈ W in the increasing order of ε.time do
υ ← ε.task idforeach τ such that e(τ, υ) ∈ G do
u(τ, υ)← append the # of occurrences of τ since previous occurrence of υif u(τ, υ) is not a prefix of op(τ, υ) then
if training then remove e(τ, υ) from G; update Gelse return ε.time, u(τ, υ), op(τ, υ)
endif training then return Gelse return False
end
T = 〈τ1, τ2, τ3, τ4, τ5, τ6, τ7, τ3, τ4, τ7, τ5, τ2op(τ2, τ7) = 〈1, 0, 0, 1, 0〉u(τ2, τ7) = 〈1, 0, 1〉
Sebastian Fischmeister, [email protected] 19 / 28
TPGs for Anomaly DetectionTrain TPG on other traces and perform online anomaly detection:
Data: trace stream W, TPG GResult: trained G or anomalous event ε in Wforeach ε ∈ W in the increasing order of ε.time do
υ ← ε.task idforeach τ such that e(τ, υ) ∈ G do
u(τ, υ)← append the # of occurrences of τ since previous occurrence of υif u(τ, υ) is not a prefix of op(τ, υ) then
if training then remove e(τ, υ) from G; update Gelse return ε.time, u(τ, υ), op(τ, υ)
endif training then return Gelse return False
end
T = 〈τ1, τ2, τ3, τ4, τ5, τ6, τ7, τ3, τ4, τ7, τ5, τ2, τ4, τ2op(τ2, τ7) = 〈1, 0, 0, 1, 0〉u(τ2, τ7) = 〈1, 0, 2〉
Sebastian Fischmeister, [email protected] 19 / 28
TPGs for Anomaly DetectionTrain TPG on other traces and perform online anomaly detection:
Data: trace stream W, TPG GResult: trained G or anomalous event ε in Wforeach ε ∈ W in the increasing order of ε.time do
υ ← ε.task idforeach τ such that e(τ, υ) ∈ G do
u(τ, υ)← append the # of occurrences of τ since previous occurrence of υif u(τ, υ) is not a prefix of op(τ, υ) then
if training then remove e(τ, υ) from G; update Gelse return ε.time, u(τ, υ), op(τ, υ)
endif training then return Gelse return False
end
T = 〈τ1, τ2, τ3, τ4, τ5, τ6, τ7, τ3, τ4, τ7, τ5, τ2, τ4, τ2, τ7op(τ2, τ7) = 〈1, 0, 0, 1, 0〉u(τ2, τ7) = 〈1, 0, 2〉 u(τ2, τ7) is not a prefix of op(τ2, τ7)
Sebastian Fischmeister, [email protected] 19 / 28
Evaluation in Automotive Context
Modern vehicles:• becoming connected (internet, bluetooth, radars, etc.);• becoming autonomous (automatic parking, autopilot).
Security threats:1. Remotely hack one or several ECUs, run malicious code on them.2. Spoof the identity of the hacked ECUs.3. Take over the control of the vehicle.
Examples:• Miller and Valasek on Jeep Cherokee (2015) 1
• Tencent Keen Lab on Tesla Model S (2016, 2017) 2
1https://www.wired.com/2015/07/hackers-remotely-kill-jeep-highway/2https://www.pcmag.com/news/355281/tesla-model-s-hackers-return-for-encore-attack
Sebastian Fischmeister, [email protected] 20 / 28
Evaluation in Automotive Context
Modern vehicles:• becoming connected (internet, bluetooth, radars, etc.);• becoming autonomous (automatic parking, autopilot).
Security threats:1. Remotely hack one or several ECUs, run malicious code on them.2. Spoof the identity of the hacked ECUs.3. Take over the control of the vehicle.
Examples:• Miller and Valasek on Jeep Cherokee (2015) 1
• Tencent Keen Lab on Tesla Model S (2016, 2017) 2
1https://www.wired.com/2015/07/hackers-remotely-kill-jeep-highway/2https://www.pcmag.com/news/355281/tesla-model-s-hackers-return-for-encore-attack
Sebastian Fischmeister, [email protected] 20 / 28
Evaluation in Automotive Context (cont.)Controller Area Network (CAN) bus:• de-facto standard for interconnecting vehicle’s ECUs;• ECUs broadcast messages;• no authentification;
CAN bus trace:
time message ID bytes b1 b2 b3 b4 b5 b6 b7 b8.... .... .... .... .... .... .... .... .... .... ....1478193190.235495 0545 8 d8 5f 00 8b 00 00 00 001478193190.235633 05f0 2 01 001478193190.236088 0130 8 0b 80 00 ff 0f 80 0c ea1478193190.236324 0131 8 f2 7f 00 00 15 7f 0c 351478193190.236565 0140 8 00 00 00 00 1e 0d 2c 3b1478193190.236815 02c0 8 15 00 00 00 00 00 00 00.... .... .... .... .... .... .... .... .... .... ....
Sebastian Fischmeister, [email protected] 21 / 28
Case Studies
CAN bus traces:Study 1 Hyundai YF Sonata with injected and annotated spoofed messages (HCRL lab
at Korea University 3)
Study 2 Production vehicle exercised on a test track (traces provided by our partner).3http://ocslab.hksecurity.net/Datasets/CAN-intrusion-dataset
Sebastian Fischmeister, [email protected] 22 / 28
Case Study I
• 4 traces with annotated spoofed messages.• spoofed messages have unique message IDs in two traces (DoS.csv and Fuzzy.csv):
• use these traces to train a TPG;• train on ”normal” parts: before the first spoofed message, 1000 events in each part.
0316
0329
1
018f
0260
1
02a0
1
1
0545
0002
1
0153
1
0430
2
02c0
1
0130
1
0350
1 1 11
0440
1
0131
1
0140
1
1
043f
1
0370
1
1 1
04f0
1
04b1
1
01f1
1
05f0 00a0
00a1
1
0690
1
The trained TPG
Sebastian Fischmeister, [email protected] 23 / 28
Case Study I (cont.)
Anomaly detection in the remaining 2 traces (gear.csv and RPM.csv):• attacks were detected prior to the first spoofed message – very good:
• 29 events before the first spoofed message in gear.csv• 7 events before the first spoofed message in RPM.csv• detected anomalies are probably caused by the activation of an external board used to
inject spoofed messages.
• false positive rates:• gear.csv: 0.019 (19 out of 1000 events)• RPM.csv: 0.007 (7 out of 1000 events)• (almost) all false positives are returned in the beginning of traces.
Observation: synchronization between analyzer and runtime needed
Sebastian Fischmeister, [email protected] 24 / 28
Case Study I (cont.)
Anomaly detection in the remaining 2 traces (gear.csv and RPM.csv):• attacks were detected prior to the first spoofed message – very good:
• 29 events before the first spoofed message in gear.csv• 7 events before the first spoofed message in RPM.csv• detected anomalies are probably caused by the activation of an external board used to
inject spoofed messages.• false positive rates:
• gear.csv: 0.019 (19 out of 1000 events)• RPM.csv: 0.007 (7 out of 1000 events)• (almost) all false positives are returned in the beginning of traces.
Observation: synchronization between analyzer and runtime needed
Sebastian Fischmeister, [email protected] 24 / 28
Case Study II
• Traces capture normal behavior of vehicle’s ECUs, no anomalies• The goal: evaluation of false positive rate• 15 traces in total:
Trace Duration (s) # events # tasks1 312 793,653 2522 245.5 653,932 2403 361.3 907,038 2244 390 1,044,558 2515 352.6 891,195 2356 287 727,508 2437 288.7 767,148 2398 307.2 1,260,299 2439 600 1,425,667 21910 343.2 866,187 22911 319 847,882 24112 249.9 635,463 23513 342.1 991,105 27614 314.3 834,942 25715 283.8 754,786 239
Sebastian Fischmeister, [email protected] 25 / 28
Case Study II (cont.)Evaluate the number of false positives (FPs):
1) build a TPG G using trace #1;2) detect anomalies in trace #2 using G; detected anomalies are FPs.3) refine G with trace #2;4) detect anomalies in trace #3 using G; detected anomalies are FPs.5) refine G with trace #3;6) apply the same procedure on traces ##4-15.
The trained TPG. Only nodes with I/O edges shown
Trace # FPs1 —2 1403 2204 115 16 387 28 99 010 011 012 013 014 015 0
Sebastian Fischmeister, [email protected] 26 / 28
Many Other Applications
• Insight and understanding for developers
• Documentation of (legacy) systems• Debugging• Safety assessment• Runtime monitoring• Model generation for simulation and optimization• . . .
Sebastian Fischmeister, [email protected] 27 / 28
Many Other Applications
• Insight and understanding for developers• Documentation of (legacy) systems
• Debugging• Safety assessment• Runtime monitoring• Model generation for simulation and optimization• . . .
Sebastian Fischmeister, [email protected] 27 / 28
Many Other Applications
• Insight and understanding for developers• Documentation of (legacy) systems• Debugging
• Safety assessment• Runtime monitoring• Model generation for simulation and optimization• . . .
Sebastian Fischmeister, [email protected] 27 / 28
Many Other Applications
• Insight and understanding for developers• Documentation of (legacy) systems• Debugging• Safety assessment
• Runtime monitoring• Model generation for simulation and optimization• . . .
Sebastian Fischmeister, [email protected] 27 / 28
Many Other Applications
• Insight and understanding for developers• Documentation of (legacy) systems• Debugging• Safety assessment• Runtime monitoring
• Model generation for simulation and optimization• . . .
Sebastian Fischmeister, [email protected] 27 / 28
Many Other Applications
• Insight and understanding for developers• Documentation of (legacy) systems• Debugging• Safety assessment• Runtime monitoring• Model generation for simulation and optimization
• . . .
Sebastian Fischmeister, [email protected] 27 / 28
Many Other Applications
• Insight and understanding for developers• Documentation of (legacy) systems• Debugging• Safety assessment• Runtime monitoring• Model generation for simulation and optimization• . . .
Sebastian Fischmeister, [email protected] 27 / 28
Conclusion
• TPG miner: an approach to reverse-engineer real-time embedded systems via miningtask precedence graphs from system traces
• Evaluation in the context of automotive security
• can be used as an anomaly-based intrusion detection system• straightforward application on CAN bus traces
• Detects anomalies in streaming data• Explains which expected system behavior has been violated• Requires little training to achieve low false positive rate
• Implemented in R: https://bitbucket.org/oiegorov/tpg_miner
• the README file explains how to reproduce results of Case Study I.
Sebastian Fischmeister, [email protected] 28 / 28
Conclusion
• TPG miner: an approach to reverse-engineer real-time embedded systems via miningtask precedence graphs from system traces
• Evaluation in the context of automotive security
• can be used as an anomaly-based intrusion detection system• straightforward application on CAN bus traces
• Detects anomalies in streaming data• Explains which expected system behavior has been violated• Requires little training to achieve low false positive rate
• Implemented in R: https://bitbucket.org/oiegorov/tpg_miner
• the README file explains how to reproduce results of Case Study I.
Sebastian Fischmeister, [email protected] 28 / 28
Conclusion
• TPG miner: an approach to reverse-engineer real-time embedded systems via miningtask precedence graphs from system traces
• Evaluation in the context of automotive security• can be used as an anomaly-based intrusion detection system
• straightforward application on CAN bus traces• Detects anomalies in streaming data• Explains which expected system behavior has been violated• Requires little training to achieve low false positive rate
• Implemented in R: https://bitbucket.org/oiegorov/tpg_miner
• the README file explains how to reproduce results of Case Study I.
Sebastian Fischmeister, [email protected] 28 / 28
Conclusion
• TPG miner: an approach to reverse-engineer real-time embedded systems via miningtask precedence graphs from system traces
• Evaluation in the context of automotive security• can be used as an anomaly-based intrusion detection system• straightforward application on CAN bus traces
• Detects anomalies in streaming data• Explains which expected system behavior has been violated• Requires little training to achieve low false positive rate
• Implemented in R: https://bitbucket.org/oiegorov/tpg_miner
• the README file explains how to reproduce results of Case Study I.
Sebastian Fischmeister, [email protected] 28 / 28
Conclusion
• TPG miner: an approach to reverse-engineer real-time embedded systems via miningtask precedence graphs from system traces
• Evaluation in the context of automotive security• can be used as an anomaly-based intrusion detection system• straightforward application on CAN bus traces
• Detects anomalies in streaming data
• Explains which expected system behavior has been violated• Requires little training to achieve low false positive rate
• Implemented in R: https://bitbucket.org/oiegorov/tpg_miner
• the README file explains how to reproduce results of Case Study I.
Sebastian Fischmeister, [email protected] 28 / 28
Conclusion
• TPG miner: an approach to reverse-engineer real-time embedded systems via miningtask precedence graphs from system traces
• Evaluation in the context of automotive security• can be used as an anomaly-based intrusion detection system• straightforward application on CAN bus traces
• Detects anomalies in streaming data• Explains which expected system behavior has been violated
• Requires little training to achieve low false positive rate
• Implemented in R: https://bitbucket.org/oiegorov/tpg_miner
• the README file explains how to reproduce results of Case Study I.
Sebastian Fischmeister, [email protected] 28 / 28
Conclusion
• TPG miner: an approach to reverse-engineer real-time embedded systems via miningtask precedence graphs from system traces
• Evaluation in the context of automotive security• can be used as an anomaly-based intrusion detection system• straightforward application on CAN bus traces
• Detects anomalies in streaming data• Explains which expected system behavior has been violated• Requires little training to achieve low false positive rate
• Implemented in R: https://bitbucket.org/oiegorov/tpg_miner
• the README file explains how to reproduce results of Case Study I.
Sebastian Fischmeister, [email protected] 28 / 28
Conclusion
• TPG miner: an approach to reverse-engineer real-time embedded systems via miningtask precedence graphs from system traces
• Evaluation in the context of automotive security• can be used as an anomaly-based intrusion detection system• straightforward application on CAN bus traces
• Detects anomalies in streaming data• Explains which expected system behavior has been violated• Requires little training to achieve low false positive rate
• Implemented in R: https://bitbucket.org/oiegorov/tpg_miner
• the README file explains how to reproduce results of Case Study I.
Sebastian Fischmeister, [email protected] 28 / 28
Conclusion
• TPG miner: an approach to reverse-engineer real-time embedded systems via miningtask precedence graphs from system traces
• Evaluation in the context of automotive security• can be used as an anomaly-based intrusion detection system• straightforward application on CAN bus traces
• Detects anomalies in streaming data• Explains which expected system behavior has been violated• Requires little training to achieve low false positive rate
• Implemented in R: https://bitbucket.org/oiegorov/tpg_miner• the README file explains how to reproduce results of Case Study I.
Sebastian Fischmeister, [email protected] 28 / 28