outline

44
Automatic Trust Management for Adaptive Survivable Systems Howard Shrobe MIT AI Lab Computational Vulnerability Analysis for Model Based Diagnosis July 2001 PI Meeting Santa Fe

Upload: lareina-garrett

Post on 01-Jan-2016

30 views

Category:

Documents


2 download

DESCRIPTION

Automatic Trust Management for Adaptive Survivable Systems Howard Shrobe MIT AI Lab Computational Vulnerability Analysis for Model Based Diagnosis July 2001 PI Meeting Santa Fe. Outline. Overall Framework Review of Diagnostic Process Computational Vulnerability Analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Outline

Automatic Trust Managementfor

Adaptive Survivable Systems

Howard Shrobe MIT AI Lab

Computational Vulnerability Analysisfor

Model Based Diagnosis

July 2001 PI Meeting Santa Fe

Page 2: Outline

Outline

• Overall Framework

• Review of Diagnostic Process

• Computational Vulnerability Analysis

Page 3: Outline

Adaptive Survivable Systems

• Techniques that enable self-monitoring and diagnosis – Driven by representations of structure and purpose– The application knows the purposes of its components– The application checks that these are achieved– If these purposes are not achieved, the application localizes and

characterize the failure

• Techniques that enable application adaptation – The application achieve its purpose as well as possible within the

available infrastructure by choosing alternatives.– Driven by models of Trust (informed by diagnosis and monitoring)– Driven by models of computational alternatives– It must have more than one way to effect each critical computation– It should choose an alternative approach if the first one failed– It should make its initial choices in light of the trust model

Page 4: Outline

The Active Trust Management Architecture

Self Adaptive Survivable Systems

PerpetualAnalytical Monitoring

Trust Model:TrustworthinessCompromises

Attacks

Rational Decision Making

Other InformationSources:

Intrusion Detectors

TrendTemplates System Models

&Domain Architecture

Rational Resource Allocation

Page 5: Outline

Motivating Example

GrammarCenter

SpeechProcessing

Grammar

VoiceCapture

text

Start

Omnibase

query

DisplayGeneratorresponse

Display

Sleepy Grumpy DocDopey

utterance

Performanceexpectations

IntegrityConstraint

GuiDirectives

Page 6: Outline

Diagnosis as Likely Mode Identification

• Single Level, Single Model Model Based Diagnosis– Tells you which components aren’t working as expected

• Multi-Mode Diagnosis– Tells you in what way they aren’t working as expected

• Multi-Level, Multi-Mode Diagnosis– Tells you how the misbehaviors are coupled through common-

mode failures (or compromises) and ranks the failures by their probabilities.

• Attack Models– Tells you how the common mode failures (or compromised modes

of the resources) are in turn coupled to common attacks exploiting vulnerabilities of the resources.

Page 7: Outline

Model Based Diagnosis with Multiple Faults

• Each component is modeled by multi-directional constraints representing the normal behavior

• As a value is propagated through a component model, it is labeled with the assumption that this component works

• A conflict is detected at any place to which inconsistent values are propagated

• A conflict set is the set of all labels attached to the conflicting values

• A diagnosis is a set of assumptions which form a covering set of all Conflict set

• Goal is to find all minimum diagnoses

Page 8: Outline

Model Based TroubleshootingGDE

Times

Times

Times

Plus

Plus

3

5

3

5

5

40

40

35

40

Conflicts:

Diagnoses:

25

20

Blue or Violet Broken

Green Broken, Red with compensating fault

Green Broken, Yellow with masking fault

15

15

25

Page 9: Outline

Consistent DiagnosesA B C MID MID Prob Explanation

Low HighNormal Normal Slow 3 3 .04410 C is delayedSlow Fast Normal 7 12 .00640 A Slow, B Masks runs negative!Fast Normal Slow 1 2 .00630 A Fast, C SlowerNormal Fast Slow 4 6 .00196 B not too fast, C slowFast Slow Slow -30 0 .00042 A Fast, B Masks, C slowSlow Fast Fast 13 30 .00024 A Slow, B Masks, C not masking fast

L H PNormal:3 6 .7Fast: -30 2 .1Slow: 7 30 .2

IN 0L H P

Normal:5 10 0.8Fast: -30 4 .03Slow: 11 30 .07

OUT2

Observed: 17Predicted: Low = 8

High =16

L H PNormal:2 4 0.9Fast: -30 1 .04Slow: 5 30 .06

OUT1

Observed: 5Predicted: Low = 5

High = 10

A

B

C

MIDLow = 3High = 6

Multi-Mode Diagnosis

Page 10: Outline

Multi-Mode Multi-Tiered Diagnosis

• The model is augmented with another level of detail showing the dependence of computations on underlying resources

• Each resource has models of its state of compromise• The modes of the resource models are linked to the modes of the

computational models by conditional probabilities• The model forms a bayesian network

Normal: Delay: 2,4

Delayed: Delay 4,+inf

Accelerated: Delay -inf,2

Node17

Located On

Normal: Probability 90%

Parasite: Probability 9%

Other: Probability 1%

Component 1

Has models Has models

Conditional probability = .2

Conditional probability = .4

Conditional probability = .3

Page 11: Outline

A

Host1

B

D

C

E

Host2 Host4Host3

N HNormal .6 .15Peak .1 .80Off Peak .3 .05

N HNormal .8 .3Slow .2 .7

Normal .9Hacked .1

Normal .85Hacked .15

Normal .8Hacked .2

Normal .7Hacked .3

N HNormal .50 .05Fast .25 .45Slow .25 .50

N HNormal .60 .05Slow .25 .45Slower .15 .50

N HNormal .50 .05Fast .25 .45Slow .25 .50

An Example System Description

Page 12: Outline

The System Description includes a Bayesian Network

• The Model can be viewed as a Two-Tiered Bayesian Network– Resources with modes– Computations with modes– Conditional probabilities linking the modes

A

Host1

B

D

C

E

Host2 Host4Host3

N HNormal .6 .15Peak .1 .80Off Peak .3 .05

N HNormal .8 .3Slow .2 .7

Normal .9Hacked .1

Normal .85Hacked .15

Normal .8Hacked .2

Normal .7Hacked .3

N HNormal .50 .05Fast .25 .45Slow .25 .50

N HNormal .60 .05Slow .25 .45Slower .15 .50

N HNormal .50 .05Fast .25 .45Slow .25 .50

Page 13: Outline

The system description includes a behavioral model

• The Model can also be viewed as a behavioral model with multiple modes per device– Each model has behavioral description

• The modes have posterior probabilities linked by conditional probabilities to the probabilities of the modes of the resources

A B

D

C

E

N HNormal .6 .15Peak .1 .80Off Peak .3 .05

N HNormal .8 .3Slow .2 .7

N HNormal .50 .05Fast .25 .45Slow .25 .50

N HNormal .60 .05Slow .25 .45Slower .15 .50

N HNormal .50 .05Fast .25 .45Slow .25 .50

Page 14: Outline

Integrating model based and Bayesian reasoning

• Start with each behavioral model in the “normal” state • Repeat: Check for Consistency of the current model• If inconsistent,

– Add a new node to the Bayesian network• This node represents the logical-and of the nodes in the conflict.• It’s truth-value is pinned at FALSE.

– Prune out all possible solutions which are a super-set of the conflict set. – Pick another set of models from the remaining solutions

• If consistent, add to the set of possible diagnoses• Continue until all inconsistent sets of models are found• Solve the Bayesian network

Conflict:A = NORMALB = NORMALC = NORMAL

Discrepancy Observed Here

Least Likely Member of ConflictMost Likely Alternative is SLOW

A B

D

C

E

N HNormal .6 .15Peak .1 .80Off Peak .3 .05

N HNormal .8 .3Slow .2 .7

N HNormal .50 .05Fast .25 .45Slow .25 .50

N HNormal .60 .05Slow .25 .45Slower .15 .50

N HNormal .50 .05Fast .25 .45Slow .25 .50

Page 15: Outline

Adding Attack Models• An Attack Model specifies the set of attacks that are

believed to be possible in the environment• Each resource has a set of vulnerabilities

– Vulnerabilities enable attacks on that resource• A successful attack exploits the vulnerability, putting the

resource into a non-normal behavioral mode• This is given as a set of conditional probabilities

– If the attack succeeded on a resource of this type then the likelihood that the resource is in mode-x is P

– This now forms a three tiered Bayesian network

Host1 Buffer-Overflow

Has-vulerability

Overflow-AttackEnables

Unix-Family

Resource-type

CausesNormal

Slow

.5

.7

Page 16: Outline

Three Tiered Model

Page 17: Outline

What the diagnostic process tells us

• All non-conflicting combination of models are possible diagnoses

• The posterior probabilities tell you how likely each diagnosis is.

• This guides recovery processing

• Each mode of each resource has a posterior probability• This guides resource selection in the future

• The attack models couple the resource models, given a system wide view.

• This informs the trust model• This couples to long-term monitoring, that looks for

complex multi-stage attacks

Page 18: Outline

Computational Vulnerability Analysis

• Grounding the attack model in systematic analysis

• Ontology of:– System Properties– System Types– System Structure– Control and Dependencies

Page 19: Outline

Generating Attack ModelsThrough Vulnerability Analysis

• The problem: Where does the attack model and its links to behavioral modes come from?– So far, by hand crafting

• Vulnerability Analysis supplants this by a systematic analysis:– Forming an ontology of how computer systems are

structured– Building models of the environment

• Network topology: nodes, routers, switches, filter, firewalls• System types: hardware, operating systems• Server and user suites: Which servers and users run where

– Analyzing how properties depend on resources– Analyzing the vulnerabilities of the resources

Page 20: Outline

Modeling System Structure

Hardware

Processor

Memory DeviceControllers

Devicescontrols

Part-of

OperatingSystem

LogonController

Scheduler

DeviceDrivers

Part-of

JobAdmitter

Resides-In

controls

UserSet

WorkLoad

FileSystem

AccessController

resources

controls

files

Part-of

Input-to

Input-to

controls

SchedulerPolicy

Page 21: Outline

Modeling the topologyMachine name: sleepyOS Type: Windows-NTServer Suite: IIS…..User Authentication Pool: Dwarfs…

Router: Enclave restrictions. ….

Topology tells you:who can share (and sniff) which packetswho can affect what types of connections to whom

Switch: subnet restrictions. ….

Switch: subnet restrictions. ….

Page 22: Outline

Modeling Dependencies

• Start with the desirable properties of systems:– Reliable performance

– Privacy of communications

– Integrity and/or privacy of data

• Analyze which system components impact those properties– Performance - scheduler

– Privacy - access-controller

• To affect a desirable property control a component that contributes to the delivery of that property

Page 23: Outline

Controlling components (1)

• One way to gain control of a component is to directly exploit a known vulnerability– One way to control a Microsoft IIS web server is to use

a buffer overflow attack on it.

IIS Web Server Process

Buffer-Overflow Attack

Takes control of

IIS Web Server

Buffer-Overflow Attack

Is vulnerable to

Page 24: Outline

Controlling components (2)

• Another way to control a component is to find an input to the component and then find a way to modify the input– Modify the scheduler policy parameters

Scheduler

Scheduler Policy

Parameters

Input to

Scheduler

control by

Modification-action

Scheduler Policy

Parameters

Page 25: Outline

Controlling components (3)

• Another way to control a component is to find one of its components and then to find a way to gain control the sub-component

Job-Admitter

User Job Admitter

Component-of

Job-Admitter

control by

Control-action

User JobAdmitter

Page 26: Outline

Modifying Inputs (1)

• One way to modify an input is to find a component which controls the input and then to find a way to gain control component

Scheduler

Workload

Input-of

Scheduler

control by

Job Admitter Workload

Job Admitter

Controls

Controls

Attack.

Controls

Page 27: Outline

Modifying Inputs (2)

• One way to modify an input is to find a component of the input and then to find a way to modify the component

Scheduler

Workload

Input-of

Scheduler

control by

User Workload

Workload

User Workload

ComponentComponent

Attack.Modify

Page 28: Outline

Access Rights• Each object specifies a set of capabilities required

for each operation on that object– Capabilities are organized in an DAG

– This generalizes the access mechanisms of all OS’s.

• Each actor (user or process) possesses certain capabilities.

• An actor can perform an action on an object only if it possesses a capability at least as strong as that required for the operation– This is a generalization of the access mechanisms in all

current OS’s.

• An access pool is a set of machines that shares resources, password & access right descriptions

Page 29: Outline

Netchex

The AI Lab Topology (partial)

Router Netchex Filters out Telnet.

ServerSwitch

8th-Floor-1

8th-Floor-2

7th-Floor-1

RouterAccesspool

Life

Kenmore

Maytag

Server Access Pool

Doc

Dopey

Sleepy

DwarfAccess Pool

Sneezy

Sakharov

Truman

Quincy-Adams

LispAccess Pool

Jefferson

Wilson

CreepyCrawler

GeneralAccess Pool

Page 30: Outline

Obtaining Access (1)

• One way to gain access to an operation on an object is to find a process with an adequate capability and take control of the process

Typical User File

User Read

Required forRead

Typical User File

To Read

Control-action

Typical UserProcess

Typical User Process

User Read

PossesesCapability

Page 31: Outline

Obtaining Access (2)

• Another way to gain access to an operation on an object is to find a user with an adequate capability and find a way to log in as that user and launch a process with the user’s capabilities

Typical User File

User Read

Required forRead

Typical User File

To Read

Logon asTypical User

UserProcess

Typical User

User Read

PossesesCapability

Launches

Page 32: Outline

Logging On

• Logging on requires obtaining knowledge of a password

• To gain knowledge of a password– Guess it, using guessing attacks

– Sniff it• By placing a parasitic virus on the user’s machine

• By monitoring network traffic

– Hack the password file

Page 33: Outline

Monitoring and Changing Network Traffic• Network are broken down into subnet segments• Segments are connected by Routers

– Routers can monitor traffic on any connected segment• Each segment may be:

– Shared media• Coaxial ethernet• Wireless ethernet• Any connected computer can monitor traffic

– Switched media• 10 (100, 100) base-T• Only the switch (or reflected ports) can monitor Traffic

• Switches and Routers are computers – They can be controlled– But they may be members of special access pools

• To gain knowledge of some information gain the ability to monitor network traffic

Page 34: Outline

Residences

• Components reside in several places– Main memory– Boot files– Paging Files

• They migrate between residences– Through local peripheral controllers– Through networks

• To modify/observe a component find a residence of the component and modify/observe it in the residence

• To modify/observe a component find a migration path and modify/observe it during the transmission

Page 35: Outline

Formats and Transformations

• Components live in several different formats– Source code

– Compiled binary code

– Linked executable images

• Processes transform one format into another– Compilation

– Linking

• To modify a component change an upstream format and cause the transformations to happen

• To modify a component gain control of the processes that perform the transformations

Page 36: Outline

Modification during Transmission

• To control traffic on a network segment launch a “man in the middle attack”– Get control of a machine, redirect traffic to it

• To observe network traffic get control of a switch/router and a user machine and then reflect traffic to the user machine

• To modify network traffic launch an “inserted packet” packet.– Get control of a machine– Send a packet from the controlled machine with the

correct serial number but wrong data before the sender sends the real packet

Page 37: Outline

An Example• Affecting reliable performance:

– Control the scheduler - • The scheduler is a component that impacts performance

– By modifying the scheduler’s policy parameters• The policy parameters are inputs to the scheduler

– By gaining root access• The policy parameters require root access for writing

– By using a buffer overflow attack on the web-server• The web-server process possesses root capabilities• The web-server process is vulnerable to a buffer-overflow

attack.

• For this attack to impact the performance all the actions must succeed– Each has an a priori probability based on its inherent

difficulty and current evidence suggesting that it occurred.

Page 38: Outline

Affecting Data Privacy (1)

Page 39: Outline

Affecting Data Privacy (2)

Page 40: Outline

Affecting Data Privacy (3)

Page 41: Outline

Affecting Performance (1)

Page 42: Outline

Affecting Performance (2)

Page 43: Outline

Using Attack Scenarios

• This information is captured in an Object-Oriented knowledge representation and rule-base system that reasons with it.

• The inference process develops multi-stage attack scenarios

• The scenarios are transformed into trend templates for recognition purpose

• The scenarios are transformed into Bayesian network fragment for diagnostic purposes

Page 44: Outline

Integration Opportunities

• Projects that provide self-monitoring capabilities– We depend on self-monitoring

– We typical assume coarse-grain (e.g. method wrapping)

– Could use lower-level tools as well

• Projects that provide policy enforcement– Attempted violations of policies should trigger

diagnostic activity

• Projects that provide recovery capabilities• Participation in framework development