a specification based approach for building survivable...

106
A specification-based approach for intrusion detection by Yong Cai A thesis submitted to the graduate faculty in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Major: Computer Science Major Professor: R. C. Sekar

Upload: vanminh

Post on 01-Jul-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

A specification-based approach for intrusion detection

by

Yong Cai

A thesis submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

MASTER OF SCIENCE

Major: Computer Science

Major Professor: R. C. Sekar

Iowa State University

Ames, Iowa

1999

Page 2: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

Graduate College

Iowa State University

This is to certify that the Master’s thesis of

Yong Cai

has met the thesis requirements of Iowa State University

________________________________

Major Professor

________________________________

For the Major Program

________________________________

For the Graduate College

ii

Page 3: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

TABLE OF CONTENTS

ABSTRACT……………………………………………………………………………….....V

CHAPTER 1. INTRODUCTION………………………………………………………... …11.1 Computer Security and Intrusion Detection.................................................................1

1.2 Our Approach...............................................................................................................3

1.3 Key Contributions........................................................................................................4

1.4 Thesis Organization......................................................................................................5

CHAPTER 2.OVERALL APPROACH AND RELATED WORK…………… …………62.1 Behavioral Specifications Model.................................................................................6

2.2 Detection System Model..............................................................................................8

2.3 Related Work..............................................................................................................11

2.3.1 Misuse Intrusion Detection..................................................................................12

2.3.2 Anomaly Intrusion Detection..............................................................................13

2.3.3 Specification Based Monitoring..........................................................................14

2.4 Benefits of Our Approach..........................................................................................15

CHAPTER 3.ASL SPECIFICATION LANGUAGE………….. ………………………..173.1 External Functions......................................................................................................17

3.2 Events.........................................................................................................................18

3.3 Patterns.......................................................................................................................19

3.3.1 Atomic Patterns...................................................................................................19

3.3.2 Primitive Patterns.................................................................................................19

3.3.3 General Event Patterns........................................................................................20

3.4 Rules...........................................................................................................................20

3.5 Event Abstractions.....................................................................................................21

3.6 Example Specifications..............................................................................................22

CHAPTER 4.ATOMIC EXECUTION…. ………………………………………………..254.1 Atomic Execution.......................................................................................................26

4.2 Defining Readset/Writeset.........................................................................................26

4.3 Implementation Approach..........................................................................................27

4.4 Some Examples of the Algorithm..............................................................................29

iii

Page 4: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

4.5 Discussion of Correctness..........................................................................................30

CHAPTER 5.TRANSLATION FROM ASL INTO AUTOMATON…. ………………..325.1 Translation Algorithm................................................................................................32

5.2 Illustration of Automata Construction........................................................................40

5.3 Code Generation.........................................................................................................42

5.4 Examples of ASL Specifications Translated into C++ Class Definitions..................43

CHAPTER 6.SUMMARY AND CONCLUSION……… ………………………………..456.1 Effectiveness..............................................................................................................45

6.2 Summary....................................................................................................................46

6.3 Conclusion and Future Work.....................................................................................47

APPENDIX A CLASSIFICATION OF SYSTEM CALLS IN RED HAT LINUX…....49

APPENDIX B COMPILED CODE FOR ASL SPECIFICATIONOF FOR RACE VULNERABILITY..……………………………………………………………………….65

BIBLIOGRAPHY……………… ……………………………………………………….69

ACKNOWLEDGEMENTS ……………………………………………………………….72

iv

Page 5: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

ABSTRACT

People begin to pay increasing attention to computer security because computers play

more and more important roles in our society. A lot of critical services are heavily dependent

on computers. It is thus critical to make computer system highly robust and reliable. Intrusion

detection is a technique to enhance the computer security. It can detect the successful

breaches of security as well as monitor attempts to breach security. This thesis presents a

specification-based approach for intrusion detection.

The traditional intrusion detection methods have some problems. The anomaly

intrusion detection method detects intrusions based on anomalous behavior of a process. It is

difficult to set up the anomaly thresholds to define which behavior can be considered as an

intrusion. The misuse intrusion detection refers to intrusions that follow well-defined patterns

of attack. It cannot detect previously unknown attacks, since it is impossible to write some

patterns for an unknown vulnerability.

Our approach uses high-level specifications to describe the security-related behaviors

of processes. These specifications are intended to capture normal or intended behaviors of

processes. Deviations from these specifications are indicative of intrusions. Thus, attacks can

be detected even though they may not have been encountered previously. We design a high-

level language called Auditing Specification Language (ASL) to specify security-related

behavior. This language is powerful enough to express a range of integrity constraints and

behaviors over time. Specifications in ASL are compiled into optimized C++ programs for

efficient detection of deviations from these specifications. Our compiler will translate ASL

specifications into an Extended Finite-State Automaton (EFSA). An EFSA is similar to a

finite-state automaton with a set of state variables. The EFSA can be simulated at runtime to

detect intrusions efficiently.

Based on the past five years security problem published by CERT, we believe that

our technique can capture most intrusions.

v

Page 6: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

CHAPTER 1. INTRODUCTION

The growing use of computers make information and networking technologies play

an increasingly important role in our society. Many of the critical services such as

telecommunication, commerce and banking, and transportation all depend heavily on the

computer. Along with the benefits brought about by this change, several new dangers arise.

Individuals and organizations can wreak havoc on these critical services by attacking their

underlying computing and networking infrastructures. Consequently, it becomes very

important to build computer systems that are highly robust and resilient, so that they can

continue to perform their critical functions even in the face of large-scale failures and

malicious attacks.

1.1 Computer Security and Intrusion Detection

The terms of security and intrusion have been defined in many ways. One broad

definition of a secure computer system is given by Garfinkel and Spafford [GS91] as one that

can be depended upon to behave as it is expected to. The expected behavior is formalized

into the security policy of the computer system and governs the goals that the system must

meet. A narrower definition of computer security is based on the realization of

confidentiality, integrity, and availability in a computer system [RS91]. Confidentiality

requires that information be accessible only to those authorized for it; integrity requires that

information remain unaltered by accidents or malicious attempts to change it; and availability

means that the computer system remains working without degradation of access and provides

resources to authorized users when they need it.

1

Page 7: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

Intrusion is defined by Heady et al. [HLMM91] as a set of actions that attempt to

compromise the integrity, confidentiality, or availability of a resource. Or, in other words, we

can say that an intrusion is a violation of the security policy of the systems.

Most computer systems provide an access control mechanism as their first line of

defense [Lam69]. Access control is often presented as the enforcement of policy described

by an access matrix. The columns of the matrix represent the operations that may be

performed on the protected object or service, and each row represents a client. For each

client there is therefore a complete description of which operations that client may undertake,

and likewise, for each operation we may enumerate all clients who have access. The access

matrix is conceptually simple and the majority of access control models are based on it.

However, access control cannot prevent unauthorized information flow through the system

because such flow can take place with authorized accesses to the objects.

Information flow can be controlled to enhance security by applying models such as

the Bell and LaPadula model [BL73] to provide secrecy, or the Biba model [Bib 77] to

provide integrity. However, security comes at the cost of convenience. Both models restrict

read and write operations to ensure the security so that the completely secure system may not

be very useful.

Moreover, both access control and protection model are not helpful against inside

threats or compromising of the authentication module. For example, if a password is weak

and is compromised, access control cannot prevent the loss of information that the

compromised user was authorized to access. In general, because of poor design or

inadequate testing, faults in system software often manifest themselves as security

weaknesses.

Intrusion detection systems fill this role. They are useful not only in detecting

successful breaches of security, but also in monitoring attempts to breach security, which

provides important information for timely countermeasures. For example, compared with

access control, when a weak password is noticed, an intrusion detection system can monitor

hacker attempts to guess the password, so that appropriate response action can be launched

quickly. Usually, intrusion detection can be considered the last line of defense against

attacks.

2

Page 8: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

1.2 Our Approach

Intrusion detection has been divided into two categories traditionally: anomaly

intrusion detection and misuse intrusion detection. The first refers to intrusions that can be

detected based on anomalous behavior and use of computer resources. For example, a user

only uses the computer in his office time, an activity on his account in the midnight might be

an intrusion. Anomaly detection attempts to identify typical usage patterns and flag other

activities as potential intrusions. Anomaly detection has the advantage that no specific

knowledge about security flaws is required to detect intrusion. On the other hand, it is

difficult to set up the anomaly thresholds to define which behavior can be considered as an

intrusion. Sometimes, an abnormal behavior is not an intrusion, maybe it is a situation we

have never seen before. At another time, a hacker is so smart that the intrusion behavior have

different only slightly from the normal behavior.

Misuse intrusion detection refers to intrusions that follow well-defined patterns of

attack that exploit weaknesses in system and application software. Such patterns can be

precisely written in advance. This technique can guarantee the detection of an intrusion if a

signature of the intrusion is described in advance to the intrusion detection system. However,

it cannot detect previously unknown attacks, since it is impossible to write a pattern for an

unknown attack.

Ours is a specification-based approach. We use high-level specifications to describe

the security-related behaviors of processes. These specifications are intended to capture

normal or intended behaviors of processes. Deviations from these specifications are

indicative of intrusions. Thus, attacks can be detected even though they may not have been

encountered previously.

We notice that damage must eventually be effected via the system calls made by the

attacked process to its operating-system environment. In particular, operations for

manipulating files or network connections are all administered through system calls. So

security-related behaviors can be represented in terms of the system calls made by each

process running on the host. We detect deviations from expected behaviors by intercepting

3

Page 9: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

and validating the system calls at runtime. Note that this approach gives us the ability to

detect problems before they cause damage, and can thus be preventive.

We design a high-level language called Auditing Specification Language (ASL) to

specify security-related behavior. This language is powerful enough to express a range of

integrity constraints and behaviors over time. The constraints can span multiple processes.

Specifications in ASL are compiled into optimized C++ programs for efficient detection of

deviations from these specifications.

We employ runtime enforcement techniques to ensure that a process satisfies the

behavior specified in ASL. When deviations from specified behavior are detected, some

automatic actions to isolate the damage can be initiated. These corrective actions are also

described within ASL.

1.3 Key Contributions

The key innovations of our approach include:

A high-level language that simplifies specification of normal behaviors of processes. ASL

is intended to simplify the specification of relationships and constraints that must hold in

a correctly operation system, without being concerned about the detail as to how these

conditions can be verified.

Efficient techniques for runtime intrusion detection. Our compiler will translate ASL

specifications into an Extended Finite-State Automaton (EFSA). An EFSA is similar to a

finite-state automaton with a set of state variables. The EFSA can be used to detect

intrusions efficiently at runtime.

Atomic execution to detect race condition attacks. We introduce the atomic execution in

our ASL specification to deal with the race condition attacks. Atomic execution provides

the data integrity in a multi-processes system. This feature lets a user describe the normal

behavior of a program that needs atomic execution or detect unknown intrusions

involving race condition or similar synchronization errors.

4

Page 10: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

1.4 Thesis Organization

The rest of this thesis is organized as follows. Chapter 2 gives an overview of our

approach and discusses its relationship to precious research. Our ASL language is described

in Chapter 3. Chapter 4 describes our approach towards atomic execution. The detailed

discussion of the ASL compiler is given in Chapter 5 with some sample C++ programs

generated by it. Finally, concluding results appear in Chapter 6.

5

Page 11: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

CHAPTER 2. OVERALL APPROACH AND RELATED WORK

We model the system to be protected as a distributed system consisting of many hosts

interconnected by a network. The network and the hosts are assumed to be physically secure,

but the network is interconnected to the public Internet. Since attackers do not have physical

access to the hosts that they are attacking, all attacks must be launched remotely from the

public network.

2.1 Behavioral Specifications Model

Our detection system detects attacks on individual processes based on events that are

observable at a per-process level. The specific choice of events used in our behavioral model

is influenced by the following considerations. We are interested in identifying and observing

events that impact the security-related behavior of processes. If all programs were designed

with intrusion detection in mind, they would internally notice and report security-related

events to an external security system. However, most existing programs are not designed in

this manner. Therefore, we need to use other methods to extract security-related events. Our

approach is to:

identify the well-defined interfaces used by all processes,

treat interactions on these interfaces as events,

develop behavioral specifications describing permissible event sequences, and intercept

and verify actual event sequences occurring at runtime against the behavioral

specifications.

Initially, we will be focussing on the interfaces between processes and their host

operating systems. We choose the process/operating system interface because in well-

designed operating systems, all operations made by processes that impact security, such as

6

Page 12: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

file and network access operations, are administered through well-defined system calls. By

describing behaviors as sequences of system calls that a process is allowed to make, and

intercepting and validating these calls at runtime, we can often detect attacks early enough to

prevent any damaging consequences.

Our per-process, system-call-based behavioral specifications are decomposed into

two types:

local correctness specifications that involve reasoning about the actions of the process in

isolation, and

non-interference specifications, which ensure that the concurrent actions of other

processes do not interfere with the correct operation of the process.

Local correctness specifications are patterns over the sequences of system calls and

their arguments that are made by a process. Non-interference specifications use the concept

of atomic sequences of system calls made by a process. We will talk about it in detail on

Chapter 4.

The concept of local correctness specifications is easy to understand and can be

clarified with a simple example. Consider a program, V, with a simple error that allows an

attacker to open and write to the password file, /etc/password. For this program we write a

specification, M that looks for a system call sequence consisting of open() followed by

write(), such that the file acted on by both system calls is /etc/password.

To illustrate the concept of non-interference, we consider attacks that exploit race

conditions in a program. Suppose that a privileged program, V, running as process P, allows

creation of a log file whose name is specified by the user. P must ensure that the user has

write permission to the log file, but since the effective user of P is root, the open() system call

does not perform the desired permission checking. (This is because open() checks file

permission against effective user, not real user.) To compensate for this, V uses the access()

system call prior to the open() to verify if the real user has access to the file. If access()

indicates that the real user has write permission then V’s logic concludes that it is safe to

issue the open(). This logic embedded in V is correct from the standpoint of non-interleaved

operation, but incorrect when another process changes the object referred by the log file in

between the two operations. Thus, for correct operation, we need to ensure that the two

7

Page 13: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

system calls are executed without interference by other processes, which requires that any

data read by the access() issued by P is not modified by another process before the

completion of the open() issued by P. To specify this behavior, M places the data read by

access() and open() in an atomic sequence. After M places data in an atomic sequence, if

another process issues system calls that modify the data in the atomic sequence, M is

informed of the impending corruption so that it can launch an appropriate, application-

specific response.

The decomposition of behavioral specifications into local and non-interference

specifications means that we can have an independent system call monitoring object O j for

each process Pj to enforce the local security behavior of the process. Detection of

interference requires coordination among the detection engines for different processes.

Given our current architecture, we propose to implement the non-interference component in a

decentralized fashion within the runtime infrastructure for intercepting system calls.

2.2 Detection System Model

The detection system consists of an offline and a runtime system. The offline system

is concerned with the generation of detection engines based on the ASL behavioral

specifications, whereas the runtime system is concerned with the execution of the generated

engines.

Figure 1 shows offline production of the system call detection engine. The user of

ASL is a system security administrator who is familiar with the functionality of various

system components, as well as known system vulnerabilities. The system security

administrator finds the security-related behavior of a certain program P using three kinds of

sources: the source code of the program, the manual or document of program that describes

the intended behavior of the program, and attack advisories. These behaviors (or

vulnerabilities) are captured using ASL specifications at the system call level. The ASL

specification is called as the monitor M of the program P. The ASL compiler translates M

into a C++ class definition, called C. C is then compiled by the C++ compiler and linked

with the runtime infrastructure to produce a detection engine. The runtime infrastructure

8

Page 14: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

9

Figure 1 - Offline system for production of detection engines

M

(ASL specification

for monitoring P)

ASL Compiler

System security administrator

Manual or document

that describe the

intended behavior

Attack advisories

or mailing lists

C

(C++ Class

definition of M)

System Call Detection

Engine

System Call

Detection Engine

Infrastructure

C++ Compiler

Page 15: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

rovides all of the support functions pertaining to the interface being monitored by the

specification. For instance, the system call runtime infrastructure provides the mechanism for

intercepting system calls, delivers them to the detection engine and provides functions that

can be used by the detection engine to take responsive actions

Figure 2 shows how the system-call detection engine generated by the offline process

is used at runtime. There is one instance of the system call detection engine per process. The

system-call interceptor is a single entity. Figure 2 shows a process Pi (Qi) that is being

monitored using the object Mp (Mq) generated from the monitoring specification for the

program P (Q). For simplicity, we assume i is the process ID. System calls made by process

Pi (Qi) are first intercepted by the system call interceptor. The system call interceptor

intercepts system calls as they cross the boundaries between the process and the operating

system kernel in both directions, that is, immediately before the system call is sent to the

kernel, and immediately after the system call gets the return value from the kernel. The

interceptor enables the system call detection engine's infrastructure and Mp (Mq) to detect

sequences of system calls requested by Pi (Qi), that deviate from expectation, and to modify

Pi's (Qi's)use of system calls to prevent detected deviations from causing damage.

10

Figure 2 - Runtime system for execution of detection engines

Pi

(Process

running P)

Operating System

System Call Interceptor

Qi

(Process

running Q)

System Call Detection Engine

Mq

Mp

Page 16: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

2.3 Related Work

Intrusion detection techniques can be broadly divided into two categories: misuse

intrusion detection and anomaly intrusion detection. We first briefly introduce these two

methods. Misuse intrusion detection attempts to identify known patterns of intrusion, known

as intrusion signatures, when they occur. The problem of misuse intrusion detection was

introduced by Porras & Ilgun[Porras92, Ilgun93] and Kumar [Kumar94]. Porras & Ilgun use

an approach named State Transition Analysis to representing computer penetrations. In their

system, a penetration is identified as a sequence of state changes that take the computer

system from some initial state to a target compromised state. Kumar's work applies pattern

matching to intrusion detection. It classifies a large subset of currently known computer

security exploitations in a simple classification scheme based on the time complexity for

detecting the vulnerabilities. He uses a single computational model to present and monitor

exploitations using pattern matching techniques.

Anomaly detection assumes that attacks will result in behavior different from that

normally observed in a system and can be detected by comparing the current behavior with

the pre-established normal behavior. We use Forrest's work [Forrest97] to illustrate anomaly

intrusion detection. They define normal behavior for a UNIX process in terms of the

sequence of system calls that are made by the process in the course of normal operation.

Intrusion is detected when we observe “foreign” system call sequences that have not been

observed under normal operation.

Our method follows third paradigm for intrusion detection called specification-based

detection, originally proposed by Ko [Ko96]. Specification-based detection relies on program

specifications that describe the intended behavior of programs. The monitoring of executing

programs involves detecting deviations of their behavior from these specifications, rather

than detecting the occurrence of specific attack patterns. This method has advantages over

misuse detection and anomaly detection. As compared to misuse detection, it does not

require the attack pattern, so it can detect unknown attacks. Compared to anomaly detection,

it has a significant threshold to determinate which behavior is an intrusion.

11

Page 17: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

2.3.1Misuse Intrusion Detection

Intrusions that follow well-defined pattern of attack exploit weaknesses in the system

and the application software. Such patterns can be precisely written in advance. It is a

premise that there are attacks that can be precisely encoded in a manner that captures the

rearrangement and variation of activities that exploit the same vulnerability. Intrusion can be

flagged as soon as these events occur. Techniques for misuse detection have been based on

state-transition systems [Porras92, Ilgun93] and pattern-matching [Kumar94]. While it is

relatively easy to deal with known vulnerabilities using misuse detection, it is difficult to

cope with unknown vulnerabilities.

Ilgun[Ilgun93] developed a real-time intrusion detection tool, USTAT, a State

Transition Analysis Tool for UNIX. In USTAT, a penetration is identified as a sequence of

state changes that take the computer system from some initial state to a target compromised

state. They use the state transition diagrams to represent penetrations. A state transition

diagram is the graphical representation of the penetration scenario. There are two major

components of a state transition diagram: nodes that represent the states and arcs that

represent the actions. Our method improves upon their work. We use an extended finite-state

automaton (EFSA) to representation the specifications. An EFSA is similar to a finite-state

automaton with a fixed set of state variables. EFSA makes transitions based on events, event

arguments, conditions on event arguments, and state variables. So an EFSA is more powerful

than the sate transition diagram used in Ilgun's system. Moreover, the USTAT analyzes the

audit data by keeping track of the state changes on the system. Our work intercepts the real

system call sequence. While USTAT analyzes the static data, our system does analysis on the

dynamic data. So USTAT can be used only for intrusion detection while our system can do

some prevention work.

Kumar [Kumar94] proposed another approach for misuse intrusion detection.

Kumar's approach encodes intrusion signatures as a formal, structured representation of low-

level system events that constitute the exploitation of the attack. He classifies intrusions on

the basis of structural interrelationships among observable system events. The classification

formalizes detection of specific exploitations by examining their effects in the system event

12

Page 18: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

trace. The benefit of this classification is they can talk about intrusion signatures belonging to

particular categories in the classification instead of vulnerabilities that result in intrusions.

Then they develop computational models to detect intrusions based on the

classification. In each category, they exploit the common structure interrelationships of

events comprising of the signatures in that category. Then, they can look at signatures of

interest that can be matched efficiently. They define and justify a computational model in

which intrusions from their classification can be represented and matched.

Patterns can be specified by defining what needs to be matched instead of how it is to

be matched. The benefit is the clean separation of the matching algorithm from the

specification of what need to be matched. Our method is similar to Kumar's in the aspect that

we too use patterns to specify the security-related behavior of the system. But we use the

patterns to describe the normal behavior instead of the attack, so we can detect attacks which

are unknown.

2.3.2Anomaly Intrusion Detection

Anomaly detection attempts to quantify the usual or acceptable behavior and flag

other irregular behavior as potentially intrusive. It first creates a profile that defines normal

behaviors and then detects deviations from this profile. Several techniques have been

developed based on statistical methods, expert systems, neural networks, or a combination of

these methods [Fox90, Lunt88, Lunt92, Anderson95]. The premise here is that the intrusive

activity is a subset of anomalous activity. The main advantage of this approach is that the

system can automatically detect when observed behavior deviates significantly from its

normal behavior. So it can detect unknown intrusions. The downside is that an attacker can

evade detection by changing behavior slowly over time.

Forrest et al [Forrest97, Kosoresow97] have developed an intrusion detection

technique inspired by immune systems in animals. They characterize “self” for a UNIX

process in terms of (short) sequences of system calls that are made by the process in course

of normal operation. Intrusion is detected when we observe “foreign” system call sequences

that have not been observed under normal operation. The overall idea is to build up a

separate database of normal behavior for each process of interest. Once a stable database is

13

Page 19: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

constructed for a given process, the database can be used to monitor the process' ongoing

behavior. The sequences of system calls form the set of normal patterns for the database, and

abnormal sequences indicate anomalies in the running process.

Their research results are largely complementary to ours, in that their main focus is

on learning normal behaviors of processes, where as our focus is on specifying and enforcing

these behaviors efficiently. In particular, the finite-state automaton learnt by the technique of

[Kosoresow97] could be fed as input to our runtime monitoring system.

2.3.3Specification Based Monitoring

A specification-based approach, first proposed by Ko et al [Ko94, Ko96], is aimed at

overcoming the drawbacks of misuse detection by describing intended behaviors of

programs, which does not require us to be aware of all the vulnerabilities in the program that

could be misused. The idea is to write security specification for the privileged programs that

describes their desirable behavior, and the specification is driven by the functionality of the

program and the system security policy.

Ko introduces a formal intrusion-detection model that makes use of traces which are

ordered sequences of execution events, for specifying the intended behavior of programs. A

formal specification language expresseses the set of valid operation sequences of such

programs in a general and efficient manner. According to the model, a specification describes

valid operation sequences of the execution of one or more programs called a (monitored)

subject. A sequence of operations performed by the subject that does not conform to the

specification is considered a security violation. It is called a trace policy.

They used a grammar called parallel environment grammars (PE-grammars) for

specifying trace polices. PE-grammars can parse the audit trails easily and describe many

different classes of trace policies that are important to security. Parsing of audit trails thus

becomes the detection mechanism in a specification-based detection system. It detects

operations performed by subjects that are in violation of the trace policies.

An important improvement in our approach is that we can enforce the specified

behaviors at runtime so as to prevent attacks, whereas Ko's approach uses offline analysis of

audit logs. Another important distinction arises in terms of the specification language used.

14

Page 20: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

[Ko96] uses a specification language based on context-free grammars augmented with state

variables, while our specification language is closer to regular expressions augmented with

state variables. Use of regular expressions affords the ability to compile the specifications

into an extended finite-state automaton (EFSA) which is a finite-state machine that is

augmented with state variables. While Ko's work need one PE-parser for each specification,

all of our specifications are in the form regular expression so that they can be matched using

a single EFSA. Such an EFSA would enable very efficient runtime checking. These factors

are particularly important in the context of an online approach such as ours. Another

important improvement is that our approach provides the atomic execution mechanism to

deal with race condition attack. Ko's work needs to know what happens in the race condition

attack so that they can write a PE-grammar to detect this kind of intrusion. In this case, their

description is equivalent to a misuse signature that can only detect well-known attacks. Using

our atomic execution specification, we can describe the normal behavior of a program that

needs the atomic execution and detect the unknown intrusions.

2.4 Benefits of Our Approach

Our approach improves on previous work greatly with respect to the following three

aspects:

1. A high-level language that simplifies specification of normal behaviors of processes. ASL

is intended to simplify the specification of relationships and constraints that must hold in

a correct system, without being concerned about the detail as how these conditions can be

verified. When users write specifications, they can be concerned only about the correct

condition instead of the implementation. This feature greatly simplifies the user's work.

Compared with the PE-grammar used in Ko's work, use of regular grammar implies that

we can potentially translate all the specifications into one single EFSA.

2. Efficient techniques for runtime intrusion detection. Our compiler translates ASL

specifications into an Extended Finite-State Automaton (EFSA). An EFSA is similar to a

finite-state automaton with a set of state variables. The EFSA can be simulated at runtime

to detect intrusions efficiently.

15

Page 21: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

3. Atomic execution to detect race condition attacks. We introduce atomic execution in our

ASL specification to deal with race condition attacks. Atomic execution provides data

integrity in a multi-processes system. This feature lets user describe the normal behavior

of a program that needs the atomic execution. Compared with Ko's work on how to detect

race condition attacks, we can write the specification about the normal behavior of the

program while Ko's work needs some intrusion signatures. So we can detect unknown

intrusions. This is a significant advantage of the specification-based method over the

misuse method.

16

Page 22: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

CHAPTER 3. ASL SPECIFICATION LANGUAGE

We propose to specify security-related behaviors declaratively in a high-level

language called auditing specification language (ASL). ASL is designed to be powerful

enough to express a wide range of integrity constraints and behaviors over time. The code

for auditing is generated by a compiler from these specifications.

3.1 External Functions

External functions are functions that are defined outside of the detection engines, but

which can be accessed from the detection engines. The primary purpose of external functions

is to invoke support functions needed by the detection engine or reaction operations provided

by the system call and packet interceptors. For instance, when an event for opening a file is

received by a detection engine, it may need to resolve the symbolic links and references to

“.” and “..” in the file name to obtain a canonical name for file. It may make use of a support

function declared as follows to accomplish this:

string realpath(CString s);

The detection engine may also need to check access permissions associated with the

file, which may be done using a support function declared as follows:

@stat(const CString s, StatBuf b);

We remark that in ASL, system call references occur in two different contexts. The

first context is an event, and the second context is the use of a system call by an ASL

specification. To differentiate between these contexts, we use the convention of preceding

system calls with an @-symbol to denote the second context.

17

Page 23: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

3.2 Events

We model the behavior of the protected system as a sequence of events. At the

process level, events are the invocation of system calls and the return from system calls.

Events have the form , where e denotes the name of the event and denote

the arguments to the event.

We associate one event with the entry to the system call and one with the exit from

the system call. An example declaration of a system call entry event is:

event stat(CString s, StatBuf b);

The exit from this system call is denoted by:

event $stat(CString s, StatBuf b);

We use the convention of using the system call name for entry events and prefixing the

system call name with $-symbol for exit events. Observe that this approach provides no

direct mechanism for accessing the return value from the completed system call or the value

of errno. (Recall that errno is the global variable in UNIX-based systems that stores the

specific error code corresponding to the most recent error.) A suitable convention would be

to have two external functions

int rv() const;

int errno() const;

in the interface to access these values.

Note that in our current implementation approach, there is one monitoring object Mp,

per process Pi. For local correctness specifications, it is sufficient for us to deliver the system

calls made by Pi, to Mp. Event delivery is effected by invoking a method on the object Mp

that has the same name as the event. However, for interference specifications it is necessary

for us to deliver the system calls made by every process executing on the host to an

interference detector. The interference detector is part of the runtime infrastructure for

system call detection.

18

Page 24: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

3.3 Patterns

ASL general event patterns are used to specify valid or invalid behaviors. Event

patterns consist of primitive patterns composed using temporal operators. Primitive patterns

are built from atomic patterns. Primitive patterns describe specific events of interest, while

the temporal operators capture the sequencing and timing relationships that must hold

between primitive events.

3.3.1Atomic Patterns

An atomic pattern is of the form Caae n |),,( 1 , where e denotes an event and C is a

boolean-valued expression on naa ,,1 . C may contain standard arithmetic, comparison and

logical operations. It may also contain comparisons of the form x = expr where x is new

variable. The semantics of such comparisons is to bind the value of expr to x.

3.3.2Primitive Patterns

A primitive pattern is obtained by combining atomic patterns with the disjunction

operator ||, and possibly preceding the entire expression with the complement operator ‘!’.

Both operators have the obvious meaning. As an example of a primitive pattern, consider the

following pattern:

execve(f,x,y) | realpath(f) != “/usr/ucb/finger”

The example captures all invocations of the execve() system call where the program

being executed is other than /usr/ucb/finger. In this pattern, realpath is an external function

that resolves all links (hard or symbolic) and occurrences of “.” and “..” in the filename

argument and returns an absolute path name. Such a pattern may be used to capture the

"Internet worm" attack that exploited fingerd vulnerabilities [Spafford91]. Another example

of a primitive pattern is

!((open(f)|realpath(f)=/home/*/.plan)||(close(f))||(exit(f))

19

Page 25: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

This example captures all system calls other than those for opening “.plan” files,

closing files or terminating processes. Patterns such as these may be used to capture

disallowed system calls for many processes.

3.3.3General Event Patterns

To capture sequencing or timing relationships among events, ASL uses several

temporal operators to compose primitive event patterns into more complex general event

patterns. The syntax of the composition operators is:

Sequential composition: 21 ; pp denotes the event pattern 1p followed by the event pattern

2p .

Alternation: denotes the occurrence of either 1p or 2p .

Repetition: p* denotes 0 or more occurrence of p

Atomicity: nonatomic (d , p) corresponds to an occurrence of pattern p within which the

data item d is not accessed atomically.

For convenience, we define the operator “..” that can be applied only to primitive

patterns. 1p .. 2p is equivalent to , i.e., 1p followed by 2p with possibly other

events occurring in between.

To avoid excessive use of parenthesis, we define the following associatively and

precedence for the temporal operators. The operators “;” and “||” associate to the left, while

“..” is non-associative. The operator “!” has the highest precedence, “*” has the next lower

precedence, “;” has the next lower precedence and “||” has the lowest precedence.

3.4 Rules

A rule is of the form pat reaction, where pat is a pattern of the form described

above, and reaction is a sequence of responsive steps to be initiated when the pattern occurs.

Actions may be empty, variable assignment, or external function invocation. Empty actions

cause no action to be taken. Assignment actions cause an assignment to a variable defined

within a module. External function invocation causes the specified function to be executed by

20

Page 26: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

the runtime infrastructure. They may be used by the detection engine for such purposes as

reading or writing data in the monitored process, or executing arbitrary system calls in the

monitored process.

3.5 Event Abstractions

Event abstraction is a convenience mechanism allowing programmer definition of

abstract events denoting arbitrary event patterns. Event abstraction allows the programmer to

name and treat complex event patterns as if they were primitive events. To illustrate the use

of event abstractions, note that many UNIX system calls have overlapping functionality.

When we write behavioral specifications, it is cumbersome to have to write several variants

of the specification based on the exact system calls that are used by a particular program. For

convenience, we group similar system calls so that all of the calls in one group can be viewed

as implementations of a higher level abstract system call. For instance, the creat() and open()

system calls can both be used to open new files, so we define the abstract event writeOpen

which captures this commonality. Then, a single behavioral specification using writeOpen

can be used to monitor processes that open new files using either creat() or open().

Code Example 1 - Definition of writeOpen() Abstract Class

event writeOpen(path) =

open(path, flags)|(flags & (O_WRONLY | O_APPEND | O_TRUNC)) || open(path, flags, mode)|(flags & (O_WRONLY | O_APPEND | O_TRUNC)) || creat(path, mode);

Different levels of abstraction may be desired in different contexts, and hence there

may be overlaps among different user-defined events. For instance, we may have an abstract

call that corresponds to readOpen, and another that captures any open, regardless of whether

it is for reading or writing. In this case, we have a hierarchy with individual system calls at

the lowest level, readOpen and writeOpen at the next level, and then Open_all() at the next

higher level. We may also have other levels in the hierarchy that may allow us to treat file

opens in the same was as socket opens or connections. A completed list of event abstraction

of Red Hat Linux system calls are given in Appendix A.

Code Example 2 - Definition of readOpen() and Open_all() Abstract Classes

21

Page 27: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

event readOpen(path) =open(path, flags) | (flags & O_RONLY) ||open(path,flags,mode) | (flags & O_RONLY);||event Open_all(path) = readOpen(path) ||writeOpen(path);

3.6 Example Specifications

First, consider the following specification in Code Example 3 for the cat program,

that is intended to prevent the program from being used to read the password file. The

offending system call is not allowed to execute, and an error code is returned.

Code Example 3 - ASL Monitoring Specification Preventing cat From Opening

/etc/passwdclass CString {

String get() const;void set(String s);}

string realpath(const CString s);event open(const CString f, int flags, mode_t mode);main() {open(f, fl, m)|realpath(f) = ”/etc/passwd” -> fail(-1,ENOPERM);}

The next example shows the defense against race condition vulnerability. We first

give a brief description of a program, R, which has a race vulnerability. R is a setuid to root

program, so when R executes the effective user ID of R is set to be the root. For most system

calls, permission to use system resources is governed by the permissions of effective user, so

R in general has the privilege of root, even when R’s real user (the user who invoked R) is

not the root. Suppose R allows the user to name a log file, L, into which R writes

information as R executes. R needs to ensure that L is writeable by R’s real user, not R’s

effective user. Without this distinction, the user could specify L to be any protected file,

such as /etc/passwd, into which only a root can write. The open() system call bases it

decision to open a file depended up the effective user’s permissions, so open(), by itself

cannot correctly decide if real user can write to L. R therefore uses the access() system call

prior to open(), since access() checks file permission based on the real user and not the

effective user. If access() indicates that the real user has write permission for L, R continues,

otherwise R aborts. If R continues then sometime later it will use open() to open L, under the

assumption that R would not reach open() if access() had not verified real user’s permission

22

Page 28: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

to write to L. In isolation R is correct, but external actions effecting L in between the time of

access() and the time of open() can change the conditions such that open() is fooled into

opening a file for which the real user does not have write permission. More specifically, if at

the time of access(), L is a file which the real user can write, but by the time of open(), L has

become a file which only the root can write, then the open() will succeed and P will write

data into a file which the real user is prohibited from writing. The transformation of L is as

simple as removing L, and then creating a symbolic link with the same name as L to the

protected file. If the attacker can accomplish these two operations anytime after R does

access(), but before R does open(), then the attacker has exploited the race vulnerability

There are several ways to protect against such attacks in ASL. The first approach is

captured by the following specification in Code Example 4.

Code Example 4 - ASL Monitoring Specification rProg1 for Race Vulnerabilitiesstring realpath(const CString s);event open(const CString f, int flags, mode_t mode);event access(CString f, mode_t mode);event $open(CString f, int flags, mode_t mode);event $access(CString f, mode_t mode);int @getuid();void setreuid(int a, int b);main() { int savedEuid; int changedEuid; access(name, mode)| ( ruid == @getuid())&&(rn =realpath(name)

.. open((name1, flags, mode1)|(rn==realpath(name1) { changedEuid=1; savedEuid= @ getuid(); @setreuid(-1,ruid); } $open(f,flag,mode) | (changedEuid==1) { changedEuid=0; @setreuid(-1, savedEuid) } }

The way the specification works is as follows. Whenever the monitored program

performs an open system call following an access system call on the same file, we set the

effective user ID of the calling process temporarily to the real user ID before the open call is

executed by the OS kernel. Before doing this, we save the current value of the effective user

ID in the state variable savedEuid, and set a flag changedEuid to remember that we have

temporarily reset the effective user ID. When the open system call completes, we use the

values stored in these state variables to restore the original effective user ID’s.

23

Page 29: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

The second defense against the race vulnerability uses the concept of atomic

sequences which we will discuss in detail in the next chapter. The ASL can be written as:

nonatomic (f.target) , (access(f, md) .. writeOpen(f)) fail(-1,EACCESS)

24

Page 30: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

CHAPTER 4. ATOMIC EXECUTION

Many UNIX applications are vulnerable to race condition attack. In general, the

vulnerability is particularly important for set-user-id-to-root applications. The set-user-id-to-

root applications require super-user privilege for some of their functions. To obtain super

user privilege when invoked by a normal user, these applications are installed with root as

their owner, and when they execute, their effective user id is changed to the file owner (i.e.,

root). Thus, when a set-user-id-to-root application is executed by a normal user, the

application attains super-user privilege. The goal of the race condition based attacks is for

invoking user to access system resources (e.g., files) that only the super-user is able to

access.

The race condition exists because applications sometimes make a decision to execute

a system call that accesses secure resources based on data collected from earlier system calls.

The time interval between the data collecting system call and the subsequent system call that

accesses the secure resource.

Consider the following prototype of a set-user-id-to-root application, called P. P's

main functionality requires super-user privilege, but P has a logging option that allows the

user to specify a log file into which P writes an activity log. For security, P must determine

whether or not the invoking user has write permission for the log file. P determines the

invoking user's permissions with respect to the log file by using the access() system call. If

the access() system call indicates that the invoking user has the write permission, P

continues, otherwise P aborts. If P continues, it opens the log file for writing. Since the open

is done with super-user privilege, it can succeed for all files the super-user has write

permission for, regardless of the invoking user's permissions.

25

Page 31: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

The decision to open the log file is based on the permission information obtained by

the earlier access() system call. To exploit the time interval between collection and action, all

the user must do is change the log file from an actual file to a link to another file, such as the

password file, to which only super-user can write, during the interval. P is thereby fooled into

opening and writing log information into an important, supposedly secure file, thereby

corrupting the file.

4.1 Atomic Execution

One way to deal with the race condition problem is to execute system calls

atomically. The atomicity that we assure is not true atomicity, which requires no interleaved

system calls; but can be apparent atomicity, which allows interleaved system calls as long as

the final result is the same as if no interleaved system calls had been allowed. Apparent

atomicity is achieved if the data read by all system calls in an atomic sequence are unchanged

by any system call not in the same atomic sequence from the time the data is read, until the

time the sequence is ended. In other words the readset of the atomic sequence must not be

corrupted externally. A straight-forward way to ensure that the readset of an atomic sequence

is not corrupted is for the operating system to remember the readset for each atomic

sequence, and to intersect the writeCall of all system calls (except those in the same atomic

sequence) with the readsets. If the intersection is not null, it implies that the system call

could corrupt the readset. In order to implement a readset corruption detector, for each

system call, it is necessary to identify the readset and writeCall.

4.2 Defining Readset/Writeset

In our specification language, we use the approach that data that need to be protected

is specified explicitly. As the number of data type is much fewer than the number of system

calls, we define the system calls that read and write each data type instead of defining the

readset and the writeset for each system call. The readset/writeset of a system call contains

26

Page 32: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

the data while the readCalls/writeCalls of a data type contains the system calls. The common

data types include file's content, file's ownership and file's permission, etc.

4.3 Implementation Approach

. The entity to be accessed atomically is referred to as object. For example, a file's

permission is one object, while its content is another object. Suppose we want to make sure

that the file f's content is not changed within the pattern E. We write the specification:

nonatomic(f.target, E) fail.

Here, the object is the file's content. Suppose the process is P1 and the monitor is M1, then

there are two ways to break the atomic execution of f.target within the pattern E. One way is

that before P1 reaches E, some other process has already opened the file f. The other is when

P1 gets into E and does not go out, some other process can modify the file f. To detect these

two situations, we associate two data structures with an instance of the file f's content. One is

a counter to record the number of incomplete write operations on this instance. The other is a

lock of the form (SN, Brokenflag). SN is a sequence number that identifies the lock uniquely

in the system. The Brokenflag is a boolean value which is true when the lock has been

broken. When process P1 meets the first event in the pattern E, it sets a lock on the file f.

When any process want to write the file f, the lock has to be broken and the Brokenflag is set

to true which means that the atomicity requirement on the file f has been broken.

To detect a process's intent to write an object, we define a set of

IncompleteWriteOperations and a set of CompleteWriteOperations corresponding to that

object. An IncompleteWriteOperation indicates that the object will be written by this process

in future and any lock will be broken. A CompleteWriteOperation indicates that the object

has already completed the write operation and any future lock will be safe. For example, if

the object is file's content, the IncompleteWriteOperation includes a WriteOpen() system call

and CompleteWriteOperation includes a close() system call. If the object is a file's

permission, the IncompleteWriteOperation is null because the writeCall of the file's

permission is chmod() system call and a chmod() is completed by itself.

27

Page 33: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

Each object is also associated with a writeCall which contains all the system calls that

modify the object. For example, if the object is a file's content, the writeCall is

{WriteOpen(), close(), symlink()}. If the object is a file's permission, the writeCall is

{chmod(), chown() }.

We notice that since a readset cannot conflict with another readset, all conflicts must

involve a writeset. So we focus our attention on the writeset. The idea is to attach a lock to an

object we want to protect. When a process P1 wants to protect an object within a pattern

E=(e1...en), the corresponding monitor will begin to check when the first event e1 occurs. If

the object has been written by another process, then return 'fail'. Otherwise the monitor will

put a lock on the object and send this information to all other monitors in the system. When

the last event en occurs, the monitor will remove the lock and also send the remove

information to all the other monitors. Within this period, if any process tries to write the

object, the lock will be broken and a 'Broken' message will be returned. The detailed

algorithm is shown below:

Suppose there are n processes P1, P2… Pn and n corresponding monitors M1, M2…

Mn in our system. M1 has the specification nonatomic(Object, P)fail.

A fixed data structure associate with an object contains:

Object.IncompleteWriteOperation;

Object.CompleteWriteOperation;

Object.writeCall;

A data structure associated with an object's instance contains:

Object1.counter=0 // how many IncompleteWriteOperations on the object

Object1.lock_list=null // list of locks

Code Example 5. Algorithm for Atomic Execution For M1

Case 1: M1 sees the first event e1 in the pattern P broadcast begin(Object1,SN) to all Mi, when M1 receives all ACKs or time out, M1 sends e1 to the kernel/*begin(Object1, SN) sets a lock on Object1 and the sequence number of this lock is SN */Case 2: M1 sees the last event en in the pattern P, or M1 cannot match pattern P. Broadcast end(Object1, SN); if (M1 has matched the pattern && receives one Broken(Object1)) then (atomic requirement has been violated) take reactions

28

Page 34: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

//end(Object1, SN) means M1 wants to release the lock SN on Object1.For Mi

Case 1: receive begin(Object1,SN): if (Object1 exists) then {if (Object1.counter!= 0) then Object1.lock_list.insert ((SN, true));//lock broken else Object1.lock_list.insert ((SN, false));//not broken } else {create Object1; Object1.lock_list.insert ((SN, false)); } send ACK to M2

Case 2: receive end(Object1, SN) if ((SN, true) in Object1.lock_list) then send Broken(Obejct1) to M1;//lock for Object1 has broken Object1.lock_list.delete((SN,*)); if (Original(Object1)) then destroy Object1;Case 3: Pi has an IncompleteWriteOperation(Object1) system call trapped by Mi, if (Object1 exists) then {Object1.counter ++; for (every lock in Object1.lock_list) lock.Brokenflag = true; } else {create Object1; Object1.counter=1; }Case 4: Mi has a CompleteWriteOperation(Object1) system call Object.counter--; if Original(Object1) then destroy Object1; /* Function Original(Object) return true when Object has its initial value, where counter==0 and lock_list==null */Case 5: Pi has writeCall(Object1) system call if ((Object1 exist)&&(Object1.lock_list!=null)) for(every lock in Object1.lock_list) lock.Brokenflag = true;

4.4 Some Examples of the Algorithm

We will use an example to explain the algorithm. Suppose we want to make sure that

the file f's content will not be changed between two system calls access(f) and open(f). Let

the process be P1 and the monitor be M1. We can write the specification nonatomic(f.target,

access(f)...open(f)) fail. Here, the object is the file's target.

The fixed data structure attached with a file's content contains:

Object.IncompleteWriteOperation = {open(f) }

29

Page 35: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

Object.CompleteWriteOperation ={close(f) }

Object.writeCall = {symlink(f) }

The only other process is P2 with the monitor M2. Let the system call sequences in M1 and M2

be:

M1: access(f) open(f)

M2: symlink(f)

Time: t1 t2 t3

When M1 gets access(f) at t1, because of case 1, it broadcasts a begin(f, SN) message

to M2. M2 creates the object f and inserts the lock (SN, false) in f.lock_list. When M2 gets

symlink(f) at t2, because of case 5, M2 sets (SN, true) in f.lock_list. When M1 gets open(f) at

t3, M1 sends end(f, SN) in case 2. When M2 receives the end(f, SN), it checks its lock_list

based on case 2. It finds (SN, true) in the lock_list and sends Broken(f) to M1. As M1 receives

this message, it knows that the atomic requirement has been broken.

If the system call sequences in M1 and M2 are:

M1: access(f) open(f)

M2: open(f) close(f)

Time: t1 t2 t3 t4

When M2 gets open(f) at t1, it creates an object f and sets f.counter to 1. When M1 gets

access(f) at t2, it sends begin(SN, f) to M2. After M2 receives this message, it finds that the

object f already exists, so it inserts the lock (SN, true) into the f.lock_list. When M2 gets

close(f) at t3, f.counter is decrement and hence become zero. Since there is a lock in the

lock_list, object f is not destroyed. When M1 gets open(f) at t4, it sends end(SN, f) to M2.

When M2 receives the end(f, SN), it check its lock_list and finds the lock (SN, true) in the

lock_list and sends Broken(f) to M1. So M1 knows that the atomic requirement has been

broken.

4.5 Discussion of Correctness

We discuss the correctness of the above algorithm based on the following

assumption: let the process be P1 and the monitor be M1, there are ONLY two ways to break

30

Page 36: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

the atomic execution of an object O within the pattern E. One way is that before P1 reaches

E, some other process has an InCompleteWriteOperation on O. The other way is when P1

gets into E and does not go out, some other process tries to modify the object O

We argue the correctness corresponding the following two cases:

Case 1: If there is an atomicity violation, then M1 gets a 'Broken' message

By contradiction, let's assume that there is an atomicity violation and that there is no

'Broken' message. Suppose process P1 has a data object O1 which is to be unchanged in

pattern (e1, ...,en) but the process P2 changes O1 within the pattern. Based on our assumption,

P2 can change O1 in either of the following two ways:

The first way is that P2 executes a system call belonging to the writeCall of O1 within

the time of pattern (e1, ..., en). For M2, because of case 5, the lock's Brokenflag is set to true

and a broken message will be sent in case 2.

The second way is that P2 executes an InCompleteOperation to object O1 before P1

reaches the pattern and completed within or after the pattern. In this case, for M2, due to case

3, the object O1 have been created before P1 reaches the pattern and the counter set to 1.

When M1 gets e1, it will send the 'begin' message in case 1, and M2 will set the lock's

Brokenflag to true which cause a 'Broken' message will be sent in case 2.

So, in both cases, a 'Broken' message will be sent which means that there is an

atomicity violation.

Case 2: If M1 gets a 'Broken' message and a pattern has been matched, then there must be an

atomicity violation.

The 'Broken' message can be sent out only by Mi in case 2, which means that a lock

has been broken. The lock's Brokenflag is set to true in one of the following three places:

In case 1, means an InCompleteWriteOperation happens on the data Object first, then

a lock does. This data can be modified in the future within or after the pattern. This is an

atomicity violation.

In case 3, it means a lock exists first, a coming InCompleteWriteOperation can break

the lock. This is an atomicity violation

In case 5, it means a write operation happens within the pattern. This is an atomicity

violation.

31

Page 37: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

CHAPTER 5. TRANSLATION FROM ASL INTO AUTOMATON

An ASL specification consists of a list of patterns. The main task in translating ASL

into a C++ class definition is to translate the patterns into an extended finite-state automaton

(EFSA). An EFSA is similar to a finite-state automaton, with the following differences:

In addition to the control state of an FSA, an EFSA makes use of a fixed set of state

variables.

EFSA makes transitions based on events, event arguments, conditions on event

arguments and state variables. In addition, the transitions may assign new values to state

variables.

5.1 Translation Algorithm

An EFSA may be deterministic (DEFSA) or nondeterministic (NEFSA). For the

aspect of efficiency, we will always prefer to generate a DEFSA rather than a

nondeterministic one. However, this may not always be possible as conversion of NEFSA

into a DEFSA can cause unacceptable explosion in space requirements.

For traditional FSA, every nondeterministic automaton can be converted into an

equivalent deterministic automaton with at most an exponential increase in the number of

(control) states. For performance critical applications (e.g., lexical analysis phase of a

compiler), this increase in state space is acceptable, especially since the worst case behavior

is unusual. For EFSA, the explosion in size is exponential in the product of the range of

values that can be assumed by each of the auxiliary state variables. For instance, a

deterministic EFSA that is equivalent to a nondeterministic EFSA with one integer (32-bit)

state variable has at least states!

32

Page 38: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

Consider the following example. For the regular expression (a|b)*a(a|b)*b, the NFA

has three states as shown in Figure 3, while a DFA also has three states as shown in Figure 4.

For comparison purpose, let us consider the extended regular expression (a|b)a(x)(a|b)*b(x).

The variable x has 32-bits, so x has 232 possible values. The nondeterministic EFSA has three

control states as shown in Figure 5. If we want to convert it into a deterministic EFSA, it

needs at least states. This it because, to convert it into a DEFSA, we must remember

33

S0 S1 S2

a | b a | b

a b

Figure 3 An NFA for (a|b)*a(a|b)*b

Figure 4 An DFA for (a|b)*a(a|b)*b

S2

ba

aS0

b

b

S1a

Page 39: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

every possible value of x in the prefix (a|b)*a(x)(a|b)*, so that when we meet b(x),we can

determine whether we can go to the terminal state or not. There are distinct states in the

DEFSA for recognizing the prefix (a|b)*a(x)(a|b)*, because this prefix can accept all

possible combinations of a(1) to a(232) at length from 1 to 232. If the string's length is 1, the

possible values of a(x) is from a(1) to a(232), so the total number is C(232,1). We use C(N, k)

to represent the number of ways of choosing k distinct objects from N objects. If the string's

length is 2, the possible values of a(x)a(x) is the combination number C(232,2). More

generally, if the string's length is K, the possible value of string is C(2 32, K). So the total

number of distinct states are C(232,1)+C(232,2)+…+C(232, 232) =

This problem leaves us with two choices:

restrict the class of ASL patterns so that they can be compiled into DEFSA

do not convert an NEFSA into an EFSA, and simulate the NEFSA at runtime

We are using the NEFSA runtime simulation approach. When an NEFSA needs to

make a nondeterministic transition, we replicate its current state to a new instance of the

NEFSA. When an instance cannot make future transition, this instance will die and release all

the resources it holds. While a DEFSA must be prepared for every possible input string that

may appear, the NEFSA simulating approach only deals with strings that actually appear at

runtime. So this approach can reduce the space requirements potentially.

34

S0 S1 S2

a | b a | b

a(x) b(x)

Figure 5 A NEFSA for (a|b)*a(x)(a|b)*b(x)

Page 40: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

The approach is described below. At runtime, the transition relation of an EFSA will

be represented in code, whereas its current state (which includes the current state and the

state variables) will be stored in some data structures. Since we plan to combine all patterns

in one ASL specification into a single EFSA, there will be only one instance of the transition

relation at runtime. To support nondeterminism, we will permit multiple instances of the

dynamic state of the EFSA. These multiple instances capture all the states the NEFSA could

have reached after examining its input up to this point.

Suppose that an EFSA needs to make a two-way nondeterministic transition on an

event e, we will perform a “fork” operation on the EFSA, i.e., replicate its current state. The

new instance will follow one of the non-deterministic choices, while the parent will follow

the other one.

The starting point for our algorithm for generating EFSA from ASL patterns are the

seminal papers by Brzozowski [Brzozowski64] and Berry & Sethi [Berry86]. However, these

papers deal only with regular expressions and classical FSA, whereas we have to deal with

conditions on event arguments and state variables that can be complex data structures. By

combining and extending these two techniques, we have developed an algorithm for

generating EFSA from a restricted class of ASL specifications.

We first introduce Brzozowski's work. The syntax of regular expressions over a set of

of input symbols is:

E::= 0| 1 |a| E+E| EE| E*

Where a is a typical symbol. '0' is a regular expression that corresponds to the empty

language. In other word, L(0) is the empty set. '1' is a regular expression that corresponding

to a language containing just the empty string. L(1) stands for the set consisting of the empty

string . L(E) denotes the language generated by a regular expression E. We write E=F iff

L(E)=L(F). Using Brzozowski's notation, (E) stands for 1 if L(E) contains the empty string;

otherwise, (E) stands for 0. Thus, (E)F equals F if the empty string is in L(E); otherwise,

(E)F equals 0.

Brzozowski's algorithm is based on the notion of the 'derivative' of a regular

expression E with respect to a symbol a, written as D(a, E). The derivative of E over a is

another regular expression E' such that asL(E) and sL(E'). For example, the derivative of

35

Page 41: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

aba+bb by a is ba. More formally, given a regular expression E and a symbol a, the

derivative of E by a, is defined by

D(a,0)=0;

D(a,a)=1;

D(a,b)=0 if ab

D(a,E+F)=D(a, E)+D(a, F)

D(a, EF)=D(a, E)F+(E)D(a, F)

D(a, E*)=D(a, E)E*

Automata are constructed by successively computing derivatives until no new state is

produced. For example, an automaton accepting (ab+b)*ba is shown in Figure 6. The only

two input symbols are 'a' and 'b'. At beginning, the initial state is (ab+b)*ba. Computing the

derivatives: D(a, (ab+b)*ba)= b(ab+b)*ba, D(b, (ab+b)*ba)= (ab+b)*ba+a. We get two

new states. Continue this computation: D(a, b(ab+b)*ba)=0, D(b, b(ab+b)*ba)= (ab+b)*ba,

D(a, (ab+b)*ba+a) =b(ab+b)*ba+1 and D(b, (ab+b)*ba+a)= (ab+b)*ba+a. We get one

new state b(ab+b)*ba+1. When we compute the derivative of this state: D(a,

b(ab+b)*ba+1) =0 and D(b, b(ab+b)*ba+1)= (ab+b)*ba. We do not get a new state

anymore, so the algorithm terminates.

36

(ab+b)*ba

ab

b(ab+b)*ba

b

(ab+b)*ba+a

b

b

b(ab+b)*ba+1

a

Figure 6. Automaton accepting (ab+b)*ba

Page 42: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

Brzozowski proved the following proposition: the set of derivatives of a regular

expression is finite, modulo associatively, commutativity, and idempotence of +; that is, the

set {F|w: F=D (w, E)} has a finite number of equivalence classes. This proposition

guarantees the algorithm will terminate because the algorithm terminates when there are no

new states can be generated.

Berry and Sethi developed a faster algorithm to create an automaton from a regular

expression with distinct symbols. They mark all the input symbols in a regular expression to

make them distinct. The marks are written as subscripts, for example, a marked version of

(ab+b)*ba is (a1b2+b3)*b4a5. Notice that a1 and a5 are treated as different symbols. Based on

Berry's method, a deterministic finite state automaton from regular expression (a1b2+b3)*b4a5

is shown in Figure 7, where C0=(a1b2+b3)*b4a5, C1=b2(a1b2+b3)*b4a5 C2=(a1b2+b3)*b4a5

C3=(a1b2+b3)*b4a5 C4=a5 C5=1. If we unmark symbols, the result will be a non-deterministic

finite automaton of the original regular expression (ab+b)*ba.

37

C0

C5C1 C3

C2

C4

a1b3

b4

b3

b4

a1

a1

b1

a5

b3

b4

Figure 7 Automaton for (a1b2+b3)*b4a5

Page 43: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

The basic pattern in our system is the system call combined with its condition, not the

system call itself. We evaluate the condition at run time and treat the system call with its

condition as if it were a single symbol.

Our main problem is to remember the system call's argument value in the pattern

when the system call matches at run time. We call this problem as variable binding. For

example, if the rule is a(x);b(y);c(z) (x+y+z) and the input system call sequence is: a(1)

b(2) c(3). The variable binding is x=1, y=2, z=3. Now consider a more complex pattern

(a+b)*a(x)a(y)b(z), and the input sequence: a(1), a(2), a(3), b(1). To match the pattern,

we will bind the variables as x=2, y=3, z=1. In this case, when an event a(2) comes, we need

to remember that a(2) is matched for a(x), not a(y). For this purpose, we need to distinguish

the first event a(x) from the second event a(y). We will consider them as two different events.

So we adopt the idea of distinct symbols from Berry & Sethi. We mark the symbols in the

pattern to make them distinct. The marked version of (a+b)*a(x)a(x)b(z) is

(a1+b2)*a3(x)a4(x)b5(z). Using this method, we can distinguish the variables and do the

variable binding correctly.

Berry & Sethi's algorithm is fast for creating the automaton. But the automaton

created are not efficient. In the example of Figure 5, C0, C2 and C3 are the same states. When

we simulate the automaton in the run time, it needs to efficient. So, in our algorithm, after

marking the symbols, we still use Brzozowski's algorithm and extend it to do the variable

binding.

Algorithm: We can intercept the system call at any time. So initial state S0 can start

at any system call. We use a specific system call named any() to stand for any system call.

For the pattern P, the initial state S0 corresponding to (any)*;P. From state S0, there is always

a transition any()* return to itself in all the NEFSA. We consider this transition as default for

all the NEFSA and will not discuss it in the algorithm below.

Comparing with Brzozowski's algorithm, the input symbols here are events, or more

specifically, system calls. Each state is an extended regular expression. We use three

functions in the algorithm. Function 'FirstSet(state)' returns a set of events which can be the

first input symbol of 'state'. Function 'FollowSet(event, state)' is similar to the derivative

function D(a, E) in Brzozowski's algorithm. It computes the derivative of 'state' by 'event'.

38

Page 44: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

Function 'Bind (FSM, event)' adds the variables in 'event' to the data structure 'FSM'. We use

a structure 'FSM' to store all the binding variables.

Code Example 6 Algorithm for Translating the ASL into EFSAThe function FirstSet(S) is defined as:FirstSet(a) = a;FirstSet(P*) = FirstSet(P);FirstSet(P||Q) = FirstSet(P) FirstSet(Q);FirstSet(P;Q) = FirstSet(P) FirstSet(Q) (if P can be empty) = FirstSet(P) (otherwise); The function FollowSet(e, S) is defined as:FollowSet(e, e) = 1;FollowSet(e, a) =0 if ea ;FollowSet(e, P*) = FollowSet(e, P);P*;FollowSet(e, P;Q) = (FollowSet(e, P);Q) || (FollowSet(e, Q)) (if P can be empty) = FollowSet(e, P);Q (otherwise)FollowSet(e, P||Q) = (FollowSet(e, P)||FollowSet(e, Q)Function Bind(FSM, e) { for (every variable v in e) insert v in FSM;}The algorithm is described as following:0. Mark all the system calls in the pattern so that they are distinct.1. put S0 into a set states_set;2. for each element S in the states_set {3. events_set = FirstSet(S);4. for each element e in the events_set {5. Bind(FSM, e);6. S1=FollowSet(e, S)7. insert S1 into states_set;8. creat edge t =(S, e , S1); record the transition from

current state S to next state S1 based on the event e. 9. insert t into edges_set; 10. } 11. } 12. unmark all the system calls

The correctness of this algorithm is to show that the algorithm will terminate. It is

means that the set of states we generated is finite. Comparing our algorithm with

Brzozowski's algorithm, we find that the difference occurs in line 0, line 5 and line 12. In line

0 and line 12, mark and unmark system calls cannot create more states. In line 5, binding the

variables into a structure also cannot create any more states. So, for the same reason of

Brzozowski's algorithm, the set {F|w: F = D(w, E) } has a finite number of equivalence

classes based on the associatively, commutativity, and idempotence of operation '||'. This

property guarantees that our algorithm will terminate.

39

Page 45: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

5.2 Illustration of Automata Construction

Consider the Code Example 4 in section 3.6. There are four system calls in the whole

system: access(CString filename, int flag), open(CString filename, int flag, mode_t mode),

$access(CString filename , int flag) and $open(CString filename, int flag, mode_t mode). To

make things clearer, we ignore all the conditions here. The initial state S0 is:

(access(name, mode);(access()||$open())*;open(name1, flags, mode1)||

($open(f, f1,mode))

The FirstSet of S0 is: {access(name, mode), $open(f, f1, mode) }

So, currently, the events_set is {access(name, mode), $open(f, f1, mode) }

For event access(name, mode), we first add its variables into FSM

BindingSet(FSM, access(name, mode)) FSM is:

struct FSM {

CString access_name;

mode_t access_mode;

}

Then, we get its next state S1 by computing FollowSet(access(name, mode), S0),

FollowSet(access(name, mode), S0)=

(access()||$open())*;(open(name1,flags,mode1))

This is a new state, we insert it into states_set, call it S1.

The edge t here is (S0, access(name, mode), S1);

All the computations about the first event access(name, mode) are done.

For the second event $open(f, f1, mode) in the events_set, we add its variables into FSM

BindingSet(FSM, $open(f, flag, mode) ) FSM is:

struct FSM {

CString access_name;

mode_t access_mode;

CString $open_f;

int $open_flag;

40

Page 46: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

mode_t $open_mode;

}

We gets next state S1 by computing FollowSet($open(f, flag, mode, S0)

FollowSet(S0, $open(f, flag, mode) )=1

This is a new state , we insert it into states_set, call it S2.

The edge t here is (S0, $open(f, flag, mode) S2);

All the computations about the second event $open(f, flag, mode) are done.

All the computations of S0 are finished here. We repeat the steps on state S1, S2, and get the

following results:

The FirstSet of S1 is:{access(), $open(), open(name1, flags, mode1) }

BindingSet(FSM, open(name1, flag) ) FSM is:

struct FSM {

CString access_name;

mode_t access_mode;

CString $open_f;

int $open_flag;

mode_t $open_mode;

CString open_name1;

int open_flags;

mode_t open_mode1;}

FollowSet(S1, access()) = S1

Edge (S1, access(), S1) is inserted into edges_set;

FollowSet(S1, $open()) = S1

Edge (S1, $open(), S1) is inserted into edges_set;

FollowSet(S1, open(name1, flags, mode1) ) = S2;

Edge (S1, open(name1,flags, mode1), S2) is inserted into edges_set;

The FirstSet of S2 is NULL;

41

Page 47: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

There is no more new state created, algorithm terminates. The NEFSA is shown as Figure 8.

5.3 Code Generation

At code generation time, the EFSA generated from ASL specifications is turned into a

C++ class. Specifically, one class is generated from each ASL specification. This class has

one member function for each event, and these member functions have the same number and

types of arguments as the event. At runtime, the system call interceptor delivers events to the

system call detection engine, which in turn invokes the member function corresponding to

the events.

A list of active EFSA instances is created at runtime. When an event is delivered, we

go through the list of EFSA instances and for each of them, make a transition based on its

current state and the newly delivered event. If there is no such transition for an EFSA

instance, then it is “killed.”

42

Figure 8. The constructed NEFSA from Code example 4

S1

S2

access()$open()

open(name1, flags,mode1)

S0

any()*

$open(f,f1,mode)

access(name,mode)

Page 48: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

5.4 Examples of ASL Specifications Translated into C++ Class Definitions

In this section we provide some C++ code that would be produced by compiling the

specifications. First, consider the specification in Code Example 3 in section 3.6 for the cat

that is intended to prevent the program from being used to read the password file:

The code generated is shown as following. This simple program intercepts all system

calls from cat. For open() calls, it checks the name of the file being opened, an if the name is

/etc/passwd, it simulates a failure of the open(). For other system calls, the program allows

them to go through without any change.

Code Example 7 Preventing cat Program to Read /etc/passwd File:class MonitorProg:{ private: struct FSM { int cs_; FSM* next_; int Rule_0_f_; void clone(FSM* p2) { Rule_0_f_=p2->Rule_0_f_; }; }; FSM* stack1_; public: MonitorProg() { FSM* start=new FSM start->cs_=0; start->next_=NULL; stack1_=start; }; int open_entry (CString pathname, int flags, mode_t mode){ FSM* stack2_; stack2_=NULL; FSM* tmp=stack1_; while(tmp!=NULL) {//perform variable binding tmp->Rule_0_f_=pathname; tmp=tmp->next_; } while (stack1_!=NULL) {//simulate the NEFSA FSM* cp=stack1_; stack1_=stack1_->next_; switch (cp->cs_) { case 0:{ if( realpath(cp->Rule_0_f_)=="/etc/passwd") { cp->cs_=1; fakedRC(-1); cp->next_=stack2_;

43

Page 49: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

stack2_=cp; } break; } default: { delete cp;break;} } } FSM* start=new FSM; // always put a initial state start->cs_=0; start->next_=stack2_; stack2_=start; stack1_=stack2_; };};

As the second example, we consider the race condition vulnerability described in

section 3.6 Code Example 4. The code is shown on Appendix B.

44

Page 50: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

CHAPTER 6. SUMMARY AND CONCLUSION

6.1 Effectiveness

In this section we present an evaluation of the effectiveness of our specification-based

technique in terms of it ability to capture several known computer intrusions. For this

purpose, we studied the advisories put out by CERT coordination center over the 5-year

period from 1993 through 1997. We focussed only on intrusions on UNIX machines, since

that is the main target of our work. Our results are shown in the Table 1.

Table 1 Detectable intrusions

Category Total number Detectable

Number

Detectable

percent

Trojan Horses in privileged programs 3 1 33

Design weaknesses in protocols,

authentication, encryption etc.

17 2 12

Configuration or environmental errors 10 4 40

Other program errors 52 46 89

Total number 82 53 65

The number of advisories issued by CERT over this period is over 110, out of which

we could not identify the underlying vulnerability for about 9 cases. The rest of the

advisories either relate to non-UNIX systems, correspond to advisories that have been

superseded, or correspond to advisories that repeat or summarize vulnerabilities mentioned in

earlier advisories. Eliminating all of these, we had about 82 incidents. We divided these 82

into four categories. The first three categories consist of vulnerabilities that fall largely

45

Page 51: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

outside the scope of our technique. For instance, exploitation of design flaws such as

inadequate authentication (e.g. NFS and rlogin), weaknesses in encryption scheme (e.g.

password stealing using network tapping), erroneous system configuration, etc do not

typically alter the behavior of processes involved, and hence are difficult to identify in our

scheme. Nevertheless, to the extent they cause unusual behavior, they can be detected using

appropriate ASL specification. The last category consists of the class of errors that our

technique is targeted at, namely, vulnerabilities that exploit flaws in programs. Our technique

is very effective for this category of vulnerabilities, and is able to capture about 90% of

intrusion in this category, most of which can actually be prevented using appropriate ASL

specifications.

6.2 Summary

In this thesis, we use high-level specifications to describe the security-related

behavior of processes. These specifications are intended to capture normal or intended

behavior of processes. Deviations from these specifications indicate intrusions. Thus, attacks

can be detected even though they may not have been encountered previously. Previous

research in intrusion detection focussed exclusively on specifying misuse or intended

behavior. In contrast, we develop an approach in which one can specify misuse, intended

behavior, as well as the actions to be taken in response to intrusion attempts

We notice that damage must eventually be effected via the system calls made by the

attacked process to its operating-system environment. In particular, operations for

manipulating files or network connections are all administered through system calls. So

security-related behavior can be represented in terms of the system calls made by each

process running on the host. We detect deviations from the expected behavior by intercepting

and validating the system calls at runtime. Note that this approach gives us the ability to

detect problems before they cause damage, and can thus be preventive.

We designed a high-level language called Auditing Specification Language (ASL) to

specify security-related behavior. This language is powerful enough to express a range of

integrity constraints and behaviors over time. Specifications in ASL are compiled into

46

Page 52: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

optimized C++ programs for efficient detection of deviations from these specifications. ASL

is intended to simplify the specification of relationships and constraints that must hold in a

correct system, without being concerned about the details on how these conditions can be

verified. This feature greatly simplifies the user's work

Our compiler will translate ASL specifications into an Extended Finite-State

Automaton (EFSA). An EFSA is similar to a finite-state automaton with a set of state

variables. The EFSA can be simulated at runtime to detect intrusions efficiently.

We introduce the notion of atomic execution in our ASL specification to deal with

attacks based on race conditions and other errors involving interference among multiple

processes. This feature lets user describe the normal/intended behavior of programs

containing synchronization or other concurrent access errors that would otherwise need to be

captured in terms known patterns of misuse.

6.3 Conclusion and Future Work

There are many ways to build a reliable system to do the intrusion detection. As we

observed, all intrusions eventually damage a system through system calls, which provide the

interface between application software and OS kernel. One of the key components in our

approach is a specification language ASL, which is used to describe the patterns for the

system calls to be captured. Building an efficient EFSA is crucial to the overall system

performance. The primary goal is to minimize pattern matching time and the size of the

automaton. Atomic execution provides the data integrity in a multi-processes system.

Currently, we have finished the ASL design. A parser has been built to translate the

ASL specification into an EFSA and from EFSA to generate the C++ code. The atomic

execution design is completed, but is not yet implemented. We also need to develop a

simple characterization of ASL patterns that can be compiled into DEFSA. A DEFSA can be

more efficient than the NEFSA that we are currently using.

Currently, only an experienced administrator writes the specifications. We may adopt

the result of some misuse or abnormal intrusion detection systems and develop an expert

47

Page 53: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

system to help the user to write the specification. We may also develop a graphic interface to

the user so that the user can see how a specification is changed to automaton visually.

48

Page 54: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

APPENDIX A CLASSIFICATION OF SYSTEM CALLS IN RED HAT LINUX

We define all the event abstracts in Red hat Linux here. We classify the system calls

in eight categories and some subcategories. For each subcategory, we list the abstract event

followed by the original system calls.1. File Access

Setup

WriteOpen (path) - open and possibly create a file for write = {open(path, flags) | (flags & (O_WRONLY | O_APPEND | O_TRUNC)), open(path, flags, mode) | (flags & (O_WRONLY | O_APPEND | O_TRUNC)), creat(path, mode); }

ReadOpen(path) - open a file for read = { open(path,flags) | (flags & O_RONLY), open(path,flags,mode) | (flags & O_RONLY), }

Open_all(path) - open a file = { ReadOpen(path), WriteOpen(path), }

truncate_2(path, len) - truncate a file to a specified length = { truncate(path, len), ftruncate(fd, len) | path = fdToName(fd) }

creat - create a file(3)int creat(const char *pathname, mode_t mode);

open - open and possibly create a file or deviceint open(const char *pathname, int flags);int open(const char *pathname, int flags, mode_t mode);

truncate, ftruncate - truncate a file to a specified length(11)int truncate(const char *path, size_t length);int ftruncate(int fd, size_t length);

pipe - create pipe

49

Page 55: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

int pipe(int filedes[2]);

File Attributes

filePermCheck(path) - check file permission = {stat(path,buf)fstat(fd, buf) | path = fdToName(fd)lstat(path, buf)access(path, mode)}

fileAttrCheck(path) - check any file attribute = { stat(path, buf) fstat(fd, buf) | path = fdToName(fd) lstat(path, buf) }

filePermChange(path) - change file permissions = {chmod_2(path, mode)chown_2(path, owner, group)}

fileAttrChange(path) - change file attributes = {filePermChange(path)

stat_2(path, buf) - get  file status = {stat(path, buf)fstat(fd, buf) | path = fdToName(fd) }

chmod_2(path, mode) - change permissions of a file = {chmod(path, mode), fchmod(fd, mode) | path = fdToName(fd); }

chown_2(path, owner, group) - change ownership of a file = { chown(path, owner, group), fchown(fd, owner, group) | path = fdToName(fd) }

link_2 (oldpath, newpath) - make a new name for a file = {link(oldpath, newpath),symlink(topath, frompath) }

access - check user's permissions for a fileint access(const char *pathname, int mode);

stat, fstat, lstat - get file status int stat(const char *file_name, struct stat *buf);int fstat(int filedes, struct stat *buf);int lstat(const char *file_name, struct stat *buf);

umask - set file creation mask

50

Page 56: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

int umask(int mask);

utime, utimes - change access and/or modification times of an inode(32)int utime(const char *filename, struct utimbuf *buf);int utimes(char *filename, struct timeval *tvp);

chmod, fchmod - change permissions of a file int chmod(const char *path, mode_t mode);int fchmod(int fildes, mode_t mode);

chown, fchown - change ownership of a file int chown(const char *path, uid_t owner, gid_t group);int fchown(int fd, uid_t owner, gid_t group);

link - make a new name for a fileint link(const char *oldpath, const char *newpath);

symlink - make a new name for a fileint symlink(const char *topath, const char *frompath);

rename - change the name or location of a fileint rename(const char *oldpath, const char *newpath);

unlink - delete a name and possibly the file it refers toint unlink(const char *pathname);

Read/Write

llseek,lseek - reposition read/write file offsetint _llseek(unsigned int fd, unsigned long offset_high, unsigned long offset_low,loff_t * result, unsigned int whence);off_t lseek(int fildes, off_t offset, int whence);

lseek_2(fd, offset, whence) - reposition read/write file offset = { _llseek(fd, offset_high, offset_low, * , whence) | offset = (offset_high<<32) | offset_low, lseek(fd, offset, whence) }

readlink - read value of a symbolic linkint readlink(const char *path, char *buf, size_t bufsiz);

Directory Operations

mkdir - create a directory int mkdir(const char *pathname, mode_t mode);

mknod - create a directoryint mknod(const char *pathname, mode_t mode, dev_t dev);

mkdir_2(path, mode) - create a directory = { mkdir(path, mode), mknod(path, mode, *) }

51

Page 57: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

rmdir - delete a directoryint rmdir(const char *pathname);

getdents - get directory entriesint getdents(unsigned int fd, struct dirent *dirp, unsigned int count);

readdir - read directory entryint readdir(unsigned int fd, struct dirent *dirp, unsigned int count);

Miscellaneous

fdatasync - synchronize a file's in-core data with that on diskint fdatasync(int fd);

fsync - synchronize a file's complete in-core state with that on diskint fsync(int fd);

msync - synchronize a file with a memory mapint msync(const void *start, size_t length, int flags);

chroot - change root directoryint chroot(const char *path);

chdir, fchdir - change working directory(2)int chdir(const char *path);int fchdir(int fd);

chdir_2(path)- change working directory = { chdir(path), fchdir(fd) | path = fdToName(fd) }

2. Network Access

Setup

socket - create an endpoint for communicationint socket(int domain, int type, int protocol);

socketpair - create a pair of connected socketsint socketpair(int d, int type, int protocol, int sv[2]);

getsockopt - get options on socketsint getsockopt(int s, int level, int optname, void *optval, int *optlen);

setsockopt - set options on socketsint setsockopt(int s, int level, int optname, const void *optval, int optlen);

bind - bind a name to a socketint bind(int sockfd, struct sockaddr *my_addr, int addrlen);

getsockname - get socket name

52

Page 58: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

int getsockname(int s , struct sockaddr * name , int * namelen )

listen - listen for connections on a socketint listen(int s, int backlog);

accept - accept a connection on a socketint accept(int s, struct sockaddr *addr, int *addrlen);

connect - initiate a connection on a socketint connect(int sockfd, struct sockaddr *serv_addr, int addrlen );

shutdown - shut down part of a full-duplex connectionint shutdown(int s, int how);

Send/Receive

recv, recvfrom, recvmsg - receive a message from a socketint recv(int s, void *buf, int len, unsigned int flags);int recvfrom(int s, void *buf, int len, unsigned int flags, struct sockaddr *from, int*fromlen);int recvmsg(int s, struct msghdr *msg, unsigned int flags);

recv_2 (s, buf, len, flag) - receive a message from a socket = { recv(s, buf, len, flag), recvfrom( s, buf, len, flag, *), recvmsg(s, msg, flag) | buf = FUN get_buf(*msg) len = FUN get_len(*msg) }

send, sendto, sendmsg - send a message from a socket(25)int send(int s, const void *msg, int len, unsigned int flags);int sendto(int s, const void *msg, int len, unsigned int flags, const struct sockaddr*to, int tolen);int sendmsg(int s, const struct msghdr *msg, unsigned int flags);

send_2 (s, msg, len, flag) - send a message to a socket = { send(s, msg, len, flag), sendto( s, msg, len, flag, *), sendmsg(s, msg, flag) | len = FUN get_len(*msg) }

Naming

getdomainname -- get domain nameint getdomainname(char *name, size_t len);

setdomainname - set domain nameint setdomainname(const char *name, size_t len);

gethostid - get the unique identifier of the current hostlong int gethostid(void);

gethostname - get host nameint gethostname(char *name, size_t len);

53

Page 59: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

gethostid_2() - get the unique identifier of the current host = { gethostid(void), gethostname(name, len) | return_value = NameToId(name) }

sethostid - set the unique identifier of the current hostint sethostid(long int hostid);

sethostname - set host name(28)int sethostname(const char *name, size_t len);

sethostid_2() - set the unique identifier of the current host = { sethostid(void), sethostname(name, len) | return_value = NameToId(name) }

getpeername - get name of connected peerint getpeername(int s, struct sockaddr *name, int *namelen)

3. Message Queues

msgctl - message control operationsint msgctl(int msqid, int cmd, struct msqid_ds *buf )

msgget - get a message queue identifierint msgget(key_t key, int msgflg)

msgsnd - send messageint msgsnd(int msqid, struct msgbuf *msgp, int msgsz, int msgflg )

msgrcv - receive messsageint msgrcv(int msqid, struct msgbuf *msgp, int msgsz, long msgtyp, int msgflg )

4. Shared Memory

shmctl - shared memory controlint shmctl(int shmid, int cmd, struct shmid_ds *buf);

shmat - shared memory operationschar *shmat(int shmid, char *shmaddr, int shmflg )

shmdt - shared memory operationsint shmdt(char *shmaddr)

shmget - allocates a shared memory segmentint shmget(key_t key, int size, int shmflg);

5. File Descriptor Operations

Setup

close - close a file descriptorint close(int fd);

54

Page 60: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

mmap - map files or devices into memoryvoid * mmap(void *start, size_t length, int prot , int flags, int fd, off_t offset);

munmap - unmap files or devices into memoryint munmap(void *start, size_t length);

getdtablesize - get descriptor table sizeint getdtablesize(void);

dup, dup2 - duplicate a file descriptorint dup(int oldfd);int dup2(int oldfd, int newfd);

dup_2(fd) - duplicate a file descriptor = { dup(fd), dup2(oldfd, newfd) | return_value = newfd, }

Read/Write

read - read from a file descriptorssize_t read(int fd, void *buf, size_t count);

readv - read a vectorint readv(int fd, const struct iovec * vector, size_t count);

read_2(fd,buf,count) = { read(fd, buf,count), readv(fd, vector, count) |buf = FUN get_buf(vector) }

read_3(fd,buf, count) = { read_2(fd, buf,count), recv_2 (fd, buf, count, *) }

write - write to a file descriptorssize_t write(int fd, const void *buf, size_t count);int writev(int fd, const struct iovec * vector, size_t count);

write_2(fd,buf,len) = { write(fd, buf,count), writev(fd, vector, count) |buf = FUN get_buf(vector) }

write_3(fd,buf,len) = { write_2(fd, buf, len), send_2(fd, buf, len, *)}

File Descriptor Control

flock - apply or remove an advisory lock on an open file

55

Page 61: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

int flock(int fd, int operation)

fcntl - manipulate file descriptorint fcntl(int fd, int cmd);int fcntl(int fd, int cmd, long arg);

fcntl_2(fd, cmd) - manipulate file descriptor = { fcntl(fd, cmd), fcntl(fd, cmd, arg) }

ioctl - control deviceint ioctl(int d, int request, ...)

6. Time-Related

nanosleep - pause execution for a specified timeint nanosleep(const struct timespec *req, struct timespec *rem);

alarm - set an alarm clock for delivery of a signalunsigned int alarm(unsigned int seconds);

getitimer - get value of an interval timerint getitimer(int which, struct itimerval *value);

setitimer - set value of an interval timerint setitimer(int which, const struct itimerval *value, struct itimerval *ovalue);

gettimeofday - get timeint gettimeofday(struct timeval *tv, struct timezone *tz);

settimeofday - set timeint settimeofday(const struct timeval *tv , const struct timezone *tz);

time - get time in secondstime_t time(time_t *t);

times - get process timesclock_t times(struct tms *buf);

7. Process Control

Process Creation and Termination

_exit - terminate the current processvoid _exit(int status);

clone - create a child processpid_t clone(void *sp, unsigned long flags)

execve - execute programint execve(const char *filename, const char *argv [], const char *envp[]);

fork, vfork - create a child process

56

Page 62: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

pid_t fork(void);pid_t vfork(void);

fork_2() - create a child process = { fork(void), vfork(void); }

wait, waitpid - wait for process termination(34)pid_t wait(int *status)pid_t waitpid(pid_t pid, int *status, int options);

wait_2 (status) = { wait (status), waitpid(pid, status, *) | pid = 0; }

pid_t wait4(pid_t pid, int *status, int options,struct rusage *rusage)

wait3, wait4 - wait for process termination, BSD style(35)pid_t wait3(int *status, int options, struct rusage *rusage)

getpid - get current process identificationpid_t getpid(void);

getppid - get parent process identificationpid_t getppid(void);

Signals

kill - send signal to a process(16)int kill(pid_t pid, int sig);

killpg - send signal to a process groupint killpg(int pgrp, int sig);

kill_2(pid,sig) - send signal to a process = { kill(pid, sig), killpg(pgrp, sig) | pid = pgrp }

sigblock - manipulate the signal maskint sigblock(int mask);

sigmask - manipulate the signal maskint sigmask(int signum);

siggetmask - manipulate the signal maskint siggetmask(void);

sigsetmask - manipulate the signal maskint sigsetmask(int mask);

57

Page 63: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

signal - ANSI C signal handlingvoid(*signal(int signum, void(*handler)(int)))(int);

sigvec - BSD software signal facilitiesint sigvec(int sig, struct sigvec *vec, struct sigvec *ovec);

sigaction - POSIX signal handling functions.int sigaction(int signum, const struct sigaction *act, struct sigaction *oldact);

sigprocmask - POSIX signal handling functions.int sigprocmask(int how, const sigset_t *set, sigset_t *oldset);

sigpending - POSIX signal handling functions.int sigpending(sigset_t *set);

sigsuspend - POSIX signal handling functions.int sigsuspend(const sigset_t *mask);

sigpause - atomically release blocked signals and wait for interruptint sigpause(int sigmask);

pause - wait for signalint pause(void);

sigreturn - return from signal handler and cleanup stack frameint sigreturn(unsigned long __unused);

Synchronization

poll - wait for some event on a file descriptorint poll(struct pollfd *ufds, unsigned int nfds, int timeout);

select - synchronous I/O multiplexingint select(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval*timeout);

semctl - semaphore control operationsint semctl(int semid, int semnun, int cmd, union semun arg )

semget - get a semaphore set identifierint semget(key_t key, int nsems, int semflg )

semop - semaphore operationsint semop(int semid, struct sembuf *sops, unsigned nsops)

User/Group Id

getuid - get user real IDuid_t getuid(void);

getgid - returns the real group ID of the current process.gid_t getgid(void);

58

Page 64: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

getegid - returns the effective group ID of the current process.gid_t getegid(void);

geteuid - returns the effective user ID of the current process.uid_t geteuid(void);

getresuid - get real, effective and saved user ID(14)int getresuid(uid_t *ruid, uid_t *euid, uid_t *suid);

getuid_2() - get real user ID = { geteuid(void), getresuid(ruid, euid, suid) | return_value = ruid }

geteuid_2() - get effective user ID = { geteuid(void), getresuid(ruid, euid, suid) | return_value = euid }

getgid_2() - get real group ID = { getgid(), getresgid(rgid, egid, sgid) | return_value = rgid }

getegid_2() - get effective group ID = { getegid(), getresgid(rgid, egid, sgid) | return_value = egid }

getresgid - get real, effective and saved group ID(15)int getresgid(gid_t *rgid, gid_t *egid, gid_t *sgid);

getsid - getsid - get session IDpid_t getsid(void);

getpgid - get process group IDpid_t getpgid(pid_t pid);

getpgrp -get process group IDpid_t getpgrp(void);getpgrp is equivalent to getpgid(0).

getpgid_2 (pid) - get process group ID = { getpgid(pid), getpgrp(void) | pid = 0 }

getgroups - get group access listint getgroups(int size, gid_t list[])

setegid - set effective group IDint setegid(gid_t egid);

seteuid - set effective user ID

59

Page 65: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

int seteuid(uid_t euid);

setfsgid - set group identity used for file system checksint setfsgid(uid_t fsgid)

setfsuid - set user identity used for file system checksint setfsuid(uid_t fsuid)

setgid - set group identityint setgid(gid_t gid)

setregid - set real and / or effective group IDint setregid(gid_t rgid, gid_t egid);

setreuid - set real and / or effective user IDint setreuid(uid_t ruid, uid_t euid);

setresgid - set real, effective and saved group ID(30)int setresgid(gid_t rgid, gid_t egid, gid_t sgid);

setresuid - set real, effective and saved user ID(29)int setresuid(uid_t ruid, uid_t euid, uid_t suid);

setuid_2() - set real user ID ) = { seteuid(void), setresuid(ruid, euid, suid) | return_value = ruid }

seteuid_2() - set effective user ID = { seteuid(void), setresuid(ruid, euid, suid) | return_value = euid }

setgid_2() - set real group ID = { setgid(), setresgid(rgid, egid, sgid) | return_value = rgid }

setegid_2() - set effective group ID = { setegid(), setresgid(rgid, egid, sgid) | return_value = egid }

setgroups - set group access listint setgroups(size_t size, const gid_t *list);

setsid - creates a session and sets the process group IDpid_t setsid(void);

setuid - set user identityint setuid(uid_t uid)

Resource Control

60

Page 66: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

getrlimit - get resource limitint getrlimit(int resource, struct rlimit *rlim)

setrlimit - set resource limitsint setrlimit(int resource, const struct rlimit *rlim);getrusage - get resource usageint getrusage(int who, struct rusage *usage);

getpriority - get program scheduling priorityint getpriority(int which, int who);

nice - change process priorityint nice(int inc);

setpriority - set program scheduling priorityint getpriority(int which, int who);

Virtual Memory

brk {,sbrk} - change data segment sizeint brk(void *end_data_segment);void *sbrk(ptrdiff_t increment);

mlock - disable paging for some parts of memory(20)int mlock(const void *addr, size_t len);

mlockall - disable paging for calling processint mlockall(int flags);

munlock - reenable paging for some parts of memory(21)int munlock(const void *addr, size_t len);

munlockall - reenable paging for calling processint munlockall(void);

mprotect - control allowable accesses to a region of memoryint mprotect(const void *addr, size_t len, int prot);

mremap - re-map a virtual memory addressvoid * mremap(void * old_address, size_t old_size , size_tnew_size, unsigned long flags);

modify_ldt - get or set local descriptor table, a per-process memory managementtable used by the i386 processorint modify_ldt(int func, void *ptr, unsigned long bytecount);

Miscellaneous

uselib - select shared libraryint uselib(const char *library);

profil - execution time profileint profil(char *buf, int bufsiz, int offset, int scale);

61

Page 67: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

ptrace - process traceint ptrace(int request, int pid, int addr, int data);

8. System-Wide

Unprivileged -- Filesystem

sync - commit buffer cache to disk.int sync(void);

ustat - get file system statisticsint ustat(dev_t dev, struct ustat * ubuf);

statfs, fstatfs - get file system statistics(10)int statfs(const char *path, struct statfs *buf);int fstatfs(int fd, struct statfs *buf);

sysfs - get file system type information(31)int sysfs(int option, const char * fsname);int sysfs(int option, unsigned int fs_index, char * buf);int sysfs(int option);

Unprivileged -- Miscellaneous

getpagesize - get system page sizesize_t getpagesize(void);

sysinfo - returns information on overall system statisticsint sysinfo(struct sysinfo *info);

uname - get name and information about current kernelint uname(struct utsname *buf);

Privileged -- Filesystem

setup - setup devices and file systems, mount root file systemint setup(void);

swapoff - stop swapping to file/deviceint swapoff(const char *path);

swapon - start swapping to file/deviceint swapon(const char *path, int swapflags);

nfsservctl - syscall interface to kernel nfs daemonnfsservctl(int cmd, struct nfsctl_arg *argp, union nfsctl_res *resp);

mount - mount filesystemint mount(const char *specialfile, const char * dir ,const char * filesystemtype, unsigned long rwflag , const void * data);

umount - unmount filesystems.int umount(const char *specialfile);int umount(const char *dir);

62

Page 68: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

Process Scheduling

sched_get_priority_max - get the max static priorityint sched_get_priority_max(int policy);

sched_get_priority_min - get the min static priorityint sched_get_priority_min(int policy);

sched_getparam - get scheduling parametersint sched_getparam(pid_t pid, struct sched_param *p);

sched_setparam - set scheduling parametersint sched_setparam(pid_t pid, const struct sched_param *p);

sched_getscheduler - get scheduling algorithm/parametersint sched_getscheduler(pid_t pid);

sched_setscheduler - set scheduling algorithm/parametersint sched_setscheduler(pid_t pid, int policy, const struct sched_param *p);

sched_rr_get_interval - get the SCHED_RR interval for the named processint sched_rr_get_interval(pid_t pid, struct timespec *tp);

sched_yield - yield the processorint sched_yield(void);

Privileged -- Time

stime - set timeint stime(time_t *t);

adjtimex - tune kernel clockint adjtimex(struct timex *buf);

Loadable Modules

create_module - create a loadable module entrycaddr_t create_module(const char *name, size_t size);

delete_module - delete a loadable module entryint delete_module(const char *name);

init_module - initialize a loadable module entryint init_module(const char *name, struct module *image);

query_module - query the kernel for various bits pertaining to modules.int query_module(const char *name, int which, void *buf, size_t bufsize, size_t *ret);

get_kernel_syms - retrieve exported kernel and module symbolsint get_kernel_syms(struct kernel_sym *table);

Accounting and Quota

63

Page 69: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

acct - switch process accounting on or offint acct(const char *filename);

quotactl - manipulate disk quotasint quotactl(cmd, special, uid, addr)

Privileged -- Miscellaneous

sysctl - read/write system parametersint _sysctl(struct __sysctl_args *args);

syslog - read and/or clear kernel message ring bufferint syslog(int type, char *bufp, int len);

idle - make process 0 idlevoid idle(void);

reboot - reboot or disable Ctrl-Alt-Delint reboot(int magic, int magic_too, int flag);

ioperm - set port input/output permissionsint ioperm(unsigned long from, unsigned long num, intturn_on);

iopl - change I/O privilege levelint iopl(int level);

bdflush - start, flush, or tune buffer-dirty-flush daemon(1)int bdflush(int func, long *address);int bdflush(int func, long data);

cacheflush - flush contents of instruction and/or data cacheint cacheflush(char *addr, int nbytes, int cache);

ipc - System V IPC system callsint ipc(unsigned int call, int first, int second, intthird, void *ptr, long fifth);

socketcall - socket system callsint socketcall(int call, unsigned long *args);

personality - set the process execution domainint personality(unsigned long persona);

vhangup - virtually hangup the current ttyint vhangup(void);

vm86old, vm86 - enter virtual 8086 modeint vm86old(struct vm86_struct * info);int vm86(unsigned long fn, struct vm86plus_struct * v86);

64

Page 70: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

APPENDIX B COMPILED CODE FOR ASL SPECIFICATIONOF FOR RACE VULNERABILITY

This C++ code is generated from the ASL Code Example 4 in section 3.6.class MonitorProg: {private: struct FSM {

int cs_;FSM* next_;int Rule_0_ruid_;String Rule_0_rn_;int Rule_0_flags_;int Rule_1_flag_; void clone(FSM* p2) { Rule_0_ruid_=p2->Rule_0_ruid_; Rule_0_flags_=p2->Rule_0_flags_; Rule_0_rn_=p2->Rule_0_rn_; Rule_1_flag_=p2->Rule_1_flag_;};

};

FSM* stack1_; int savedEuid; int changedEuid;

public: MonitorProg(){

FSM* start=new FSM;start->cs_=0;start->next_=NULL;stack1_=start;

}; int access_entry (CString name, mode_t mode){ FSM* stack2_; stack2_=NULL;

FSM* tmp=stack1_;while(tmp!=NULL) { tmp=tmp->next_;}while (stack1_!=NULL) { FSM* cp=stack1_; stack1_=stack1_->next_; switch (cp->cs_) {

case 0:{cp->Rule_0_ruid_=getuid();cp->Rule_0_rn_=realpath(name);

65

Page 71: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

FSM* np=new FSM;np->cs_=1;np->clone(cp);delete cp;np->next_=stack2_;stack2_=np;break;

} default: { delete cp;break;} }}FSM* start=new FSM;start->cs_=0;start->next_=stack2_;stack2_=start;stack1_=stack2_;}

int open_entry (CString name1, int flags, mode_t mode1){ FSM* stack2_; stack2_=NULL; FSM* tmp=stack1_; while(tmp!=NULL) {

tmp->Rule_0_flags_=flags; tmp=tmp->next_;

} while (stack1_!=NULL) {

FSM* cp=stack1_; stack1_=stack1_->next_; switch (cp->cs_) { case 1:{ if ( (cp->Rule_0_rn_==realpath(name1) )) {

FSM* np=new FSM; np->cs_=2; np->clone(cp); delete cp;

{ changedEuid = 1; savedEuid = geteuid(); setreuid((-1), np->Rule_0_ruid_); } np->next_=stack2_; stack2_=np;

} break; } default: { delete cp;break;} }

} FSM* start=new FSM; start->cs_=0; start->next_=stack2_; stack2_=start; stack1_=stack2_; }

66

Page 72: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

int access_exit (CString pathname, mode_t mode){FSM* stack2_;stack2_=NULL;FSM* tmp=stack1_;while(tmp!=NULL) { tmp=tmp->next_;}while (stack1_!=NULL) { FSM* cp=stack1_; stack1_=stack1_->next_; switch (cp->cs_) {

case 1:{FSM* np=new FSM;np->cs_=1;np->clone(cp);delete cp;np->next_=stack2_;stack2_=np;break; }

default: { delete cp;break;} }}FSM* start=new FSM;start->cs_=0;start->next_=stack2_;stack2_=start;stack1_=stack2_;

};

int open_exit (CString f, int flag, mode_t mode ){ FSM* stack2_; stack2_=NULL; FSM* tmp=stack1_; while(tmp!=NULL) {

tmp->Rule_1_flag_=flag; tmp=tmp->next_;

} while (stack1_!=NULL) {

FSM* cp=stack1_; stack1_=stack1_->next_; switch (cp->cs_) { case 0:{ if ( (changedEuid==1)) {

FSM* np=new FSM; np->cs_=2; np->clone(cp); delete cp;

{ changedEuid = 0; setreuid((-1), changedEuid); } np->next_=stack2_; stack2_=np;

67

Page 73: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

} break; } case 1:{ FSM* np=new FSM; np->cs_=1; np->clone(cp); delete cp; np->next_=stack2_; stack2_=np; break; } default: { delete cp;break;} }

} FSM* start=new FSM; start->cs_=0; start->next_=stack2_; stack2_=start; stack1_=stack2_; }}

68

Page 74: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

BIBLIOGRAPHY

[Anderson95] D. Anderson, T. Lunt, H. Javitz, A. Tamaru, and A. Valdes, Next-generation

Intrusion Detection Expert System (NIDES): A Summary, SRI-CSL-95-07, SRI

International, 1995

[Berry86] G. Berry and R. Sethi, From Regular Expressions to Deterministic Automata,

Theoretical Computer Science Vol 48, pp. 117-126, 1986

[Bib77] K.J.Biba. Integrity Constraints for Secure Computer Systems. Technical Report

ESD-TR-76-372, USAF Electronic Systems Division, Bedford, Massachusetts, April

1977.

[BL73] D.E.Bell and L.J.LaPadula. Secure Computer System: Mathematical Foundations

and Model. Technical Report M74-244, The MITRE Corporation, Bedford,

Massachusetts, May 1973.

[Brzozowski64] J.A. Brzozowski, Derivatives of Regular Expressions, Journal of ACM Vol

11, No.4, pp. 481-494, 1964

[Forrest97] S. Forrest, S. Hofmeyr and A. Somayaji, Computer Immunology,

Communication of ACM Vol. 40, No.10, 1997.

[Fox90] K. Fox, R. Henning, J. Reed and R. Simonian, A Neural Network Approach

Towards Intrusion Detection, National Computer Security Conference, 1990

69

Page 75: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

[GS91] Simon Garfinkel and Gene Spafford. Practical Unix Security. O’Reilly and

Associates, Sebastopol, California, 1991.

[HLMM91] R. Heady, G. Luger, A. Maccabe, and B. Mukherjee. A Method to Detect

Intrusive Activity in a Networked Environment. In Proceedings of the 14th National

Computer Security Conference, pages 362-371, October 1991.

[Ilgun93] K. Ilgun, A real-time intrusion detection system for UNIX, IEEE Symp. on

Security and Privacy, 1993.

[Ko96] C. Ko , Execution Monitoring of Security-Critical Programs in a Distributed System:

A Specification-Based Approach, Ph.D. Thesis, University of California at Davis, 1996.

[Ko94] C. Ko, G. Fink and K. Levitt, Automated detection of vulnerabilities in privileged

programs by execution monitoring, Computer Security Application Conference, 1994.

[Kosoresow97] A. Kosoresow and S. Hofmeyr, Intrusion detection via system call traces,

IEEE Software Conference 1997.

[Kumar94] S. Kumar and E. Spafford, A Pattern-Matching Model for Intrusion Detection,

National Computer Security Conference, 1994.

[Lam69] B.W. Lampson. Dynamic Protection Structures. In Proceedings of the AFIPS Fall

Joint Computer Conference, pages 27-38, 1969

[Lunt88] T. Lunt and R. Jagannathan, A prototype real-time intrusion detection system, IEEE

Symp. on Computer Security and Privacy, 1988.

[Lunt92] T. Lunt et al, A Real-Time Intrusion Detection Expert System (IDES) - Final

Report, SRI-CSL-92-05, SRI International, 1992

70

Page 76: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

[Porras92] P. Porras and R. Kemmerer, Penetration State Transition Analysis - A Rule Based

Intrusion Detection Approach, Computer Security Applications Conference, 1992

[RS91] Deborah Russell and G.T. Gangemi Sr. Computer Security Basics. O’Reilly and

Associates, Sebastopol, California, December 1991.

71

Page 77: A Specification Based Approach for Building Survivable …seclab.cs.sunysb.edu/sekar/papers/ycaith.doc  · Web view2014-07-10 · This thesis presents a specification-based approach

ACKNOWLEDGEMENTS

I wish to express my sincere appreciation to Dr. R. C. Sekar, my major professor, for

his full support, valuable advice and assistance to carry out and complete this research.

I would like to thank Dr. Johnny Wong for his help in my graduate study. His

encouragement and support makes my two years study at Iowa State University the most

rewarding time in my life.

I would like to thank Dr. Gary Leveans for his kind advising in my thesis work.

The contribution of Dr. Prasant Mohapatra as committee member is greatly

acknowledged.

Many thanks to Premchand Uppuluri, Ravi Vankamamidi, Guang Yang, Pradeep

Bollineni, and Shobhit Verma for their help to my research and study.

Finally, thanks for the love from my parents and my wife, Di Wu. Without their

support, I would not have completed this research.

This project is supported by Defense Advanced Research Project Agency's

Information Technology Office (DARPA-ITO) under the Information System Survivability

program, under contract number F30602-97-C-0244.

72