data warehousing to counter cyber crime

Upload: roli2712

Post on 30-May-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    1/24

    Data warehousing to counter cybercrime

    Data Warehousing to counter

    cybercrime

    Roli Agarwal 08BM8082

    Upma 08BM8018

    Abstract

    This paper introduces the usage and application of data

    warehousing in detecting and fighting cybercrime. A

    brief overview of data mining techniques with the

    different type of cybercrime is presented. The paper

    then describes the basic mechanism behind Intrusion

    Detection System which are useful to improve cyber

    security and in detection of threats. The paper also

    highlights some specific Intrusion Detection System that

    are in use.

    Survey of Literature

    The report by Mary DeRosa [1] has provided a basic

    description of the working and application of data-mining

    techniques and the privacy implications involved. Datamining process has been described as discover useful,

    previously unknown knowledge by analyzing large and

    complex data sets. The report also explains the basis

    for models used for Automated data analysis which is

    the process of applying or using those patterns to

    analyze data and make predictions. The importance of

    automated data analysis has also been explained byTimothy [8] as a means to discriminate criminals in larger

    data set based urban areas.

    1

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    2/24

    Data warehousing to counter cybercrime

    Data mining has also been described as the process of

    posing queries and extracting useful patterns or trends

    often previously unknown from large amounts of data

    using various techniques such as those from patternrecognition and machine learning. (Thuraisingham)[2]. In

    the same report classification of threat into malicious

    threats due to terror attacks or non-malicious threats

    due to inadvertent errors has been described.

    Chen et al[3] in a report on crime data mining have

    emphasized that traditional data mining techniques likeassociation analysis, classification and prediction, cluster

    analysis, and outlier analysis can identify patterns in

    structured data only whereas newer techniques identify

    patterns from both structured and unstructured data.

    The report has also presented overview regarding

    various mining techniques.

    Singla et al [4] in their paper have explained the use of

    link analysis in data mining to help detecting the crime

    patterns and to fasten the process of solving crimes.

    This paper describes how profile of offender is being

    learned and analyzed and future movements detected

    The article by Wenke Lee et al [7] has categorized

    Intrusion detection techniques into anomaly detectionand misuse detection. The problem involved with

    inadequate and expensive IDS development and up

    gradation process, makes current IDSs to have limited

    extensibility and adaptability. The paper highlights the

    application of data mining programs to gathered audit

    data in order to compute models that can accurately

    capture the actual behavior (patterns) of intrusions andnormal activities, thus facilitating the construction of

    adaptive IDS.

    2

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    3/24

    Data warehousing to counter cybercrime

    Understanding on the functionality of IDS has been

    explained by the example of extraction of variable

    length instruction sequences that can identify worms

    from clean programs using data mining techniques.(Siddiqui et al)[6]

    Tim Bass [5] in his article regarding intrusion detection

    has introduced Multisensor data fusion concept which

    can provide a functional framework for building next-

    generation Intrusion Detection Systems (IDS) and

    increase cyberspace situational awareness. Multisensor

    data fusion is a new engineering discipline which is used

    to combine data from multiple and diverse sensors and

    sources in order to make inferences about events,

    activities, and situations. The article also describes the

    generic sensor characteristics of this fusion as explained

    by Waltz.

    Thurisingham in his report has also presented thechallenges regarding usage of data mining for

    counterterrorism. These challenges are mining

    multimedia data, graph mining, building models in real-

    time, knowledge directed data mining to eliminate false

    positives and false negatives, web mining, and privacy

    sensitive data mining.

    Weissman J.B. et al[9] in the paper have focused on a

    data mining application Minnesota Intrusion Detection

    System (MINDS), which has many modules working

    together to perform anomaly detection and detecting

    distributed attacks. People from Minnesota University

    developed a framework to leverage the Grid technology

    thereby showing the support of distributed data mining

    for network intrusion. The paper describes a prototype of

    the MINDS Grid service which was developed and then

    its performance was measured on test-bed and then

    3

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    4/24

    Data warehousing to counter cybercrime

    further evaluated. MINDS showed great operational

    success in detecting network intrusions when it was

    deployed in various scenarios.

    Barbard D. et al [10] in the paper talk about Audit Data

    Analysis and Mining (ADAM) system which uses data

    mining techniques for intrusion detection apart from

    traditional matching of characteristics of any attack. The

    paper describes mainly two kinds of intrusion detection

    system where one type of system uses signatures to

    detect the attacks and the other one uses some kind ofstatistical or data mining analysis to do the detection.

    The paper further describes few types of detection

    systems under these two. ADAM is a testbed which uses

    association rules under data mining techniques to detect

    intrusions. Paper describes the ability of ADAM to drill

    down and find the raw data in audit trail and further

    analysis.

    Marcos M. Campos and Boriana L. Milenova in the paper

    present a data centric architecture named DAID

    (Database-centric Architecture for Intrusion Detection)

    that uses data mining within Oracle RDBMS to

    challenges that exist in the design and implementation

    of production quality modern intrusion detection

    systems (IDSs). The paper talks about the relevance ofIDSs as nowadays huge amount of sensitive data is

    stored on the network. In DAID in the database itself all

    the major operations take place. It describes DAID

    components such as sensors, extraction transformation

    and load (ETL), centralized data warehousing,

    automated model generation, automated model

    distribution, real-time and offline detection, report andanalysis, and automated alerts. The paper tells that the

    Database-centric IDSs generally offer many advantages

    over alternative systems.

    4

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    5/24

    Data warehousing to counter cybercrime

    Sologar A. and Moll J. in the paper talk about a company

    named Southern Company, one of the largest utilities in

    North America, which required a complete substationcyber security solution along with substation data

    management system to overcome few challenges it was

    facing. The paper describes the solution design of such a

    data management system and its future benefits.

    Bloedorn E. et al [13] in the paper explain what an

    intrusion detection system is, then how the intrusiondetection earlier used to happen without data mining,

    what is data mining. It tells that a capable team/staff is

    required to carry out intrusion detection using data

    mining; it says that you need to invest in adequate

    infrastructure. For data mining it is required that

    appropriate attributes are designed, computed, and

    stored in the data records. Paper talks about

    installations of data filters for filtering of traffic and then

    refining of the overall architecture for intrusion

    detection. Further it talks about usage of data mining

    techniques such as classification rules, clustering etc.

    Lappas T. and Pelechrinis K. in the paper have the focus

    on detections of intrusions by means of data mining

    techniques. It describes the way intrusion detectionsystems work. It gives an idea of taxonomy of IDSs

    which talks about anomaly detection and

    Misuse/Signature detection. Then the drawbacks of

    standard IDS along with brief introduction to data mining

    and real time IDS are given. Paper talks about various

    data mining techniques like Feature Selection, learning

    techniques etc. The a description of existing intrusiondetection systems like ISOA, DIDS etc. is provided that

    use data mining techniques. Finally authors have given

    5

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    6/24

    Data warehousing to counter cybercrime

    their own proposal on how data mining can be used to

    aid IDSs.

    Singhal A. in the paper talks about the authorsexperience while designing data ware housing system

    for doing intrusion detection. Author says that Intrusion

    Detection Systems generate a lot of alerts. There is a

    need to develop methods and tools that can be used by

    the system security analyst to understand the massive

    amount of data that is being collected by IDS, analyze

    and summarize the data and determine the importanceof an alert. The paper presents few data modeling, data

    visualization and data warehousing techniques. These

    techniques can improve the performance and usability of

    Intrusion Detection Systems very significantly. The paper

    talks about few researches that have been carried out in

    this area like ADAM, MINDS etc. It also gives a software

    architecture and data model for intrusion detection.

    Caesar M. and Han J. in the paper talk about botnets

    which are used for credit card fraud, identity theft,

    spamming, phishing, and other attacks. Botnet is a

    collection of software agents, or robots, that run

    autonomously and automatically. Botnets are a big

    threat as the owner of a bot-infected machine often has

    no way to know when their machine is infected by a bot.The paper tries to illustrate two ways to counteract this

    threat: Isolate bot behavior from normal ones using data

    mining and Enable cooperation of autonomous

    participants by sharing observation. The paper then

    describes various data mining methods and lists

    challenges while tackling botnets like sharing

    information across domains and quickly mining verylarge data sets and further provides approach to

    solutions to these challenges.

    6

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    7/24

    Data warehousing to counter cybercrime

    Section 1: Introduction on Data

    Warehousing and Data Mining

    Data warehousing enables delivery of effective and

    integrated information by virtue of collecting together

    data from various distributed information repositories

    (from various operational and administrative information

    systems already in operation and external data sources)

    into a single repository by using algorithms and tools.

    The data from these sources are then integrated,

    cleaned and transformed, and stored for the purpose of

    data analysis.

    Broadly Data warehousing comprises of three different

    kinds of applications:

    1. Data Mining: It is being applied in knowledge

    discovery processes for extracting hidden patterns

    and associations in large and complex data sets.

    2. Information Processing: Used for querying, basic

    statistical analysis and reporting.

    3. Analytical Processing: Used for Multidimensional data

    analysis using techniques like slice-and-dice and drill-

    down operations.

    This process of knowledge discovery can be summarizedin the table given below:

    Process of Knowledge Discovery1

    .

    Data Cleaning Removing Irrational Data

    2

    .

    Data Integration Integrating data from various

    heterogeneous sources3

    .

    Data

    Transformation

    Making data more appropriate

    for carrying out data analysis

    using methods like aggregation,

    7

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    8/24

    Data warehousing to counter cybercrime

    generalization, normalization,

    data reduction and data

    refreshing.

    4.

    Evaluation andPresentation

    Interpretation of transformeddata and effective presentation.

    With the increasing scope, coverage and volume of

    digital multidimensional data flowing in from various

    heterogeneous sources in the recent years, the need of

    data analysis systems for analyzing, summarizing and

    predicting the future trends has become most critical.

    Besides applications in various other important areas,

    data mining techniques finds especial relevance in crime

    analysis as accurate and efficient analysis of growing

    volumes of crime data is posing a major challenge faced

    by all law-enforcement and intelligence-gathering

    organizations. An accurate and efficient analysis using

    crime data mining techniques can act as a catalyst for

    facilitating the detection and resolution of complexconspiracies and even in case of detecting cybercrimes

    wherein it is all more difficult and complex to attribute

    the cause of cyber crime due to large amount of data

    because of high network traffic and massive online

    transactions out of which only a miniscule percentage of

    transactions will correlates to illegal activities.

    To automate the process of analyzing data and

    predicting data patterns the following query based

    approaches are in general employed:

    Subject Based Queries: These queries initiate their

    search for information for a specific and known subject.

    The query provides search results which presents a more

    comprehensive understanding of the subject.Pattern-Based Queries: These types of queries are run

    to identify a predictive model or recognize any pattern in

    8

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    9/24

    Data warehousing to counter cybercrime

    behavior. This pattern is then used for searching in the

    data sets.

    Data mining has applications in both structured as well

    as in unstructured data sets. For example, data mining

    techniques such as association analysis, classification

    and prediction, cluster analysis, and outlier analysis

    identify patterns in structured data whereas newer

    techniques can identify patterns from both structured

    and unstructured data As well.

    Some of the techniques of data mining are elaborated

    below:

    1. Link Analysis: This technique is used to find the

    correlation between various parameters i.e., a

    suspect, an address or other relevant information and

    other people, places or things by using graphtheoretic methods. The idea of web structure mining

    is to establish the links and extract the patterns and

    structures about the web. This particular technique

    looks for correlations amongst the data and

    subsequently determining how to reduce the graphs

    for better analysis. For example search engines such

    as Google use some form of link analysis for

    displaying the results of a search.

    2. Entity extraction: In this technique data such as

    text, images or audio materials are used to identify

    particular patterns. It has been useful in

    automatically identifying persons. For example,

    clubbing similar pattern programs written by hackers

    and tracing their behavior. The performance of thistechnique is a function of the availability of precision

    of input data.

    9

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    10/24

    Data warehousing to counter cybercrime

    3. Clustering techniques:This technique involves

    grouping of objects having similar

    properties/characteristics together within a cluster.

    Clustering is based on the principle of maximizingthe infraclass similarity and minimizing the inter

    class similarity. Although clustering techniques can

    be very effective in automation of a major part of

    crime analysis, its application is limited by the high

    computational intensity required.

    4. Association rule mining: This technique

    identifies and highlights the most frequently

    occurring item sets in a database an example of

    using this technique can be in network intrusion

    detection to identify association rules from users

    interaction history. As an attempt to detect potential

    future network attacks, this technique can be used in

    forming and understanding intruders profiles and

    then accordingly take sufficient safeguards toprevent them.

    5. Sequential pattern mining: Its working is

    similar to that of association rule mining. This

    technique is used to detect sequences that occur

    frequently in a set of transactions that occurred at

    different times. When data is in a time-stampedformat, this technique can be used to identify

    intrusion patterns.

    6. Deviation detection: In a general scenario,

    criminal activities can be seen as those activities

    which are considerably different from normal

    activities. Like planting a bomb is a different sort of

    activity which is not prevalent amongst othermembers. If an analysis is done on such activities

    which vary significantly from the normal activity, the

    analysis can be used for applications such as fraud

    10

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    11/24

    Data warehousing to counter cybercrime

    detection and network intrusion detection. Such data

    points which differs or deviates markedly from the

    rest of the data and do not comply with the general

    model are called outliers. The main drawback withthis analysis is that sometimes these abnormal

    activities can appear to be normal which thus makes

    analysis false and complex.

    7. Classification and Prediction:In this technique

    a homogenous patter or similar characteristics is

    identified among different crime entities and then

    they are organized into predefined classes. Models

    are being used to describe the different classes of

    objects. These models can also be used to predict the

    class of an object for which the class is unknown. For

    example, when a new type of crime is recorded, the

    model can be used to sort the crime in the respective

    class based on the model parameters. Classifications

    techniques leads to early identification of crimeentities and are more often used predict crime

    trends. Data classification can be done as per

    decision tree induction, Bayesian classification and

    neural networks techniques.

    8. String comparator: String comparator is a basic

    technique of checking similarity between two textual

    fields database records. Each database entry is

    compared with the rest of the entries in pair and

    similarity is detected. This helps in finding duplication

    of data and in detection of deceptive information

    such as name, address, and Social Security number

    in criminal records. In order to find duplication,

    extensive computation is required (considering the

    fact that database contain criminal records havehuge collection of data points) which thus makes the

    analysis process slow and time consuming.

    11

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    12/24

    Data warehousing to counter cybercrime

    9. Social network analysis: Social network

    analysis is used to describe the roles of nodes and

    also the interactions among these nodes in a

    conceptual network. By this mechanism it helps invisualization of criminal networks. Thus this

    technique can be used to prepare a flow diagram of

    the criminal activities and to find the association

    between different entities. A network model can be

    prepared which can specify the criminals role, its

    connecting entities, the information and data flow

    among these entities and the degree of linkagebetween them. More in depth analysis can be used to

    detect subgroups in the network, the entity with the

    highest order of criticality, and vulnerabilities inside

    the network.

    Section 2: Network and System

    Security

    With the digital workplaces taking over the conventional

    working culture, network and system security has

    become an issue of paramount importance to all sectors

    ranging from banking industry to multinational

    corporations to government organizations A secured

    network or system need to ensure proper authentication,

    authorization, integrity, confidentiality and availability of

    computing resources as its most important

    characteristics in order to be effective.

    Security in digital world has to be addressed at all three

    levels: hardware, software and the communication links

    security.

    The different kinds of threats related to computer

    security are:

    12

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    13/24

    Data warehousing to counter cybercrime

    1. VIRUS and Related Threats: Virus is any program

    when executed will result in the infection of other

    programs by either altering the program or by

    including a copy of itself in the program.Classification of a Computer Virus: Boot Sector,

    Stealth, Memory Resident, Parasitic and Polymorphic.

    Virus is generally sent via emails and uses various

    features of mail to spread itself.

    2. Network Threats: Large inter-connected computer

    network catalyzed with the rise of internet has

    brought the issue of network security on the centre

    stage. Different kinds of network threats include:

    a. Spoofing: Network authentication details of

    any entity are obtained and are used to send

    communication under the name of the entity.

    b. Masquerade: In this case unauthorized users

    act as if they are legitimate users

    c. Phishing: Any website which is set up to act as

    a legitimate site of a pre-existing website.

    Using this method the unique id and password

    of a user are obtained to use it in the

    legitimate website for illegal purposes.

    d. Session Hijacking: As the name suggests the

    users session is intercepted to carry out the

    session that was started by the authorized

    user.

    e. Man in the middle attack: This is also related to

    session hijacking with one main difference that

    in this type of attack the threat is induced fromthe start of the session.

    13

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    14/24

    Data warehousing to counter cybercrime

    f. Website Defacement: In this kind of attack the

    code of the website is fully downloaded and is

    fed with programs with more data than

    required.

    g. Message Confidentiality Threats: Due to

    malfunctioning of the network software or

    hardware messages are sent to unauthorized

    recipients.

    3. Denial of Service Attacks: The access of service is

    denied to authorize users in this attack. Some typesof these attacks are as follow:

    a. Connection Flooding: Large sets of data are

    sent to the system to overload the system and

    prevent it from receiving data from any other

    source.

    b. Ping of Death: The system is flooded with pingsfrom the attacker.

    c. Smurf: Similar in nature to ping of death

    wherein the attacker selects a network of

    victims.

    d. Syn Flood: Service is denied by the attacker

    by continuously sending SYN requests and notreplying with ACKs.

    Section 3: Intrusion Detection System

    Intrusion detection systems (IDSs) are designed for

    detecting, blocking and reporting unauthorized activity

    14

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    15/24

    Data warehousing to counter cybercrime

    in computer networks [17] Simple process model for

    intrusion detection is as shown in the fig.1 below

    Fig.1. Simple process model for ID (Source - Ville Jussila,

    2003)

    The traditional or available intrusion detection systemseither use "signatures" to detect attacks where the

    behavior of the attack is already known or they make

    use of statistical techniques or data mining techniques.

    Various tools use both kinds of engines which increases

    chances of capturing the attacks to a large extent.

    Signature based IDSs include P-Best, USTAT and NSTAT.

    Statistic and Data Mining-Based IDSs include IDES,

    NIDES, EMERALD, Haystack, and JAM. These IDSs have

    been described below in brief.

    Signature based Intrusion Detection Systems

    (IDSs)

    P-Best: It is Production-Based Expert System Toolset (P-

    Best). This signature-based intrusion detection

    mechanism is a rule-based, forward-chaining expert

    15

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    16/24

    Data warehousing to counter cybercrime

    system. It has been used for many years. Here the

    characteristics of malicious behavior are specified,

    where afterwards the stream of events generated by

    system activity are monitored so whenever intrusionhappens then the signature" can be identified. In real-

    time detection scenario this kind of detection system is

    shown to give fine performance.

    Use and integration of such a system into existing OS

    environments is easy. In it the expert system

    production rule consists of a predicate expression (rule

    antecedent) over a well-defined set of facts, and a

    consequent, which specifies which other facts are

    derived when the antecedent is true. When any facts are

    asserted that match the arguments of a rule antecedent,

    the predicate expression is evaluated. If the predicate

    expression evaluates to true (the rule "fires"), then the

    consequent is executed, potentially resulting in otherfacts being asserted. This process may create a chain of

    rule firings that yield new deductions about the state of

    the system. In the context of intrusion detection, facts

    are generally system events, with a type such as "login

    attempt" and additional context attributes, e.g. "return-

    code" with value "bad-p~sword". These attribute values

    can be used as arguments in the rules antecedents.

    [10]

    USTAT

    USTAT is a real-time intrusion detection system. This

    system was developed for UNIX. USTAT stands for State

    Transition Analysis Tool for UNIX. STAT employs rule-

    based analysis of the audit trails of multi-user computer

    systems. In STAT, an intrusion is identified as a

    sequence of state changes that lead the computer

    system from some initial state to a target compromised

    16

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    17/24

    Data warehousing to counter cybercrime

    state. USTAT makes use of the audit trails that are

    collected by the C2 Basic Security Module of SunOS and

    it keeps track of only those critical actions that must

    occur for the successful completion of the penetration.This approach differs from other rule-based penetration

    identification tools that pattern match sequences of

    audit records. [10]

    NSTAT

    NSTAT or network State Transition Analysis Toolperforms real-time network-based intrusion detection. IT

    uses the analysis technique of state transition for the

    networked environment. The system is composed of

    complex networks which has a number of sub-networks

    in it. In it state transition diagrams are used for the

    representation of network attacks. Use of these state

    transition diagrams involves advantages such as the

    automatic determination of the data to be collected so

    that intrusion analysis is carried out, which would further

    result implementation of the network probes that would

    be lightweight and scalable in nature. [10]

    IDSs based on Statistical and Data Mining

    Techniques

    IDES

    It is real-time intrusion-detection expert system (IDES).

    IDES examines user actions on one or more monitored

    computer systems and marks suspicious events and it is

    basically a stand-alone system. Activities of individual

    users, groups, remote hosts and complete systems are

    monitored by IDES. Suspected security violations by

    insiders and outsiders that occur are detected by IDES.

    17

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    18/24

    Data warehousing to counter cybercrime

    Users behavior patterns are learnt by IDES over a period

    of time adaptively and behavior that deviates from these

    patterns are detected and analyzed. In addition, for

    encoding information about recognized systemvulnerabilities and intrusion situations, there is a rule-

    based component which is used. The two approaches

    are put together to make a complete IDS for detection of

    intrusions and misusage by other users. IDES runs under

    GLU (OpenGL Utility Library (GLU) is a computer graphics

    library) after the improvements were performed on it.

    GLU enhances flexibility for configuration and faulttolerance ability of the system. [18]

    NIDES

    NIDES is an intrusion-detection system that performs

    real-time user activity monitoring on multiple target

    systems connected via Ethernet. NIDES runs on its own

    workstation (the NIDES host) and analyzes audit data

    collected from various interconnected systems. It

    searches for activities that may indicate unusual and/or

    malicious user behavior. Two complimentary detection

    units perform the analysis: a rule-based signature

    analysis subsystem and a statistical profile-based

    anomaly-detection subsystem. The NIDES rule-base

    employs expert rules to characterize known intrusiveactivity represented in activity logs, and raises alarms as

    matches are identified between the observed activity

    logs and the rule encodings. The statistical subsystem

    maintains historical profiles of usage per user and raises

    an alarm when observed activity departs from

    established patterns of usage for an individual. The

    alarms generated by the two analysis units are screenedby a resolver component, which filters and displays

    warnings as necessary through the NIDES host X-window

    interface. [18]

    18

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    19/24

    Data warehousing to counter cybercrime

    EMERALD

    EMERALD is Event Monitoring Enabling Responses toAnomalous Live Disturbances. Detection methods used

    in EMRERALD usually use anomaly detection involving

    recognition of deviations from expected normal behavior

    and secondly misuse detection that involves the

    detection of various types of misuse. The system targets

    both external and internal threats that attempt to

    misuse the system. It generally combines signature-

    based and statistical analysis components with a

    resolver that interprets the analysis results. EMERALD

    has a recursive framework for gathering data from the

    distributed monitors to provide a global detection and

    response capability that can counter attacks occurring

    across an entire network. It does real-time detection of

    patterns in network operations to detect malicious

    activity, and responds to this activity through automatedcountermeasures.

    Analysis units for EMERALD include profiler engines,

    signature engines and resolver. Profiler engines perform

    statistical profile-based anomaly detection given a

    generalized event stream of an analysis target.

    Signature engines require minimal state-management

    and employ a rule-coding scheme to provide adistributed signature-analysis model. Resolver performs

    the coordination of the monitor's external reporting

    system and implements the response policy. EMERALD

    has a hierarchically layered approach with three layers

    where firstly service analysis, secondly domain wide

    analysis and thirdly enterprise wide analysis is done. In

    service analysis, check for misuse of individualcomponents and network services, within the boundary

    of a single domain, is done. Then, domain wide analysis

    checks the misuse which is visible across multiple

    19

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    20/24

    Data warehousing to counter cybercrime

    services and components. Enterprise Wide Analysis

    checks for coordinated misuse across multiple domains.[19]

    Haystack

    To detect intrusions Haystack system employs two

    methods of detection: anomaly detection and signature

    based detection. The anomaly detection is organized

    around two concepts; per user models of how users have

    behaved in the past, and pre-specified generic usergroup models that specify generic acceptable behavior

    for a particular group of users. The combination of these

    two methods solves many of the problems associated

    with the application of any one of them in intrusion

    detection systems. [20].The system works as shown below

    in fig.2

    20

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    21/24

    Data warehousing to counter cybercrime

    Fig.2. Haystacks role in investigation (Source - Stephen

    E. Smaha)

    JAM

    JAM is distributed, scalable and portable agent-based

    data mining system that employs a general approach to

    scaling data mining applications that is called meta-

    learning to learn models of fraud and intrusive behavior.

    21

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    22/24

    Data warehousing to counter cybercrime

    JAM provides a set of learning programs, implemented

    either as JAVA applets or applications that compute

    models over data stored locally at a site. JAM also

    provides a set of meta-learning agents for combiningmultiple models that learned (perhaps) at different sites.

    It employs a special distribution mechanism which allows

    the migration of the derived models or classifier agents

    to other remote sites. [21]

    References

    1. Mary DeRosa, Data Mining and Data Analysis

    for Counterterorism, CSIS, March 2004

    2. Bhavani Thuraisingham, Data Mining for

    Counter-Terrorism, The MITRE Corporation

    3. Hsinchun Chen, Wingyan Chung, Jennifer Jie Xu,

    Gang Wang, Yi Qin, Michael Chau, Crime Data

    Mining: A General Framework and Some

    Example

    4. C. Ram Singla, Deepak Dembla, Yogesh Chaba

    and Kevika Singla, An Optimal KD Model for

    Crime Pattern Detection Based on Semantic

    Link Analysis-A Data Mining Tool,http://www.jiit.ac.in/jiit/ic3/IC3_2008/IC3-2008/APP4_34.pdf

    5. Tim Bass, Intrusion Detection System andMultisensor Data Fusion, Communications of

    the ACM, Volume 43 , Issue 4 (April 2000)

    6. Muazzam Siddiqui, Morgan C. Wang, Joohan Lee,

    Detecting Internet Worms Using Data Mining

    Techniques

    7. Wenke Lee, Salvatore J. Stolfo, Kui W. Mok, A

    Data Mining Framework for Building

    Intrusion Detection Models, Computer Science

    Department, Columbia University

    22

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    23/24

    Data warehousing to counter cybercrime

    8. Timothy C. OShea, Pattern recognition: An

    exploration of the information processing

    operations of detectives, The Justice

    Professional, 1999, Vol. 11, pp. 425-4389. Jon B. Weissman, Vipin Kumar, Varun Chandola,

    Eric Eilertson, Levent Ertoz, Gyorgy Simon,

    Seonho Kim, Jinoh Kim, DDDAS/ITR: A Data

    Mining and Exploration Middleware for Grid

    and Distributed Computing, Dept. of Computer

    Science and Engineering, University of Minnesota

    10.Barbard D., Julia Couto, Sushil Jajodia, NingningWu,ADAM: A Testbed for Exploring the Use

    of Data Mining in IntrusionDetection, George

    Mason University Center for Secure Information

    Systems Falrfax, VA 22303 October 12, 2001

    11.Marcos M. Campos Boriana L. Milenova,Creation

    and Deployment of Data Mining-Based

    Intrusion Detection Systems in OracleDatabase 10g,Oracle Data Mining Technologies

    12.Sologar A. and Jerrod Moll, Developing a

    Comprehensive Substation Cyber Security

    and Data Management Solution, Members,

    IEEE

    13.Bloedorn E., Alan D. Christiansen, William Hill,

    Clement Skorupka, Lisa M. Talbot, Jonathan Tivel,

    Data Mining for Network Intrusion

    Detection: How to Get Started, The MITRE

    Corporation

    14.Lappas T. and Pelechrinis K., Data Mining

    Techniques for (Network) Intrusion

    Detection Systems, Department of Computer

    Science and Engineering UC Riverside

    23

  • 8/9/2019 Data Warehousing to Counter Cyber Crime

    24/24

    Data warehousing to counter cybercrime

    15.Singhal A., Data Modeling and Data

    Warehousing Techniques to improve

    Intrusion Detection

    16.Caesar M. and Han J., Leveraging Data Mining

    to Improve Internet Security, Department of

    Computer Science University of Illinois at Urbana-

    Champaign

    17.Ville Jussila, Intrusion Detection Systems-

    Principles, Architecture and

    Measurements, HUT - NetworkingLaboratory, 2003

    18.http://www.csl.sri.com/

    19. Leckie T.,

    http://www.cs.fsu.edu/~yasinsac/group/slides/

    leckie.pdf.

    20.Stephen E. Smaha, Haystack: An Intrusion

    Detection System, Tracor AppliedSciences, Inc.

    21.Stolfo S., Prodromisdis A.L., Tselepis S., Lee

    W. and Fan W.,JAM: Java agents for Meta-

    Learning over Distributed Databases,

    1997

    http://www.csl.sri.com/http://www.csl.sri.com/