date: 05/08/2009 wei-yu chen, yao-tsung wang national center for high-performance computing, taiwan...

Download DATE: 05/08/2009 Wei-Yu Chen, Yao-Tsung Wang National Center for High-Performance Computing, Taiwan {waue,jazz}@nchc.org.tw Building ICAS with Hadoop and

If you can't read please download the document

Upload: godwin-nichols

Post on 03-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

  • Building ICAS with Hadoop and HBaseImprove Security-Events-Center to the Cloud Platform

  • [**] [1:538:15] NETBIOS SMB IPC$ unicode share access [**][Classification: Generic Protocol Command Decode] [Priority: 3] 09/04-17:53:56.363811 168.150.177.165:1051 -> 168.150.177.166:139TCP TTL:128 TOS:0x0 ID:4000 IpLen:20 DgmLen:138 DF***AP*** Seq: 0x2E589B8 Ack: 0x642D47F9 Win: 0x4241 TcpLen: 20

    [**] [1:1917:6] SCAN UPnP service discover attempt [**][Classification: Detection of a Network Scan] [Priority: 3] 09/04-17:53:56.385573 168.150.177.164:1032 -> 239.255.255.250:1900UDP TTL:1 TOS:0x0 ID:80 IpLen:20 DgmLen:161Len: 133

    [**] [1:1917:6] SCAN UPnP service discover attempt [**][Classification: Detection of a Network Scan] [Priority: 3] 09/04-17:53:56.386910 168.150.177.164:1032 -> 239.255.255.250:1900UDP TTL:1 TOS:0x0 ID:82 IpLen:20 DgmLen:161Len: 133

    [**] [1:1917:6] SCAN UPnP service discover attempt [**][Classification: Detection of a Network Scan] [Priority: 3] 09/04-17:53:56.388244 168.150.177.164:1032 -> 239.255.255.250:1900UDP TTL:1 TOS:0x0 ID:84 IpLen:20 DgmLen:161Len: 133

    [**] [1:538:15] NETBIOS SMB IPC$ unicode share access [**][Classification: Generic Protocol Command Decode] [Priority: 3] 09/04-17:53:56.405923 168.150.177.164:1035 -> 168.150.177.166:139TCP TTL:128 TOS:0x0 ID:94 IpLen:20 DgmLen:138 DF***AP*** Seq: 0x82073DFF Ack: 0x2468EB82 Win: 0x4241 TcpLen: 20

    [**] [1:1917:6] SCAN UPnP service discover attempt [**][Classification: Detection of a Network Scan] [Priority: 3] 09/04-17:53:56.417045 168.150.177.164:45461 -> 168.150.177.1:1900UDP TTL:1 TOS:0x0 ID:105 IpLen:20 DgmLen:161Len: 133

    [**] [1:1917:6] SCAN UPnP service discover attempt [**][Classification: Detection of a Network Scan] [Priority: 3] 09/04-17:53:56.420759 168.150.177.164:45461 -> 168.150.177.1:1900UDP TTL:1 TOS:0x0 ID:117 IpLen:20 DgmLen:160Len: 132

    [**] [1:1917:6] SCAN UPnP service discover attempt [**][Classification: Detection of a Network Scan] [Priority: 3] 09/04-17:53:56.422095 168.150.177.164:45461 -> 168.150.177.1:1900UDP TTL:1 TOS:0x0 ID:118 IpLen:20 DgmLen:161Len: 133

    [**] [1:2351:10] NETBIOS DCERPC ISystemActivator path overflow attempt little endian unicode [**][Classification: Attempted Administrator Privilege Gain] [Priority: 1] 09/04-17:53:56.442445 198.8.16.1:10179 -> 168.150.177.164:135TCP TTL:105 TOS:0x0 ID:49809 IpLen:20 DgmLen:1420 DF***A**** Seq: 0xF9589BBF Ack: 0x82CCF5B7 Win: 0xFFFF TcpLen: 20[Xref => http://www.microsoft.com/technet/security/bulletin/MS03-026.mspx][Xref => http://cgi.nessus.org/plugins/dump.php3?id=11808][Xref => http://cve.mitre.org/cgi-bin/cvename.cgi?name=2003-0352][Xref => http://www.securityfocus.com/bid/8205]

    [**] [122:3:0] (portscan) TCP Portsweep [**][Priority: 3] 09/04-17:53:56.499016 198.8.16.1 -> 168.150.177.166PROTO:255 TTL:0 TOS:0x0 ID:1750 IpLen:20 DgmLen:168

  • Internet Security

  • Network IDS Interface

  • Difficult to realize the overall accidentsIgnoring the crucial information easily !!!These Events are MISs Nightmare !!!!

  • The Security Events CenterA platform whose purpose is to provide detection and reaction services to security incidents. Main functionsCollects all information from both security and non-security productsCarries out the unified automatic event evaluation to tell if they are complying with the policy.

  • SEC OverviewSEC

  • The SEC ComponentSECDBSecurity Events Center

    Security Operation Center

    Sensor 1

    AdministratorInterface

    Soc DB

    Core Procedure Unit

    Format Transform Unit

    Alert Generator

    System Operation Unit

    Event Reaction

    Sensor 2

    Format Transform Unit

    Alert Generator

  • Alert Merge Example

    KeyValuesHost_1TrojanSip1,Sip280,4434077,5002tcpT1,T2,T3Host_2TrojanSip14435002tcpT4Host_3D.D.O.S.Sip3,Sip4,Sip5 ,Sip6536007,6008tcp, udpT5

    Destination IPAttack SignatureSource IPDestination PortSource PortPacket ProtocolTimestampHost_1TrojanSip1804077tcpT1Host_1TrojanSip2804077tcpT2Host_1TrojanSip14435002tcpT3Host_2TrojanSip14435002tcpT4Host_3D.D.O.SSip3536007udpT5Host_3D.D.O.SSip4536008tcpT5Host_3D.D.O.SSip5536007udpT5Host_3D.D.O.SSip6536008tcpT5

  • Whats problem about the SEC ?Enormous Data less Efficient Got Nothing if the database were crashMemory and CPU Exhausted when system is running.

  • ICASICAS, IDS Cloud Analysis SystemApplying Cloud Computing techniqueHigher capabilityFault toleranceMaking alerts algorism to generate manifest reportReducing redundancy Merge relation

  • ICAS OverviewICAS

  • System ArchitectureICAS Component Overview

  • Program Procedure

  • Change SEC to ICAS

    Security Operation Center

    Sensor 1

    AdministratorInterface

    Soc DB

    Core Procedure Unit

    Format Transform Unit

    Alert Generator

    System Operation Unit

    Event Reaction

    Sensor 2

    Format Transform Unit

    Alert Generator

  • Change SEC to ICASMySQLCore ProcedureSingle MachineHBaseMap-ReduceMultiple MachineHadoop + LinuxLinux

  • MySQLHBasesec_event

  • MySQLHBase

    Row Key Time Stamp Column "contents:" Column "anchor:" Column "mime:" com.cnn.wwwt5"anchor:cnnsi.com" CNN t4"anchor:my.look.ca" CNN.comt3...text/htmlt2... t1...

  • ExperimentMachine: CPU : Intel quad-core, Memory : 2g, OS : Linux : Ubuntu 8.04 serverSoftware : versionHadoop : 0.16.4Hbase : 0.1.3Java : 6Alerts Data SetsMIT Lincoln Laboratory, Lincoln Lab Data Sets Computer Security group at UCDavis, tcpdump file

  • Experimental ResultThe Consuming Time of Each Number of Data Sets

    Experiment Result

    Traditional1 nodes2 nodes4 nodes6 nodes

    alert_2862861.0684.0874.8694.8645.07730com~5febat2035u.c.davis Felix Wu

    alert_3803801.3334.945.0695.0675.09711sp0~2005at4pmu.c.davis Felix Wu

    alert_4344341.764.615.0665.0685.099outsidemit/ll 1999

    alert_7547543.1455.0665.0795.0385.09616sp1st~1231pmu.c.davis Felix Wu

    alert_117411744.736.0665.0935.0895.09733com~5febat2035 *4u.c.davis Felix Wu

    alert_166816687.9096.076.566.0715.08216sp2sta~16jan~1234pmu.c.davis Felix Wu

    alert_2182218214.9496.6716.955.1665.08816sp1pa~2128u.c.davis Felix Wu

    alert_3396339619.9017.0536.6545.0765.09168combi~1707u.c.davis Felix Wu

    alert_58165816374.3749.0819.0769.077.07666com~1350u.c.davis Felix Wu

    alert_63446344383.829.689.8727.0696.06972combi~13au.c.davis Felix Wu

    alert_1269812698801.34613.09612.36711.3679.08336com~27febat1349u.c.davis Felix Wu

    alert_1051410514151.03312.08611.07111.0679.08249inside allmit/ll 1999

    graph

    1.0684.0874.8694.8645.077

    1.3334.945.0695.0675.097

    1.764.615.0665.0685.09

    3.1455.0665.0795.0385.096

    4.736.0665.0935.0895.097

    7.9096.076.566.0715.082

    14.9496.6716.955.1665.088

    19.9017.0536.6545.0765.091

    374.3749.0819.0769.077.076

    383.829.689.8727.0696.069

    801.34613.09612.36711.3679.083

    Traditional

    1 nodes

    2 nodes

    4 nodes

    6 nodes

    Alerts

    Analysis Time (sec)

    analysis

    Origianl AlertsAnalysis Time (sec)ResultsReduction Rate

    Traditional1 nodes2 nodes4 nodes6 nodesReduction

    2861.0684.0874.8694.8645.0773089.51%

    3801.3334.945.0695.0675.0971197.11%

    4341.764.615.0665.0685.09997.93%

    7543.1455.0665.0795.0385.0961697.88%

    11744.736.0665.0935.0895.0973397.19%

    16687.9096.076.566.0715.0821699.04%

    218214.9496.6716.955.1665.0881699.27%

    339619.9017.0536.6545.0765.0916898.00%

    5816374.3749.0819.0769.077.0766698.87%

    6344383.829.689.8727.0696.0697298.87%

    12698801.34613.09612.36711.3679.0833699.72%

    reduce

    Alerts28638043475411741668218233965816634412698

    Results301191633161668667236

    Reduction (%)89.597.197.997.997.299.099.398.098.998.999.7

  • Experimental Result Throughput Data Overall

  • ICAS : latencyHBase

  • ICAS IIICAS Advanced Version

  • ICAS IIHadoop v 0.20Map \ Reduce

    SourceRegulationIntegrateAlertMerge ExtenstionCorrelationRecordSnortLogsFinal Report

  • Live Demo

  • Source RegulationInput[**] [gid:sid:cid] alert name [**][Classification: Class] [Priority: priority] 09/04-17:53:56.363811 source_ip :port -> destination: portTCP TTL:128 TOS:0x0 ID:4000 IpLen:20 DgmLen:138 DF***AP*** Seq: 0x2E589B8 Ack: 0x642D47F9 Win: 0x4241 TcpLen: 20Output:Gid ; sid ; version ; alert name ; class ; priority; month;day;hour;min;second; source ; destination ; type;

  • Integrate AlertMap outputkey: dst_ip | classify_idvalues: timestamp1 | src_ip | sid |priorityReduce outputkey: tatal_count | src_ip_count | sid_countvalues: dst_ip | priority | t1-tn | s1,s2,sn | sid | class_id

  • Merge ExtenstionMap:key: src_ipvalues: tatal_count | srcIP_count | sid_count | des_ip | priority | t1-tn | sid | class_idReduce:key: tatal_count | dst_count | src_ip_count | sid_count values: d1,d2,dn | priority | t1-tn | s1,s2,sn | sid |class_id

  • Correlation RecordGenerate dot graph format

  • Thank You !& Question ?

    Hello every body, I am wei-yu chen, the other one is yao-tsung wang. We are umpired by nchc in beautiful Taiwan.I am very glad to present the talk,Building a cloud computing analysis system for instrusion detection system, that is mention about using cloud computing technique to solve security issue.? Because of this session is the last session, I would introduce this talk as soon as possible.But there are several problems no mater where the alert store.Firstly, it costs a lot of time to understand whats happen when we see a mass of alerts.secondly, when alert data is increasing day by day, large amount of data must cause the performance database less efficient. thirdly, it is easy to ignore the crucial information in large amount of alerts. Moreover, if the database were crash, all of the alerts would be missing.

    But there are several problems no mater where the alert store.Firstly, it costs a lot of time to understand whats happen when we see a mass of alerts.secondly, when alert data is increasing day by day, large amount of data must cause the performance database less efficient. thirdly, it is easy to ignore the crucial information in large amount of alerts. Moreover, if the database were crash, all of the alerts would be missing.

    Our idea is a system named icas, full name is isd cloud analysis system.Even Cloud Computing is a growing field of research, its applications are quite few and only supported by some big companies, such as Google, Yahoo, Amazon, etc. So, our goal is applying an innovation method that integrate Cloud Computing technique into security domain.we want to improve the performance when system analyze alert based on cloud computing platform.The analysis method is including how to integrate the redundant alerts and merge relative alerts.Icas is based on three tools. Snort,hadoop, and hbase.And we design four component in it, regular parser, analysis procedure, data mapper, and data reducer.Regular Parser normalizes raw IDS log to form a regular form. Each alert in IDS log file contains many statements to specify an accident but ICAS just extracts several important fields.Analysis Procedure consists of Data Mapper and Data Reducer. It would gather the output of data reducer and insert the result into hbase database.It either dispatch analysis alert job if there are new alerts gathered in pool.Data Mapper is applied to parallel every item in the input dataset. This produces a list of (key, value) pairs for each call.Cloud platform gathers all pairs with identity key from all lists. After that, all pairs are grouped together and separated into several group for each one of the different generated keys.Data Reducer is applied in parallel to merge data from Data Mapper. It would mere values if data are with the same key.There are some benefit about icas.Legible:Icas integrates alerts by merging and reducing, administrator can read fewer but critical information, then get more time to do some appropriate process.EfficientBy cloud platform distributing data, icas can process alerts in parallel on the nodes where the data is located. This method makes it extremely rapid.Scalableicas can reliably store and process petabytes data.Economicalcloud platform distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes.Reliableicas automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures.

    Of course, there are several defeats about icas.Icas is not a realtime system. It needs time to parse and analyze alert. Such as google search engine, we query its cached data and it direct our explorer to current web site.When you want to show all of the information in hbase database, it has relative higher latency than mysql. Because of hbase is a distributed database, it need time to gather data from other machines.Icas is developed in several months, so it is just a experimental tool.