[ieee 2011 first syssec workshop - amsterdam, netherlands (2011.07.6-2011.07.6)] 2011 first syssec...

Research Roadmap on Security Measurements

Xenofontas DimitropoulosETH Zurich

[email protected]

Abstract—In the context of the SysSec Network of Excellencecall for consolidating the European (and international) systemssecurity research community, this position paper aims at sum-marizing the current research activities in the CommunicationSystems Group (CSG) of ETH Zurich relating to networksecurity. Our research is aligned along three projects: 1) identi-fying, validating, and characterizing computer infections fromintrusion alerts; 2) building privacy-preserving collaborativenetwork security mechanisms based on efficient secure multi-party computation (MPC) primitives; and 3) dissecting Internetbackground radiation towards live networks with one-way flowclassification. In addition, we highlight important directions forfuture research.

I. INTRODUCTION

Our work is motivated by a number of inter-relatedproblems. As modern malware rely increasingly on socialengineering techniques, its becoming more important todevelop reliable methods to detect with a small number offalse positives systems within an organization that have beeninfected. Furthermore, to build better defenses its importantto collaborate and share data about suspected attackers. De-spite, its clear advantages, collaboration is presently largelyavoided due to privacy concerns.

Motivated by these problems, in the following sections wediscuss our current and future research on security measure-ments. In Section II we first describe security datasets wehave accumulated and use in our studies. Next, in Section IIIwe describe our inference, validation, and characterization ofcomputer infections. Sections IV and V outline our researchon aggregating sensitive data using MPC and on classify-ing one-way flows, respectively. Finally, in Section VI weoutline future directions and we conclude in Section VII.

II. COLLECTING AND CORRELATING DIVERSE TYPES OFDATA ABOUT A TARGET NETWORK

A fundamental problem with understanding security fail-ures and with evaluating proposed defenses is the lack ofsecurity datasets from real-world environments. Datasets,like intrusion detection alerts and traffic traces, are vital forscientific research. In CSG, we have been collecting un-sampled NetFlow data from the border routers of SWITCHsince 2003 and have accumulated a rich archive of morethan 70 Gbytes of compressed flow records. In addition,since 2009 we have been collecting Snort alerts from asensor next to the edge router of the ETH campus. Using

the Snort signature ruleset and the Emerging Threats (ET)ruleset, we observe on average three million alerts per dayand have accumulated a rich archive of several millionSnort alerts. The collection of different types of data from aproduction network is essential for building a more completepicture of security incidents and for correlating footprintsfrom different observation instruments. Members of CSG areactively working on extending our archives with additionaltypes of data. In particular, on-going efforts examine thepossibility of collecting vulnerability reports from end-hosts,reports from a Deep Packet Inspection (DPI) system, andDNS data.

III. IDENTIFYING, VALIDATING, AND CHARACTERIZINGCOMPUTER INFECTIONS FROM IDS ALERTS

In collaboration with Elias Raftopoulos we are workingon identifying, validating, and characterizing computer in-fections in a large academic network infrastructure from in-trusion alerts. In particular, our work addresses the followingproblems (more information can be found in our technicalreport [1]):

Extrusion Detection from IDS Alerts: A fundamentalproblem of network intrusion detection systems is that theygenerate a large number of false positives. For example, ourIDS sensor generates on average three million alerts perday. Given an archive of IDS alerts, an important problemis how an administrator can filter out the noise to identifyactual security incidents. In this work, we are particularlyinterested in identifying computer infection within a mon-itored infrastructure, i.e., extrusion detection. Annotatinga rich trace of IDS alerts with inferred security incidentsis useful both for forensics investigations as well as forevaluating network defenses with realistic data. To addressthis problem, in our current research we have developeda novel alert correlation technique tailored for extrusiondetection.

Validating Inferred Infections in a Production Net-work: Having inferred a number of suspected infections, itis very useful to thoroughly validate the incidents. This isparticularly challenging in production environments, wherevalidation might interfere with regular employee work. Inthis work, we are interested in the challenging problemof remotely validating suspected infections on un-managedhosts within a production network. In our current research,we are conducting a complex experiment in which an expert

2011 First SysSec Workshop

978-0-7695-4530-1/11 $26.00 © 2011 IEEE

DOI 10.1109/SysSec.2011.20

83

is assigned to manually validate live infections by collectingand analyzing data about the suspected host from a numberof independent sources, including intrusion alerts, black-lists, vulnerability reports, host scanning, and search enginequeries. The goal of manual assessment is to “connect thedots”, i.e., correlate collected evidence and decide if theyagree with the background knowledge about the suspectedmalware.

Characterizing Computer Infections: Infections areamongst the most critical events for computer administrators.Using our heuristic, we have identified several thousands in-fections that infected over a period of 9 months more than 11thousand distinct hosts with static IP addresses. Given dataabout a large number of computer infections, it is importantto systematically analyze infected hosts and derive insightsfor building better defenses. In our current research, wehave characterized a number of different aspects of computerinfections including infection times, types of infections,infected hosts, and observed spatio-temporal correlations.Among various interesting findings, we observe that thevolume of inbound attacks to infected hosts increases rapidlyafter their infection and that infections exhibit significantspatial correlations, i.e., a new infection is more likely tooccur close to already infected hosts.

IV. BUILDING PRIVACY-PRESERVING COLLABORATIVENETWORK SECURITY MECHANISMS BASED ON SECURE

MULTI-PARTY COMPUTATION

Internet security suffers from a fundamental imbalance:although attackers are globally spread and well coordinated,individual network domains are isolated to analyzing onlylocal data, when in need to diagnose global security prob-lems. If independent network domains collaborated, then itwould be possible to design much more effective networkmonitoring and security mechanisms. For example, severalcollaborating networks can identify and blacklist spammersmuch faster and with higher accuracy than any singlenetwork alone. Despite its obvious advantages, collaborationis presently largely avoided because privacy laws, securitypolicies, and competition prevent sharing sensitive data. Tomitigate this problem we are working with Martin Burkharton building efficient MPC primitives suited for collaborativenetwork monitoring and security applications.

MPC appears as an ideal solution to the privacy-utilitytrade-off. On the one hand, any function can be turned intoan MPC protocol and on the other hand the computationprocess provides strong privacy guarantees. Applying MPCon practical scenarios involving aggregation of networksecurity data introduces the challenge of having to buildvery efficient protocols that can deal with voluminous inputdata. Our research spans a path starting from MPC theory,going to system design, performance evaluation, and endingwith measurement. Along this path a key contribution of ourwork is that we make an effort to bridge two very disparate

worlds: MPC theory and network monitoring and securitypractices.

In the theory front, we have designed optimized MPCcomparison operations based on the observation that the per-formance of data-intensive MPC operations can be improvedby not enforcing the widely-used constant-round paradigm.We have learned that constant-round protocols (at the costof many multiplications) are not a panacea in MPC protocoldesign: allowing many parallel invocations and removingthe constant-round constraint enabled us to design protocolsthat substantially reduce the total run-time.

In addition, we have designed four MPC protocols,namely the entropy, distinct count, event correlation, andtop-k protocols. Our protocols have been inspired fromspecific network monitoring and security applications, butat the same time they are also general and can be usedfor other applications. For example, in our top-k protocolwe have used sketches, an approximation data structure, toreduce expensive MPC comparison operations to much moreefficient MPC additions and multiplications at the cost of amanageable approximation error [2].

We have implemented our basic operations and MPCprotocols in the SEPIA library [3], [4], which we have madepublicly-available. Our extensive performance evaluationshows that SEPIA operations are between 35 and severalhundred times faster than those of existing comparable MPCframeworks. In addition, we have used our protocols withreal-world traces from 17 customer networks of SWITCHto investigate the practical applicability of collaborativenetwork monitoring and security based on MPC. We haveinvestigated different ways the networks could have collab-orated to troubleshoot an actual global network anomaly.This is the first work to apply MPC on real traffic tracesand to demonstrate that collaborative network monitoringand security based on MPC is both computationally feasibleand useful for addressing real network problems.

This work has led to one recent publication [3], whilefurther research is carried out in the context of two projects.In the DEMONS FP7 collaborative European project, weare further optimizing the performance and extending thefunctionality of SEPIA, while in collaboration with IBMResearch we have acquired support from the Swiss NationalScience Foundation to transfer the technology of the multi-party computation library to a privacy-preserving multido-main traffic flow analyzer.

V. BEYOND NETWORK TELESCOPES: ONE-WAY FLOWCLASSIFICATION

Network telescopes extract Internet background radiationto unused IP address blocks and have been very useful incharacterizing malicious patterns and events. An alternativeway to study background radiation in edge networks is tofind one-way flows, i.e., flows that never receive a reply.One-way flows are an important fraction of Internet traffic.

84

In our traces from SWITCH, in 2004 two out of every threeflows were one-way, while in 2008 one out of every threeflows were one-way. One-way flows provide a number ofadvantages for monitoring Internet background radiation.They can be easily extracted from live networks, can be usedeven when large unused IP address blocks are not available,and require fewer resources for instrumentation. To enableadministrators and researchers to extract and analyze back-ground radiation from one-way flows, we have been workingwith Eduard Glatz on novel techniques for classifying one-way flows in a set of malicious and benign classes. Ourclassification associates each one-way flows with up to 17different signs derived from flow-level data and uses a setof expert rules to determine the appropriate class of an one-way flow. It does not require training (plug & play), is easilyextensible, and is based on comprehensible classificationrules. We have used our classification to analyze a massivedataset of flow records summarizing 7.73 petabytes of trafficbetween 2004 and 2010 that crossed the border routersof SWITCH. In addition, we work on characterizing howthe volume of background radiation changed over time andwhat is the persistence of local port numbers as top targets.Among our findings, we observe that the relative volume ofbackground radiation towards the target network decreasedsharply between 2004 and 2007 by 73% and remains almostconstant since then. In addition, filtering the top-5 or 10port numbers, reduces the volume of background radiationby 35.6% and 45.0%, respectively. We believe that one-way flow classification opens a number of new directions onexploiting background radiation for building better intrusiondetection systems. More information can be found in ourtechnical report [5].

VI. FUTURE DIRECTIONS

Based on our experience with our current research, webelieve that the following open problems deserve furtherattention in the future:

1) Assessing the privacy risk of the MPC output: Al-though MPC based on Shamir’s secret sharing schemeguarantees that the computation provides informationtheoretic privacy, the output of the computation maystill pose privacy risks especially if an adversary cancorrelate it with background knowledge. Intuitively,aggregating data from multiple parties obscures thesensitive input of any individual party. However, it isstill an open question how to assess the privacy riskof the output of an evaluated MPC function and ifthe output needs to be perturbed before publication toprovide some rigorous privacy guarantees.

2) Automating security assessment: Our experiencewith validating a number of inferred computer in-fections taught us that security assessment in largeorganizations is a magic art rather than a scientific

process. It requires collecting relevant evidence, lever-aging the background knowledge of domain expertsabout the infrastructure and the suspected malware,and relying on an ad-hoc cognitive process to diagnosea suspected security incident. In the future, the stepsof a security assessment process need to be supportedby automated techniques that expedite or even replacethe manual work of a security administrator.

3) Correlating flow data with intrusion alerts: Pastresearch has extensively studied how an administratorcan use intrusion alerts or flow data to detect at-tacks and other security incidents. Although these datasources have been studied in isolation, an importantquestion is how to best combine flow data with IDSalerts to detect security incidents more accurately.Flow data provide more information about the normalbehavior of a host, while intrusion alerts providemore details about a suspected incident. Their propercombination can lead to reducing the false positive rateof IDSs and to enriching flow-based anomaly detectorswith more information about a suspected anomaly.

VII. CONCLUSIONS

This short paper aims at summarizing the current researchof part of the Communication Systems Group of ETHZurich that relates to security measurements. In addition,we pinpoint important future research directions. Our focusareas are on: 1) identifying, validating, and characterizingcomputer infections from intrusion alerts; 2) MPC-basedcollaborative network monitoring and security applications;and 3) one-way flow classification. More details about thesummarized works can be found in [1], [4], [3], [5].

REFERENCES

[1] E. Raftopoulos and X. Dimitropoulos, “Detecting, validatingand characterizing computer infections from IDS alerts,” ETHZurich, TIK-Report 337, June 2011.

[2] M. Burkhart and X. Dimitropoulos, “Fast privacy-preservingtop-k queries using secret sharing,” in International Conferenceon Computer Communications and Networks (ICCCN), 2010.

[3] M. Burkhart, M. Strasser, D. Many, and X. Dimitropoulos,“SEPIA: Privacy-Preserving Aggregation of Multi-DomainNetwork Events and Statistics,” in USENIX Security Sympo-sium, 2010.

[4] M. Burkhart and X. Dimitropoulos, “Privacy-preserving dis-tributed network troubleshooting – bridging the gap betweentheory and practice,” ACM Transactions on Information andSystems Security (under submission), 2011.

[5] E. Glatz and X. Dimitropoulos, “Beyond network telescopes:One-way traffic classification,” ETH Zurich, TIK-Report 336,June 2011.

85

[ieee 2011 first syssec workshop - amsterdam, netherlands (2011.07.6-2011.07.6)] 2011 first syssec...

Documents