Practical Machine Learning for Cloud Intrusion Machine Learning for Cloud Intrusion ... Machine Learning for Cloud Intrusion Detection. ... imperative for companies to invest in e‡ective tools and ...
Post on 12-May-2018
Practical Machine Learning for Cloud Intrusion DetectionChallenges and the Way Forward
Ram Shankar Siva KumarMicroso
ABSTRACTOperationalizing machine learning based security detections is ex-tremely challenging, especially in a continuously evolving cloudenvironment. Conventional anomaly detection does not producesatisfactory results for analysts that are investigating security in-cidents in the cloud. Model evaluation alone presents its own setof problems due to a lack of benchmark datasets. When deployingthese detections, we must deal with model compliance, localization,and data silo issues, among many others. We pose the problem ofaack disruption as a way forward in the security data sciencespace. In this paper, we describe the framework, challenges, andopen questions surrounding the successful operationalization ofmachine learning based security detections in a cloud environmentand provide some insights on how we have addressed them.
KEYWORDSmachine learning, security, intrusion detection, cloudACM Reference format:Ram Shankar Siva Kumar, Andrew Wicker, and Ma Swann. 2017. PracticalMachine Learning for Cloud Intrusion Detection. In Proceedings of AISec17,Dallas, TX, USA, November 3, 2017, 10 pages.DOI: 10.1145/3128572.3140445
1 INTRODUCTIONe increasing prevalence of cybersecurity aacks has created animperative for companies to invest in eective tools and techniquesfor detecting such aacks. Intrusion detection systems are expected to grow to USD 5.93 billion by 2021 at a compound annualgrowth rate of 12%.
Academia [8, 17, 29] and industry have long focused on buildingsecurity detection systems (shortened hereaer as detection) fortraditional, static, on-premise networks (also called bare metal)while research in employing machine learning for cloud seingis more nascent [20, 24, 26]. Whether detection systems for baremetal or for the cloud, the emphasis is almost always on the algo-rithmic machinery. is paper takes a dierent approach - insteadof detailing a single algorithm or technique that may or may notbe applicable depending on factors like volume of data, velocityof operation (batch, near real time, real time), and availability oflabels, we document the challenges and open questions in buildingmachine learning based detection systems for the cloud. In this
Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor prot or commercial advantage and that copies bear this notice and the full citationon the rst page. Copyrights for third-party components of this work must be honored.For all other uses, contact the owner/author(s).AISec17, Dallas, TX, USA 2017 Copyright held by the owner/author(s). 978-x-xxxx-xxxx-x/YY/MM. . .$15.00DOI: 10.1145/3128572.3140445
spirit, this paper is more closely related to  but very specic tobuilding monitoring systems for the clouds backend infrastructure.
We report the lessons learned in securing Microso Azure whichdepends on more than 300 dierent backend infrastructure servicesto ensure correct functionality. ese 300+ services support allavors of cloud oerings: public cloud (accessible by all customers)and private cloud (implementation of cloud technology within an or-ganization). Within these cloud oerings, the backend services alsosupport dierent customer needs like Infrastructure as a Service(IaaS) and Platform as a Service (PaaS). Azure backend infrastruc-ture generates more than tens of petabytes of log data per yearwhich has a direct impact on building machine learning based in-trusion detection systems. In this seing, seemingly simple taskssuch as detecting login anomalies can be dicult when one has towrestle with 450 billion login events yearly.
ere are other problems besides scalability. Firstly, the cloudenvironment is constantly shiing: virtual machines are constantlydeployed and decommissioned based on demand and usage; devel-opers continuously push out code to support new features whichinherently changes the data distributions, and the assumptionsmade during model building. Secondly, each backend service func-tions dierently. For instance, the backend service that orchestratesAzures storage solution is architected dierently from the backendservice that allocates computation power. Hence, to continue withthe login anomaly example, one must account for dierent architec-tures, data distributions and analyze each service separately. Fur-thermore, the cloud, unlike traditional systems, is geo-distributed.For instance, Azure has 36 data centers across the world, includingChina, Europe, and Americas, and hence must respect the privacyand compliance laws in the individual regions. is poses novelchallenges in operationalizing security data science solutions. Forinstance, compliance restrictions that dictate data cannot be ex-ported from specic geographic locations (a security constraint)have a downstream eect on model design, deployment, evaluation,and management strategies (a data science constraint).
is paper focuses on the practical hurdles in building machinelearning systems for intrusion detection systems in a cloud envi-ronment for securing the backend infrastructure as opposed tooering frontend security solutions to external customers. Hence,the alerts produced by the detection systems discussed in this paperare consumed by in-house, Microso security analysts as opposedto paying customers who buy Azure services. ough not discussedin this paper, we would like to highlight that the frontend monitor-ing solutions built for external customers are considerably dierentfrom backend solutions, as the threat landscape diers based onthe customers cloud oering selection. For instance, if a customerchooses IaaS, important security tasks such as rewall congura-tion, patching, and management is the customers responsibility
as opposed to PaaS, where most of the security tasks are the cloudproviders responsibility. In practice, the dierence between PaaSand IaaS dictates dierent security monitoring solutions.
is paper is not about fraud, malware, spam, specic algorithmsor techniques. Instead, we share several open questions related tomodel compliance, generating aack data for model training, siloeddetections, and automation for aack disruption - all in the contextof monitoring internal cloud infrastructure.
We begin with a discussion about building models (or systems)that distinguish between statistical anomalies and security-interestingevents using domain knowledge. is is followed by a discussionof techniques for evaluating security detections. We then describeissues surrounding model deployment, such as privacy and local-ization, and present some approaches to address these issues. Wemove on to discuss issues with siloed data and models. We concludewith some ways to move from aack detection to aack disruption.
2 EVOLUTION TO SECURITY INTERESTINGALERTS
Here is a typical industry scenario: An organization invests inlog collection and monitoring systems, then hires data scientiststo build advanced security detections only to nd that the teamof security analysts are unhappy with the results. Disgruntledanalysts are not the only thing at stake here: a recent study bythe Ponemon Institute , showed that organizations spend, onaverage, nearly 21,000 hours each year analyzing false positivesecurity alerts, wasting roughly $1.3 million yearly. To address thisissue, it can be appealing to invest in a more complex algorithm thatpresumably can reduce the false positive rate and surface beeranomalies. However, as we describe below, blind adherence to thisstrategy tends not to yield the desired results.
As mentioned earlier, Azure has more than hundreds of backendservices that are all architected dierently. On the one hand, it isimpossible to have a single generic anomaly detection that capturesthe nuances of each service. On the other hand, it is cumbersometo build bespoke machine learning detections for each service. Inthis section, we describe strategies to combine the regular anomalydetection seing with domain knowledge from the service andsecurity experts in the form of rules to lower false positive rates.
We have established the following criteria for security alerts tohelp maximize their usefulness to security analysts: Explainable,Credible, and Actionable. Unfortunately, anomaly detection inan industry seing rarely satises these criteria. is is becauseanomalous events are present in any organization, but not all ofthese anomalies are security interesting which is what the securityanalysts care about.
As an example, we encountered the following issue when build-ing an anomalous executable detection. We collaborated with oursecurity investigation team to beer understand how aackers mas-querade their tools to match common executables. For instance,aackers would name their tool ccalc.exe to be deceptively similarto the Microso Windows Calculator program calc.exe. We soughtto develop an anomaly detection for nding abnormal executablesbased on the executable name and metadata.
When we ran this new detection, security experts found most ofthe alerts were false positives despite conforming to their denition
Figure 1: A Venn diagram depicting the intersection of secu-rity interesting alerts
of aacker activity. For instance, the detection system found anexecutable named psping.exe that closely resembles ping.exe, butthe investigation team found that the service engineers were usinga popular system utility tool. is soon became a recurring theme:the alert appeared worthy of investigation at rst glance, but aerspending considerable resources on the investigation, we wouldconclude that the alert was a false positive.
In order to generate useful results, we moved away from simplyanomaly detection and focused our eorts on systems that producesecurity interesting alerts. We dene such a system as one thatcaptures an adversarys tools, tactics and procedures from the gath-ered event data while ignoring expected activity. We show laterin the section, how rules and domain knowledge can help in theseaspects.
As a rst step, we recommend that machine learning engineersconsult with security domain experts to see if there is any overlapbetween the aacker activity that we seek to detect and expectedactivity. If there is some overlap, then this is a hygiene issue andmust be addressed. For instance, aackers oen elevate privilegesusing Run as Administrator functionality when compromisinginfrastructure machines, which can be tracked easily in securityevent logs. It is standard operating procedure that service engi-neers must never elevate to admin privileges without requestingelevated privileges through a just-in-time access system. is way,the service engineers high privileged activity is monitored andmore importantly, is scoped for a short period of time. However,service engineers oen disregard this rule when they are debugging.is creates a problem in which regular service engineer activityis almost indistinguishable from aacker activity which we referto this as poor hygiene (see Figure 1). Specifying and strictlyenforcing operating procedures to correct poor hygiene, is the rststep in reducing the false positive rate of the system.
Once the hygiene issues are resolved and a well-dened secu-rity scenario is in place, the stage is set for incorporating domainknowledge.
Figure 2: Sophistication of anomaly detection techniques
2.1 Strategies to incorporate DomainKnowledge
Domain knowledge is critical when developing security detections,and how it is leveraged goes well beyond simple feature engineering.In this section, we discuss the dierent strategies that we havesuccessfully employed to utilize domain knowledge in the form ofrules. Other ways to incorporate domain knowledge, not discussedin this paper, are feedback of alerts from security analysts andconsuming threat models.
2.1.1 Incorporating Rules (end consumer + security experts). Rulesare an aractive means to incorporate domain knowledge for thefollowing reasons:
ey are a direct embodiment of domain knowledge - Mostorganizations have a corpus of rewall rules (e.g., limitingtrac from Remote Desktop Protocol ports), Web Aackdetection rules (e.g., detecting xp cmdshell in SQL logs isa strong evidence of compromise), or even direct embodi-ment of the goodness (like whitelists) and maliciousness(such as blacklists). Security analysts embrace rules be-cause it allows them to easily express their domain knowl-edge in simple conditionals. If we dene rules as an atomicrst-order logic statements, then we can expand to a widerset: Indicators of Compromise (le hashes, network con-
nections, registry key values, specic user agent strings)that are commonly sourced from commercial vendors;
reat intelligence feeds (domain reputation, IP repu-tation, le reputation, application reputation);
Evidence/telemetry generated by adversary tools, tac-tics, and procedures that have been observed before-hand.
Rules have the highest precision - Every time a scoped ruleres, it is malicious by construction.
Figure 3: Rules can be applied as lters aer the machinelearning system. e machine learning system producesanomalies, and the business heuristics help to winnow thesecurity interesting alerts.
Rules have the highest recall - Whenever a scoped ruleres, it detects all known instances of maliciousness thatare observed for that rule.
We also acknowledge the biggest disadvantage of rules: caremust be taken to maintain the corpus of rules since stale onescan spike the false positive rate. However, even machine learningmodels require babysiing and have their own complications .For instance, if we use a model that has been trained on data thatno longer reects the state environment, the model can dri andproduce unexpected results. Given that rules encode domain knowl-edge, are readily available, and favored by security analysts, wepresent three strategies to incorporate them alongside a machinelearning system.
As lters. Rules not only catch known malicious activity, butcan also be applied as lters on the output of the machine learningsystem to si out the expected activity (see Figure 3). In this archi-tecture, the machine learning system produces anomalies, and therules/business heuristics help to pick out the security interestingalerts. We used this framework to detect logins from unusual ge-ographic locations. In this scenario, if a user who always logs infrom New York aempts to login from Sydney, then the user mustbe prompted for multifactor authentication. Our initial implemen-tation of the detection logic had a false positive rate of 28%, and atcloud scale, that translated to 280 million suspicious logins. Toimprove our false positive rate, we supplemented the system withcustom rules to identify company proxi...