on load shedding in complex event processingpages.cs.wisc.edu/~sid/papers/icdt14.pdfimizing load...

19
On Load Shedding in Complex Event Processing Yeye He Microsoft Research Redmond, WA, 98052 [email protected] Siddharth Barman California Institute of Technology Pasadena, CA, 91106 [email protected] Jeffrey F. Naughton University of Wisconsin-Madison Madison, WI, 53705 [email protected] ABSTRACT Complex Event Processing (CEP) is a stream processing model that focuses on detecting event patterns in continuous event streams. While the CEP model has gained popularity in the research communities and commercial technologies, the problem of gracefully degrading performance under heavy load in the presence of resource constraints, or load shed- ding, has been largely overlooked. CEP is similar to “classi- cal” stream data management, but addresses a substantially different class of queries. This unfortunately renders the load shedding algorithms developed for stream data process- ing inapplicable. In this paper we study CEP load shed- ding under various resource constraints. We formalize broad classes of CEP load-shedding scenarios as different opti- mization problems. We demonstrate an array of complex- ity results that reveal the hardness of these problems and construct shedding algorithms with performance guarantees. Our results shed some light on the difficulty of developing load-shedding algorithms that maximize utility. 1. INTRODUCTION The complex event processing or CEP model has received significant attention from the research community [6, 29, 37, 38, 51, 53], and has been adopted by a number of commer- cial systems including Microsoft StreamInsight [1], Sybase Aleri [3], and StreamBase [2]. A wide range of applica- tions, including inventory management [4], behavior moni- toring [48], financial trading [5], and fraud detection [8, 52], are now powered by CEP technologies. A reader who is familiar with the extensive stream data processing literature may wonder if there is anything new here, or if CEP is just another name for stream data manage- ment. While both kinds of system evaluate queries over data streams, the important difference is the class of queries upon which each system focuses. In the traditional stream pro- cessing literature, the focus is almost exclusively on aggre- gate queries or binary equi-joins. By contrast, in CEP, the fo- cus is on detecting certain patterns, which can be viewed as multi-relational non-equi-joins on the time dimension, pos- sibly with temporal ordering constraints. The class of queries addressed by CEP systems requires different evaluation al- gorithms and different load-shedding algorithms than the class previously considered in the context of stream data manage- ment. As an example of a CEP-powered system, consider the health care monitoring system, HyReminder [48], currently deployed at the University of Massachusetts Memorial Hos- pital. The HyReminder system tracks and monitors the hy- giene compliance of health-care workers. In this hospital setting, each doctor wears an RFID badge that can be read by sensors installed throughout the hospital. As doctors walk around the hospital, the RFID badges they wear trigger “event” data, which is transmitted to a central CEP engine. The CEP engine in turn looks for patterns to check for hygiene com- pliance. As one example, according to the US Center for Disease Control (CDC) [13], a doctor who enters or exits a patient room (which is captured by sensors installed in the doorway and encoded as an Enter-Patient-Room event or Exit-Patient-Room event) should cleanse her hands (en- coded by a Sanitize event) within a short period of time. This hygiene regulation can be tracked and enforced using the following CEP queries. Q1: SEQ(Sanitize, Enter-Patient-Room) within 1 min Q2: SEQ(Exit-Patient-Room, Sanitize) within 1 min In the HyReminder system, these two CEP queries mon- itor the event sequence to track sanitization behavior and ensure hygiene compliance. As another example consider CIMS [4], which is a system also powered by CEP and de- ployed in the same University of Massachusetts hospital. CIMS is used for inventory management and asset track- ing purposes. It captures RFID events triggered by tags at- tached to medical equipment and expensive medicines, and uses CEP technology to track supply usage and reduce in- ventory cost [15]. While the emergence of CEP model has spawned a wide variety of applications, so far research efforts have focused almost exclusively on improving CEP query join efficiency [6, 29, 37, 38, 51, 53]. Load shedding, an important issue that has been extensively studied in traditional stream process- ing [10, 20, 26, 27, 44, 45, 49, 50, 54], has been largely overlooked in the new context of CEP. Like other stream systems, CEP systems often face bursty input data. Since over-provisioning the system to the point where it can handle any such burst may be uneconomical 1

Upload: others

Post on 06-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • On Load Shedding in Complex Event Processing

    Yeye HeMicrosoft Research

    Redmond, WA, [email protected]

    Siddharth BarmanCalifornia Institute of Technology

    Pasadena, CA, [email protected]

    Jeffrey F. NaughtonUniversity of Wisconsin-Madison

    Madison, WI, [email protected]

    ABSTRACTComplex Event Processing (CEP) is a stream processing modelthat focuses on detecting event patterns in continuous eventstreams. While the CEP model has gained popularity inthe research communities and commercial technologies, theproblem of gracefully degrading performance under heavyload in the presence of resource constraints, or load shed-ding, has been largely overlooked. CEP is similar to “classi-cal” stream data management, but addresses a substantiallydifferent class of queries. This unfortunately renders theload shedding algorithms developed for stream data process-ing inapplicable. In this paper we study CEP load shed-ding under various resource constraints. We formalize broadclasses of CEP load-shedding scenarios as different opti-mization problems. We demonstrate an array of complex-ity results that reveal the hardness of these problems andconstruct shedding algorithms with performance guarantees.Our results shed some light on the difficulty of developingload-shedding algorithms that maximize utility.

    1. INTRODUCTIONThe complex event processing or CEP model has received

    significant attention from the research community [6, 29, 37,38, 51, 53], and has been adopted by a number of commer-cial systems including Microsoft StreamInsight [1], SybaseAleri [3], and StreamBase [2]. A wide range of applica-tions, including inventory management [4], behavior moni-toring [48], financial trading [5], and fraud detection [8, 52],are now powered by CEP technologies.

    A reader who is familiar with the extensive stream dataprocessing literature may wonder if there is anything newhere, or if CEP is just another name for stream data manage-ment. While both kinds of system evaluate queries over datastreams, the important difference is the class of queries uponwhich each system focuses. In the traditional stream pro-cessing literature, the focus is almost exclusively on aggre-gate queries or binary equi-joins. By contrast, in CEP, the fo-cus is on detecting certain patterns, which can be viewed asmulti-relational non-equi-joins on the time dimension, pos-sibly with temporal ordering constraints. The class of queriesaddressed by CEP systems requires different evaluation al-gorithms and different load-shedding algorithms than the class

    previously considered in the context of stream data manage-ment.

    As an example of a CEP-powered system, consider thehealth care monitoring system, HyReminder [48], currentlydeployed at the University of Massachusetts Memorial Hos-pital. The HyReminder system tracks and monitors the hy-giene compliance of health-care workers. In this hospitalsetting, each doctor wears an RFID badge that can be readby sensors installed throughout the hospital. As doctors walkaround the hospital, the RFID badges they wear trigger “event”data, which is transmitted to a central CEP engine. The CEPengine in turn looks for patterns to check for hygiene com-pliance. As one example, according to the US Center forDisease Control (CDC) [13], a doctor who enters or exits apatient room (which is captured by sensors installed in thedoorway and encoded as an Enter-Patient-Room event orExit-Patient-Room event) should cleanse her hands (en-coded by a Sanitize event) within a short period of time.This hygiene regulation can be tracked and enforced usingthe following CEP queries.Q1: SEQ(Sanitize, Enter-Patient-Room) within 1 min

    Q2: SEQ(Exit-Patient-Room, Sanitize) within 1 min

    In the HyReminder system, these two CEP queries mon-itor the event sequence to track sanitization behavior andensure hygiene compliance. As another example considerCIMS [4], which is a system also powered by CEP and de-ployed in the same University of Massachusetts hospital.CIMS is used for inventory management and asset track-ing purposes. It captures RFID events triggered by tags at-tached to medical equipment and expensive medicines, anduses CEP technology to track supply usage and reduce in-ventory cost [15].

    While the emergence of CEP model has spawned a widevariety of applications, so far research efforts have focusedalmost exclusively on improving CEP query join efficiency [6,29, 37, 38, 51, 53]. Load shedding, an important issue thathas been extensively studied in traditional stream process-ing [10, 20, 26, 27, 44, 45, 49, 50, 54], has been largelyoverlooked in the new context of CEP.

    Like other stream systems, CEP systems often face burstyinput data. Since over-provisioning the system to the pointwhere it can handle any such burst may be uneconomical

    1

  • or impossible, during peak loads a CEP system may needto “shed” portions of the load. The key technical challengeherein is to selectively shed work so as to eliminate the lessimportant query results, thereby preserve the more usefulquery results as defined by some utility function.

    More specifically, the problem we consider is the follow-ing. Consider a CEP system that has a number of patternqueries, each of which consists of a number of events and isassociated with a utility function. During peak loads, mem-ory and/or CPU resources may not be sufficient. A utility-maximizing load shedding scheme should then determinewhich events should be preserved in memory and which queryresults should be processed by the CPU, so that not only areresource constraints respected, but also the overall utility ofthe query results generated is maximized.

    We note that, in addition to utility-maximization in a sin-gle CEP application, the load-shedding framework may alsobe relevant to cloud operators that need to optimize acrossmultiple CEP applications. Specifically, stream processingapplications are gradually shifting to the cloud [35]. In cloudscenarios, the cloud operator in general cannot afford to pro-vision for the aggregate peak load across all tenants, whichwould defeat the purpose of consolidation. Consequently,when the load exceeds capacity, cloud operators are forcedto shed load. They have a financial incentive to judiciouslyshed work from queries that are associated with a low penaltycost as specified in Service Level Agreements (SLAs), sothat their profits can be maximized (similar problems havebeen called “profit maximization in a cloud” and have beenconsidered in the Database-as-a-Service literature [18, 19]).Note that this problem is naturally a utility-maximizing CEPload shedding problem, where utility essentially becomes thefinancial rewards and penalties specified in SLAs.

    While load shedding has been extensively studied in thecontext of general stream processing [10, 20, 26, 27, 44,45, 49, 50, 54], the focus there is aggregate queries or two-relation equi-join queries, which are important for traditionalstream joins. The emerging CEP model, however, demandsmulti-relational joins that predominantly use non-equi-joinpredicates on timestamp. As we will discuss in more detailin Section 4, the CEP load shedding problem is significantlydifferent and considerably harder than the problems previ-ously studied in the context of general stream load shedding.

    We show in this work that variants of the utility max-imizing load shedding problems can be abstracted as dif-ferent optimization problems. For example, depending onwhich resource is constrained, we can have three problemvariants, namely CPU-bound load shedding, memory-boundload shedding, and dual-bound load shedding (with bothCPU- and memory-bound). In addition we can have integralload shedding, where event instances of each type are eitherall preserved in memory or all shedded; and fractional loadshedding, where a sampling operator exists such that a frac-tion of event instances of each type can be sampled accord-ing to a predetermined sampling ratio. Table 1 summarizes

    Memory-bound CPU-bound Dual-boundIntegral IMLS ICLS IDLS

    Fractional FMLS FCLS FDLS

    Table 1: Problem variants for CEP load shedding

    the six variants of CEP load shedding studied in this paper:IMLS (integral memory-bound load shedding), FMLS (frac-tional memory-bound load shedding), ICLS (integral CPU-bound load shedding), FCLS (fractional CPU-bound loadshedding), IDLS (integral dual-bound load shedding), andFDLS (fractional dual-bound load shedding).

    We analyze the hardness of these six variants, and studyefficient algorithms with performance guarantees. We demon-strate an array of complexity results. In particular, we showthat CPU-bound load shedding is the easiest to solve: FCLSis solvable in polynomial time, while ICLS admits a FPTASapproximation. For memory-bound problems, we show thatIMLS is in general NP-hard and hard to approximate. Wethen identify two special cases in which IMLS can be ef-ficiently approximated or even solved exactly, and describea general rounding algorithm that achieves a bi-criteria ap-proximation. As for the fractional FMLS, we show it is hardto solve in general, but has approximable special cases. Fi-nally, for dual-bound load shedding IDLS and FDLS, weshow that they generalize memory-bound problems, and thehardness results from memory-bound problems naturally hold.On the positive side, we describe a tri-criteria approximationalgorithm and an approximable special case for the IDLSproblem.

    The rest of the paper is organized as follows. We firstdescribe necessary background of CEP in Section 2, and in-troduce the load shedding problem in Section 3. We describerelated work in Section 4. In Section 5, Section 6 and Sec-tion 7 we discuss the memory-bound, CPU-bound and dual-bound load-shedding problems, respectively. We concludein Section 8.

    2. BACKGROUND: COMPLEX EVENT PRO-CESSING

    The CEP model has been proposed and developed by anumber of seminal papers (see [6] and [53] as examples). Tomake this paper self-contained, we briefly describe the CEPmodel and its query language in this section.

    2.1 The data modelLet the domain of possible event types be the alphabet of

    fixed size ⌃ = {Ei

    }, where Ei

    represents a type of event.The event stream is modeled as an event sequence as follows.

    DEFINITION 1. An event sequence is a sequence S =(e

    1

    , e2

    , ..., eN

    ), where each event instance ej

    belongs to anevent type E

    i

    2 ⌃, and has a unique time stamp tj

    . Thesequence S is temporally ordered, that is t

    j

    < tk

    , 8j < k.EXAMPLE 1. Suppose there are five event types denoted

    by upper-case characters ⌃ = {A,B,C,D,E}. Each char-acter represents a certain type of event. For example, for the

    2

  • hospital hygiene monitoring application HyReminder, eventtype A represents all event instances where doctors enterICU, B denotes the type of events where doctors washinghands at sanitization desks, etc.

    A possible event sequence is S = (A1

    , B2

    , C3

    , D4

    , E5

    ,A

    6

    , B7

    , C8

    , D9

    , E10

    ), where each character is an instanceof the corresponding event type occurring at time stamp givenby the subscript. So A

    1

    denotes that at time-stamp 1, a doc-tor enters ICU. B

    2

    shows that at time 2, the doctor sanitizeshis hands. At time-stamp 6, there is another enter-ICU eventA

    6

    , so on and so forth.

    Following the standard practice of the CEP literature [6,7, 38, 53], we assume events are temporally ordered by theirtimestamps. Out-of-order events can be handled using tech-niques from [16, 37].

    DEFINITION 2. Given an event sequence S = (e1

    , e2

    , ...,eN

    ), a sequence S0 = (ei1 , ei2 , ..., eim) is a subsequence

    of S, if 1 i1

    < i2

    ... < im

    N .Note that the temporal ordering in the original sequence is

    preserved in subsequences. Also observe that a subsequencedoes not have to be a contiguous subpart of a sequence.

    2.2 The query modelUnlike relational databases, where queries are typically

    ad-hoc and constructed by users at query time, CEP sys-tems are more like other stream processing systems, wherequeries are submitted ahead of time and run for an extendedperiod of time (thus are also known as long-standing queries).The fact that CEP queries are known a priori is a key prop-erty that allows queries to be analyzed and optimized forproblems like utility maximizing load shedding.

    Denote by Q = {Qi

    } the set of CEP queries, where eachquery Q

    i

    is a sequence query defined as follows.

    DEFINITION 3. A CEP sequence query Q is of the formQ = SEQ(q

    1

    , q2

    , ...qn

    ), where qk

    2 ⌃ are event types.Each query Q is associated with a time based sliding win-dow of size T (Q) 2 R+, over which Q will be evaluated.

    We then define query match under the semantics skip-till-any-match.

    DEFINITION 4. In skip-till-any-match, a subsequence S0= (e

    i1 , ei2 , . . . , ein) of S is considered a query match ofQ = (q

    1

    , q2

    , . . . , qn

    ) over time window T (Q) if:(1) Pattern matches: Event e

    il in S0 is of type ql

    for alll 2 [1, n],

    (2) Within window: tin � ti1 T (Q).

    We illustrate query matches using Example 2.

    EXAMPLE 2. We continue with Example 1, where the eventsequence S = (A

    1

    , B2

    , C3

    , D4

    , E5

    , A6

    , B7

    , C8

    , D9

    , E10

    ).Suppose there are a total of three queries: Q

    1

    = SEQ(A,C),Q

    2

    = SEQ(C,E), Q3

    = SEQ(A,B,C,D), all having thesame window size T (Q

    1

    ) = T (Q2

    ) =T (Q3

    ) = 5.

    Both sequences (A1

    , C3

    ) and (A6

    , C8

    ) are matches forQ

    1

    , because they match patterns specified in Q1

    , and arewithin the time window 5. However, (A

    1

    , C8

    ) is not a matcheven though it matches the pattern in Q

    1

    , because the timedifference between C

    8

    and A1

    exceeds the window limit 5.Similarly, (C

    3

    , E5

    ) and (C8

    , E10

    ) are matches of Q2

    ; (A1

    ,B

    2

    , C3

    , D4

    ) and (A6

    , B7

    , C8

    , D9

    ) are matches of Q3

    .A query with the skip-till-any-match semantics essen-

    tially looks for the conjunction of occurrences of event typesin a specified order within a time window. Observe that inskip-till-any-match a subsequence does not have to be con-tiguous in the original sequence to be considered as a match(thus the word skip in its name). Such queries are widelystudied [2, 6, 29, 37, 38, 51, 53] and used in CEP systems.

    We note that there are three additional join semantics de-fined in [6], namely, skip-till-next-match, partition-contiguityand contiguity. In the interest of space and also to better fo-cus on the topic of load shedding, the details of these joinsemantics are described in Appendix A. The load sheddingproblem formulated in this work is agnostic of the join se-mantics used.

    We also observe that although there are CEP language ex-tensions like negation [29] and Kleene closure [23], in thiswork we only focus on the core language constructs that useconjunctive positive event occurrences. Studying query ex-tensions for load shedding is an interesting area for futureresearch.

    3. CEP LOAD SHEDDINGIt is well known that continuously arriving stream data is

    often bursty [20, 44, 50]. During times of peak loads notall data items can be processed in a timely manner under re-source constraints. As a result, part of the input load mayhave to be discarded (shedded), resulting in the system re-taining only a subset of data items. The question that natu-rally arises is which queries should be preserved while othersshedded? To answer this question, we introduce the notionof utility to quantify the “usefulness” of different queries.

    3.1 A definition of utilityIn CEP systems, different queries naturally have differ-

    ent real-world importance, where some query output may bemore important than others. For example, in the inventorymanagement application [4, 15], it is naturally more impor-tant to produce real-time query results that track expensivemedicine/equipment than less expensive ones. Similarly, inthe hospital hygiene compliance application [48], it is moreimportant to produce real-time query results reporting a se-rious hygiene violation with grave health consequences thanthe ones merely reporting routine hygiene compliance.

    We define utility weight for each query to measure its im-portance.

    DEFINITION 5. Let Q = {Qi

    } be the set of queries. De-fine the utility weight of query Q

    i

    , denoted by wi

    2 R+, asthe perceived usefulness of reporting one instance of matchof Q

    i

    .

    3

  • A user or an administrator familiar with the applicationcan typically determine utility weights. Alternatively, in acloud environment, operators of multi-tenancy clouds mayresort to service-level-agreements (SLAs) to determine util-ity weights. In this work we simply treat utility weights asknown constants. We note that the notion of query-levelweights has been used in other parts of data managementliterature (e.g., query scheduling [40]).

    The total utility of a system is then defined as follows.

    DEFINITION 6. Let C(Qi

    , S) be the number of distinctmatches for query Q

    i

    in S. The utility generated for queryQ

    i

    isU(Q

    i

    , S) = wi

    · C(Qi

    , S) (1)

    The sum of the utility generated over Q = {Qi

    } is

    U(Q, S) =X

    Qi2QU(Q

    i

    , S). (2)

    Our definition of utility generalizes previous metrics likethe max-subset [20] used in traditional stream load sheddingliterature. Max-subset aims to maximize the number of out-put tuples, and thus can be viewed as a special case of ourdefinition in which each query has unit-weight.

    We also note that although it is natural to define utility as alinear function of query matches for many CEP applications(e.g., [4, 48]), there may exist applications where utility canbe best defined differently (e.g. a submodular function tomodel diminishing returns). Considering alternative utilityfunctions for CEP load shedding is an area for future work.

    EXAMPLE 3. We continue with Example 2 using the eventsequence S and queries Q

    1

    , Q2

    and Q3

    to illustrate utility.Suppose the utility weight w

    1

    for Q1

    is 1, w2

    is 2, and w3

    is 3. Since there are 2 matches of Q1

    , Q2

    and Q3

    , respec-tively, in S, the total utility is 2 ⇥ 1 + 2 ⇥ 2 + 2 ⇥ 3 = 12.

    3.2 Types of resource constraintsIn this work, we study constraints on two common types

    of computing resources: CPU and memory.Memory-bound load shedding. In this first scenario,

    memory is the limiting resource. Normally, arriving eventdata are kept in main memory for windowed joins until time-stamp expiry (i.e., when they are out of active windows).During peak loads, however, event arrival rates may be sohigh that the amount of memory needed to store all eventdata might exceed the available capacity. In such a case notevery arriving event can be held in memory for join process-ing and some events may have to be discarded.

    EXAMPLE 4. In our running example the event sequenceS = (A

    1

    , B2

    , C3

    , D4

    , E5

    , A6

    , B7

    , C8

    , D9

    , E10

    ...), with queriesQ

    1

    = SEQ(A,C), Q2

    = SEQ(C,E) and Q3

    = SEQ(A,B,C, D). Because the sliding window of each query is 5, weknow each event needs to be kept in memory for 5 units oftime. Given that one event arrives in each time unit, a totalof 5 events need to be simultaneously kept in memory.

    Suppose a memory-constrained system only has memorycapacity for 3 events. In this case “shedding” all eventsof type B and D will sacrifice the results of Q

    3

    but pre-serves A, C and E and meets the memory constraint. In ad-dition, results for Q

    1

    and Q2

    can be produced using avail-able events in memory, which amounts to a total utility of2⇥ 1 + 2⇥ 2 = 6. This maximizes utility, for shedding anyother two event types yields lower utility.

    CPU-bound load shedding. In the second scenario, mem-ory may be abundant, but CPU becomes the bottleneck. As aresult, again only a subset of query results can be processed.

    EXAMPLE 5. We revisit Example 4, but now suppose wehave a CPU constrained system. Assume for simplicity thatproducing each query match costs 1 unit of CPU. Supposethere are 2 unit of CPU available per 5 units of time, so only2 matches can be produced every 5 time units.

    In this setup, producing results for Q2

    and Q3

    while shed-ding others yields a utility of 2 ⇥ 2 + 2 ⇥ 3 = 10 given theevents in S. This is utility maximizing because Q

    2

    and Q3

    have the highest utility weights.

    Dual-bound load shedding. Suppose now the system isboth CPU bound and memory bound (dual-bound).

    EXAMPLE 6. We continue with Example 5. Suppose nowdue to memory constraints 3 events can be kept in memoryper 5 time units, and in addition 2 query matches can beproduced every 5 time units due to CPU constraints.

    As can be verified, the optimal decision is to keep eventsA, C and E while producing results for Q

    1

    and Q2

    , whichyields a utility of 2 ⇥ 1 + 2 ⇥ 2 = 6 given the events in S.Note that results for Q

    3

    cannot be produced because it needsfour events in memory while only three can fit in memorysimultaneously.

    3.3 Types of shedding mechanismsIn this paper, we consider two types of shedding mecha-

    nisms, an integral load shedding, in which certain types ofevents or query matches are discarded altogether; and a frac-tional load shedding, in which a uniform sampling is used,such that a portion of event types or query matches is ran-domly selected and preserved.

    Note that both mechanisms we consider are online loadshedding. That is, a shedding decision is made for the cur-rent event before the next arriving event is processed. Thisis in contrast to offline load shedding, where decisions aremade after the whole event sequence has arrived.

    In this paper we only focus on online CEP load shedding,given the fact that most stream applications demand real-time responsiveness. In addition, offline load shedding re-quires the entire event sequence to be buffered, which is notlikely to be practical in real systems.

    Performance of online algorithms is oftentimes measuredagainst their offline counterparts to develop quality guaran-tees like competitive ratios [12]. However, we show in the

    4

  • following that meaningful competitive ratios cannot be de-rived for any online CEP load shedding algorithms.

    PROPOSITION 1. No online CEP load shedding algorithm,deterministic or randomized, can have competitive ratio bet-ter than ⌦(n), where n is the length of the event sequence.

    The full proof of this proposition can be found in Ap-pendix B. Intuitively, to see why it is hard to bound the com-petitive ratio, consider the following adversarial scenario.Suppose we have a universe of 2m event types, ⌃ = {E

    i

    }[{E0

    i

    }, i 2 [1,m]. Let there be m queries SEQ(Ei

    , E0i

    ),8i 2 [1,m], all with unit utility weight. The stream is knownto be (e

    1

    , e2

    , ..., em

    , X), where ei

    is of type Ei

    , and X isdrawn from a uniform distribution on {E0

    i

    : i 2 [1,m]}.Lastly, suppose the system only has enough memory to holdtwo events. The optimal offline algorithm can look at thetype of event X , denoted by E0

    l

    , and keep the correspond-ing event e

    l

    of type El

    that arrived previously, to produce amatch (E

    l

    , E0l

    ) of utility of 1. In comparison, an online algo-rithm needs to select one event into memory before the eventtype of X is revealed. Thus, the probability of producing amatch is 1

    m

    , and the expected utility is also 1m

    .This result essentially suggests that we cannot hope to de-

    vise online algorithms with good competitive ratios. Conse-quently, in what follows, we will focus on optimizing the ex-pected utility of online algorithms without further discussingcompetitive ratio bounds.

    3.4 Modeling CEP systemsAt a high level, the decision of which event or query to

    shed should depend on a number of factors, including utilityweights, memory/CPU costs, and event arrival rates. Intu-itively, the more important a query is, the more desirable itis to keep constituent events in memory and produce resultsof this query. Similarly the higher the cost is to keep an eventor to produce a query match, the less desirable it is to keepthat event or produce that result. The rate at which eventsarrive is also important, as it determines CPU/memory costsof a query as well as utility it can produce.

    In order to study these trade-offs in a principled way, weconsider the following factors in a CEP system. First, weassume that the utility weight, w

    i

    , which measures the im-portance of query Q

    i

    installed in a CEP system, is providedas a constant. We also assume that the CPU cost of produc-ing each result of Q

    i

    is a known constant ci

    , and the mem-ory cost of storing each event instance of type E

    j

    is alsoa fixed constant m

    j

    . Note that we do not assume uniformmemory/CPU costs across different events/queries, becausein practice event tuples can be of different sizes. Further-more, the arrival rate of each event type E

    j

    , denoted by �j

    , isassumed to be known. This is typically obtained by samplingthe arriving stream [17, 42]. Note that characteristics of theunderlying stream may change periodically, so the samplingprocedure may be invoked at regular intervals to obtain anup-to-date estimate of event arrival rates.

    ⌃ The set of all possible event typesEj Event of type j�j The number of arrived events Ej in a unit time (event arrival rate)mj The memory cost of keeping each event of type EjQ The set of query patternsQi Query pattern i|Qi| The number of event types in Qiwi The utility weight associated with Qini The number of matches of Qi in a unit timeci The CPU cost of producing each result for QiC The total CPU budgetM The total memory budgetxj The binary selection decision of event Ejxj The fractional sampling decision of event Ejyi The selection decision of query Qiyi The fractional sampling decision of query Qip The max number of queries that one event type participates inf The fraction of memory budget that the largest query consumesd The maximum number of event types in any one query

    Table 2: Summary of the symbols usedApproximation Ratio

    IMLS p/(1� f) [Theorem 4]IMLSm

    (loss minimization)( 1⌧ ,

    11�⌧ ) bi-criteria approximation, for any ⌧ 2

    (0, 1) [Theorem 3]ICLS 1 + ✏, for ✏ > 0 [Theorem 10]IDLS p/(1� f) [Theorem 12]

    IDLSm(loss minimization)

    ( 1⌧ ,1

    1�⌧ ,1

    1�⌧ ) tri-criteria approximation, forany ⌧ 2 (0, 1) [Theorem 11]

    Relative Approximation Ratio (see Definition 8)

    FMLS1 � O

    |⌃|�d�22 (t2 + 1)�

    d2

    where t =

    min

    minEj {�jmjM },

    1p|⌃|

    [Theorem 7]

    FMLS(under someassumptions)

    O⇣

    1� k!(k�d)!kd

    where k > d [Theorem 8]

    Absolute Approximation Ratio (see Definition 7)

    FMLS

    (1 � �( k!(k�d)!kd ))-approximation, where � =

    min⇣

    minjn

    �jmjM

    o

    , 1⌘d

    , and k > d controlsapproximation accuracy [Theorem 9].

    Table 3: Summary of approximation results

    Furthermore, we assume that the “expected” number ofmatches of Q

    i

    over a unit time, denoted by ni

    , can also beestimated. A simple but inefficient way to estimate n

    i

    is tosample the arriving event stream and count the number ofmatches of Q

    i

    in a fixed time period. Less expensive al-ternatives also exist. For example, in Appendix C, we dis-cuss an analytical way to estimate n

    i

    , assuming an indepen-dent Poisson arrival process, which is a standard assumptionin the performance modeling literature [36]. In this workwe will simply treat n

    i

    as known constants without furtherstudying the orthogonal issue of estimating n

    i

    .Lastly, note that since n

    i

    here is the expected number ofquery matches, the utility we maximize is also optimizedin an expected sense. In particular, it is not optimized un-der arbitrary arrival event strings (e.g., an adversarial input).While considering load shedding in such settings is interest-ing, Proposition 1 already shows that we cannot hope to getany meaningful bounds against certain adversarial inputs.

    The symbols used in this paper are summarized in Table 2,and our main approximation results are listed in Table 3.

    5

  • 4. RELATED WORKLoad shedding has been recognized as an important prob-

    lem, and a large body of work in the stream processing liter-ature (e.g., [9, 10, 20, 27, 33, 44, 45, 49, 50]) has been de-voted to this problem. However, existing work in the contextof traditional stream processing predominantly considers theequi-join of two streaming relations. This is not directly ap-plicable to CEP joins, where each join operator typically in-volves multi-relational non-equi-join (on time-stamps). Forexample, the authors in [33] are among the first to study loadshedding for equi-joins operators. They proposed strategiesto allocate memory and CPU resources to two joining re-lations based on arrival rates, so that the number of outputtuples produced can be maximized. Similarly, the work [20]also studies the problem of load shedding while maximiz-ing the number of output tuples. It utilizes value distribu-tion of the join columns from the two relations to produceoptimized shedding decisions for tuples with different joinattribute values.

    However, the canonical two-relation equi-join studied intraditional stream systems is only a special case of the multi-relational, non-equi-join that dominates CEP systems. Inparticular, if we view all tuples from R (resp. S) that havethe same join-attribute value v

    i

    as a virtual CEP event typeR

    i

    (resp. Si

    ), then the traditional stream load shedding prob-lem is captured as a very special case of CEP load shed-ding we consider, where each “query” has exactly two “eventtypes” (R

    i

    and Si

    ), and there are no overlapping “event types”between “queries”. Because of this equi-join nature, shed-ding one event has limited ramification and is intuitivelyeasy to solve (in fact, it is shown to be solvable in [20]).In CEP queries, however, each event type can join with anarbitrary number of other events, and different queries useoverlapping events. This significantly complicates the opti-mization problem and makes CEP load shedding hard.

    In [10], sampling mechanisms are proposed to implementload shedding for aggregate stream queries (e.g., SUM), wherethe key technical challenge is to determine, in a given oper-ator tree, where to place sampling operators and what sam-pling rates to use, so that query accuracy can be maximized.The work [45] studies the similar problem of strategicallyplacing drop operator in the operator tree to optimize utilityas defined by QoS graphs. The authors in [44] also considerload shedding by random sampling, and propose techniquesto allocate memory among multiple operators.

    The works described above study load shedding in tradi-tional stream systems. The growing popularity of the newCEP model that focuses on multi-relational non-equi-joincalls for another careful look at the load-shedding problemin the new context of CEP.

    5. MEMORY-BOUND LOAD SHEDDINGRecall that in the memory-bound load shedding, we are

    given a fixed memory budget M , which may be insufficientto hold all data items in memory. The problem is to selecta subset of events to keep in memory, such that the overall

    utility can be maximized.

    5.1 The integral variant (IMLS)In the integral variant of the memory-bound load shedding

    problem, a binary decision, denoted by xj

    , is made for eachevent type E

    j

    , such that event instances of type Ej

    are eitherall selected and kept in memory (x

    j

    = 1), or all discarded(x

    j

    = 0). The event selection decisions in turn determinewhether query Q

    i

    can be selected (denoted by yi

    ), becauseoutput of Q

    i

    can be produced only if all constituent eventtypes are selected in memory. We formulate the resultingproblem as an optimization problem as follows.

    (IMLS) maxX

    Qi2Qni

    wi

    yi

    (3)

    s.t.X

    Ej2⌃�j

    mj

    xj

    M (4)

    yi

    =

    Y

    Ej2Qi

    xj

    (5)

    yi

    , xj

    2 {0, 1} (6)

    The objective function in Equation (3) says that if eachquery Q

    i

    is selected (yi

    = 1), then it yields an expectedutility of n

    i

    wi

    (recall that as discussed in Section 3, ni

    mod-els the expected number of query matches of Q

    i

    in a unittime, while w

    i

    is the utility weight of each query match).Equation (4) specifies the memory constraint. Since select-ing event type E

    j

    into memory consumes �j

    mj

    memory,where �

    j

    is the arrival rate of Ej

    and mj

    is the memorycost of each event instance of E

    j

    , Equation (4) guaranteesthat total memory consumption does not exceed the memorybudget M . Equation (5) ensures that Q

    i

    can be produced ifand only if all participating events E

    j

    are selected and pre-served in memory (x

    j

    = 1, for all Ej

    2 Qi

    ).

    5.1.1 A general complexity analysisWe first give a general complexity analysis. We show that

    this shedding problem is NP-hard and hard to approximateby a reduction from the Densest k-Sub-Hypergraph (DKSH).

    THEOREM 1. The problem of utility maximizing integralmemory-bound load shedding (IMLS) is NP-hard.

    The proof of the theorem can be found in Appendix D. Weshow in the following that IMLS is also hard to approximate.

    THEOREM 2. The problem of IMLS with n event typescannot be approximated within a factor of 2(logn)

    , for some� > 0, unless 3SAT 2 DTIME(2n3/4+✏ ).

    This result is obtained by observing that the reductionfrom DKSH is approximation-preserving. Utilizing an in-approximability result in [30], we obtain the theorem above(the proof is in Appendix E).

    It is worth noting that DKSH and related problems areconjectured to be very hard problems with even strongerinapproximability than what was proved in [30]. For ex-ample, authors in [24] conjectured that Maximum Balanced

    6

  • Complete Bipartite Subgraph (BCBS) is n✏ hard to approx-imate. If this is the case, utilizing a reduction from BCBSto DKSH [30], DKSH would be at least n✏ hard to approx-imate, which in turn renders IMLS n✏ hard to approximategiven our reduction from DKSH.

    While it is hard to solve or approximate IMLS efficientlyin general, in the following sections we look at constraintsthat may apply to real-world CEP systems, and investigatespecial cases that enable us to approximate or even solveIMLS efficiently.

    5.1.2 A general bi-criteria approximationWe reformulate the integral memory-bound problem into

    an alternative optimization problem (IMLSl) with linear con-straints as follows.

    (IMLSl) maxX

    Qi2Qni

    wi

    yi

    (7)

    s.t.X

    Ej2⌃�j

    mj

    xj

    M

    yi

    xj

    , 8Ej

    2 Qi

    (8)yi

    , xj

    2 {0, 1}

    Observing that Equation (5) in IMLS is essentially mor-phed into an equivalent Equation (8). These two constraintsare equivalent because y

    i

    , xj

    are all binary variables, yi

    willbe forced to 0 if there exists x

    j

    = 0 with Ej

    2 Qi

    .Instead of maximizing utility, we consider the alternative

    objective of minimizing utility loss as follows. Set ŷi

    =

    1 � yi

    be the complement of yi

    , which indicates whetherquery Q

    i

    is un-selected. We can change the utility gainmaximization IMLSl into a utility loss minimization prob-lem IMLSm. Note that utility gain is maximized if and onlyif utility loss is minimized.

    (IMLSm) minX

    Qi2Qni

    wi

    ŷi

    (9)

    s.t.X

    Ej2⌃�j

    mj

    xj

    M

    ŷi

    � 1� xj

    , 8Ej

    2 Qi

    (10)ŷi

    , xj

    2 {0, 1} (11)

    In IMLSm Equation (10) is obtained by using ŷi

    = 1� yi

    and Equation (8). Using this new minimization problemwith linear structure, we prove a bi-criteria approximationresult. Let OPT be the optimal loss with budget M in a lossminimization problem, then an (↵,�)-bi-criteria approxima-tion guarantees that its solution has at most ↵ · OPT loss,while uses no more than � · M budget. Bicriteria approx-imations have been extensively used in the context of re-source augmentation (e.g., see [43] and references therein),where the algorithm is augmented with extra resources andthe benchmark is an optimal solution without augmentation.

    THEOREM 3. The problem of IMLSm admits a ( 1⌧

    , 11�⌧ )

    bi-criteria-approximation, for any ⌧ 2 [0, 1].

    For concreteness, suppose we set ⌧ = 12

    . Then this re-sult states that we can efficiently find a strategy that incurs atmost 2 times the optimal utility loss with budget M , whileusing no more than 2M memory budget.

    PROOF. Given a parameter 1⌧

    , we construct an event se-lection strategy as follows. First we drop the integrality con-straint of IMLSm to obtain its LP-relaxation. We solve therelaxed problem to get an optimal fractional solutions x⇤

    j

    andŷ⇤i

    .We can then divide queries Q into two sets, Qa = {Q

    i

    2Q|ŷ⇤

    i

    ⌧} and Qr = {Qi

    2 Q|ŷ⇤i

    > ⌧}. Since ŷi

    denoteswhether query Q

    i

    is un-selected, intuitively a smaller valuemeans the query is more likely to be accepted. We can ac-cordingly view Qa as the set of “promising” queries, and Qras “unpromising” queries.

    The algorithm works as follows. It preserves every querywith ŷ⇤

    i

    ⌧ , by selecting constituent events of Qi

    into mem-ory. So the set of query in Qa is all accepted, while Qr is allrejected.

    We first show that the memory consumption is no morethan 1

    1�⌧M . From Equation (10), we know the fractionalsolutions must satisfy

    x⇤j

    � 1� ŷ⇤i

    , 8Ej

    2 Qi

    (12)

    In addition, we have ŷ⇤i

    ⌧, 8Qi

    2 Qa. So we conclude

    x⇤j

    � 1� ⌧, 8Qi

    2 Qa, Ej

    2 Qi

    (13)

    Since we know x⇤j

    are fractional solutions to IMLSm, wehave X

    Ej2Qam

    j

    �j

    x⇤j

    X

    Ej2Qa[Qrm

    j

    �j

    x⇤j

    = M (14)

    Here we slightly abuse the notation and use Ej

    2 Qa todenote that there exists a query Q 2 Qa such that E

    j

    2 Q.Combining (13) and (14), we haveX

    Ej2Qam

    j

    �j

    (1� ⌧) M

    Notice thatP

    Ej2Qa mj�j is the total memory consump-tion of our rounding algorithm, we have

    X

    Ej2Qam

    j

    �j

    M1� ⌧

    Thus total memory consumption cannot exceed M1�⌧ .

    We then need to show that the utility loss is bounded by afactor of 1

    . Denote the optimal loss of IMLSm as l⇤, and theoptimal loss with LP-relaxation as ¯l⇤. We then have ¯l⇤ l⇤because any feasible solution to IMLSm is also feasible tothe LP-relaxation of IMLSm. In addition, we knowX

    Qi2Qrni

    wi

    ŷ⇤i

    X

    Qi2Qa[Qrni

    wi

    ŷ⇤i

    =

    ¯l⇤ l⇤

    So we can obtain X

    Qi2Qrni

    wi

    ŷ⇤i

    l⇤ (15)

    7

  • Based on the way queries are selected, we know for everyrejected query

    ŷ⇤i

    � ⌧, 8Qi

    2 Qr (16)

    Combining (15) and (16), we getX

    Qi2Qrni

    wi

    ⌧ l⇤

    Observing thatP

    Qi2Qr niwi is the utility loss of the al-gorithm, we conclude that

    X

    Qi2Qrni

    wi

    l⇤

    This bounds the utility loss from optimal l⇤ by a factor of 1⌧

    ,thus completing the proof.

    Note that since our proof is constructive, this gives an LP-relaxation based algorithm to achieve ( 1

    , 11�⌧ ) bi-criteria-

    approximation of utility loss.

    5.1.3 An approximable special caseGiven that memory is typically reasonably abundant in to-

    day’s hardware setup, in this section we will assume thatthe available memory capacity is large enough such that itcan hold at least a few number of queries. If we set f =maxQi

    PEj2Qi

    mj�j

    M

    to be the ratio between the memory re-quirement of the largest query and available memory M . Weknow if M is large enough, then each query uses no morethan fM memory, for some f < 1.

    In addition, denote by p = maxj

    |{Qi

    |Ej

    2 Qi

    , Qi

    2Q}| as the maximum number of queries that one event typeparticipates in. We note that in practice there are problems inwhich each event participates in a limited number of queries.In such cases p will be limited to a small constant.

    Assuming both p and f are some fixed constants, we ob-tain the following approximation result.

    THEOREM 4. Let p be the maximum number of queriesthat one event can participate in, and f be the ratio betweenthe size of the largest query and the memory budget definedabove, IMLS admits a p

    1�f -approximation.

    The idea here is to leverage the fact that the maximumquery-participation p is a constant to simplify the memoryconsumption constraint, so that a knapsack heuristic yieldsutility guarantees. In the interest of space we present the fullproof of this theorem in Appendix F.

    5.1.4 A pseudo-polynomial-time solvable special caseWe further consider the multi-tenant case where multiple

    CEP applications are consolidated into one single server orinto one cloud infrastructure where the same set of underly-ing computing resources is shared across applications.

    In this multi-tenancy scenario, since different CEP ap-plications are interested in different aspects of real-world

    event occurrences, there typically is no or very limited shar-ing of events across different applications (the hospital hy-giene system HyReminder and hospital inventory manage-ment system CIMS as mentioned in the Introduction, forexample, have no event types in common. So do a net-work intrusion detection application and a financial appli-cation co-located in the same cloud). Using a hyper-graphmodel, a multi-tenant CEP system can be represented as ahyper-graph H , where each event type is represented as avertex and each query as a hyper-edge. If there is no shar-ing of event types among CEP applications, then each con-nected component of H corresponds to one CEP applica-tion. Let k be the size of the largest connected componentof H , then k is essentially the maximum number of eventtypes used in any one CEP application, which may be limitedto a small constant (the total number of event types acrossmultiple CEP applications is not limited). Assuming this isthe case, we have the following special case that is pseudo-polynomial time solvable.

    THEOREM 5. In a multi-tenant CEP system where eachCEP tenant uses a disjoint set of event types, if each CEPtenant uses no more than k event types, the problem of IMLScan be solved in time O(|⌃||Q|M2k2).

    Our proof utilizes a dynamic programming approach de-veloped for Set-union Knapsack problem [28]. The full proofof this theorem can be found in Appendix G.

    We note that we do not assume the total number of eventtypes across multiple CEP tenants to be limited. In fact, therunning time grows linearly with the total number of eventtypes and queries across all CEP tenants.

    Lastly, we observe that the requirement that events in eachCEP tenant are disjoint can be relaxed. As long as the shar-ing of event types between different CEP tenants are limited,such that the size of the largest component of H mentionedabove is bounded by k, the result in Theorem 5 holds.5.2 The fractional variant (FMLS)

    In this section, we consider the fractional variant of thememory-bound load shedding problem. In this variant, in-stead of taking an all-or-nothing approach to either includeor exclude all event instances of certain types in memory, weuse a random sampling operator [32], which samples somearriving events uniformly at random into the memory. De-note by x

    j

    2 [0, 1] the sampling probability for each eventtype E

    j

    . The fractional variant memory-bound load shed-ding (FMLS) can be written as follows.

    (FMLS) maxX

    Qi2Qni

    wi

    yi

    (17)

    s.t.X

    Ej2⌃�j

    mj

    xj

    M (18)

    yi

    =

    Y

    Ej2Qi

    xj

    (19)

    0 xj

    1 (20)

    8

  • The integrality constraints are essentially dropped fromthe integral version of the problem, and are replaced by con-straints in (20). We use fractional sampling variables x

    j

    andyi

    to differentiate from binary variables xj

    , yi

    . Note thatEquation (19) states that the probability that a query result isproduced, y

    i

    , is the cross-product of sampling rates of con-stituent events since each event is sampled randomly and in-dependently of each other.

    5.2.1 A general complexity analysisIn the FMLS formulation, if we fold Equation (19) into

    the objective function in (17), we obtain

    max

    X

    Qi2Qni

    wi

    Y

    Ej2Qi

    xj

    (21)

    This makes FMLS a polynomial optimization problem sub-ject to a knapsack constraint (18).

    Since we are maximizing the objective function in Equa-tion (21), it is well known that if the function is concave, thenconvex optimization techniques [14] can be used to solvesuch problems optimally. However, we show that except thetrivial case where each query has exactly one event (i.e., (21)becomes linear), in general Equation (21) is not concave.

    LEMMA 1. If the objective function in Equation (21) isnon-trivial (that is, at least one query has more than oneevent), then (21) is non-concave.

    We show the full proof of Lemma 1 in Appendix H in theinterest of space.

    Given this non-concavity result, it is unlikely that we canhope to exploit special structures of the Hessian matrix tosolve FMLS. In particular, convex optimization techniqueslike KKT conditions or gradient descent [14] can only pro-vide local optimal solutions, which may be far away fromthe global optimal.

    On the other hand, while the general polynomial programis known to be hard to solve [11, 46], FMLS is a specialcase where all coefficients are positive, and the constraintis a simple knapsack constraint. Thus the hardness resultsin [11, 46] do not apply to FMLS. We show the hardnessof FMLS by using the Motzkin-Straus theorem [39] and areduction from the Clique problem.

    THEOREM 6. The problem of fractional memory-boundload shedding (FMLS) is NP-hard. FMLS remains NP-hardeven if each query has exactly two events.

    The full proof of this theorem can be found in Appendix I.So despite the continuous relaxation of the decision vari-

    ables of IMLS, FMLS is still hard to solve. However, inthe following we show that FMLS has special structure thatallows it to be solved approximately under fairly general as-sumptions.

    5.2.2 Definitions of approximationWe will first describe two definitions of approximation

    commonly used in the numerical optimization literature.

    The first definition is similar to the approximation ratioused in combinatorial optimization.

    DEFINITION 7. [31] Given a maximization problem P thathas maximum value v

    max

    (P ) > 0. We say a polynomialtime algorithm has an absolute approximation ratio ✏ 2 [0, 1],if the value found by the algorithm, v(P ), satisfies v

    max

    (P )�v(P ) ✏ v

    max

    (P ).The second notion of relative approximation ratio is widely

    used in the optimization literature [22, 31, 41, 46].DEFINITION 8. [31] Given a maximization problem P that

    has maximum value vmax

    (P ) and minimum value vmin

    (P ).We say a polynomial time algorithm has a relative approxi-mation ratio ✏ 2 [0, 1], if the value found by the algorithm,v(P ), satisfies v

    max

    (P )�v(P ) ✏(vmax

    (P )� vmin

    (P )).

    Relative approximation ratio is used to bound the qualityof solutions relative to the possible value range of the func-tion. We refer to this as ✏-relative-approximation to differ-entiate from ✏-approximation used Definition 7.

    Note that in both cases, ✏ indicates the size of the gapbetween an approximate solution and the optimal value. Sosmaller ✏ values are desirable in both definitions.

    5.2.3 Results for relative approximation boundIn general, the feasible region specified in FMLS is the in-

    tersection of a unit hyper-cube, and the region below a scaledsimplex. We first normalize (18) in FMLS using a change ofvariables. Let x̄0

    j

    =

    �jmjxj

    M

    be the scaled variables. We canobtain the following formulation FMLS’.

    (FMLS’) maxX

    Qi2Qni

    wi

    Y

    Ej2Qi

    M

    �j

    mj

    x̄0j

    (22)

    s.t.X

    Ej2⌃x̄0j

    = 1 (23)

    0 x̄0j

    �jmjM

    (24)

    Note that the inequality constraint (18) in FMLS is nowreplaced by an equality constraint (23) in FMLS’. This willnot change the optimal value of FMLS as long as

    PEj

    �jmj

    M

    �1 (otherwise, although

    PEj2⌃ x̄

    0j

    = 1 is unattainable, thememory budget becomes sufficient and the optimal solutionis trivial). This is because all coefficients in (17) are posi-tive, pushing any x

    i

    to a larger value will not hurt the objec-tive value. Since the constraint (18) is active for at least oneglobal optimal point in FMLS, changing the inequality in theknapsack constraint to an equality in (23) will not change theoptimal value.

    Denote by d = max{|Qi

    |} the maximum number of eventtypes in a query. We will assume in this section that d is afixed constant. Note that this is a fairly realistic assumption,as d tends to be very small in practice (in HyReminder [48],for example, the longest query has 6 event types, so d = 6).Observe that d essentially corresponds to the degree of thepolynomial in the objective function (22).

    9

  • An approximation using co-centric balls. Using a ran-domized procedure from the optimization literature [31], weshow that FMLS’ can be approximated by bounding the fea-sible region using two co-centric balls to obtain a (loose)relative-approximation ratio as follows.

    THEOREM 7. The problem of FMLS’ admits a relativeapproximation-ratio ✏, where ✏ = 1�O

    ⇣|⌃|� d�22 (t2 + 1)� d2

    and t = min✓min

    Ej

    ⇣�jmj

    M

    ⌘, 1p

    |⌃|

    ◆.

    The proof of this result can be found in Appendix J. Notethat this is a general result that only provides a loose relativeapproximation bound, which is a function of the degree ofthe polynomial d, the number of event types |⌃|, and �jmj

    M

    ,and cannot be adjusted to a desirable level of accuracy.

    An approximation using simplex. We observe that thefeasible region defined in FMLS’ has special structure. Itis a subset of a simplex, which is the intersection between astandard simplex (Equation (23)) and box constraints (Equa-tion (24)).

    There exists techniques that produces relative approxima-tions for polynomials defined over simplex. For example, [22]uses a grid-based approximation technique, where the ideais to fit a uniform grid into the simplex, so that values on allnodes of the grid are computed and the best value is pickedas an approximate solution. Let �

    n

    be an n dimensionalsimplex, then (x 2 �

    n

    |kx 2 Zn+

    ) is defined as a k-grid overthe simplex. The quality of approximation is determined bythe granularity the uniform grid: the finer the grid, the tighterthe approximation bound is.

    The result in [22], however, does not apply here becausethe feasible region of FMLS’ represents a subset of the stan-dard simplex. We note that there exist a special case whereif �jmj

    M

    � 1, 8j (that is, if the memory requirement of asingle event type exceeds the memory capacity), the feasi-ble region degenerates to a simplex, such that we can usegrid-based technique for approximation.

    THEOREM 8. In FMLS’, if for all j we have �jmjM

    � 1then the problem admits a relative approximation ratio of ✏,where

    ✏ = O

    ✓1� k!

    (k � d)!kd

    for any k 2 Z+ such that k > d. Here k represents thenumber of grid points along one dimension.

    Note that as k ! 1, approximation ratio ✏ ! 0.The full proof of this result can be found in Appendix K.

    This result can provide increasingly accurate relative approx-imation bound for a larger k. It can be shown that for a givenk, a total of

    �|⌃|+k�1|⌃|�1

    �number of evaluations of the objective

    function is needed.

    5.2.4 Results for absolute approximation boundResults obtained so far use the notion of relative approx-

    imation (Definition 8). In this section we discuss a special

    case in which FMLS’ can be approximated relative to theoptimal value (Definition 7).

    We consider a special case in which queries are regularwith no repeated events. That is, in each query, no events ofthe same type occur more than once (e.g., query SEQ(A,B,C)is a query without repeated events while SEQ(A,B,A) is notbecause event A appears twice). This may be a fairly gen-eral assumption, as queries typically detect events of differ-ent types. HyReminder [48] queries, for instance, use norepeated events in the same query (two such example Q1and Q2 are given in the Introduction). In addition, we re-quire that each query has the same length as measured bythe number of events.

    With the assumption above, the objective function Equa-tion (22) becomes a homogeneous multi-linear polynomial,while the feasible region is defined over a sub-simplex thatis the intersection of a cube and a standard simplex. We ex-tend a random-walk based argument in [41] from standardsimplex to the sub-simplex, and show an (absolute) approx-imation bound.

    THEOREM 9. In FMLS’, if �jmjM

    s are fixed constants, inaddition if every query has no repeated event types and isof same query length, d, then a constant-factor approxima-tion can be obtained for FMLS’ in polynomial time. In par-ticular, we can achieve a (1 � �( k!

    (k�d)!kd ))-approximation,

    by evaluating Equation (21) at most�|⌃|+k�1

    |⌃|�1�

    times, where

    � = min⇣min

    j

    ⇣�jmj

    M

    ⌘, 1⌘d

    , and k > d is a parameterthat controls approximation accuracy.

    A proof of Theorem 9 can be found in Appendix L. Weuse a scaling method to extend the random-walk argument tothe sub-simplex in order to get the desirable constant factorapproximation. Note that, by selecting k = O(d2) we canget k!

    (k�d)!kd close to 1. Also note that if�jmj

    M

    � 1 for all j,� = 1 and we can get an approximation arbitrarily close tothe optimal value by using large k.

    6. CPU-BOUND LOAD SHEDDINGIn this section we consider the scenario where memory

    is abundant, while CPU becomes the limiting resource thatneeds to be budgeted appropriately. CPU-bound problemsturn out to be easy to solve.

    6.1 The integral variant (ICLS)In the integral variant CPU load shedding, we again use

    binary variables yi

    to denote whether results of Qi

    can begenerated. For each query Q

    i

    , at most ni

    number of querymatches can be produced. Assuming the utility weight ofeach result is w

    i

    , and the CPU cost of producing each resultis c

    i

    , when Qi

    is selected (yi

    = 1) a total of ni

    wi

    utility canbe produced, at the same time a total of n

    i

    ci

    CPU resourcesare consumed. That yields the following ICLS problem.

    10

  • (ICLS) maxX

    Qi2Qni

    wi

    yi

    (25)

    s.t.X

    Qi2Qni

    ci

    yi

    C (26)

    yi

    2 {0, 1} (27)

    ICLS is exactly the standard 0-1 knapsack problem, whichhas been studied extensively. We simply cite a result from [47,Chapter 5] for completeness.

    THEOREM 10. [47] The integral CPU-bound load shed-ding (ICLS) is NP-complete. It admits a fully polynomialapproximation scheme (FPTAS).

    ICLS is thus an easy variant among the load sheddingproblems studied in this work.

    6.2 The fractional variant (FCLS)Similar to the memory-bound load shedding problems, we

    also investigate the fractional variant where a sampling oper-ator is used to select a fixed fraction of query results. Insteadof using binary variables y

    i

    , we denote by yi

    the percentageof queries that are sampled and processed by CPU for out-put. The integrality constraints in ICLS are again dropped,and we can have the following FCLS formulation.

    (FCLS) maxX

    Qi2Qni

    wi

    yi

    (28)

    s.t.X

    Qi2Qni

    ci

    yi

    C (29)

    0 yi

    1, for all i (30)

    Since ni

    , wi

    , ci

    are all constants, this is a simple linearprogram problem that can be solved in polynomial time. Weconclude that FCLS can be efficiently solved.7. DUAL-BOUND LOAD SHEDDING

    Lastly, we study the dual-bound load shedding problem,where both CPU and memory resources can be limited.

    7.1 The integral variant (IDLS)The integral dual-bound load shedding (IDLS) can be for-

    mulated by combining CPU and memory constraints.

    (IDLS) maxX

    Qi2Qni

    wi

    yi

    s.t.X

    Ej2⌃�j

    mj

    xj

    M

    X

    Qi2Qni

    ci

    yi

    C

    yi

    Y

    Ej2Qi

    xj

    (31)

    xj

    , yi

    2 {0, 1}

    Binary variables xj

    , yi

    again denote event selection andquery selection, respectively. Note that in order to respectthe CPU constraint, not all queries whose constituent events

    are available in memory can be produced. This is modeledby an inequality in Equation (31).

    We show that the loss minimization version of IDLS ad-mits a tri-criteria approximation.

    THEOREM 11. Denote by M the given memory budgetC the given CPU budget, and l⇤ the optimal utility losswith that budget. IDLS admits ( 1

    , 11�⌧ ,

    1

    1�⌧ ) tri-criteria-approximation for any ⌧ 2 [0, 1]. That is, for any ⌧ 2 [0, 1],we can compute a strategy that uses no more than 1

    1�⌧M

    memory 11�⌧C CPU, and incur no more than

    l

    utility loss.

    The idea of the proof is to use LP-relaxation and round theresulting fractional solution, which is similar to Theorem 3.Detailed proof of Theorem 11 can be found in Appendix M.

    In addition, we show that the approximable special case ofIMLS (Theorem 4) also holds for IDLS. Details of the proofcan be found in Appendix N.

    THEOREM 12. Let p be the maximum number of queries

    that one event can participate in, and f =maxQi

    PEj2Qi

    mj�j

    M

    be the ratio between the size of the largest query and thememory budget. IDLS is p

    1�f -approximable in pseudo poly-nomial time.

    7.2 The fractional variant (FDLS)The fractional dual-bound problem once again relaxes the

    integrality constraints in IDLS to obtain the following FDLSproblem.

    (FDLS) maxX

    Qi2Qni

    wi

    yi

    (32)

    s.t.X

    Ej2⌃�j

    mj

    xj

    M

    X

    Qi2Qni

    ci

    yi

    C (33)

    yi

    Y

    Ej2Qi

    xj

    (34)

    xj

    , yi

    2 [0, 1]

    First of all, we note that the hardness result in Theorem 6still holds for FDLS, because FMLS is simply a special caseof FDLS without constraint (33).

    The approximation results established for FMLS, how-ever, do not carry over easily. If we fold (34) into Equa-tion (32), and look at the equivalent minimization problemby taking the inverse of the objective function, then by anargument similar to Lemma 1, we can show that objectivefunction is non-convex in general. In addition, we can showthat except in the trivial linear case, the constraint in Equa-tion (33) is also non-convex using a similar argument. So weare dealing with a non-convex optimization subject to non-convex constraints.

    It is known that solving or even approximating non-convexproblems with non-convex constraints to global optimalityis hard [14, 21]. Techniques dealing with such problems are

    11

  • relatively scarce in the optimization literature. Exploitingspecial structure of FDLS to optimize utility is an interest-ing area for future work.

    8. CONCLUSIONS AND FUTURE WORKIn this work we study the problem of load shedding in the

    context of the emerging Complex Event Processing model.We investigate six variants of CEP load shedding under var-ious resource constraints, and demonstrate an array of com-plexity results. Our results shed some light on the hardnessof load shedding CEP queries, and provide some guidancefor developing CEP shedding algorithms in practice.

    CEP load shedding is a rich problem that so far has re-ceived little attention from the research community. We hopethat our work will serve as a springboard for future researchin this important aspect of the increasingly popular CEP model.

    9. REFERENCES[1] Microsoft StreamInsight:

    http://www.microsoft.com/sqlserver/2008/en/us/r2-complex-event.aspx.[2] StreamBase: http:://www.streambase.com.[3] Sybase Aleri streaming platform high-performance CEP:

    http://www.sybase.com/files/data sheets/ sybase aleri streamingplatform ds.pdf.[4] Wavemark system: Clinical inventory management solution

    www.wavemark.net/healthcare.action.[5] A. Adi, D. Botzer, G. Nechushtai, and G. Sharo. Complex event processing for

    financial services. Money Science.[6] J. Agrawal, Y. Diao, D. Gyllstrom, and N. Immerman. Efficient pattern

    matching over event streams. In proceedings of SIGMOD, 2008.[7] M. H. Ali and C. G. et al. Microsoft cep server and online behavioral targeting.

    In proceedings of VLDB, 2009.[8] L. Aniello, G. Lodi, and R. Baldoni. Inter-domain stealthy port scan detection

    through complex event processing. In Proceedings of EWDC, 2011.[9] B. Babcock, S. Babu, R. Motwani, and M. Datar. Chain: Operator scheduling

    for memory minimization in data stream systems. In proceedings of SIGMOD,2003.

    [10] B. Babcock, M. Datar, and R. Motwani. Load shedding techniques for datastream systems. In Workshop on Management and Processing of Data Streams,2003.

    [11] M. Bellare and P. Rogaway. The complexity of approximating a nonlinearprogram. In Mathematical Programming 69, 1993.

    [12] A. Borodin and R. El-Yaniv. Online Computation and Competitive Analysis.Cambridge University Press, 1998.

    [13] J. M. Boyce and D. Pittet. Guideline for hand hygiene in healthcare settings.CDC Recommendations and Reports, 51, 2002.

    [14] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge UniversityPress, 2004.

    [15] K. Carter. Optimizing critical medical supplies in an acute care setting. RFID inHealth Care, 2009.

    [16] B. Chandramouli, J. Goldstein, and D. Maier. High-performance dynamicpattern matching over disordered streams. In proceedings of VLDB, 2010.

    [17] M. Cherniack, H. Balakrishnan, M. Balazinska, D. Carney, U. Cetintemel,Y. Xin, and S. Zdonik. Scalable distributed stream processing. In proceedings ofCIDR, 2003.

    [18] Y. Chi, H. J. Moon, and H. Hacigms. icbs: Incremental costbased schedulingunder piecewise linear slas. In proceedings of VLDB, 2011.

    [19] Y. Chi, H. J. Moon, H. Hacigms, and J. Tatemura. Sla-tree: a framework forefficiently supporting sla-based decisions in cloud computing. In proceedings ofEDBT, 2011.

    [20] A. Das, J. Gehrke, and M. Riedewald. Approximate join processing over datastreams. In proceedings of SIGMOD, 2003.

    [21] A. d’Aspremont and S. Boyd. Relaxations and randomized methods fornonconvex qcqps.

    [22] E. de Klerk, M. Laurent, and P. Parrilo. A PTAS for the minimization ofpolynomials of fixed degree over the simplex. In Theoretical Computer Science,2006.

    [23] Y. Diao, N. Immerman, and D. Gyllstrom. Sase+: An agile language for kleeneclosure over event streams. Technical report, University of Massachusetts atAmherst, 2007.

    [24] U. Feige and S. Kogan. Hardness of approximation of the balanced completebipartite subgraph problem. Technical report, Weizmann Institute, 2004.

    [25] U. Feige and M. Seltser. On the densest k-subgraph problem. Algorithmica, 29,1997.

    [26] B. Gedik, K.-L. Wu, and P. S. Yu. Efficient construction of compact sheddingfilters for data stream processing. In proceedings of ICDE, 2008.

    [27] B. Gedik, K.-L. Wu, P. S. Yu, and L. Liu. A load shedding framework andoptimizations for m-way windowed stream joins. In proceedings of ICDE, 2007.

    [28] O. Goldschmidt, D. Nehme, and G. Yu. Note: On the set-union knapsackproblem. In Applied Probability & Statistics, 2006.

    [29] D. Gyllstrom, E. Wu, H.-J. Chae, Y. Diao, P. Stahlberg, and G. Anderson.SASE: Complex event processing over streams. In proceedings of CIDR, 2007.

    [30] M. T. Hajiaghayi, K. Jain, K. Konwar, L. C. Lau, I. I. Mandoiu, A. Russell,A. Shvartsman, and V. V. Vazirani. The minimum k-colored subgraph problemin haplotyping and DNA primer selection. In International Workshop onBioinformatics Research and Applications, 2006.

    [31] S. He, Z. Li, and S. Zhang. General constrained polynomial optimization: anapproximation approach. In Technical report, 2009.

    [32] T. Johnson, S. Muthukrishnan, and I. Rozenbaum. Sampling algorithms in astream operator. In proceedings of SIGMOD, 2005.

    [33] J. Kang, J. F. Naughton, and S. D. Viglas. Evaluating window joins overunbounded streams. In proceedings of ICDE, 2003.

    [34] H. Kellerer, U. Pferschy, and D. Pisinger. Knapsack Problems. Springer, 2004.[35] W. Kleiminger, E. Kalyvianaki, and P. Pietzuch. Balancing load in stream

    processing with the cloud. In Workshop on Self-Managing Database Systems,2011.

    [36] E. D. Lazowska, J. Zahorjan, G. S. Graham, and K. C. Sevcik. QuantitativeSystem Performance. Prentice Hall, 1984.

    [37] M. Liu, M. Li, D. Golovnya, E. A. Rundensteiner, and K. Claypool. Sequencepattern query processing over out-of-order event streams. In proceedings ofICDE, 2009.

    [38] Y. Mei and S. Madden. Zstream: A cost-based query processor for adaptivelydetecting composite events. In proceedings of SIGMOD, 2009.

    [39] T. Motzkin and E. Straus. Maxima for graphs and a new proof of a theorem ofturn. Canadian Jornal of Mathamatics, 17, 1965.

    [40] S. Narayanan and F. Waas. Dynamic prioritization of database queries. Inproceedings of ICDE, 2011.

    [41] Y. Nesterov. Random walk in a simplex and quadratic optimization over simplexpolytopes. Center for Operations Research and Econometrics (CORE), 2003.

    [42] O. Papaemmanouil, U. Cetintemel, and J. Jannotti. Supporting generic costmodels for wide-area stream processing. In proceedings of ICDE, 2009.

    [43] K. P. J. Sgall and E. Torng. Handbook of scheduling: Algorithms, models, andperformance analysis. 2004.

    [44] U. Srivastava and J. Widom. Memory-limited execution of windowed streamjoins. In proceedings of VLDB, 2004.

    [45] N. Tatbul, U. Cetintemel, S. B. Zdonik, M. Cherniack, and M. Stonebraker.Load shedding in a data stream manager. In proceedings of VLDB, 2003.

    [46] S. A. Vavasis. Approximation algorithms for concave quadratic programming.Technical report, Cornell University, 1990.

    [47] V. Vazirani. Approximation Algorithms. Springer-Verlag Berlin Heidelberg,2003.

    [48] D. Wang and E. Rundensteiner. Active complex event processing over eventstreams. In proceedings of VLDB, 2011.

    [49] M. Wei, E. A. Rundensteiner, and M. Mani. Utility-driven load shedding forXML stream processing. In WWW, 2008.

    [50] M. Wei, E. A. Rundensteiner, and M. Mani. Achieving high output qualityunder limited resources through structure-based spilling in XML streams. Inproceedings of VLDB, 2010.

    [51] W. White, M. Riedewald, J. Gehrke, and A. Demers. What is “next” in eventprocessing? In proceedings of PODS, 2007.

    [52] A. Widder, R. v. Ammon, G. Hagemann, and D. Schnfeld. An approach forautomatic fraud detection in the insurance domain. In AAAI Spring Symposium:Intelligent Event Processing, 2009.

    [53] E. Wu, Y. Diao, and S. Rizvi. High-performance complex event processing overstreams. In proceedings of SIGMOD, 2006.

    [54] Y. Xing, J.-H. Hwang, U. Cetintemel, and S. Zdonik. In Providing Resiliency toLoad Variations in Distributed Stream Processing, 2006.

    [55] A. Yao. Probabilistic computations: Toward a unified measure of complexity. Inproceedings of FOCS, 1977.

    12

  • APPENDIXA. ADDITIONAL CEP JOIN SEMANTICS

    In order to define additional CEP join semantics, we de-fine two additional types of subsequences.

    DEFINITION 9. Given an event sequence S = (e1

    , e2

    , ...eN

    )

    and a subsequence S0 = (ei1 , ei2 , ...eim). The subsequence

    S0 is called a contiguous subsequence of S, if ij

    +1 = ij+1

    ,for all 1 j m� 1.

    The subsequence S0 is called a type-contiguous subse-quence of S, if for all events e

    ij of type Elj , where j 2[1,m], there does not exist another event e

    k

    in S, whereij�1 < k < ij , such that ek is also of the type Elj .

    We illustrate these definitions using the following exam-ple.

    EXAMPLE 7. Suppose there are four event types ⌃= {A,B,C, D}. Let the stream of arriving events be S = (A

    1

    , B2

    ,C

    3

    , D4

    , A5

    , B6

    , B7

    , C8

    , D9

    ), where the subscript denotesthe time-stamp of the event occurrence.

    In this example, (A1

    , B2

    , C3

    ) is a contiguous subsequence,because intuitively there is no intervening events between A

    1

    and B2

    , or B2

    and C3

    . However, (A5

    , B6

    , C8

    ) is not a con-tiguous subsequence, because there is a B

    7

    between B6

    andC

    8

    , violating the definition of contiguity.Although (A

    5

    , B6

    , C8

    ) is not a contiguous subsequence, itis a type-contiguous subsequence, because there is no eventof type B between A

    5

    and B6

    , and no event of type C be-tween B

    6

    and C8

    . The subsequence (A5

    , B7

    , C8

    ), on theother hand, is not type-contiguous subsequence because thereis an event of type B, B

    6

    , between A5

    and B7

    .

    We are now ready to define skip-till-next-match, and con-tiguity 1, as follows.

    DEFINITION 10. Join semantics skip-till-any-match, skip-till-next-match, and contiguity determine what event sub-sequence constitutes a query match. Specifically, a subse-quence S0 = (e

    i1 , ei2 , ... , eim) of S is considered a querymatch of Q = SEQ(q

    1

    , q2

    , ..., qn

    ) if:Event e

    il in S0 is of type ql

    for all l 2 [1, n] (patternmatches), and t

    in � ti1 T (Q) (within window). And inaddition,(1) For contiguity:

    The subsequence S’ has to be a contiguous subsequence.(2) For skip-till-next-match:

    The subsequence S’ has to be a type-contiguous subse-quence.(3) For skip-till-any-match:

    The subsequence S’ can be any subsequence.

    We illustrate query matches using Example 8.1An additional semantics, partitioned-contiguity, was discussedin [6]. Since partitioned-contiguity can be simulated using con-tiguity by sub-dividing event types using partition criteria, for sim-plicity it is not described here.

    EXAMPLE 8. We revisit the event sequence used in Ex-ample 7. Suppose there are three queries,Q

    1

    = skip-till-any-match(A,B,C),Q0

    1

    = skip-till-next-match(A,B,C),Q00

    1

    = contiguity(A,B,C),where T (Q

    1

    ) = T (Q01

    ) = T (Q001

    ) = 5.For Q

    1

    , (A1

    , B2

    , C3

    ), (A5

    , B6

    , C8

    ) and (A5

    , B7

    , C8

    ) areall query matches, because they are valid subsequences, thepattern matches {A,B,C} specified in the query, and eventsare within the time window 5. However, (A

    1

    , B7

    , C8

    ) is nota match even though pattern matches, because the differencein time stamp between C

    8

    and A1

    is over the window limit5.

    For Q01

    , (A1

    , B2

    , C3

    ) and (A5

    , B6

    , C8

    ) are query matches.Subsequence (A

    5

    , B7

    , C8

    ) is not a match because it is not atype-contiguous subsequence.

    For Q001

    , only subsequence (A1

    , B2

    , C3

    ) is a valid match,because (A

    5

    , B6

    , C8

    ) and (A5

    , B7

    , C8

    ) are not contiguoussubsequences.

    B. PROOF OF PROPOSITION 1PROOF. First, we show that no deterministic online shed-

    ding algorithm can achieve a competitive ratio independentof the length of the event sequence. In order to show this isthe case, it is sufficient to find an adversarial case for whicha bound on the competitive ratio is hard to produce.

    We construct such a scenario as follows. Suppose we havea universe of 2m event types, ⌃ = {E

    i

    } [ {E0i

    }, i 2 [1,m].Assume that there are m queries SEQ(E

    i

    , E0i

    ), 8i 2 [1,m],all with unit utility weight. The event stream is of the form(e

    1

    , e2

    , ..., em

    , X), where ei

    is of type Ei

    , and X is drawnfrom a uniform distribution on {E0

    i

    : i 2 [1,m]}. Supposethe system can only hold two events in memory due to mem-ory constraints.

    The optimal offline algorithm has a utility of 1. This isbecause it can look at the type of event X , denoted by E0

    l

    ,and keep the corresponding event e

    l

    of type El

    that arrivedpreviously. This ensures that one query match can be pro-duced.

    A deterministic online algorithm, on the other hand, needsto select one event into memory before seeing the event typeof X . Thus, the probability that a deterministic algorithmproduces a match is 1

    m

    . Its expected utility is also 1m

    giventhat queries all have a unit weight.

    Note that the ratio of the utility produced by the optimaloffline algorithm, and an online deterministic algorithm, is1

    m

    , where m is essentially the size of the event stream. Sincean event stream can be unbounded, the ratio can be arbitrar-ily bad.

    Next, we show that no randomized algorithm can do anybetter. Using Yao’s principle [55], we know that the ex-pected utility of a randomized algorithm against the worstcase input stream, is no more than the expected utility of thebest deterministic algorithm against the input distribution.Since we know any deterministic algorithm against the in-

    13

  • put distribution constructed above is 1m

    , it follows that theexpected utility of a randomized algorithm against the worstcase input stream from this input distribution is at most 1

    m

    ,thus completing this proof.

    C. QUERY MATCH ESTIMATIONSIn this section we describe a way in which the expected

    query matches of Qi

    over a unit time, ni

    , can be computed.It is not hard to image that n

    i

    can be estimated, because abrute force approach is to simply sample the arriving eventstream to count the number of query matches, ˆN

    i

    over time-window T (Q

    i

    ), so that the number of query matches overunit time, n

    i

    , can be simply computed as ni

    =

    ˆ

    NiT (Qi)

    . In thissection we will present an analytical approach to estimate n

    i

    by using event arrival rates �j

    under the skip-till-any-matchsemantics. Note that our discussion is only to show that thereexist ways to estimate n

    i

    , so that we can treat ni

    as constantsin our load shedding problem formulation.

    Assuming that events arrive in Possion process with ar-rival rate �

    i

    (a typical assumption in the performance mod-eling literature [36]), our estimation is produced as follows.First, for each event type E

    j

    2 Qi

    , the expected number ofE

    j

    occurrences in T (Qi

    ), denoted as lj

    , can be computedas l

    j

    = �j

    T (Qi

    ). Set �(Qi

    ) as the set of event types thatare part of Q

    i

    , |Qi

    | as the total number of events in Qi

    ,and m(E

    j

    ) be the number of occurrences of event type ei

    in Qi

    (as an example, a query Q = (A,A,B) would have�(Q) = {A,B}, m(A) = 2, m(B) = 1, and |Q| = 3).Let L =

    Pei2�(Qi) lj be the total number of occurrences

    of events relevant to Qi

    in time window T (Qi

    ), then theexpected number of query matches of Q

    i

    in T (Qi

    ) can beestimated as

    �L

    |Qi|�Q

    ei2�(Qi)Q

    m(ei)

    k=0

    (li

    � k)Q|Qi|

    r=0

    (L� r)(35)

    The formula of above is obtained as follows. Given a to-tal of L event occurrences in T (Q

    i

    ), we only pick a total|Q

    i

    | events to form a query match. Let us pick the first |Qi

    |events to form random permutations and compute the prob-ability that the first |Q

    i

    | events produces a match. Let thefirst position of Q

    i

    be an event of type Ep1 . The probability

    of actually seeing the event that type islEp1L

    . For the sec-ond event in Q

    i

    , if it is also of type Ep1 , the probability of

    seeing that event that type islEp1

    �1L�1 , otherwise it is

    lEp1L�1 , so

    on and so forth. In the end we haveQ

    Ej2�(Qi)Qn(Ej

    k=0 (li�k)Q|Qi|

    r=0 (L�r).

    Given that there are a total of�

    L

    |Qi|�

    such possible positionsout of L event occurrences and each of which is symmet-ric, the expected count of query matches can be expressedas the product of the two, thus the Equation (35). Given thatover time T (Q

    i

    ) this many number of event matches can beproduced, we can conclude that over unit time the expected

    number of query matches ni

    , is thus

    ni

    =

    1

    T (Qi

    )

    �L

    |Qi|�Q

    ei2�(Qi)Q

    m(ei)

    k=0

    (li

    � k)Q|Qi|

    r=0

    (L� r)

    D. PROOF OF THEOREM 1

    PROOF. We obtain the hardness result by a reduction fromDensest k-sub-hypergraph (DKSH). Recall that given a hy-pergraph G = (V,H), the decision version of DKSH is todetermine if there exists an induced subgraph G0 = (V 0, H 0),such that |V 0| k, and |H 0| � B for some given constantB.

    Given any instance of the DKSH problem, we construct anIMLS problem as follows. We build a bijection � : V ! ⌃,so that each vertex v

    j

    2 V corresponds to one event typeE

    j

    2 ⌃. For each hyperedge hi

    2 H , using vertices vk

    thatare endpoints of h

    i

    to construct a corresponding query Qi

    ,such that 8v

    k

    2 hi

    , �(vk

    ) 2 Qi

    . Set all Ej

    to have unit cost(m

    j

    �j

    = 1) and all Qi

    have unit weight (wi

    ni

    = 1), andlastly set the memory budget to k.

    We first show the forward direction, that is if there existsa solution to DKSH, i.e., a k-sub-hypergraph with at leastB hyper-edges, then there exists a solution to IMLS with nomore than k memory cost but achieves B utility. Supposethe subgraph G0 = (V 0, H 0) is the solution to DKSH. Inthe corresponding IMLS problem, if we keep all event types�(v0), 8v0 2 V 0, our solution has a memory cost of |V 0|,which is no more than k since G0 is a k-sub-hypergraph.This ensures the constructed solution is feasible. Further-more, the utility of the IMLS solution is exactly |H 0| by ourunit weight construction. Since we know |H 0| � B, thiscompletes the proof in this direction.

    In the other direction, let the set of events selected inIMLS be S ✓ ⌃ while the set of dropped events be D =⌃ \ S. The set of vertices V

    s

    corresponding to S inducesa subgraph G

    s

    = (Vs

    , Hs

    ). S respects memory-boundsin IMLS implies |V

    s

    | k, so Gs

    is a valid k-sub-hyper-graph. In addition, S produces utility B in IMLS ensuresthat |H

    s

    | � B, thus completing the hardness proof.

    E. PROOF OF THEOREM 2

    PROOF. We note that the reduction from Densest k-sub-hypergraph (DKSH) discussed above is approximation pre-serving. Utilizing a hardness result of DKSH [30], which es-tablishes that DKSH cannot be approximated within a factorof 2(logn)

    for some � > 0, we obtain the inapproximabilityresult of IMLS.

    F. PROOF OF THEOREM 4

    PROOF. We modify the IMLS problem in Equation (3)-(6) to a relaxed version IMLS’ as follows.

    14

  • (IMLS’) maxX

    Qi2Qni

    wi

    yi

    s.t.X

    Qi

    (yi

    X

    Ej2Qi

    �j

    mj

    ) M (36)

    yi

    =

    Y

    Ej2Qi

    xj

    xj

    , yi

    2 {0, 1}

    We essentially replace Equation (4) and Equation (5) inIMLS with Equation (36). Note that the left side of Equa-tion (36) is an overestimate of the memory consumption ofa particular event/query selection strategy. First, set x⇤

    j

    , y⇤i

    as an optimal solution to IMLS, where each event selectedparticipates at least one selected query. That is, if we setX = {x⇤

    j

    |x⇤j

    = 1, @Qi

    3 Ej

    , y⇤i

    = 1}, we have X = ;.There always exists one such optimal solution, because oth-erwise we can force x⇤

    j

    = 0, 8x⇤j

    2 X without changing theoptimal value opt⇤ (by definition no events E

    j

    correspond-ing to x⇤

    j

    2 X participates in queries Qi

    with y⇤i

    = 1), thusobtaining one optimal solution with X = ;. With this x⇤

    i

    , y⇤j

    we can show the following inequality

    X

    Ej2⌃�j

    mj

    x⇤j

    X

    Qi

    (y⇤i

    X

    Ej2Qi

    �j

    mj

    ) (37)

    This is the case because we know X = ;, so 8x⇤j

    =

    1, 9Qi

    3 Ej

    where y⇤i

    = 1. It ensures that for each term�j

    mj

    x⇤j

    where x⇤j

    = 1 from the left side of Equation (37),there exists one matching term y⇤

    i

    �j

    mj

    on the right side,where E

    j

    2 Qi

    , y⇤i

    = 1, thus providing the inequality.This inequality guarantees that any feasible solution of

    IMLS’ will be feasible for IMLS. This is because given Equa-tion (36) in IMLS’ and Equation (37), we know Equation (4)in IMLS holds, so the solution to IMLS’ will produce thesame value in IMLS but still respect the constraints in IMLS.

    In addition, we show the following inequality is true

    1

    p

    X

    Qi

    (y⇤i

    X

    Ej2Qi

    �j

    mj

    ) X

    Ej2⌃�j

    mj

    x⇤j

    (38)

    Recall that we know each event Ej

    participates at most pnumber of queries. If each event participates exactly onequery, we know

    PQi

    (y⇤i

    PEj2Qi �jmj) =

    PEj2⌃ �jmjx

    ⇤j

    .Now if each event appears in no more than p queries, thenfor all j, the coefficients of each term �

    j

    mj

    on the right handside is no more than p, thus the inequality

    With this inequality we further define the program IMLS”.

    (IMLS”) maxX

    Qi2Qni

    wi

    yi

    s.t.1

    p

    X

    Qi

    (y⇤i

    X

    Ej2Qi

    �j

    mj

    ) M (39)

    yi

    =

    Y

    Ej2Qi

    xj

    xj

    , yi

    2 {0, 1}

    Solution feasible to IMLS will be feasible to IMLS”, be-cause we only replace Equation (4) in IMLS with Equa-tion (39), and we know Equation (38) holds. Let the optimalvalue to IMLS” be opt00⇤, the optimal value of IMLS opt⇤,then

    opt00⇤ � opt⇤ (40)Furthermore, observe that both IMLS’ and IMLS” are now

    simple one dimensional knapsack problems, where IMLS”has p times more knapsack capacity than IMLS’ (becausewe can rewrite Equation (39) as

    PQi

    (y⇤i

    PEj2Qi �jmj)

    pM ). Suppose we solve IMLS’ using a simple knapsackheuristics, by picking queries (setting y

    i

    = 1) in the de-scending order of niwi

    �jmjuntil the knapsack M is filled, and

    denote the value so produced by val0. We can show val0 isgood relative to opt00⇤ in the following sense.

    val0 ⇥ p1� f � opt

    00⇤ (41)

    To see why this is the case, suppose in the knapsack inIMLS’, we cheat by allowing each query to be divided intofractional parts. The optimal value in this modified version,ˆopt

    0, can be produced by simply packing queries in the or-der of niwi

    �jmj. The difference between ˆopt0 and the heuristics

    val0 is the largest when the last query cannot fit in its entiretyusing IMLS’ heuristics. Since we know each query is rela-tively small compared to the budget M (bounded by fM ),we know

    val0

    ˆopt0� 1� f

    1

    = 1� f (42)

    Next, we know if we cheat again by allowing fractionalqueries in IMLS”, the optimal value ˆopt00 is no more than ptimes ˆopt0,

    ˆopt00 p⇥ ˆopt0 (43)

    because knapsack size has grown p times but we fill knap-sacks strictly based on their niwi

    �jmjvalues.

    Lastly, it is easy to see that

    ˆopt00 � opt00⇤ (44)

    Combining Equation (42) (43) (44) we can derive Equa-tion (41). Since we also know Equation (40), we get

    val0 ⇥ p1� f � opt

    ⇤ (45)

    15

  • thus completing the proof.

    G. PROOF OF THEOREM 5Proof Sketch. The IMLS problem studied in this work canbe formulated as a Set-union Knapsack problem [28]. A dy-namic programming algorithm is proposed in [28] for Set-union Knapsack, which however runs in exponential time. Ifwe define an adjacency graph G by representing all eventsas graph vertices, and each pair of vertices are connectedif the corresponding events co-occur in the same query. Theexponent of the running time is shown to be no more than thecut-width of the induced adjacency graph G, cw(G). Recallthat cut-width of a graph G is defined as the smallest integerk such that the vertices of G can be arranged in a linear lay-out [v

    1

    , ..., vn

    ] such that for every i 2 [1, n� 1], there are atmost k edges with one endpoint in {v

    1

    , ..., vi

    } and anotherendpoint in {v

    i+1

    , ..., vn

    }.In the context of a multi-tenant CEP system, assuming

    each non-overlapping CEP tenant uses at most k event types,then the size of the largest component of the adjacency graphG is at most k. This, combines with the fact that the degreeof each vertex in each component is at most k, ensures thatcw(G) k2. The running time of the dynamic program-ming approach in [28] can then be bounded by O(|⌃||Q|M2k2).Note that this result is pseudo-polynomial, because the run-ning time depends on the value of memory budget M insteadof the number of bits it needs to represent it.

    H. PROOF OF LEMMA 1

    PROOF. Denote H as the Hessian matrix of (21) repre-senting its second order partial derivatives. In the trivial casewhere (21) is just a linear function (each query has exactlyone event type), H is an all-ze