design and evaluation of a wide-area event noti cation service

52
Design and Evaluation of a Wide-Area Event Notification Service Antonio Carzaniga University of Colorado at Boulder David S. Rosenblum University of California, Irvine and Alexander L. Wolf University of Colorado at Boulder The components of a loosely-coupled system are typically designed to operate by generating and responding to asynchronous events. An event notification service is an application-independent infrastructure that supports the construction of event-based systems, whereby generators of events publish event notifications to the infrastructure and consumers of events subscribe with the in- frastructure to receive relevant notifications. The two primary services that should be provided to components by the infrastructure are notification selection (i.e., determining which notifications match which subscriptions) and notification delivery (i.e, routing matching notifications from publishers to subscribers). Numerous event notification services have been developed for local- area networks, generally based on a centralized server to select and deliver event notifications. Therefore, they suffer from an inherent inability to scale to wide-area networks, such as the In- ternet, where the number and physical distribution of the service’s clients can quickly overwhelm a centralized solution. The critical challenge in the setting of a wide-area network is to maxi- mize the expressiveness in the selection mechanism without sacrificing scalability in the delivery mechanism. This paper presents S iena, an event notification service that we have designed and implemented to exhibit both expressiveness and scalability. We describe the service’s interface to applications, the algorithms used by networks of servers to select and deliver event notifications, and the strategies used to optimize performance. We also present results of simulation studies that examine the scalability and performance of the service. Categories and Subject Descriptors: C.2.1 [Network Architecture and Design]: Distributed Networks; Network Communication; Network Topology; Store and Forward Networks; C.2.2 [Net- work Protocols]: Applications; Routing Protocols; C.2.4 [Distributed Systems]: Client/ser- Effort sponsored by the Defense Advanced Research Projects Agency, and Air Force Research Laboratory, Air Force Materiel Command, USAF, under agreement numbers F30602-94-C-0253, F30602-97-2-0021, F30602-98-2-0163 and F30602-99-C-0174; by the Air Force Office of Scientific Research, Air Force Materiel Command, USAF, under grant number F49620-98-1-0061; and by the National Science Foundation under Grant Number CCR-9701973. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Defense Advanced Research Projects Agency, Air Force Research Laboratory or the U.S. Government. Address: Antonio Carzaniga and Alexander L. Wolf, Department of Computer Science, University of Colorado at Boulder, 430 UCB, Boulder, CO 80309-0430, USA, {carzanig,alw}@cs.colorado.edu Address: David S. Rosenblum, Department of Information & Computer Science, University of California, Irvine, Irvine, CA 92697-3425, USA, [email protected]

Upload: others

Post on 23-May-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of aWide-Area Event Notification Service

Antonio Carzaniga

University of Colorado at Boulder

David S. Rosenblum

University of California, Irvine

and

Alexander L. Wolf

University of Colorado at Boulder

The components of a loosely-coupled system are typically designed to operate by generating andresponding to asynchronous events. An event notification service is an application-independentinfrastructure that supports the construction of event-based systems, whereby generators of eventspublish event notifications to the infrastructure and consumers of events subscribe with the in-frastructure to receive relevant notifications. The two primary services that should be provided tocomponents by the infrastructure are notification selection (i.e., determining which notificationsmatch which subscriptions) and notification delivery (i.e, routing matching notifications frompublishers to subscribers). Numerous event notification services have been developed for local-area networks, generally based on a centralized server to select and deliver event notifications.Therefore, they suffer from an inherent inability to scale to wide-area networks, such as the In-ternet, where the number and physical distribution of the service’s clients can quickly overwhelma centralized solution. The critical challenge in the setting of a wide-area network is to maxi-mize the expressiveness in the selection mechanism without sacrificing scalability in the deliverymechanism.

This paper presents S iena, an event notification service that we have designed and implementedto exhibit both expressiveness and scalability. We describe the service’s interface to applications,the algorithms used by networks of servers to select and deliver event notifications, and thestrategies used to optimize performance. We also present results of simulation studies that examine

the scalability and performance of the service.

Categories and Subject Descriptors: C.2.1 [Network Architecture and Design]: Distributed

Networks; Network Communication; Network Topology; Store and Forward Networks; C.2.2 [Net-work Protocols]: Applications; Routing Protocols; C.2.4 [Distributed Systems]: Client/ser-

Effort sponsored by the Defense Advanced Research Projects Agency, and Air Force ResearchLaboratory, Air Force Materiel Command, USAF, under agreement numbers F30602-94-C-0253,F30602-97-2-0021, F30602-98-2-0163 and F30602-99-C-0174; by the Air Force Office of ScientificResearch, Air Force Materiel Command, USAF, under grant number F49620-98-1-0061; and bythe National Science Foundation under Grant Number CCR-9701973. The U.S. Government isauthorized to reproduce and distribute reprints for Governmental purposes notwithstanding anycopyright annotation thereon. The views and conclusions contained herein are those of the authorsand should not be interpreted as necessarily representing the official policies or endorsements,either expressed or implied, of the Defense Advanced Research Projects Agency, Air Force ResearchLaboratory or the U.S. Government.Address: Antonio Carzaniga and Alexander L. Wolf, Department of Computer Science, Universityof Colorado at Boulder, 430 UCB, Boulder, CO 80309-0430, USA, {carzanig,alw}@cs.colorado.eduAddress: David S. Rosenblum, Department of Information & Computer Science, University ofCalifornia, Irvine, Irvine, CA 92697-3425, USA, [email protected]

Page 2: Design and Evaluation of a Wide-Area Event Noti cation Service

2 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

ver—Distributed applications; C.2.5 [Local and Wide-Area Networks]: Internet; C.2.6 [Inter-

networking]: Routers; C.4 [Performance of Systems]: Design Studies; Modeling Techniques;I.6.3 [Simulation and Modeling]: Applications; I.6.4 [Simulation and Modeling]: ModelValidation and Analysis; I.6.8 [Simulation and Modeling]: Types of Simulation—Discreteevent

General Terms: Algorithms, Experimentation, Performance

Additional Key Words and Phrases: event notification, publish/subscribe, content-based address-ing and routing

1. INTRODUCTION

The asynchrony, heterogeneity, and inherent loose coupling that characterize ap-plications in a wide-area network promote event interaction as a natural designabstraction for a growing class of software systems. An emerging building block forsuch systems is an infrastructure called an event notification service [Rosenblumand Wolf 1997].

We envision a ubiquitous event notification service accessible from every site ona wide-area network and suitable for supporting highly distributed applicationsrequiring component interactions ranging in granularity from fine to coarse. Con-ceptually, the service is implemented as a network of servers that provide accesspoints to clients. Clients use the access points to advertise information about eventsand subsequently to publish multiple notifications of the kind previously advertised.Thus, an advertisement expresses the client’s intent to publish a particular kind ofnotification. They also use the access points to subscribe for notifications of in-terest. The service uses the access points to then notify clients by delivering anynotifications of interest. Clearly, an event notification service complements othergeneral-purpose middleware services, such as point-to-point and multicast commu-nication mechanisms, by offering a many-to-many communication and integrationfacility.

The event notification service can carry out a selection process to determinewhich of the published notifications are of interest to which of its clients, routingand delivering notifications only to those clients that are interested. In addition toserving clients’ interests, the selection process also can be used by the event noti-fication service to optimize communication within the network. The informationthat drives the selection process originates with clients. More specifically, the eventnotification service may be asked to apply a filter to the contents of event notifica-tions, such that it will deliver only notifications that contain certain specified datavalues. The selection process may also be asked to look for patterns of multipleevents, such that it will deliver only sets of notifications associated with that pat-tern of event occurrences (where each individual event occurrence is matched by afilter).

Given that the primary purpose of an event notification service is to supportnotification selection and delivery, the challenge we face in a wide-area setting ismaximizing expressiveness in the selection mechanism without sacrificing scalabil-ity in the delivery mechanism [Carzaniga et al. 2000a]. Expressiveness refers to theability of the event notification service to provide a powerful data model with which

Page 3: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 3

to capture information about events, to express filters and patterns on notificationsof interest, and to use that data model as the basis for optimizing notification de-livery. In terms of scalability, we are referring not simply to the number of eventgenerators, the number of event notifications, and the number of notification recip-ients, but also to the need to discard many of the assumptions made for local-areanetworks, such as low latency, abundant bandwidth, homogeneous platforms, con-tinuous and reliable connectivity, and centralized control. We recognize that thereare other important attributes of an event notification service besides expressive-ness and scalability, including security, reliability, and fault tolerance, but we donot address them in this paper.

Intuitively, a simple event notification service that provides no selection mech-anism can be reduced to a multicast routing and transport mechanism for whichthere are numerous scalable implementations. However, once the service provides aselection mechanism, then the overall efficiency of the service and its routing of no-tifications are affected by the power of the language used to construct notificationsand to express filters and patterns. As the power of the language increases, so doesthe complexity of the processing. Thus, in practice, scalability and expressivenessare two conflicting goals that must be traded off.

This paper presents S iena, an event notification service that we have designedand implemented to maximize both expressiveness and scalability. In Section 3 wedescribe the service’s formally defined application programming interface (API),which is an extension of the familiar publish/subscribe protocol [Birman 1993].Several candidate server topologies and protocols are presented in Section 4. Wethen describe in Section 5 the routing algorithms used by the service to deliverevent notifications to clients; these algorithms are designed specifically for distri-buted networks of event servers. This is followed by a description of strategies foroptimizing the performance of the notification selection process. Supported in partby the results of simulation studies, we present an analysis of the scalability of ourdesign choices in Section 6. In our simulation studies, we focus on two alternativeservice architectures, one based on a hierarchical topology, and the other based ona peer-to-peer topology. In particular, we study how these two architectures per-form under various classes of applications in which the densities and distributionsof clients differ. We conclude in Sections 7 and 8 with a discussion of related workand a brief indication of our future plans.

2. FRAMING THE PROBLEM AND ITS SOLUTION

As discussed in Section 1, an event notification service implements two key ac-tivities, notification selection and notification delivery. A naive approach to re-alizing these activities is to employ a central server where all subscriptions arerecorded, where all notifications are initially targeted, where notifications are eval-uated against all subscriptions, and from where notifications are sent out to allrelevant subscribers. This solution is logically very simple, but is impractical inthe face of scale. Clearly, the service instead must be architected as a distributedsystem in which activities are spread across the network, hopefully exploiting somesort of locality, and hopefully exhibiting a reasonable growth in complexity.

In its most general form, a distributed event notification service is composed ofinterconnected servers, each one serving some subset of the clients of the service,

Page 4: Design and Evaluation of a Wide-Area Event Noti cation Service

4 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

event service

access point

interested party

servers

object of interest

subscribe

notify

advertise

publish ����������������

���������

���������

Fig. 1. Distributed Event Notification Service.

as shown in Figure 1. (Some use the terms proxy and broker instead of the termserver.) The clients are of two kinds: objects of interest, which are the generatorsof events, and interested parties, which are the consumers of event notifications.Of course, a client can act as both an object of interest and an interested party.Both kinds of clients interact with a locally-accessible server, which functions asan access point to the network-wide service. In practice, the service becomes awide-area network of pattern matchers and routers, overlaid on top of some otherwide-area communication facility, such as the Internet. One reasonable allocationof such servers might be to place a server at each administrative domain within thelow-level, wide-area communication network.

Creating a network of servers to provide a distributed service of any sort givesrise to three critical design issues.

—Interconnection topology. In what configuration should the servers be connected?—Routing algorithm. What information should be communicated between the

servers to allow the correct and efficient delivery of messages?—Processing strategy. Where in the network, and according to what heuristics,

should message data be processed in order to optimize message traffic?

These three design issues have been studied extensively for many years and in manycontexts. Our challenge is to find a solution in the particular domain of wide-areaevent notification, leveraging previous results (both positive and negative) whereverpossible.

In terms of interconnection topology, there are essentially two broad classes fromwhich to choose: a hierarchy and a general graph. Existing distributed event notifi-cation services, such as JEDI [Cugola et al. 1998] and TIBCO’s product TIB/Ren-dezvoustm, adopt a hierarchical topology. However, our analysis (presented in Sec-tion 6) shows that such a topology can exhibit significant performance problems. InS iena we have adopted the general graph, which in common terms means that theservers are organized in a peer-to-peer relationship, as we detail in Section 4. A hy-brid of the two structures—whether a hierarchy of peers, or peers of hierarchies—isalso a topology to consider, but requires a priori knowledge of the inherent struc-

Page 5: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 5

ture of the service’s applications in order to make a proper subdivision among peersand hierarchies. Having such knowledge would violate the notion that the serviceis general purpose.

Our desire for the event notification service to be general purpose also compli-cates the routing problem for the service. In particular, we assume that objects ofinterest have no knowledge of interested parties. Therefore, event notifications can-not be addressed and routed in the same, relatively simple manner as, for example,an electronic mail message. Moreover, we cannot assume any particular localityof objects of interest and interested parties, which is a fact that bears a strongrelationship to the server topology issue. At best we can only try to take advantageof any locality or structure in the message traffic as it emerges. We demonstratebelow that advertisements and subscriptions serve as the basis for this.

Given these considerations, solving the routing problem can be seen as a choiceamong three alternatives. Common to the three alternatives is the need to broadcastsome piece of information to all the servers in the network, where the broadcastis required by the lack of a priori knowledge of locality. The first alternativebroadcasts notifications, which implies that notification matching is performed ateach local server based on the subscriptions received at that server. This alternativesuffers from the drawback that all notifications are delivered to all local servers,whether or not they are serving any parties interested in the notifications.

The second and third alternatives try to take advantage of emergent localityand structure. In particular, the second alternative involves a broadcast of sub-scriptions. A shortest-path algorithm is used to route notifications back to onlythe local servers of interested parties. Under the third alternative, advertisementsare broadcast and subscriptions are then used to establish paths, akin to virtualcircuits, by which notifications are routed to only the local servers of interestedparties. Of course, both these alternatives suffer from the cost of having to storeeither all subscriptions or all advertisements at all servers. The drivers that tradeoff among the three alternatives are fairly straightforward to identify, but in the de-sign of a general-purpose service, any choice will be suboptimal for some situation,as we discuss in Section 5.

Fortunately, we can improve the situation considerably by being intelligent abouthow we allocate the notification matching tasks within the network. This is the de-sign issue that concerns the processing strategy. We observe that in practice manyparties are interested in “similar” events. Put another way, it is likely that distinctsubscriptions define partially, or even completely, overlapping sets of notifications.A similar observation can be made about objects of interest and their advertise-ments. We therefore sketch how this observation leads to a processing strategyfor subscriptions and assume a corresponding strategy exists for advertisements;Section 5 presents a detailed discussion of these strategies.

Based on our observation about the likely relationship among subscriptions, thestrategy works as follows. When a subscription reaches a server (either from aclient or from another server), the server propagates that subscription only if itdefines new selectable notifications that are not in the set of selectable notificationsdefined by any previously propagated subscription. Three benefits accrue fromthis approach: First, we reduce network costs by pruning the propagation of newsubscriptions. Second, we reduce the storage requirements for servers. Third, by

Page 6: Design and Evaluation of a Wide-Area Event Noti cation Service

6 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

reducing the number of subscriptions held at each server, we reduce the compu-tational resources needed to match notifications at that server. We use a similarstrategy for propagation of advertisements.

Up to this point in the discussion we have treated notifications, subscriptions, andadvertisements in rather abstract terms. We now make these concepts somewhatmore concrete as a basis for the material presented in the next several sections.

As mentioned in the introduction, the information associated with an event isrepresented by a data structure called a notification. We refer to the data modelor encoding schema of notifications as the event notification model or simply eventmodel. Most existing event notification services adopt a simple record-like struc-ture for notifications, while some more recent frameworks define an object-orientedmodel (e.g., the Javatm Distributed Event Specification [Sun Microsystems, Inc.1998] and the CORBA Notification Service [Object Management Group 1998b]).

Closely related to the event model is the subscription language, which definesthe form of the expressions associated with subscriptions. Two aspects of thesubscription language are crucial to the issue of expressiveness.

—Scope of the subscription predicates.This aspect is concerned with the visibility that subscriptions have into the con-tents of a notification. For a record-like notification structure, visibility deter-mines which fields can be used in specifying a subscription.

—Power of the subscription predicates.This aspect is concerned with the sophistication of operators that can be usedin forming subscription predicates. The predicates apply both to any possiblefiltering of individual notifications as well as to any possible formation of patternsof multiple notifications.

The dual of the subscription language is the advertisement language, which sharesthe issues of scope and power, but from the perspective of an object of interest,rather than an interested party. One consequence of this difference in perspective isthat interested parties may subscribe for patterns of multiple notifications, whereasobjects of interest advertise only individual notifications.

The following sections elaborate on these basic concepts and our approach toachieving expressiveness and scalability.

3. API AND SEMANTICS

At a minimum, an event notification service exports two functions that togetherdefine what is usually referred to as the publish/subscribe protocol. Interested partiesspecify the events in which they are interested by means of the function subscribe.Objects of interest publish notifications via the function publish. S iena extends thepublish/subscribe protocol with an additional interface function called advertise,which an object of interest uses to advertise the notifications it publishes. S iena

also adds the functions unsubscribe and unadvertise. Subscriptions can be matchedrepeatedly until they are cancelled by a call to unsubscribe. Advertisements remainin effect until they are cancelled by a call to unadvertise.

Table 1 shows the interface functions of S iena. The expression given to subscribeand unsubscribe is a pattern, while the expression given to advertise and unadvertiseis a filter ; we discuss patterns and filters in greater detail below. The parameter

Page 7: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 7

publish(notification n)

subscribe(string identity, pattern expression)

unsubscribe(string identity, pattern expression)

advertise(string identity, filter expression)

unadvertise(string identity, filter expression)

Table 1. Interface of S iena.

identity specifies the identity of the object of interest or interested party. Objectsof interest and interested parties must identify themselves to S iena when theyadvertise or subscribe, respectively, so that they can later cancel their own adver-tisements or subscriptions. The only requirement that S iena imposes on identifiersis that they be unique.

We note that S iena maintains a mapping between the identities of interestedparties and their handlers. A handler specifies the means by which an interestedparty receives notifications, either through callbacks or through messages sent viaa communication protocol such as HTTP or SMTP. Separating these two conceptsat the level of clients allows for the possibility of redirecting and/or temporarilysuspending the flow of notifications from objects of interest to interested parties,and also supports the mobility of clients. Detailed discussion of handlers is beyondthe scope of this paper.

3.1 Notifications

An event notification (or simply a notification) is a set of typed attributes. Forexample, the notification displayed in Figure 2 represents a stock price change event.Each individual attribute has a type, a name, and a value, but the notification as a

string class = finance/exchanges/stock

time date = Mar 4 11:43:37 MST 1998

string exchange = NYSE

string symbol = DIS

float prior = 105.25

float change = -4

float earn = 2.04

Fig. 2. Example of a Notification.

whole is purely a structural value derived from its attributes. Attribute names aresimply character strings. The attribute types belong to a predefined set of primitivetypes commonly found in programming languages and database query languages,and for which a fixed set of operators is defined.

The justification for choosing this typing scheme is scalability. In other systems,such as one finds for example in the Java Distributed Event Specification [SunMicrosystems, Inc. 1998] and CORBA Notification Service [Object ManagementGroup 1998b], a notification is a value of some named, explicit notification type.

Page 8: Design and Evaluation of a Wide-Area Event Noti cation Service

8 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

This implies a global authority for managing and verifying the type space, some-thing which is clearly not feasible at an Internet scale. On the other hand, we definea restricted set of attribute types from which to construct (arbitrary) notifications.By having this well-defined set, we can perform efficient routing based on the con-tent of notifications. As we discuss in Section 7, content-based routing has distinctadvantages over the alternative schemes of channel- and subject-based routing.

3.2 Filters

An event filter, or simply a filter, selects event notifications by specifying a setof attributes and constraints on the values of those attributes. Each attributeconstraint is a tuple specifying a type, a name, a binary predicate operator, and avalue for an attribute. The operators provided by S iena include all the commonequality and ordering relations (=, 6=, <, >, etc.) for all of its types; substring(∗), prefix (> ∗), and suffix (∗<) operators for strings; and an operator any thatmatches any value.

An attribute α = (typeα,nameα, valueα) matches an attribute constraint φ =(typeφ,nameφ, operatorφ, valueφ) if and only if typeα = typeφ ∧ nameα = nameφ ∧operatorφ(valueα, valueφ). We say an attribute α satisfies or matches an attributeconstraint φ with the notation α ≺ φ. When α matches φ, we also say that φcovers α. Figure 3 shows a filter that matches price increases for stock DIS onstock exchange NYSE.

string class >∗finance/exchanges/

string exchange = NYSE

string symbol = DIS

float change > 0

Fig. 3. Example of an Event Filter.

When a filter is used in a subscription, multiple constraints for the same attributeare interpreted as a conjunction; all such constraints must be matched. Thus, wesay that a notification n matches a filter f , or equivalently that f covers n (n ≺N

S ffor short):

n ≺NS f ⇔ ∀φ ∈ f : ∃α ∈ n : α ≺ φ

A filter can have two or more attribute constraints with the same name, in whichcase the matching rule applies to all of them. Also, the notification may containother attributes that have no correspondents in the filter. Table 2 gives someexamples that illustrate the semantics of ≺N

S . The second example is not a matchbecause the notification is missing a value for attribute level. The third example isnot a match because the constraints specified for attribute level in the subscriptionare not matched by the value for level in the notification.

3.3 Patterns

While a filter is matched against a single notification based on the notification’sattribute values, a pattern is matched against one or more notifications based onboth their attribute values and on the combination they form. At its most generic,

Page 9: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 9

notification subscription

string what = alarm

time date = 02:40:03≺N

S string what = alarm

string what = alarm

time date = 02:40:036≺N

S

string what = alarm

integer level > 3

string what = alarm

integer level = 106≺N

S

string what = alarm

integer level > 3

integer level < 7

string what = alarm

integer level = 5≺N

S

string what = alarm

integer level > 3

integer level < 7

Table 2. Examples of ≺NS .

a pattern might correlate events according to any relation. For example, the pro-grammer of a stock market analysis tool might be interested in receiving pricechange notifications for the stock of one company only if the price of a relatedstock has changed by a certain amount. Rich languages and logics exist that allowone to express event patterns [Mansouri-Samani and Sloman 1997].

In S iena we do not attempt to provide a complete pattern language. Our goalis rather to study pattern operators that can be exploited to optimize the selectionof notifications within the event notification service. Here, we restrict a pattern tobe syntactically a sequence of filters, f1 · f2 · · · fn, that is matched by a temporallyordered sequence of notifications, each one matching the corresponding filter. Anexample of a pattern is shown in Figure 4, which matches an increase in the priceof stock MSFT followed by a subsequent increase in the price of stock NSCP. In

string what >∗finance/exchanges/

string symbol = MSFT

float change > 0

•string what >∗finance/exchanges/

string symbol = NSCP

float change > 0

Fig. 4. Example of an Event Pattern.

general, we observe that more sophisticated forms of patterns can always be splitinto a set of simple subscriptions and then matched externally to S iena (i.e., at theaccess point of the interested party), although this is likely to induce extra networktraffic. We say that a pattern is simple when it is composed of a single filter, andsimilarly we say that a subscription is simple when it requests a simple pattern.

Page 10: Design and Evaluation of a Wide-Area Event Noti cation Service

10 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

There are many possible semantics for the filter sequence operator. In the inter-ests of scalability, we have opted for the simplest possible semantics, which ignoresout-of-order matches of notifications due to network latency (see Section 3.8). Tounderstand the semantics we chose, consider the pattern A·B (read “A followed byB”), which we assume to be submitted as a subscription at time t0. We representnotifications that match A as Aj

i , meaning the notification was generated at timeti by the object of interest and was matched at time tj by the server responsible formatching the pattern (and similarly for notifications Bj

i matched to B). Considerthe following sequence of notifications (shown in match order):

B14 A2

3 A31 B4

2 A55 B6

6 A77 B8

8

According to the semantics we chose, the server matching A·B uses the first Aji it

matches followed by the first Bmk it matches to form the first match of the pattern,

such that i < k and j < m. It then uses the next A it matches followed by the nextB it matches to form the second match of the pattern, and so on. Hence, the firstmatch of the pattern would be the sequence A2

3 · B66 , and the second match would

be the sequence A77 ·B8

8 . The matcher receives B14 first but discards it because it has

not yet matched an A. The first A it matches is A23, and so it ignores all subsequent

As until it matches a Bmk where k > 3. Thus it ignores A3

1 and A55 because it is

waiting for a B, and it also ignores B42 because it was generated before A2

3. HenceB6

6 is the first B that can be matched with A23. Once this whole sequence has been

matched, the matching of the pattern begins anew with the next A following B66 ,

which is A77. The second match of the pattern is completed with B8

8 .

3.4 Advertisements

We have seen how the covering relation ≺NS defines the semantics of filters in sub-

scriptions. We now define the semantics of advertisements by defining a similarrelation ≺N

A . The motivation for advertisements is to inform the event notificationservice about which kind of notifications will be generated by which objects of in-terest, so that it can best direct the propagation of subscriptions. The idea is that,while a subscription defines the set of interesting notifications for an interestedparty, an advertisement defines the set of notifications potentially generated by anobject of interest. Therefore, the advertisement is relevant to the subscription onlyif these two sets of notifications have a non-empty intersection.

The relation ≺NA defines the set of notifications covered by an advertisement:

n ≺NA a ⇔ ∀α ∈ n : ∃φ ∈ a : α ≺ φ

This expression says that an advertisement covers a notification if and only if itcovers each individual attribute in the notification. Note that this is the dual ofsubscriptions, which define the minimal set of attributes that a notification mustcontain. In contrast to subscriptions, when a filter is used as an advertisement,multiple constraints for the same attribute are interpreted as a disjunction ratherthan as a conjunction; only one of the constraints need be satisfied. Table 3 showssome examples of the relation ≺N

A . The second example is not a match because theattribute date of the notification is not defined in the advertisement. The fourthexample is not a match because the value of attribute what in the notification doesnot match any of the constraints defined for what in the advertisement.

Page 11: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 11

notification advertisement

string what = alarm ≺NA

string what = alarm

string what = login

string username any

string what = alarm

time date = 02:40:036≺N

A

string what = alarm

string what = login

string username any

string what = login

string username = carzanig≺N

A

string what = alarm

string what = login

string username any

string what = logout

string username = carzanig6≺N

A

string what = alarm

string what = login

string username any

Table 3. Examples of ≺NA .

3.5 Two Variants of the Semantics of S iena

We have studied two alternative semantics for S iena, a subscription-based semanticsand an advertisement-based semantics.

Under the subscription-based semantics, the semantics of subscriptions is definedsolely by the relation ≺N

S and its extensions to patterns. Advertisements are notenforced on notifications—they may be used for optimization purposes, or they canbe ignored completely by the implementation of the service. Thus, a notificationn is delivered to an interested party X if and only if X submitted at least onesubscription s such that n ≺N

S s.Under the advertisement-based semantics, both advertisements and subscriptions

are used. In particular, a notification n published by object Y is delivered tointerested party X if and only if Y advertised a filter a that covers n (i.e., such thatn ≺N

A a) and X registered a subscription s that covers n (i.e., such that n ≺NS s).

Under both semantics, a notification is delivered at most once to any interestedparty.

3.6 Other Important Covering Relations

So far we have defined a number of relations that express the semantics of subscrip-tions and advertisements:

—α ≺ φ: attribute α matches attribute constraint φ;—n ≺N

S f : notification n matches filter f , where f is interpreted as a subscriptionfilter;

—n ≺NA a: notification n matches filter a, where a is interpreted as an advertisement

filter;

From these, other relations can be derived:

Page 12: Design and Evaluation of a Wide-Area Event Noti cation Service

12 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

—f2 ≺SS f1: filter f1 covers filter f2, where f1 and f2 are interpreted as subscrip-

tions. Formally,

f2 ≺SS f1 ⇔ ∀n : n ≺N

S f2 ⇒ n ≺NS f1

which means that f1 defines a superset of the notifications defined by f2.—a2 ≺A

A a1: filter a1 covers filter a2, where a1 and a2 are interpreted as advertise-ments. Formally:

a2 ≺AA a1 ⇔ ∀n : n ≺N

A a2 ⇒ n ≺NA a1

which means that a1 defines a superset of the notifications defined by a2.—f ≺S

A a: filter a covers filter f , where a is interpreted as an advertisement and fis interpreted as a subscription. Formally,

f ≺SA a ⇔ ∃n : n ≺N

A a ∧ n ≺NS f

which means that a defines a set of notifications that has a non-empty intersectionwith the set defined by f .

The relations ≺SS and ≺A

A can also define the equality relation between filterswith its intuitive meaning:

f1 = f2 ⇔ f1 ≺ f2 ∧ f2 ≺ f1

.We now use the relations ≺S

S and ≺AA to define the semantics of unsubscriptions

and unadvertisements.

3.7 Unsubscriptions and Unadvertisements

Unsubscriptions and unadvertisements serve to cancel previous subscriptions andadvertisements, respectively. Given a simple unsubscription unsubscribe(X, f),where X is the identity of an interested party and f is a filter, the event notificationservice cancels all simple subscriptions subscribe(X, g) submitted by the sameinterested party X with a subscription filter g covered by f (i.e., such that g ≺S

S f).This semantics is extended easily to patterns: An unsubscription for a patternP = f1 · f2 · · · fk cancels all previous subscriptions S = g1 · g2 · · · gk such thatg1 ≺S

S f1∧g2 ≺SS f2∧ . . .∧gk ≺S

S fk. In an analogous way, unadvertisements cancelprevious advertisements that are covered according to the relation ≺A

A.Note that an unsubscription (unadvertisement) either cancels previous subscrip-

tions (advertisements) or else has no effect. It cannot impose further constraintsonto existing subscriptions. For example, subscribing with a filter [price > 100]and then unsubscribing with [price > 200] does not result in creation of a reducedsubscription, [price > 100, price≤ 200]. Rather, the unsubscription simply has noeffect, since it does not cover the subscription. Note also that all subscriptionscovered by an unsubscription are cancelled by that unsubscription. Thus, when aninterested party initially subscribes with a specific filter (say, [change > 10]), thensubscribes with a more generic one (say, [change > 0]), and then finally unsubscribeswith a filter that covers the more generic subscription (say, [change > 0]), the effectis to cancel all the previous subscriptions, not to revert to the more specific one[change > 10].

Page 13: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 13

3.8 Timing issues

The semantics of S iena depends on the order in which S iena receives and processesrequests (subscriptions, notifications, etc.). For instance, in the subscription-basedsemantics, a subscription s is effective after it is processed and until an unsubscrip-tion u that cancels s is processed.

In the most general case, a service request R, say a subscription, is generatedat time Rg, received at time Rr and completely processed at time Rp (with Rg ≤Rr ≤ Rp). S iena guarantees the correct interpretation of R immediately afterRp. Notice that the external delay Rg − Rr is caused by external communicationmechanisms and is by no means controllable by S iena. The processing delay Rp−Rg

is instead directly caused by computations and possibly by other communicationdelays internal to S iena.

S iena’s semantics is that of a best effort service. This means that the imple-mentation of S iena must not introduce unnecessary delays in its processing, butit is not required to prevent race conditions induced by either the external delayor the processing delay. Clients of S iena must be resilient to such race conditions;for instance, they must allow for the possibility of receiving a notification for acancelled subscription.

S iena associates a timestamp with each notification to indicate when it waspublished.1 This allows the service to detect and account for the effects of latencyon the matching of patterns, which means that within certain limits the actualorder of notifications can be recognized.

4. ARCHITECTURES: SERVER TOPOLOGIES AND PROTOCOLS

The previous section describes the protocol by which clients (i.e., objects of interestand interested parties) communicate with the servers that act as the clients’ accesspoints to the event notification service. As mentioned in Section 2, the servers them-selves communicate in order to cooperatively distribute the selection and deliverytasks across a wide-area network. The servers must therefore be arranged into aninterconnection topology and make use of a server/server communication protocol.Together, the topology and protocol define what we refer to as an architecture forthe event notification service.

The architecture is assumed to be implemented on top of a lower-level networkinfrastructure. In particular, a topological connection between two servers does notnecessarily imply a permanent or direct physical connection between those servers,such TCP/IP. Moreover, the server/server protocol might make use of any one of anumber of network protocols, such as HTTP or SMTP, through standard encodingand/or tunneling techniques. All we assume at this point in the discussion is thata given server can communicate with some number of other specific servers by ex-changing messages. This is the same assumption we make about the communicationbetween clients and servers.

In this section we consider three basic architectures: hierarchical client/server,

1With the advent of accurate network time protocols and the existence of the satellite-based GlobalPositioning System (GPS), it is reasonable to assume the existence of a global clock for creationof these timestamps, and it is hence reasonable for all but the most time-sensitive applications torely on these timestamps.

Page 14: Design and Evaluation of a Wide-Area Event Noti cation Service

14 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

acyclic peer-to-peer, and general peer-to-peer. We also consider some hybrid archi-tectures. Because it is not scalable, we ignore the degenerate case of a centralizedarchitecture having a single server.

4.1 Hierarchical Client/Server Architecture

A natural way of connecting event servers is according to a hierarchical topology,as illustrated in Figure 5. In this topology, pairs of connected servers interact in anasymmetric client/server relationship. Hence, we use a directed graph to representthe topology of this architecture, and we refer to this architecture as a hierarchicalclient/server architecture (or simply a hierarchical architecture). A server canhave any number of incoming connections from other “client” servers, but only oneoutgoing connection to its own “master” server. A server that has no “master”server of its own is referred to as a root.

H

H

H

H

H

servers

protocolclient/server

clients

Fig. 5. Hierarchical Client/Server Architecture.

The hierarchical architecture is a straightforward extension of a centralized ar-chitecture. It only requires that the basic central server be modified to propagateany information that it receives (i.e., subscriptions, etc.) on to its “master” server.In fact, the server/server protocol we use within the hierarchical architecture isexactly the same as the protocol described in Section 3 for communication betweenthe servers and the external clients of the event notification service. Thus, in termsof communication, a server is not distinguished from objects of interest or interestedparties. In practice, this means that a server will receive subscriptions, advertise-ments and notifications from its “client” servers, and will send only notificationsback to those “client” servers.

As we demonstrate in Section 6.2.2.3, the main problem exhibited by the hier-archical architecture is the potential overloading of servers high in the hierarchy.Moreover, every server acts as a critical point of failure for the whole network. In

Page 15: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 15

fact, a failure in one server disconnects all the subnets reachable from its “master”server and all the “client” subnets from each other.

4.2 Acyclic Peer-to-Peer Architecture

In the acyclic peer-to-peer architecture, servers communicate with each other sym-metrically as peers, adopting a protocol that allows a bi-directional flow of sub-scriptions, advertisements, and notifications. Hence we use an undirected graph torepresent the topology of this architecture. (As always, the external clients of theservice use the standard client/server protocol described in Section 3.) The config-uration of the connections among servers in this architecture is restricted so thatthe topology forms an acyclic undirected graph. Figure 6 shows an acyclic peer-to-peer architecture of servers. The communication between servers is representedby thick undirected lines, while the communication between clients and servers isrepresented by dashed arrows.

A

A

A A

A

protocolclient/server

server/serverprotocol

Fig. 6. Acyclic Peer-to-Peer Server Architecture.

It is important that the procedures adopted to configure the connections amongservers maintain the property of acyclicity, since routing algorithms might rely onthe property to assume, for instance, that any two servers are connected with atmost one path. However, ensuring this can be difficult and/or costly in a wide-areaservice in which administration is decentralized and autonomous.

As in the hierarchical architecture, the lack of redundancy in the topology con-stitutes a limitation in assuring connectivity, since a failure in one server S isolatesall the subnets reachable from those servers directly connected to S.

4.3 General Peer-to-Peer Architecture

Removing the constraint of acyclicity from the acyclic peer-to-peer architecture,we obtain the general peer-to-peer architecture. Like the acyclic peer-to-peer archi-tecture, this architecture allows bi-directional communication between two servers,

Page 16: Design and Evaluation of a Wide-Area Event Noti cation Service

16 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

but the topology can form a general undirected graph, possibly having multiplepaths between servers. An example is shown in Figure 7.

G

G G

G

G

client/serverprotocol

server/serverprotocol

Fig. 7. General Peer-to-Peer Server Architecture.

The advantage of the general peer-to-peer architecture over the previous two ar-chitectures is that it requires less coordination and offers more flexibility in theconfiguration of connections among servers. Moreover, allowing redundant connec-tions makes it more robust with respect to failures of single servers. The drawbackof having redundant connections is that special algorithms must be implemented toavoid cycles and to choose the best paths. Typically, messages will carry a “time-to-live” counter, and routes will be established according to minimal spanning trees.Consequently, the server/server protocol adopted in the general peer-to-peer archi-tecture must accommodate this extra information.

4.4 Hybrid Architectures

A wide-area, large-scale, decentralized service such as S iena poses different require-ments at different levels of administration. In other words, we must account forintermediate levels between the local area and the wide area. We can potentiallytake advantage of these intermediate levels to gain some efficiencies by consideringthe use of different architectures at different levels of network granularity.

For example, in the case of a multi-national corporation, it might be reasonableto assume a high degree of control and coordination in the administration of thecluster of subnets of the corporation’s intranet. The administrators of this intranetmight very well be able to design and manage the whole network of event serversdeployed on their subnets, and thus it might be a good idea to adopt a hierarchicalarchitecture within the intranet. Of course, the intranet would connect to othernetworks outside of the influence of the administrators. Thus, what could arise is ageneral peer-to-peer architecture at the global level, serving to interconnect differentcorporate intranets, each having a hierarchical architecture. This is illustrated inFigure 8.

Page 17: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 17

H

G

G

G

GH

HH

H HH

HH

H

HHH

H

intranets

Fig. 8. Hierarchical/General Hybrid Server Architectures.

GG

G/A

GG

G

G

G/A

GG

G G/A

G/A

G

G

G

general peer−to−peerprotocol

acyclic peer−to−peerprotocol

Fig. 9. General/Acyclic Hybrid Server Architectures.

Page 18: Design and Evaluation of a Wide-Area Event Noti cation Service

18 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

In other cases, we might want to invert the structure, as illustrated in Figure 9.For example, suppose that some clusters of subnets carry a high degree of event-service message traffic, and for some specific applications or perhaps for securityreasons, only a small fraction of that traffic is visible outside the cluster. In thiscase, for efficiency reasons a general peer-to-peer architecture might be preferablewithin the clusters, while the high-level architecture could be acyclic peer-to-peer.For every cluster, there would be a gateway server that should be able to filter themessages used for the protocol inside the cluster, and adapt them to the protocolused between clusters. For example, if a protocol is used locally within a cluster todiscover minimal spanning trees, then the messages associated with that protocolshould not be propagated outside the cluster.

Hybrid architectures such as these are somewhat more complicated than thethree basic architectures. Nevertheless, they offer the opportunity to tailor theserver/server topologies and protocols in such a way that localities can be ex-ploited.

5. ROUTING ALGORITHMS AND PROCESSING STRATEGIES

Once a topology of servers is defined, the servers must establish appropriate routingpaths to ensure that notifications published by an object of interest are correctly de-livered to all the interested parties that subscribed for them. In general, we observethat notifications must “meet” subscriptions somewhere in the network so that thenotifications can be selected according to the subscriptions and then dispatched tothe subscribers. This common principle can be realized according to a spectrum ofpossible routing algorithms. One possibility is to maintain subscriptions at theiraccess point and to broadcast notifications throughout the whole network; when anotification meets and matches a subscription, the subscriber is immediately noti-fied locally. However, since we expect the number of notifications to far exceed thenumber of subscriptions or advertisements, this strategy appears to offer the leastpossible efficiency, and so we consider it no further for S iena.

5.1 Routing Strategies in S iena

To devise more efficient routing algorithms, we employ principles found in IP multi-cast routing protocols [Deering and Cheriton 1990]. Similar to these protocols, themain idea behind the routing strategy of S iena is to send a notification only towardevent servers that have clients that are interested in that notification, possibly usingthe shortest path. The same principle applies to patterns of notifications as well.More specifically, we formulate two generic principles that become requirements forour routing algorithms:

downstream replication: A notification should be routed in one copy as far aspossible and should be replicated only downstream, that is, as close as possibleto the parties that are interested in it. This principle is illustrated in Figure 10.

upstream evaluation: Filters are applied and patterns are assembled upstream,that is, as close as possible to the sources of (patterns of) notifications. Thisprinciple is illustrated in Figure 11.

These principles are implemented by two classes of routing algorithms, the first ofwhich involves broadcasting subscriptions and the second of which involves broad-

Page 19: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 19

3

6

4

51

2

subscribe

subscribe

publish

replication

Fig. 10. Downstream Notification Replication.

3

1 5

4

62

publish(X)

publish(Y)

subscribe(X·Y)

filter Y

filter X

assemble X·Y

Fig. 11. Upstream Filter and Pattern Evaluation.

Page 20: Design and Evaluation of a Wide-Area Event Noti cation Service

20 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

casting advertisements:

subscription forwarding: In an implementation that does not use advertise-ments, the routing paths for notifications are set by subscriptions, which arepropagated throughout the network so as to form a tree that connects thesubscribers to all the servers in the network. When an object publishes a noti-fication that matches that subscription, the notification is routed towards thesubscriber following the reverse path put in place by the subscription.

advertisement forwarding: In an implementation that uses advertisements, itis safe to send a subscription only towards those objects of interest that in-tend to generate notifications that are relevant to that subscription. Thus,advertisements set the paths for subscriptions, which in turn set the pathsfor notifications. Every advertisement is propagated throughout the network,thereby forming a tree that reaches every server. When a server receives asubscription, it propagates the subscription in reverse, along the paths to alladvertisers that submitted relevant advertisements, thereby activating thosepaths. Notifications are then forwarded only through the activated paths.

In the process of forwarding subscriptions, S iena exploits commonalities amongthe subscriptions. In particular, S iena prunes the propagation trees by propagatingalong only those paths that have not been covered by previous requests. The derivedcovering relation ≺S

S is used to determine whether a new subscription is covered by aprevious one that has already been forwarded. Advertisements are treated similarlyusing the relation ≺A

A. And although not discussed in detail here, unsubscriptionsand unadvertisements are handled in a similar way as well.

Subscription forwarding algorithms realize a subscription-based semantics, whileadvertisement forwarding algorithms realize an advertisement-based semantics. Aswe show in Section 5.3, advertisement forwarding algorithms are needed in orderto implement the upstream evaluation principle for event patterns.

In addition to the principles introduced above, we have also devised several otherstrategies that lead to optimizations in resource usage. These are discussed else-where [Carzaniga 1998; Carzaniga et al. 1999].

5.2 Putting Algorithms and Topologies Together

We now describe in detail how subscription forwarding and advertisement forward-ing algorithms are implemented within the hierarchical and peer-to-peer architec-tures. In particular, we describe the principal data structures maintained by serversand the main algorithms that process the various requests coming from clients orother servers. Here we consider only simple subscriptions; Section 5.3 deals withpatterns.

At a high level and in general terms, the algorithms for the acyclic and peer-to-peer architectures attempt to reduce communication, storage, and computationcosts by pruning spanning trees over a network of S iena servers. More specifically,the subscription-forwarding algorithms operate by broadcasting subscriptions alongspanning trees rooted at interested parties. When a server receives a new subscrip-tion, it can terminate the further propagation of that subscription if it has alreadypropagated a more general subscription that covers the new one. In this wayservers prune spanning trees along which new subscriptions are propagated. The

Page 21: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 21

advertisement-forwarding algorithms operate in an analogous fashion by pruningthe spanning trees rooted in objects of interest. Computation of spanning treesin a network is a solved problem [Dalal and Metcalfe 1978] and, therefore, we donot discuss their construction. Instead, our focus is on the details of pruning. Thealgorithms for the hierarchical architectures are simpler, because subscriptions andadvertisements are not propagated along spanning trees, but are merely propagatedalong unique paths to the root of the hierarchy.

5.2.1 The Filters Poset. In order to keep track of previous requests, their re-lationships, where they came from, and where they have been forwarded, eventservers maintain a data structure that is common to the different algorithms andtopologies. This data structure represents a partially ordered set (poset) of filters.The partial order is defined by the covering relations ≺S

S for subscription filters,and ≺A

A for advertisement filters. We denote with PS a poset defined by ≺SS , and

denote with PA a poset defined by ≺AA. Figure 12 shows an example of a poset of

subscriptions.

orig=DENairline=UA

orig=DENairline=UAprice<800 orig=DEN

airline=UAprice<600dest=MXP

x=any

x=3 x>5

x>0x<8

price<800

price<700

orig=DEN

Fig. 12. Example of a Poset of Simple Subscriptions. Arrows represent the immediate relation≺S

S .

Note that ≺SS and ≺A

A are transitive relations, while the diagram and its repre-sentation in memory store immediate relationships only. In a poset PS , orderedaccording to ≺S

S , a filter f1 is an immediate predecessor of another filter f2 and f2

is an immediate successor of f1 if and only if f1 ≺SS f2 and there is no other filter

f3 in PS such that f1 ≺SS f3 ≺S

S f2. The “top-level” filters, which we refer to asroots, are those that have no successors in the poset.

inserting a new filter f into a poset, three different cases apply that are of specialinterest for the forwarding algorithms:

—f is added as a root filter;—f exists already in the poset; or—f is inserted somewhere in the poset with a non-empty set of successors.

As we detail below, only root filters produce network traffic, due to the propa-gation of subscriptions (or advertisements). Thus the “shape” of a subscription

Page 22: Design and Evaluation of a Wide-Area Event Noti cation Service

22 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

(or advertisement) poset roughly reflects the degree of opportunity presented toour processing strategies. In particular, a poset that extends “vertically” indi-cates that subscriptions are very much interdependent and that there are just afew subscriptions summarizing all the other ones. Conversely, a poset that extends“horizontally” indicates that there are few similarities among subscriptions andthat there are thus few opportunities to reduce network traffic.

5.2.2 Hierarchical Client/Server Architecture. A hierarchical server maintains itssubscriptions in a poset PS . Each subscription s in PS has an associated set calledsubscribers(s) containing the identities of the subscribers of that filter. Every serveralso has a variable master, possibly null, containing the identity of its “master”server.

5.2.2.1 Subscriptions. Upon receiving a simple subscription subscribe(X, f), aserver E walks through its subscription poset PS , starting from each root subscrip-tion, looking for a filter f ′ that covers the new filter f and that contains X in itssubscribers set: f ≺S

S f ′ ∧ X ∈ subscribers(f ′). If the server finds such a subscrip-tion f ′ in PS , it simply terminates the search without any effect. This happenswhen the same interested party (X) has already subscribed for a more generic filter(f ′).

In case the server does not find such a subscription, the search process terminatesproducing two possibly empty sets f and f , representing the immediate successorsand the immediate predecessors of f , respectively. If f = f = {f}, that is, if filter falready exists in PS , then the server simply inserts X in subscribers(f). Otherwise,f is inserted in PS between f and f , and X is inserted in its subscribers set.

Only if f = ∅, that is, only if f is inserted as a root subscription, does the serverthen forward the same subscription to its master server. In particular, if master isnot null, the server (E) sends a subscription subscribe(E, f) to master.

If f 6= ∅, the server removes X from the sets of subscribers of all the subscriptionscovered by f . This is done by recursively walking breadth first through the poset PS

starting from the subscriptions in f . The recursion is stopped whenever X is foundin a subscription (and removed). Note that, in this process, some subscriptionsmight be left with no associated interested parties; such subscriptions are removedfrom PS .

We illustrate the processing of subscriptions in the hierarchical architecture withthe scenario depicted in Figure 13. Figure 13a depicts a hierarchical server (1)that has two clients (a and b) and a master server (2). The server receives andprocesses a subscription [airline=UA] from client a. The right side of the figureshows the subscription poset PS of server 1. The new subscription is inserted as aroot subscription, so server 1 forwards it to its master server (2).

In Figure 13b, server 1 receives another subscription [airline=UA, dest=DEN]from client b. Since this new subscription is already covered by the previouslyforwarded subscription (it is not made a root subscription in PS), server 1 does notforward it to its master.

In Figure 13c, server 1 processes another subscription [airline=any] from client a.This is a root subscription and so it is forwarded to server 2. In this case, server 1eliminates a from the subscribers of all the subscriptions covered by the new one. In

Page 23: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 23

(a)

1

2

a b

airline=UA

airline=UA

SP of server 1

subscribers

airline=UA

a

(b)airline=UAdest=DENdest=DEN

airline=UA

airline=UA

SP of server 1

new1

2

ba

b

a

(c)1

2

ba

airline=any

airline=any

SP of server 1

airline=UAdest=DEN

airline=UA

airline=any new

removed

b

a

a

Fig. 13. Example Scenario Using a Hierarchical Client/Server Architecture (Subscription).

Page 24: Design and Evaluation of a Wide-Area Event Noti cation Service

24 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

particular, it removes a from the first subscription [airline=UA]; because there areno other subscribers for that subscription, the subscription itself is also removed.

5.2.2.2 Notifications. When a server receives a notification n, it walks throughits subscriptions poset PS breadth first looking for all the subscriptions matchingn. In particular, the server initializes a queue Q with its root subscriptions. Then,the server iterates through each element s in Q. If n ≺N

S s, the server appends toQ all the immediate predecessors of s that have not yet been visited. Otherwise, ifn 6≺N

S s, the server removes s from the queue.When this process terminates, Q contains all the subscriptions that cover n.

The server then sends a copy of n to each subscriber of the subscriptions in Q.Independently of the matching of subscriptions, if the server has a master serverand the master server was not the sender of n, then the server also sends a copy ofn to its master server.

5.2.2.3 Unsubscription. Unsubscriptions cancel previous subscriptions, but theyare not exactly the inverse of subscriptions. They are slightly more complex tohandle and sometimes more expensive in terms of communication. One reason isthat a single unsubscription might cancel more than one previous subscription. Theother reason is that an unsubscription might cancel one or more root subscriptions,which in turn might uncover other more specific subscriptions (which in turn becomenew root subscriptions). In this case, the server must forward the unsubscriptionto its master server, but it must also forward the new root subscriptions as well.

More specifically, when a server receives an unsubscription unsubscribe(X, f),it removes X from the subscribers set of all the subscriptions in PS that are coveredby f . The algorithm used by the server in this case is a simple variation of thealgorithm that computes the set of matching subscriptions for a notification. Theonly difference is that the relation ≺S

S is used to fill the queue instead of ≺NS .

As a consequence of removing X , some subscriptions might remain with an emptyset of subscribers. Let SX be the set of such subscriptions and let Sr

X (SrX ⊂ SX)

be the set of those that are also root subscriptions in PS . The server computes SrX

as the union of all the immediate predecessors of each subscription in SrX . With all

this, the server:

(1) removes all the subscriptions in SX from PS ;(2) forwards the unsubscription for f to its master server; and(3) sends all the subscriptions in Sr

X to its master server.

Figure 14 continues the scenario from Figure 13c. Server 1 receives an unsub-scription for [airline=any] from client a. As a consequence, it removes a from thesubscribers of subscription [airline=any], which is the only subscription from a cov-ered by the unsubscription (in this case, the two filters coincide). The subscriptioncontains no more subscribers, and so the server removes it. But since it was also aroot subscription, the server forwards the unsubscription to its master along withthe new root subscription, [airline=UA, dest=DEN].

5.2.2.4 Advertisements. The advertisement forwarding technique does not applyto the hierarchical architecture. Although it would be possible to propagate ad-vertisements from a server to its master, this would be useless, since the master

Page 25: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 25

b

unsub sub

airline=UAdest=DEN

airline=any

airline=anyairline=UAdest=DEN

2

1

of server 1P

airline=any

ba

a

cancelled

S

Fig. 14. Example Scenario Using a Hierarchical Client/Server Architecture (Unsubscription).

server would never respond by sending back subscriptions. In fact, a hierarchicalserver considers all its clients as “normal” clients (i.e., outside the event notificationservice), so it would not forward subscriptions to them. In practice, advertisementsand unadvertisements are silently dropped.

5.2.3 Peer-to-Peer Architectures with Subscription Forwarding. In peer-to-peerarchitectures, each server maintains a set neighbors containing the identifiers of thepeer servers to which the server is connected. A peer-to-peer server also maintainsits subscriptions in a poset PS that is an extension of the subscription poset of ahierarchical server. As in the hierarchical server, a peer-to-peer server associates aset subscribers(s) with each subscription s, and it associates an additional set withs called forwards(s), which contains the subset of neighbors to which s has beenforwarded.

5.2.3.1 General vs. Acyclic Architectures. A subscription or notification is prop-agated from its origin to its destination following a minimal spanning tree. In anacyclic peer-to-peer architecture the path that connects any two servers (if it exists)is unique, and any such spanning tree coincides with the whole network of servers.Thus, when propagating a message m, say a subscription, a server simply sends itto all of its neighbors excluding the sender of m. Any server that propagates m isconsidered to be a sender of m, but the origin of a message is the (unique) eventnotification service access point to which the message is originally posted.

In a general peer-to-peer architecture, two servers might be connected by two ormore different paths. So when a server receives a message that must be forwardedthroughout the network of servers, the first server must make sure to forward it onlythrough the links of the minimal spanning tree rooted in the origin of that message.This is similar to the well-known problem of broadcasting information over a packet-switched network. In order to simplify the description of the algorithms we focusonly on acyclic peer-to-peer architectures; algorithms for the general peer-to-peerarchitectures can be found elsewhere [Carzaniga 1998].

5.2.3.2 Peer Connection Setup. A server E1 connects to a server E2 by sendinga peer connect(E1) request to E2. E2 can either accept or refuse the connection.In case E2 accepts E1 as a peer, E2 sends a confirmation message back to E1

Page 26: Design and Evaluation of a Wide-Area Event Noti cation Service

26 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

so that both servers add each other’s address to their neighbors set. Then theaccepting server E2 forwards every root subscription in its subscriptions poset PS

to the requesting server E1, adding E1 to the corresponding forwards set. Serverscan also be dynamically disconnected with a peer disconnect(E1) request. Whena server E2 receives a peer disconnect(E1), it removes E1 from its neighbors set,unsubscribes E1 for all its root subscriptions, and finally removes E1 from all itsforwards sets.

5.2.3.3 Subscriptions. The algorithm by which a peer-to-peer server processessubscriptions is an extension of the algorithm of the hierarchical server. When aserver receives a subscription subscribe(X, f), it searches its subscriptions posetPS for either

(1) a subscription f ′ that covers f and has X among its subscribers: f ≺SS f ′∧X ∈

subscribers(f ′). In this case, the search terminates with no effect; or

(2) a subscription f ′ that is equal to f and does not have X among its subscribers:f ≺S

S f ′ ∧ f ′ ≺SS f . Here the server adds X to subscribers(f ′); or

(3) two possibly empty sets f and f , representing the immediate successors andthe immediate predecessors of f respectively. Here the server inserts f as anew subscription between f and f , and adds X to subscribers(f).

In cases 2 and 3, the server also removes X from all the subscriptions in PS thatare covered by f , and then removes from PS those subscriptions that have no othersubscribers.

This procedure differs from the corresponding procedure of the hierarchical serverin how the peer-to-peer server forwards the subscription to its neighbors. Formally,given a subscription f in PS , let forwards(f) be defined as follows:

forwards(f) = neighbors − NST(f) −⋃

f ′∈PS∧f≺SSf ′

forwards(f ′) (1)

In other words, f is forwarded to all neighbors of the server except those notdownstream from the server along any spanning tree rooted at an original subscriberof f (the second term in the formula), and those to which subscriptions f ′ coveringf have been forwarded already by this server (the last term in the formula).

The second term in the formula (whose functor stands for “Not on any SpanningTree”) accounts for the fact that there may be multiple paths connecting a sub-scriber to potential publishers, and that therefore the propagation of a subscriptionf must follow only the computed spanning trees rooted at the original subscribersof f . Viewing a spanning tree rooted at f as a directed graph, we may refer topaths traveling away from f as going “downstream” with the edges, and thosetraveling toward f as going “upstream” against the edges. In practice, the propa-gation process excludes those neighbors that are not downstream from the serverof interest along any spanning tree rooted at a subscriber of f . NST(f) is triviallycomputed for the topology of the acyclic architecture, since every spanning treein the topology coincides with the whole topology itself. For the topology of thegeneral architecture its computation is more complicated; however, the necessarytechniques, such as link-state or distance-vector routing algorithms, are well-known

Page 27: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 27

and widely deployed. An alternative approach to propagating subscriptions is touse Dalal and Metcalfe’s broadcasting algorithm [Dalal and Metcalfe 1978].

The last term in the formula represents an important optimization that the servermakes in the situation where more generic subscriptions have been propagatedalready to some neighbors.

We illustrate the processing of subscriptions in the acyclic peer-to-peer architec-ture with the scenario depicted in Figure 15. Figure 15a shows a fragment of apeer-to-peer event notification service. In this example, server 1 is connected toservers 2, 3, and 4. Server 1 also has a local client a. Server 3 sends a subscription[airline=any] to server 1. The poset shown on the right side of the figure representsthe subscription poset PS of server 1. As shown in the figure, the new subscriptionis inserted as a root subscription in PS and then forwarded to servers 2 and 4 butnot to server 3, which is in the NST set of the subscription. In this figure and thefollowing ones, for each subscription in PS , subscribers are denoted with an out-going arrow from the subscription, while forwards are denoted with an incomingarrow. Intuitively, arrows indicate the direction of notifications.

Figure 15b shows the effect of a second subscription [airline=UA, orig=DEN]sent to server 1 by server 2. This subscription is inserted in PS as an immediatepredecessor of the previous (root) subscription and is forwarded to server 3, whichis the only neighbor that is not in the forwards set for the covering subscription[airline=any].

In Figure 15c, client a subscribes for [airline=any]. In this case, the subscriptionis found in PS and a is simply added to its subscribers set. Because the NST setfor that subscription is now empty, the subscription is then forwarded to server 3.Every time the server forwards a subscription f to a neighbor server E2, it addsE2 to the forwards set of f and consequently removes E2 from the forwards sets ofall the subscriptions covered by f . In the example, the server removes 3 from theforwards set of subscription [airline=UA, orig=DEN].

5.2.3.4 Unsubscriptions. An unsubscription has the effect of removing a sub-scriber from a number of subscriptions in PS . More specifically, when a server Ereceives an unsubscription unsubscribe(X, f), it removes X from the subscribersset of every subscription covered by f .

As a consequence of these cancellations, some subscriptions might remain withan empty subscribers set; such subscriptions are removed from PS . The removal ofX from some subscriptions might also affect the NST set of those subscriptions. Inparticular, removing a subscriber for a subscription means removing its distributionspanning tree, which in turn might add some neighbor servers to NST for thosepaths that are not on the spanning tree of any other subscriber (see equation 1 onpage 26). In order to reduce the forwards set of those subscriptions according toequation 1, the server forwards the corresponding unsubscriptions to every neighborserver added to NST.

The reduced forwards sets of some subscriptions might affect the forwards sets ofother covered subscriptions. This effect is produced by the last term of equation 1for the covered subscriptions. Intuitively, this means that after unsubscribing forsome more generic subscriptions it might be necessary to (re)forward some morespecific subscriptions whose propagation was blocked by the existence of the more

Page 28: Design and Evaluation of a Wide-Area Event Noti cation Service

28 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

(a)

forwards subscribers

SP of server 1

new

airline=any

airline=any

airline=any

airline=any

1

43

2

a

32,4

(b)

SP of server 1

43

a

1

2

airline=any

3

dest=DENairline=UA

dest=DENairline=UA

airline=UAdest=DEN

server 3 is in NST

fwd sub

fwd2,4

sub

23

(c)

SP of server 1

airline=anyairline=any

3 4

1

2

a

airline=UAdest=DEN

airline=any

2

3,a2,3,4

subfwd

fwd sub

3

Fig. 15. Example Scenario Using an Acyclic Peer-to-Peer Architecture.

Page 29: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 29

(a)

airline=UAdest=DEN

airline=UAa

airline=UAprice<500

dest=DEN airline=AZ

price<800airline=AZ

2,3,4 3,a 3,4 2 2,3,4

4 32subfwdsubfwdsubfwd

fwd sub fwd sub fwd sub

(b)

airline=UAdest=DEN

airline=UA

emptyserver 3 is in NST

fwd fwd sub fwd sub

fwd subsubfwdsubfwd

price<800airline=AZairline=UA

price<500

2,3,4 2,3,4airline=AZ

3dest=DEN

23,4

4 3sub2

(c)

airline=UAdest=DEN

airline=UA

42

airline=UAprice<500

3

dest=DEN3 22,4 3,4

fwd subsubfwd

subfwdsubfwd

airline=AZprice<800

32,4fwd sub

Fig. 16. Unsubscriptions in an Acyclic Peer-to-Peer Architecture.

generic subscriptions.We illustrate the processing of unsubscriptions in the acyclic peer-to-peer archi-

tecture with the scenario depicted in Figure 16. Figure 16a depicts the subscriptionsposet of server 1 from Figure 15c after it has received some subscriptions from localclients and neighbor servers. This is the state of server 1 just before it receives anunsubscription filter [airline=any] from client a.

As a first step in processing the unsubscription from client a, server 1 removesthe subscriber (i.e., a) from all the subscriptions covered by the unsubscriptionfilter [airline=any]. Figure 16b shows the subscriptions poset PS in this state.Two (root) subscriptions are affected: subscription [airline=UA] changes its NSTset (which is initially empty) to include server 3, while subscription [airline=AZ]

Page 30: Design and Evaluation of a Wide-Area Event Noti cation Service

30 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

remains with an empty subscribers set. As a consequence, server 1 forwards the firstunsubscription [airline=UA] to the neighbor server added to the NST set (i.e., 3)and forwards the second unsubscription [airline=AZ] to all the previous forwards 2,3, and 4.

Eventually, server 1 processes the immediate predecessors of the canceled sub-scriptions since their forwards might have changed as a consequence of the previousunsubscriptions. Figure 16c shows the state of the subscription poset at this time.The subscription [airline=UA, price<500] must be forwarded to server 3 because its(only) successor has not been forwarded to server 3 (see equation 1). Subscription[airline=UA, dest=DEN] does not need to be propagated because all the neigh-bor servers have received either one of its successors. Subscription [airline=AZ,price<800] has now become a root subscription and thus must be forwarded toevery neighbor server except those in its NST (i.e., server 3).

5.2.3.5 Notifications. The algorithm for peer-to-peer architectures processes no-tifications exactly like the one that operates on the hierarchical architecture. So, asubscription n is forwarded to every subscriber of s for every s that covers n.

5.2.4 Advertisement Forwarding. With the subscription forwarding algorithmpresented in the previous sections, we have described almost everything neededto implement an advertisement forwarding algorithm. In fact, we can exploit theduality between subscriptions and advertisements to transpose the subscriptionforwarding algorithm to advertisement forwarding. To some extent, if we read thedescription of the subscription forwarding algorithm, replacing the terms regardingsubscriptions with the corresponding terms regarding advertisements, and replac-ing the terms regarding notifications with the corresponding terms regarding sub-scriptions, we obtain an almost exact description of the advertisement forwardingalgorithm.

The main difference with respect to the subscription forwarding structure is thatthere are actually two interacting computations; one realizes the forwarding ofadvertisements while the other realizes the forwarding of subscriptions. Both com-putations have similar data structures and similar algorithms, equivalent to theones described above. In particular, the server has a poset of advertisements PA,ordered according to the relation ≺A

A, as well as a poset of subscriptions PS . In PA,each advertisement a has an associated set of identities advertisers(a) and anotherset of identities forwards(a).

These two computations interact in the sense that advertisement forwarding con-strains subscription forwarding. For instance, in maintaining PS , when processinga subscription s, the server does not use the global set neighbors, but instead usesa subset neighborss ⊆ neighbors that is specific to s. neighborss is defined as theset of advertisers listed in PA for all the advertisements covering s. Formally:

neighborss =⋃

a∈TA:s≺SAa

advertisers(a) ∩ neighbors

Note that one effect of this constraint is that new advertisements and unadvertise-ments are viewed by the subscription forwarding computation as new peer connec-tions or dropped peer connections, respectively. Thus, if the server receives a newadvertisement that covers a set of subscriptions s1, s2, . . . , sk, then the server reacts

Page 31: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 31

by forwarding s1, s2, . . . , sk immediately to the sender of the advertisement.

5.3 Matching Patterns

So far we have seen how simple subscriptions and simple notifications are handled byevent servers. A major additional functionality provided by S iena is the matchingof patterns of notifications. This functionality is implemented with distributedmonitoring following the upstream evaluation principle set forth in Section 5.1.

To match patterns, servers assemble sequences of notifications from smaller subse-quences or from single notifications. Thanks to advertisements, every server knowswhich notifications and which subpatterns may be sent from each of its neighbors,which is why this technique requires an advertisement-based semantics. In additionto the notifications and patterns available from its neighbors, a server might furtheruse patterns from previous subscriptions that the server itself is already set up torecognize.

We use the term pattern factoring to refer to the process by which the serverbreaks a compound subscription into smaller compound and simple subscriptions.After a subscription has been factored into its elementary components, the serverattempts to group those factors into compound subscriptions to forward to someof its neighbors. This process is called pattern delegation.

5.3.1 Available Patterns Table. Every server maintains a table TP of availablepatterns. This table is simply the advertisements poset PA that, in addition tothe usual advertisements, contains also those patterns that the server has alreadyprocessed. Each pattern p in TP has an associated set of identities providers(p)that contains all the peer servers from which p is available. Table 4 shows an

pattern providers

a1string alarm = “failed-login”integer attempts > 0

3

a2string file anystring operation = “file-change”

2,3

Table 4. Example of a Table of Available Patterns.

example of a table of available patterns. The table says that notifications matchingfilter a1—notifications that signal a failed login with an integer attribute named“attempts”—are available from server 2, and that notifications matching filter a2—file modification notifications—are available from servers 2 and 3.

5.3.2 Pattern Factoring. Let us suppose a server E receives a compound sub-scription subscribe(X, s), where s = f1 · f2 · . . . · fk. Now, the server scans strying to match each fi with a pattern pi, or trying to match a sequence of filtersfi · fi+1 · . . . · fi+ki with a single compound pattern pi...i+ki using patterns p thatare contained in TP .

For example, assuming the table of available patterns shown in Table 4, supposeserver 1 receives a subscription s = f ·g ·h for a sequence of two “failed login” alarmswith one and two attempts respectively (f = [alarm=failed-login, attempts=1], and

Page 32: Design and Evaluation of a Wide-Area Event Noti cation Service

32 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

g = [alarm=failed-login, attempts=2]), followed by a file modification event on file“/etc/passwd” (h = [file=/etc/passwd, operation=file-change]). In response to s,

requested available

string alarm= “failed-login”integer attempts = 1

string alarm = “failed-login”integer attempts > 0

(a1)

string alarm= “failed-login”integer attempts = 2

string alarm = “failed-login”integer attempts > 0

(a1)

string file = “/etc/passwd”string operation = “file-change”

string file anystring operation = “file-change”

(a2)

Table 5. Example of a Factored Compound Subscription.

the server factors s, matching the three filters of s with the sequence of availablepatterns a1 · a1 · a2. Table 5 shows the subscription and the factoring computed bythe server. Because the only operator in S iena for combining subpatterns is thesequence operator, the output of the factoring process is always a sequence.

5.3.3 Pattern Delegation. Once a compound subscription is divided into avail-able parts, the server must (1) send out the necessary subscriptions to collect therequired subpatterns and (2) set up a monitor that will receive all the notifica-tions matching the subpatterns and will observe and distribute the occurrence ofthe whole pattern. In deciding which subscriptions to send out, the server triesto reassemble the elementary factors in groups that can be delegated to otherservers, thereby implementing the upstream evaluation principle. The selection ofsubpatterns that are eligible for delegation follows some intuitive criteria. For ex-ample, only contiguous subpatterns available from a single source can be groupedand delegated to that source. A complete discussion of these criteria is presentedelsewhere [Carzaniga 1998].

In the example of Table 5, server 1 would group the first two filters a1 · a1 anddelegate the subpattern defined by the corresponding two subscriptions (f · g) toserver 2. Thus, it would send a subscription subscribe(E, s1) with pattern

s1 =string alarm= “failed-login”integer attempts = 1

• string alarm= “failed-login”integer attempts = 2

to server 2, and would send the remaining filter h using a simple subscriptionsubscribe(E, s2) with

s2 = string file = “/etc/passwd”string operation= “file-change”

to servers 2 and 3. Server 1 will then start up a monitor that recognizes thesequence (s1 = f · g) · (s2 = h).

Figure 17 depicts an example that corresponds to the tables and subscriptionsdiscussed above. In particular, server 1 delegates f · g to server 2, subscribes for h,and monitors (f ·g) ·h. The diagram also shows how server 2 handles the delegatedsubscription. Assuming that f is available from server 5 and g is available from

Page 33: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 33

1 2 5

43

monitor(f·g)monitor((f·g)·h)

g

f·g·h

h

f·g f

subscribe(f·g·h)

Fig. 17. Pattern Monitoring and Delegation.

server 4, server 2 sends the two corresponding subscriptions to 4 and 5 and thenstarts up a monitor for f · g.

6. EVALUATION

Substantiating claims of scalability for an event notification service is a difficultchallenge. In particular, how does one demonstrate its ability to scale, when fullydoing so would require the deployment of an implementation of the service to thou-sands of computers across the world? Conceding to pragmatics, our approach is tobuild an argument based on (1) reasoning qualitatively about the rationale for theexpressiveness of the notification selection mechanism, (2) performing simulationstudies to determine the relative performance of the various architectures under cer-tain hypothetical usage scenarios, and (3) constructing a prototype implementationof the service as a proof of concept.

This section presents each of these three elements of the argument. Our con-clusion is that, while more study is required to fully validate the design, the earlyevidence strongly suggests that we have achieved our goal of developing an eventnotification service suitable for use at the scale of a wide-area network.

6.1 Rationale for Chosen Expressiveness

The interface to the S iena event notification service is a tailored application ofthe basic publish/subscribe protocol. Certain factors affecting the scalability ofour design (such as network latency and data structure size) are intrinsic to theservice and its use, and hence are beyond our control. The key factors we cancontrol are the definitions of notifications, filters, and patterns, and the complexityof computing the covering relations.

Our chosen level of expressiveness in S iena represents a compromise, at whichnotification structure, attribute types, and attribute operators approximate thoseof the well-understood and widely-used database query language SQL.

The covering relations are well behaved and predictable in the sense that theyexhibit an arguably reasonable computational complexity deriving from the expres-siveness of filters: Assuming a brute-force and unoptimized algorithm, the complex-ity of determining whether a given subscription and a given notification are relatedby ≺N

S is O(n+m), where n is the number of attribute constraints in the subscrip-

Page 34: Design and Evaluation of a Wide-Area Event Noti cation Service

34 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

tion filter and m is the number of attributes in the notification. The complexityof computing ≺N

S reflects the computation of an intersection between the attributevalues in a notification and constraints on those values appearing in a subscription.The complexity of each individual comparison is O(1) for all the predefined typeswe have included in S iena. The only exception is the string type, but efficientcomparison algorithms are well known.

The complexities of computing ≺SS, ≺A

A, and ≺SA are all O(nm), where n and m

represent the number of attribute constraints appearing in the respective subscrip-tion and/or advertisement filters. This complexity represents a comparison betweeneach attribute constraint in one filter and any corresponding attribute constraintsin the other filter. Checking a covering relation between filters amounts to a univer-sal quantification. But given our choice of types and operators, comparing a pairof attribute constraints can be reduced to evaluating an appropriate predicate onthe two constant values of the constraints, with a complexity O(1). For example,to see if [x > k1] covers [x > k2] we can simply verify that k2 ≥ k1.

We also restrict the expressiveness of patterns in S iena in the interests of ef-ficiency. Patterns, as we discuss in Section 3.3, are a simple sequence of filters.The computational complexity of matching a pattern is O(l(n+m)), where l is thelength of the pattern. This means that it is linear in the number of filters, whosecovering relation ≺N

S has complexity O(n + m).Our conclusion from this analysis is that the covering relations exhibit a com-

plexity that is quite reasonable for a scalable event notification service. In fact, thefactors n and m are, in practice, likely to be relatively small (typically less than10), making the computations negligible compared to the network costs they areattempting to reduce. This is all achieved with an expressiveness that approximatesSQL.

6.2 Simulation Studies

There are many questions that one could ask about a wide-area event notificationservice. In our initial simulation studies we have concentrated on the particularquestion of scalability with respect to the architectures and algorithms describedin the previous sections.

6.2.1 Simulation Framework. The simulation framework we use consists of twoparts: (1) a configuration of servers and clients mapped onto the sites of a wide-area network and (2) an assignment of application behaviors to objects of interestand interested parties. A site represents a subnet and its possibly many hosts,where the cost of communication between hosts within a site is assumed to be zero.The configuration of servers reflects the choice of the event notification servicearchitecture, while the application behaviors involve the basic service requests ofadvertise/unadvertise, subscribe/unsubscribe, and publish.

The primary measurement of interest is an abstract quantity we refer to as cost.We assign a relative cost to each site-to-site communication in the network andthen calculate the effect on this cost of varying a number of simulation parame-ters. In other words, we evaluate the architectures and algorithms in terms of thecommunication induced by the application behaviors, since we are interested incharacterizing the degree to which each architecture/algorithm combination can or

Page 35: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 35

cannot absorb increased communication costs in the face of increasing applicationdemands.

6.2.1.1 Network Configuration. Figure 18 shows the layered structure of a net-work configuration in our simulation framework. At the bottom level is a model of a

eventservice

topology

networktopology 4

3

21

5

7

12

1

Fig. 18. Layers in a Network Configuration.

wide-area network topology. This model defines sites (depicted as cubes) and links(depicted as heavy lines between cubes). To develop realistic network topologiesthat account for important properties such as latency, and approximate the relativecosts of real wide-area networks, we use a publicly available generator of randomnetwork topologies 2 that implements the Transit-Stub model [Zegura et al. 1996].(A discussion of this and other models for generating network topologies can befound elsewhere [Zegura et al. 1997].) The results we present here were performedon networks of 100 sites.

We make several simplifying assumptions about sites and links at this level. First,we assume that the costs of computation at sites and communication through linksare linear functions of load. Second, links have constant latency. Third, sites andlinks have infinite capacity. Note that one consequence of these assumptions is thatour model does not account for the effect of congestion.

At the top level of the network configuration is a model of an event notificationservice topology. This model defines the servers (depicted as shaded ovals) andclients (depicted as white ovals), with the interconnection among servers resultingfrom a choice of architecture (depicted as heavy lines between servers), the assign-ment of a client to a server (depicted as a dotted arrow), and the mapping of clientsand servers onto sites in a wide-area network (depicted as dashed lines from ovalsto cubes).

As a simplification, the simulations we present here involve only homogeneous ar-chitectures, and not the hybrid architectures that are also possible (see Section 4.4).

2Georgia Tech Internet Topology Models (GT-ITM).

Page 36: Design and Evaluation of a Wide-Area Event Noti cation Service

36 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

Moreover, each client represents only either an object of interest or an interestedparty, although in general it is possible for a client to be both. Finally, we allocateone server per site, we configure every server to connect to the servers residing at itsneighbor sites in the network topology, and we configure every client to use a serverat its (local) site. In other words, we assume that the locations and interconnectionsamong servers are an image of the underlying network topology. This assumptionsignificantly reduces the parameter space in the simulation. Nonetheless, this is areasonable assumption, since it reflects the structure of domains that characterizethe Internet [Clark 1989].

In addition to simulating the various multi-server architectures, we simulate asingle-server, centralized architecture, which serves as a baseline for our compar-isons; where the centralized architecture performs as well as or better than theothers, it should be chosen simply because of its simplicity. Of course, the cen-tralized architecture requires no forwarding algorithm, since it comprises a singleserver.

6.2.1.2 Application Behavior. The behavior of an application using the eventnotification service involves the collective behaviors of its objects of interest andinterested parties. These individual behaviors are specified as sequences of servicerequests. In particular, an object of interest executes m sequences

advertise,

n times︷ ︸︸ ︷publish, publish, . . ., unadvertise

and within each cycle publishes n notifications. In addition, an average delaybetween publish requests, t, can be specified (with the delays generated accordingto a Poisson distribution). Similarly, an interested party executes p sequences

subscribe,

q times︷ ︸︸ ︷recv notif, recv notif, . . ., unsubscribe

where recv notif represents the operation of waiting for and then receiving a noti-fication.

6.2.1.3 Scenario Generation. Input to our simulation tool is generated in a two-step process. In the first step a (random) network topology is generated, as dis-cussed above. In the second step the generated topology is combined with a scenarioparameter file that specifies both the event notification service topology and theapplication behavior.

6.2.2 Results. The space of studies made possible by our simulation framework isquite extensive. Here we explore a portion of that space, focusing on usage scenar-ios that distinguish four basic architecture/algorithm combinations: centralized,hierarchical client/server with subscription forwarding, acyclic peer-to-peer withsubscription forwarding, and acyclic peer-to-peer with advertisement forwarding.3

Furthermore, we are only simulating the objects of interest and interested partiesassociated with one particular kind of event. Because the main purpose of the simu-lations presented here is to highlight the relative behaviors of the architectures and

3A more extensive set of data plots is presented elsewhere [Carzaniga 1998; Carzaniga et al. 1999].

Page 37: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 37

algorithms, we choose not to complicate the experiments by simulating additionalkinds of events (with their associated objects of interest and interested parties).While it is conceivable that this choice could affect the results in some way, ourintuition tells us that our conclusions about the relative behaviors would remainthe same.

To reveal the scaling properties of the architecture/algorithm combinations, ourapproach is to keep the behaviors of objects of interest and interested parties con-stant while varying the number of objects of interest from 1 to 1000 and the numberof interested parties from 1 to 10000. In all cases, the number of network sites is100. Referring to the characterization given in Section 6.2.1, we used parametervalues of m = 10 and n = 10 for objects of interest, indicating 10 sequences of 10iterations each, and an average inter-publication delay t in the interval 2000–2500.The behavior parameters for interested parties was p = 10 and q = 10, indicating10 iterations of 10 received notifications each.

We ran our scenarios with artificially low ratios of publications-per-advertisementand notifications-per-subscription in order to produce conservative simulation re-sults. Applications ultimately benefit from delivery of notifications, and so adver-tisements and subscriptions can be considered a necessary overhead to obtain thatbenefit. Such low ratios then serve to exaggerate this overhead. In real applica-tions, we would expect the service to deliver a much higher volume of notifications(with correspondingly lower overhead) than is represented by these ratios.

The results we present are all shown as plots whose data points represent the av-erage of 10 simulation runs for the same parameter values. Except for the plots ofthe acyclic peer-to-peer architecture with advertisement forwarding (whose behav-ior is unstable and as yet inexplicable), the variance within each set of 10 simulationruns was negligible and, therefore, we choose not to include error bars in the plots.

For the majority of the plots, the horizontal axis gives the number of interestedparties in a logarithmic scale ranging from 1 to 10000, while the vertical axis givesa linear measure of cost. As mentioned above, the cost values are derived from anassignment of relative costs for communicating over network links. Therefore, theabsolute value of a data point’s cost is meaningless, but its relative value gives auseful characterization.

In the plots below we use the following aliases for the event notification servicearchitecture/algorithm combinations: ce=centralized, hs=hierarchical client/serverwith subscription forwarding, as=acyclic peer-to-peer with subscription forwarding,and aa=acyclic peer-to-peer with advertisement forwarding.

6.2.2.1 Total Cost. A basic metric for the event notification service is the totalcost of providing the service. The total cost is calculated by summing the costsof all site-to-site message traffic. The total cost captures an important aspect ofscalability by revealing how communication cost is impacted by increases to theload presented to the service. Figure 19 compares the total costs incurred by thethree distributed architectures with 1, 10, 100 and 1000 objects interest; we omitthe curves for the centralized architecture because its total cost far outweighs thatof the distributed architectures, exhibiting exponential blowup starting at around1000 interested parties in each plot. There are several interesting observations wecan make about these plots.

Page 38: Design and Evaluation of a Wide-Area Event Noti cation Service

38 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

0

0.05

0.1

0.15

0.2

0.25

1 10 100 1000 10000

tota

l cos

t (m

illio

ns)

interested parties

arch=hs,as,aa; objects=1

hsasaa

00.050.1

0.150.2

0.250.3

1 10 100 1000 10000

tota

l cos

t (m

illio

ns)

interested parties

arch=hs,as,aa; objects=10

hsasaa

00.10.20.30.40.50.60.70.80.9

11.1

1 10 100 1000 10000

tota

l cos

t (m

illio

ns)

interested parties

arch=hs,as,aa; objects=100

hsasaa

02468

1012

1 10 100 1000 10000

tota

l cos

t (m

illio

ns)

interested parties

arch=hs,as,aa; objects=1000

hsasaa

Fig. 19. Comparison of Total Costs Incurred by Distributed Architectures.

First, when there are more than 100 interested parties, the total cost is essentiallyconstant, meaning that there is a point beyond which there is no additional costincurred in delivering notifications to additional interested parties. We call this thesaturation point, since there is high likelihood that there is an object of interest atevery site, and thus an object of interest near every interested party. (Recall thatall objects of interest are publishing the same notifications.)

Second, all the architectures scale sublinearly when the number of interestedparties is below the saturation point. When below this point, it is likely that anobject of interest and an interested party are not at the same site. Therefore, it isalso likely that the cost to deliver a notification is nonzero. However, as the numberof interested parties grows, the likelihood increases that each new interested partywill be located at a site at which there is already an interested party. The marginalcost of each additional interested party decreases up to the saturation point, wherethe cost becomes zero. This sublinear scaling character of the architectures can bediscerned more easily in the plots of Figure 20, which present the portion of thedata of Figure 19 below the saturation point using a linear scale in the horizontalaxis.

Third, as the number of objects of interest increases, the hierarchical client/serverarchitecture with subscription forwarding performs worse by an increasingly largeconstant factor as compared to the acyclic peer-to-peer architecture with subscrip-tion forwarding. This can be attributed to the fact that, while the acyclic peer-to-peer architecture is penalized by its broadcast of subscriptions, the hierarchicalclient/server architecture, which propagates notifications only towards the root ofthe hierarchy, is forced to do so whether or not interested parties exist on the other

Page 39: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 39

00.020.040.060.080.1

0.120.140.160.180.2

0.22

20 40 60 80100120140160180200

tota

l cos

t (m

illio

ns)

interested parties

arch=hs,as,aa; objects=1

hsasaa

0

0.05

0.1

0.15

0.2

0.25

20 40 60 80100120140160180200

tota

l cos

t (m

illio

ns)

interested parties

arch=hs,as,aa; objects=10

hsasaa

00.10.20.30.40.50.60.70.80.9

1

20 40 60 80100120140160180200

tota

l cos

t (m

illio

ns)

interested parties

arch=hs,as,aa; objects=100

hsasaa

02468

1012

20 40 60 80100120140160180200

tota

l cos

t (m

illio

ns)

interested parties

arch=hs,as,aa; objects=1000

hsasaa

Fig. 20. Comparison of Total Costs Below the Saturation Point.

side of the root of the network. This generates a potentially significant traffic inunnecessary notifications.

Finally, the acyclic peer-to-peer architecture with advertisement forwarding dis-plays a strikingly, and as yet inexplicable, unstable cost profile for low densities ofinterested parties. On the other hand, its costs essentially follow those of the acyc-lic peer-to-peer architecture with subscription forwarding once the saturation pointis passed. This effect becomes more evident as the number of objects of interestincrease. We can attribute this to our conservative choice of behavior for objectsof interest in these studies. In particular, an object of interest unadvertises andreadvertises quite frequently, compared to the number publications it generates ateach iteration.

Overall, the acyclic peer-to-peer architecture with subscription forwarding ap-pears to scale well and predictably under all circumstances, and thus is likely torepresent a good choice to cover a wide variety of scenarios.

6.2.2.2 Cost Per Service Request. An event notification service is efficient if itcan amortize the cost of satisfying newer client requests over the cost of satisfy-ing previous client requests. This is another manifestation of the network effect.The average per-service cost is calculated by dividing the total cost, as introducedabove, by the total number of client requests. A low value for this ratio indicateslow overhead. Recall that for these studies we configure the network so that clientsare connected to servers at their local sites and, therefore, the client-to-server com-munication cost is treated as zero. The per-service cost thus purely reflects thechoice of architecture/algorithm combination.

Page 40: Design and Evaluation of a Wide-Area Event Noti cation Service

40 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

We can see several interesting things in the results of the data analysis presentedin Figure 21. First, the centralized architecture is unreasonable in essentially all

050

100150200250300350400

1 10 100 1000 10000

per-

serv

ice

cost

interested parties

arch=ce,hs,as,aa; objects=1

cehsasaa

050

100150200250300350

1 10 100 1000 10000

per-

serv

ice

cost

interested parties

arch=ce,hs,as,aa; objects=10

cehsasaa

0

50

100

150

200

250

1 10 100 1000 10000

per-

serv

ice

cost

interested parties

arch=ce,hs,as,aa; objects=100

cehsasaa

050

100150200250300

1 10 100 1000 10000

per-

serv

ice

cost

interested parties

arch=ce,hs,as,aa; objects=1000

cehsasaa

Fig. 21. Comparison of Per-Service Costs.

scenarios as compared to the other architectures. Second, advertisement forwardingagain shows itself unstable for high numbers of objects of interest until the satu-ration point in interested parties is reached. Third, for low numbers of objects ofinterest and low numbers of interested parties, the costs are dominated by message-passing costs internal to S iena, since there are relatively few notifications generatedin the network, there are few parties interested in receiving those notifications, andthere is a significant internal cost incurred in setting up routing paths from ob-jects of interest to interested parties. The hierarchical client/server architecturewith subscription forwarding does well in this situation because subscriptions areforwarded only towards the root server, resulting in lower setup costs. However,as the number of objects of interest or the number of interested parties increases,its advantage quickly disappears, recovering only beyond the saturation point forinterested parties. Finally, the acyclic peer-to-peer architecture with subscriptionforwarding does extremely well when there is a high number of objects of interest,independent of the number of interested parties. This effect is explained in the nextseries of plots.

6.2.2.3 Cost Per Subscription and Per Notification. Based on the results of study-ing the total and per-service cost incurred by each of the four architecture/algorithmcombinations, the hierarchical client/server architecture and acyclic peer-to-peer ar-chitecture, both with subscription forwarding, appear to be the two most promising

Page 41: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 41

choices. However, they are clearly distinguished if we examine which kind of servicerequest each one favors for its optimizations.

The average per-subscription cost is calculated by dividing the total cost ofall subscription-related messages by the number of subscriptions processed. Thegraph of Figure 22 shows the per-subscription cost incurred by the hierarchicalclient/server architecture and acyclic peer-to-peer architecture with a single objectof interest. In trying to understand the different cost drivers of the two architec-tures, we simulated several scenarios with a single object of interest while varyingonly the behavioral parameters. In these cases, we observed no significant variationin cost. However, additional simulations, in which we varied the density of inter-ested parties, highlight the difference between the two architectures. The results ofthese simulations are presented in Figure 22 and reveal that the costs are primarilydependent on the density of interested parties. In particular, the per-subscriptioncost is evidently higher for the acyclic peer-to-peer architecture than the hierar-chical client/server for low densities of interested parties, while both architecturesbenefit from increasing densities of interested parties.

The main difference is in the way each architecture forwards subscriptions. Inthe acyclic peer-to-peer architecture, a subscription must be propagated throughoutthe network; in a network of N sites, a subscription goes through O(N) hops and,therefore, the cost is O(N). The hierarchical client/server architecture, on theother hand, requires that a subscription be forwarded only upward towards theroot server; in this case the number of hops, and hence the cost, are both O(log N).

0200400600800

1000120014001600

1 10 100 1000

subs

crip

tion

interested parties

arch=hs,as; objects=1

hsas

Fig. 22. Comparison of Per-Subscription Costs For Hierarchical Client/Server and Acyclic Peer-to-Peer Architectures under Varying Numbers of Interested Parties.

The acyclic peer-to-peer architecture recoups its greater setup costs for sub-scriptions by reducing the average cost of notifications. Figure 23 compares theper-notification costs incurred by the acyclic peer-to-peer architecture and the hier-archical client/server architecture with a single object of interest. In the particularscenario of Figure 23, the difference between the per-notification costs in the twoarchitectures is constant with respect to the number of interested parties. The samedifference is clearly visible from the global per-service costs shown in of Figure 21.

We observe that this constant bracket depends on the number of ignored notifi-cations. In many of the scenarios we simulated, the total number of notificationsproduced by objects of interest exceeds the number of notifications consumed by

Page 42: Design and Evaluation of a Wide-Area Event Noti cation Service

42 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

0

50

100

150

200

250

1 10 100 1000

per-

noti

fica

tion

cos

tinterested parties

arch=hs,as; objects=1

hsas

Fig. 23. Comparison of Per-Notification Costs For Hierarchical Client/Server and Acyclic Peer-to-Peer Architectures under Varying Numbers of Interested Parties.

interested parties. For example, in the scenario represented by the results presentedhere, a total of 100000 notifications are produced, while each interested party con-sumes 100 notifications before terminating, leaving 99900 ignored notifications.

The cost of ignored notifications is clearly shown by the degenerate scenarios ofFigure 24, in which per-notification costs are plotted against the number of noti-fications published. The average per-notification cost is calculated by dividing thetotal cost of all notification-related messages by the number of notification pro-cessed. The first scenario has one object of interest that emits a varying numberof notifications and no interested parties at all. Here the hierarchical client/serverarchitecture incurs a constant cost due to the fact that every notification must bepropagated toward the root of the hierarchy, whereas the acyclic peer-to-peer ar-chitecture incurs no cost at all, since every notification remains local to its accessserver. The second scenario has one interested party that consumes exactly one no-tification and then terminates. Again, in the hierarchical client/server architecture,the per-notification cost is constant, while the acyclic peer-to-peer architecture in-curs an initial cost for the first notification that is subsequently amortized by thezero cost of the subsequent ignored notifications.

010203040506070

1 10 100 1000

avg

cost

per

ser

vice

publications

arch=hs,as; interested parties=0

hsas

01020304050607080

1 10 100 1000

avg

cost

per

ser

vice

publications

arch=hs,as; interested parties=1

hsas

Fig. 24. Comparison of Per-Notification Costs For Hierarchical Client/Server and Acyclic Peer-to-Peer Architectures under Varying Numbers of Publications in Two Degenerate Cases.

Page 43: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 43

6.2.2.4 Worst-Case Per-Site Cost. A critical issue in a communication networkis how it behaves in the face of congestion. In the simulation model presented here,however, we cannot directly simulate congestion (see Section 6.2.1). Nevertheless,we can ask a related question that gives us an indication of which architecturewould approach a given level of congestion soonest. The metric we use is theworst-case per-site cost. It is calculated, for each scenario, by averaging the cost ofcommunication incurred by each site over the ten simulations of that scenario, andthen by computing the maximum over those average per-site costs.

600800

100012001400160018002000220024002600

1 10 100 1000

max

cos

t per

sit

e

interested parties

arch=hs,as; objects=1

hsas

2000400060008000

10000120001400016000

1 10 100 1000m

ax c

ost p

er s

ite

interested parties

arch=hs,as; objects=10

hsas

0100002000030000400005000060000700008000090000

100000110000

1 10 100 1000

max

cos

t per

sit

e

interested parties

arch=hs,as; objects=100

hsas

0100000200000300000400000500000600000700000

1 10 100 1000

max

cos

t per

sit

e

interested parties

arch=hs,as; objects=1000

hsas

Fig. 25. Comparison of Worst-Case Per-Site Costs.

The plots of Figure 25 show the maximum cost incurred by a site in the hierarchi-cal client/server and acyclic peer-to-peer architectures under the same scenarios ofFigure 19 and Figure 20. The hierarchical client/server architecture exhibits slightlylower worst-case per-site cost under low densities of objects of interest. For highdensities of objects of interests, and therefore under high volumes of notifications,the hierarchical client/server architecture incurs a much greater maximum per-sitecost than the acyclic peer-to-peer architecture. In other words, a high volume ofnotifications is likely to cause congestion to be reached sooner in the hierarchicalclient/server archtitecture than in the acyclic peer-to-peer architecture.

How do we explain this difference in behavior? We can see that it is not at-tributable to a simple overloading of the root server in the hierarchical client/serverarchitecture. This, perhaps counterintuitive, observation is based on the fact thatthe acyclic peer-to-peer architecture gains advantage only when, for a given objectof interest, there is no interested party on the “opposite side” of the network, andhence the traffic remains within a localized neighborhood of the object of interest.

Page 44: Design and Evaluation of a Wide-Area Event Noti cation Service

44 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

This situation rarely arises under high densities of interested parties. Thus, thedifference in behavior is explained solely by the effect of relatively high numbers ofignored notifications unavoidably forwarded to the root server, as discussed above.

6.2.3 Summary. We can summarize the differences between the hierarchical cli-ent/server and acyclic peer-to-peer architectures as follows:

—The hierarchical client/server architecture has a lower per-subscription cost thanthe acyclic peer-to-peer (O(log N) and O(N) respectively). This cost does notdepend on the behavior of objects of interest or interested parties.

—In both architectures, the subscription cost is amortized for increased densitiesof interested parties. The cost difference between the two architectures is alsosignificantly reduced for high densities of interested parties.

—The cost of delivering a notification to interested parties is more or less the samefor the two architectures. However the acyclic peer-to-peer architecture has nocost for ignored notifications while the hierarchical peer-to-peer architecture paysa fixed cost (O(log N)).

In practice, the hierarchical client/server architecture should be used where thereare low densities of interested parties that subscribe (and unsubscribe) very fre-quently. The acyclic peer-to-peer architecture is more suitable to scenarios wherethe total cost is dominated by notifications, and especially where the total numberof notifications exceeds the number of notifications consumed by interested parties,that is, in the presence of ignored notifications.

6.3 Prototype

We have implemented a prototype of S iena that realizes the subscription-basedevent notification service.4 The current implementation of S iena offers two appli-cation programming interfaces, one for C++ and the other for Java. Both interfacesprovide nearly the complete data model and subscription language described in Sec-tion 3. The time data type is the only one that has not yet been implemented.

Two event servers are also provided in the current implementation. One (writtenin Java) is based on the hierarchical client/server algorithm, while the other one(written in C++) is based on the acyclic peer-to-peer architecture with subscriptionforwarding. These two servers have been used together to form a hybrid topology.

For the client/server and server/server communication in S iena, we have de-veloped a simple event notification protocol that we have implemented on top ofTCP/IP connections. We have also encapsulated application-level protocols suchas HTTP and SMTP.

7. RELATED WORK

In this section we briefly review related work in event notification services. Amore complete discussion of these topics is presented elsewhere [Carzaniga 1998;Carzaniga et al. 1999].

4Source and binary packages are available at http://www.cs.colorado.edu/serl/siena/.

Page 45: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 45

7.1 Classification Framework

In order to understand and classify technologies that are related to S iena, wecan compare them from the perspective of their server architectures, which affectsscalability, and from the perspective of their subscription language, which affectsexpressiveness. Table 6 presents such a comparison in terms of the architecturesdescribed in Section 4 and in terms of a classification of subscription languagesshown in Table 7.

We classify subscription languages based on their scope and expressive power.Scope has two aspects: (1) whether a subscription is limited to considering a singlenotification (thus reducing the language to that of filters) or whether it can considermultiple notifications (thus involving both filters and patterns); and (2) whether asubscription is limited to considering a single, designated field in a notification orwhether it can consider multiple fields. Expressive power is concerned with the so-phistication of operators that can be used in forming subscription predicates, rang-ing from a simple equality predicate, to expressions involving only predefined op-erators, to expressions involving user-defined operators. We note that user-definedoperators suffer from the disadvantage of having arbitrary, unknown, and poten-tially unbounded complexity. Moreover, the computation of the covering relationsthat allow the pruning of propagation trees, such as ≺S

S , might be undecidable.From Table 7 we derive the four classes of subscription languages used in Ta-

ble 6. In a channel-based language, a client subscribes for all notifications sentacross an explicitly-identified channel, which is a discrete communication path.In a subject-based language, a client subscribes for all notifications that the pub-lisher has identified as being relevant to a particular subject, which is selectedfrom a predefined set of available subjects. The difference between channel-basedand subject-based is that a channel typically allows only a straight equality test(e.g., channel=314 or channel=“CNN”) whereas a subject often subsumes richerpredicates, such as wild-card string expressions on subject identifiers (e.g., sub-ject=“comp.os.∗”). In both cases, the filter applies to a single well-known field.In a content-based language, a client subscribes for all notifications whose contentmatches client-specified predicates that are evaluated on the content; evaluation ofthese predicates can be limited either to individual notifications (a simple content-based language) or to patterns of multiple notifications (a content-based languagewith patterns). We observe that subscription languages with user-defined predi-cates are rare; in Table 6 we have combined the language classes corresponding topredefined and user-defined predicates because only a single entry, object-orientedactive databases, makes use of user-defined predicates.

In the remainder of this section, we discuss the relationship between S iena andthe other technologies mentioned in Table 6 in greater detail.

7.2 Related Technologies

The idea of integrating different components by means of messages was pioneered ina research system called Field [Reiss 1990]. As in several commercial products thatfollowed (e.g., HP SoftBench [Cagan 1990], DEC FUSE [Hart and Lupton 1995],and Sun ToolTalk [Julienne and Holtz 1994]), Field implements a message-based in-tegrated environment in which several software development tools can cooperate by

Page 46: Design and Evaluation of a Wide-Area Event Noti cation Service

46 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

Architecture

centralizedhierarchicalclient/server

peer-to-peerch

annel-b

ase

d

Field [Reiss 1990]

CORBA Event Service[Object ManagementGroup 1998a]

Java Distributed EventSpecification [SunMicrosystems, Inc.1998]

Java Message Service[Sun Microsystems, Inc.1999]

CORBA Event Service[Object ManagementGroup 1998a]

IP multicast [Deeringand Cheriton 1990]

SoftWired’s iBus

Subsc

rip

tion

Language

subj

ect-ba

sed

ToolTalk [Julienne andHoltz 1994]

NNTP [Kantor andLapsley 1986]

JEDI [Cugola et al.pear]

TIBCO’sTIB/Rendezvous

Talarian’s SmartSocketsVitria’s BusinessWare

conte

nt-ba

sed

Elvin [Segall andArnold 1997]

Keryx [Wray andHawkes 1998]

Yu et al. [Yu et al.1999]

Gryphon [Banavar et al.1999]

conte

nt-ba

sed

with

patter

ns

GEM[Mansouri-Samani andSloman 1997]

Yeast [Krishnamurthyand Rosenblum 1995]

CORBA NotificationService [ObjectManagement Group1998b]

object-oriented activedatabases [Ceri andWidom 1996]

S iena S iena

†Allows user-defined operators in subscription predicates; all others support only predefinedoperators.

Table 6. A Classification of Related Technologies.

Page 47: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 47

Scopesingle notification

one fieldsingle notification

multiple fieldsmultiple notifications

multiple fieldsPow

er

simpleequality

channel-based — —

expressions withpredefinedoperators

restrictedsubject-based

restrictedcontent-based

restrictedcontent-basedwith patterns

expressions withuser-defined

operators

generalsubject-based

generalcontent-based

generalcontent-basedwith patterns

Table 7. Typical Features of Subscription Languages.

exchanging messages. Messages carry service requests to other tools or announce-ments of changes of state. The domain of event notifications and subscriptions inthese systems is limited. Tools can generate a fixed set of messages, and in somecases (e.g., in DEC FUSE), the set of messages is statically mapped into a set ofcallback procedures hard-wired into the tool.

Yeast [Krishnamurthy and Rosenblum 1995] is an event-action system support-ing the definition of rules that specify the actions to be taken upon the occurrenceof particular events. Unlike message-based integrated environments, Yeast is ageneral-purpose event notification service with a rich event pattern language. Theaction part of a rule is a UNIX shell script. GEM [Mansouri-Samani and Slo-man 1997] is a more recent language that allows one to specify event-action rulessimilarly to Yeast. A different, more specialized kind of system that is concep-tually equivalent to event-action systems are active databases [Ceri and Widom1996]. The main difference between an event notification service like S iena and anevent-action system like Yeast is that an event notification service only dispatchesevent notifications, so that responses to events (i.e., the actions) are executed byinterested parties externally to the service. An event-action system or an activedatabase, on the other hand, is responsible for also executing the actions taken inresponse to event notifications.

The USENET News system, with its main protocol NNTP [Kantor and Lapsley1986], is perhaps the best example of a scalable, user-level, many-to-many com-munication facility. News articles (modeled after e-mail messages) are propagatedthrough a network of servers. A new server can join the network by connecting asa slave to another (master) server that is already part of the infrastructure. Arti-cles are posted to newsgroups, which are organized in a hierarchical name/subjectspace. NNTP provides a primitive filtering capability, such that articles can beselected by means of simple expressions denoting sets of group names and the datesof postings. For example, a slave server can request all the groups in comp.os.*that have been posted after a given date. Although group names and subnamesreflect the general subject and content of messages, the filter that they realize istoo coarse-grained for most users and definitely inadequate for a general-purposeevent notification service. As a result, it is common for news readers (the clientprograms of USENET News) to allow users to perform additional, more sophisti-cated filtering over the messages that have been transferred from the server, but

Page 48: Design and Evaluation of a Wide-Area Event Noti cation Service

48 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

this is outside the purview of the service itself and thus cannot be used to reducenetwork traffic. The service is scalable but still quite heavyweight, and in fact thetime frame of article propagation ranges from hours to days, which is inadequatefor an event notification service.

IP multicast [Deering 1991] is a network-level infrastructure that extends the In-ternet protocol to realize a one-to-many communication service. The network thatrealizes this extension is also referred to as the MBone. A multicast address is avirtual IP address that corresponds to a group of hosts. IP datagrams that areaddressed to a host group are routed to every host belonging to the group. Hostscan join or leave a group at any time reporting their group membership with aspecific protocol [Fenner 1997]. An event notification service can be thought of asa multicast communication infrastructure in which addresses are not explicit hostaddresses but rather arbitrary expressions of interest, and in which subscribing isequivalent to joining a group. However, the IP multicast infrastructure has majorlimitations when used as a generic event notification service. The first issue is howto map expressions of interest into IP group addresses in a scalable way. A sep-arate service, perhaps similar to DNS, could be sufficient to resolve the mapping.However, we would have to assume that there exist enough multicast addresses tomap every possible expression of interest. The second and most crucial issue is thelimited expressiveness of the addressing scheme itself. In fact, since IP multicastnever relates two different IP groups, it would not be possible to exploit similaritiesbetween subscriptions mapped to different IP group addresses. Different notifica-tions matching more than one subscription would have to be sent to several separatemulticast addresses, each one being routed in parallel and independently by the IPmulticast network.

Another network-level technology that is at least somewhat related to S iena isactive networks. An active network is a network with programmable switches [Ten-nenhouse et al. 1997]. In a sense, the programmability feature of active networksis a form of content-based routing, since the content of the packets can governpacket routing in a very expressive manner and can be used to achieve routingoptimizations. But while active networks themselves are not a general-purposeevent notification service like S iena, they could nevertheless possibly be used as animplementation platform.

Two prominent efforts are intended to lead specifically to event notification ser-vices. These are the CORBA Notification Service [Object Management Group 1999]and the Javatm Message Service [Sun Microsystems, Inc. 1999]. It is important tonote, however, that both these efforts do not themselves realize an event notifica-tion service. Rather, they simply specify interfaces to be implemented by a service,with the critical issue of how an implementation is to provide a scalable service leftunaddressed.

Commercial products such as SoftWired’s iBus, TIBCO’s TIB/Rendezvoustm,Talarian’s SmartSocketstm, Hewlett-Packard’s E-speaktm, and the messaging sys-tem in Vitria’s BusinessWaretm provide implementations of an event notificationservice. While these products support distribution of the service, they are notspecifically designed to support wide-area scale to the degree discussed in this pa-per. In particular, they typically achieve simple distribution through a federationof centralized servers with statically configured inter-server routing.

Page 49: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 49

There are several research efforts concerned with the development of an eventnotification service, including IBM’s Gryphon [Aguilera et al. 1999; Banavar et al.1999], Elvin [Segall and Arnold 1997], JEDI [Cugola et al. 1998], Keryx [Wray andHawkes 1998], and the recent work of Yu et al. [Yu et al. 1999].

Gryphon is a distributed content-based message brokering system similar toS iena. The techniques developed in Gryphon are complementary to the ones ofS iena. In particular, Gryphon uses a fast algorithm to match a notification to alarge set of subscriptions [Aguilera et al. 1999]. This algorithm, similar to the onedescribed by Gough and Smith [Gough and Smith 1995], exploits commonalitiesamong subscriptions. That is, whenever two or more subscriptions specify a con-straint on the same attribute, the algorithm organizes them in order to test thevalue of that attribute in each notification once for all the subscriptions. The maindifference with S iena is that Gryphon propagates every subscription everywhere inthe network, whereas S iena propagates only the most generic subscriptions.

Yu et al. propose an event notification service implemented using a peer-to-peerarchitecture of proxy servers, with one server being the distinguished “root” server.In a sense their architecture is an amalgam of S iena’s hierarchical and acyclic peer-to-peer architectures. They believe a hierarchical arrangement of servers to havesuperior scalability to a non-hierarchical one. However, they have not yet simulatedor implemented their architecture, so this architecture’s scalability properties areyet to be determined.

Elvin is a centralized event dispatcher that has a rich event filtering languagethat allows complex expressions of interest. The centralized architecture facilitatesefficient event filtering, although it poses severe limitations to its scalability. Keryxalso provides structured event notifications and filtering capabilities extended tothe whole structure of events. Keryx has a distributed architecture similar to thatof USENET News servers. The architectures of TIB/Rendezvous and JEDI arealso hierarchical, although their subscriptions are based on a simplified regularexpression applied to a single string—the “subject” in TIB/Rendezvous or theentire notification in JEDI. Even simpler is the selection mechanism offered byiBus that adopts channel-based addressing.

The same channel-based addressing is specified by the CORBA Event Service [Ob-ject Management Group 1998a] and by the Javatm Distributed Event Specifica-tion [Sun Microsystems, Inc. 1998]. The shortcomings of the channel-based ad-dressing scheme have been recognized both in the CORBA and Java communities,therefore these specifications have recently been superseded by more advanced ser-vice specifications that are in line with the interface of S iena. In particular, theOMG has added additional filtering based on notification contents with the spec-ification of the CORBA Notification Service [Object Management Group 1999],while Sun has specified a radically new service, the Javatm Message Service [SunMicrosystems, Inc. 1999], that features SQL-like message selectors.

8. CONCLUSIONS

In this paper, we have described our work on S iena, a distributed, Internet-scaleevent notification service. We have described the design of the interface to the ser-vice, its semantics, the topological arrangements of event servers, and the routingalgorithms that realize the service over a network of servers. The simulations that

Page 50: Design and Evaluation of a Wide-Area Event Noti cation Service

50 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

we performed confirm our intuitions about the scalability of the topologies andalgorithms that we have studied. To summarize, we found that the hierarchical ar-chitecture is suitable with low densities of clients that subscribe (and unsubscribe)very frequently, whereas the peer-to-peer architecture performs better when the to-tal cost of communication is dominated by notifications. In situations where thereare high numbers of ignored notifications (i.e., notifications for which there are nosubscribers), the peer-to-peer architecture is also superior to the hierarchical ar-chitecture. We plan to continue exploring the parameter space of our simulationsin several directions. In particular, we are simulating different ranges of behav-ioral parameters to see which algorithms are most sensitive to different classes ofapplications.

We plan on extending our design and prototype implementation of S iena ina number of ways. For instance, we plan to enhance the design of the interfaceand algorithms to support mobility of clients. We also plan to implement theadvertisement-forwarding algorithm in the prototype, which will also allow us toapply the pattern matching optimizations that we discussed. Additionally, we planto work on extensions that support the expression of quality of service parametersespecially suited to the integration of software components. These new featureswould allow the implementation of grouping mechanisms, such as transactions fornotifications. Finally, we plan to explore other important aspects of a wide-areaevent notification service. Specifically, we are currently developing a model of securepublish/subscribe communication, as well as mechanisms for reliability and faulttolerance of event notification services.

A significant new direction we intend to explore in the future is a realizationof content-based routing as a fundamental network service provided within thephysical network fabric itself [Carzaniga et al. 2000b]. This essentially involvesreplacing the architecture described at the beginning of Section 4, where we assumean event notification service such as S iena to be implemented on top of a lower-levelnetwork protocol such as TCP/IP. Viewed differently, this involves embedding thecontent processing and routing capabilities of the upper layer of Figure 18 within thenetwork topology of the lower layer of that figure. Content-based routing supportedin this fashion could portend even more efficient and scalable event notificationservices, supported by a new class of networks built from high-speed content-basedrouters.

Acknowledgments

We thank Gianpaolo Cugola, Elisabetta Di Nitto, Alfonso Fuggetta, Richard Hall,Dennis Heimbigner, and Andre van der Hoek for their considerable contributionsin discussing and shaping many of the ideas presented in this paper.

REFERENCES

Aguilera, M. K., Strom, R. E., Sturman, D. C., Astley, M., and Chandra, T. D. 1999.Matching events in a content-based subscription system. In Eighteenth ACM Symposiumon Principles of Distributed Computing (PODC ’99) (Atlanta, GA, May 4–6 1999), pp.53–61.

Banavar, G., Chandra, T. D., Mukherjee, B., Nagarajarao, J., Strom, R. E., and Stur-

man, D. C. 1999. An efficient multicast protocol for content-based publish-subscribe

Page 51: Design and Evaluation of a Wide-Area Event Noti cation Service

Design and Evaluation of a Wide-Area Event Notification Service · 51

systems. In The 19th IEEE International Conference on Distributed Computing Systems

(ICDCS ’99) (Austin, TX, May 1999), pp. 262–272.

Birman, K. P. 1993. The process group approach to reliable distributed computing. Com-munications of the ACM 36, 12 (Dec.), 36–53.

Cagan, M. R. 1990. The HP SoftBench environment: An architecture for a new generationof software tools. Hewlett-Packard Journal: technical information from the laboratories ofHewlett-Packard Company 41, 3 (June), 36–47.

Carzaniga, A. 1998. Architectures for an Event Notification Service Scalable to Wide-areaNetworks. Ph. D. thesis, Politecnico di Milano, Milano, Italy.

Carzaniga, A., Rosenblum, D. S., and Wolf, A. L. 1999. Interfaces and algorithms fora wide-area event notification service. Technical Report CU-CS-888-99 (Oct.), Departmentof Computer Science, University of Colorado. Revised May 2000.

Carzaniga, A., Rosenblum, D. S., and Wolf, A. L. 2000a. Achieving scalability and ex-pressiveness in an internet-scale event notification service. In Proceedings of the NineteenthACM Symposium on Principles of Distributed Computing (PODC 2000) (Portland, OR,July 2000), pp. 219–227.

Carzaniga, A., Rosenblum, D. S., and Wolf, A. L. 2000b. Content-based addressingand routing: A general model and its application. Technical Report CU-CS-902-00 (Jan.),Department of Computer Science, University of Colorado.

Ceri, S. and Widom, J. 1996. Active Database Systems: Triggers and Rules for AdvancedDatabase Processing. Morgan Kaufmann, San Mateo.

Clark, D. 1989. Policy routing in internet protocols. Internet Requests For Comments(RFC) 1102.

Cugola, G., Di Nitto, E., and Fuggetta, A. 1998. Exploiting an event-based infras-tructure to develop complex distributed systems. In Proceedings of the 20th InternationalConference on Software Engineering (ICSE ’98) (Kyoto, Japan, April 1998), pp. 261–270.

Cugola, G., Di Nitto, E., and Fuggetta, A. To appear. The JEDI event-based infras-tructure and its application to the development of the OPSS WFMS. IEEE Transactionson Software Engineering .

Dalal, Y. K. and Metcalfe, R. M. 1978. Reverse path forwarding of broadcast packets.

Communications of the ACM 21, 12 (Dec.), 1040–1048.

Deering, S. E. 1991. Multicast Routing in a Datagram Internetwork. Ph. D. thesis, Stan-ford University.

Deering, S. E. and Cheriton, D. R. 1990. Multicast routing in datagram networks andextended LANs. ACM Transactions on Computer Systems 8, 2 (May), 85–111.

Fenner, W. 1997. Internet group management protocol, version 2. Internet Requests ForComments (RFC) 2236.

Gough, J. and Smith, G. 1995. Efficient recognition of events in a distributed system. InProceedings of the 18th Australasian Computer Science Conference (Adelaide, Australia,Feb. 1995).

Hart, R. O. and Lupton, G. 1995. DEC FUSE: Building a graphical software developmentenvironment from UNIX tools. Digital Technical Journal of Digital Equipment Corpora-tion 7, 2 (Spring), 5–19.

Julienne, A. M. and Holtz, B. 1994. ToolTalk and open protocols, inter-application com-munication. Prentice–Hall, Englewood Cliffs, New Jersey.

Kantor, B. and Lapsley, P. 1986. Network news transfer protocol—a proposed standardfor the stream-based transmission of news. Internet Requests For Comments (RFC) 977.

Krishnamurthy, B. and Rosenblum, D. S. 1995. Yeast: A general purpose event-actionsystem. IEEE Transactions on Software Engineering 21, 10 (Oct.), 845–857.

Mansouri-Samani, M. and Sloman, M. 1997. GEM: A generalized event monitoringlanguage for distributed systems. IEE/IOP/BCS Distributed Systems Engineering Jour-nal 4, 2 (June), 96–108.

Object Management Group. 1998a. CORBAservices: Common object service specifica-tion. Technical report (July), Object Management Group.

Page 52: Design and Evaluation of a Wide-Area Event Noti cation Service

52 · A. Carzaniga, D.S. Rosenblum, and A.L. Wolf

Object Management Group. 1998b. Notification service. Technical report (Nov.), Object

Management Group.

Object Management Group. 1999. Notification service. Technical report (Aug.), ObjectManagement Group.

Reiss, S. P. 1990. Connecting tools using message passing in the Field environment. IEEESoftware 7, 4 (July), 57–66.

Rosenblum, D. S. and Wolf, A. L. 1997. A design framework for Internet-scale eventobservation and notification. In Proceedings of the Sixth European Software EngineeringConference, Number 1301 in Lecture Notes in Computer Science (1997), pp. 344–360.Springer–Verlag.

Segall, B. and Arnold, D. 1997. Elvin has left the building: A publish/subscribe notifi-cation service with quenching. In Proceedings of AUUG97 (Brisbane, Australia, Sept. 3–51997), pp. 243–255.

Sun Microsystems, Inc. 1998. Java Distributed Event Specification. Mountain View, CA:Sun Microsystems, Inc.

Sun Microsystems, Inc. 1999. Java Message Service. Mountain View, CA: Sun Microsys-tems, Inc.

Tennenhouse, D. L., Smith, J. M., Sincoskie, W. D., Wetherall, D. J., and Minden, G. J.

1997. A survey of active network research. IEEE Communications Magazine 35, 1 (Jan.),80–86.

Wray, M. and Hawkes, R. 1998. Distributed virtual environments and VRML: an event-based architecture. Computer Networks and ISDN Systems 1-7, 30 (April), 43–51.

Yu, H., Estrin, D., and Govindan, R. 1999. A hierarchical proxy architecture for Internet-scale event services. In Proceedings of WETICE ’99 (Stanford, CA, June 1999).

Zegura, E. W., Calvert, K., and Donahoo, M. J. 1997. A quantitative comparison ofgraph-based models for internet topology. IEEE/ACM Transactions on Networking 5, 6(Dec.), 770–783.

Zegura, E. W., Calvert, K. L., and Bhattacharjee, S. 1996. How to model an in-ternetwork. In Proceedings of IEEE INFOCOM ’96 (San Francisco, CA, April 1996), pp.594–602.