publisher placement algorithms in content-based publish/subscribe

48
Publisher Placement Algorithms in Content-based Publish/Subscribe Alex King Yeung Cheung and Hans-Arno Jacobsen University of Toronto June, 24 th 2010 ICDCS 2010 MIDDLEWARE SYSTEMS RESEARCH GROUP

Upload: meda

Post on 24-Feb-2016

62 views

Category:

Documents


2 download

DESCRIPTION

Publisher Placement Algorithms in Content-based Publish/Subscribe. Alex King Yeung Cheung and Hans-Arno Jacobsen University of Toronto June, 24 th 2010 ICDCS 2010. MIDDLEWARE SYSTEMS. RESEARCH GROUP. Problem. P. Publishers can join anywhere in the broker overlay Closest broker Impact - PowerPoint PPT Presentation

TRANSCRIPT

POP Publisher Optimal Placement

Publisher Placement Algorithms in Content-based Publish/SubscribeAlex King Yeung Cheung and Hans-Arno JacobsenUniversity of TorontoJune, 24th 2010ICDCS 2010

MIDDLEWARE SYSTEMSRESEARCH GROUP

ProblemPublishers can join anywhere in the broker overlayClosest brokerImpactHigh delivery delayHigh system utilizationMatchingBandwidthSubscription Storage

PSS2

In many pub/sub systems, publishers. Whatever closest means: network wise, geographic wise, etc.However, this is a problem because the publisher may be arbitrarily far away from the set of matching subscribers.2MotivationHigh system utilization leads to overloadsHigh response timesReliability issues

Critical for enterprise-grade publish/subscribe systemsGooPSGoogles internal publish/subscribe middlewareSupermontageTibcos pub/sub distribution network for Nasdaqs quote and order processing system GDSN (Global Data Synchronization Network) Global pub/sub network to allow retailers and suppliers to exchange supply chain data3

GoalAdaptively move publisher to area of matching subscribersAlgorithms should beDynamicTransparentScalableRobust

SSP4

Our goal is to dynamically compute the best location for the publisher and transparently move the publisher to the desired location. The solution should complement existing pub/sub system, including being able to scale with the number of brokers and clients, and robust by not introducing any single points of failure.4TerminologyB1B2B3B4B5PReference brokerupstreamdownstreamPublication flow5

In the rest of my presentation, I will make use of the terms upstream and downstream to refer to brokers that are in the direction to the publisher or away from the publisher respectively. In this picture, if B3 is the reference broker, then5Publisher Placement AlgorithmsPOP Publisher Optimistic PlacementFully distributed designRetrieves trace information per traced publicationUses one metric: number of publication deliveries downstreamGRAPE Greedy Relocation Algorithm for Publishers of EventsComputations are centralized at each publishers broker, makes implementing and debugging easier Retrieves trace information per trace sessionCustomize on minimizing delivery delay, broker load, or a specificed combination of bothUses two metrics: average delivery delay and total system message rateGoal: Move publishers to where the subscribers are based on past publication traffic6

In this work, we developed two publisher placement algorithms6Choice of Minimizing Delivery Delay or LoadSSSSSS[class,=,`STOCK], [symbol,=,`GOOG], [volume,>,1000000]P[class,`STOCK], [symbol,`GOOG], [volume,9900000][class,=,`STOCK], [symbol,=,`GOOG], [volume,>,0]4 msg/s1 msg/s100% Load 0%0% Delay 100%7

GRAPE allows one to choose between minimizing .This slide shows that relocating the publisher to minimize delivery delay may not minimize the system load.7GRAPEs 3 PhasesPhase 1 Discover location of publication deliveries by tracing live publication messagesRetrieve trace and broker performance informationPhase 2 Pinpoint the broker that minimizes the average delivery delay or system load in a centralized mannerPhase 3 Migrate the publisher to the broker decided in Phase 2Transparently with minimal routing table update and message overhead

8

Just to recap, the goal of POPIn order to achieve this, POP utilizes a 3 phase algorithmChallenges. Computational: dont want super elaborate probabilistic scheme for tagging8Phase 1 Illustration

0000000000Trace session IDStart of bit vector1Total number of deliveries made to local subscribersPublications received at this brokerNumber of matching subscribersB34-M213B34-M2125B34-M215B34-M21210B34-M216B34-M2125B34-M217B34-M2121B34-M220B34-M2123B34-M222B34-M21220B34-M225B34-M2121B34-M226B34-M2125B34-M21205150150010101010101GRAPEs data structure per publisherMessage IDTrace session ID9

Need bit vectors because we are dealing with a content-based system9Phase 1 Trace Data and Broker Performance RetrievalB1B5B7B6B8PSSS1x9x5xS1xReplyB8ReplyB7ReplyB8, B7, B6ReplyB8, B7, B6, B5Once Gthreshold publications are traced, then the trace session ends10

Contents of Trace Reply in Phase 1Broker IDNeighbor ID(s)Bit vector (for estimating total system message rate)Total number of local deliveries (for estimating end-to-end delivery delay)Input queuing delayAverage matching delayOutput queuing delays to neighbor(s) and binding(s)

Message overhead-wise, GRAPE adds 1 reply message per trace session11

Phase 2 Broker SelectionSimulate placing the publisher at every downstream broker and estimate the average end-to-end delivery delayLocal delivery countsProcessing delay at each brokerqueuing and matching delaysPublisher ping times to each brokerSimulate placing the publisher at every downstream broker and estimate the total system message rateBit vectors12

Phase 2 Estimating Average End-to-End Delivery DelayB1B8B6B7PSSS95S21Input Q:Matching:Output Q (RMI):Output Q (B5):

Input Q:Matching:Output Q (RMI):Output Q (B5):Output Q (B7):Output Q (B8):Input Q:Matching:Output Q (RMI):Output Q (B6):Input Q:Matching:Output Q (RMI):Output Q (B6):30 ms20 ms100 ms50ms20 ms5 ms45 ms25 ms40 ms35 ms30 ms10 ms70 ms30 ms35 ms15 ms75 ms35 msSubscriber at B1:10+(30+20+100) 1 = 160 msSubscribers at B6:10+[(30+20+50)+(20+5+45)] 2 = 350 ms Subscribers at B7:10+ [(30+20+50)+(20+5+40)+(30+10+70)] 9 = 2,485 msSubscribers at B8:10+[(30+20+50)+(20+5+35)+(35+15+75)] 5 = 1,435 msAverage end-to-end delivery delay:(150+340+2475+1425) 17 = 268 ms10 msPing time:13

Phase 2 Estimating Total Broker Message RateB1B8B6B7PSSS95S21 10000 00001 01111 00100Bit vector are necessary in capturing publication deliveries to local subscribers in content-based pub/sub systems 11111 01111 01111 00100Message rate through a broker is calculated by using the OR-bit operator to aggregate the bit vectors of all downstream brokers14

Need bit vectors because we are dealing with a content-based system

14Phase 2 Minimizing Delivery Delay with Weight P%Get publisher-to-broker ping timesCalculate the average delivery delay if the publisher is positioned at each of the downstream brokersNormalize, sort, and drop candidates with average delivery delays greater than 100 - PCalculate the total broker message rate if the publisher is positioned at each of the remaining candidate brokersSelect the candidate that yields the lowest total system message rate

15

POPs 3 PhasesPhase 1Discover location of publication deliveries by probabilistically tracing live publication messagesPhase 2Pinpoint the broker closest to the set of matching subscribers using trace data from phase 1 in a decentralized fashionPhase 3Migrate the publisher to the broker decided in Phase 2Transparently with minimal routing table update and message overhead

16Just to recap, the goal of POPIn order to achieve this, POP utilizes a 3 phase algorithmChallenges. Computational: dont want super elaborate probabilistic scheme for tagging16Phase 1 Publication TracingB43B615B1B3B2B5B4B7B6B8PSSSSSS2x4x3x1x9x5xS1xB1B2B4B516315B32B3B224B89B75B6B7B8159Publisher Profile TableMultiple publication traces are aggregated by :Si = Snew + (1 - ) Si-1Reply9Reply5Reply15Reply1517

9x means 9 matching subscribers are represented by the S symbol.

Any questions?17Phase 2 Broker SelectionB43B615B1B3B2B5B4B7B6B8SSSSSS2x4x3x1x9x5xPS1xB1B2B4B516315B32B3B224B89B75B6B7B8159AdvId: PDestId: nullBroker List:B1, B5, B610B618

Notice there are a number of brokers whose PPCs were not used, which means POP does not need detailed global knowledge of the entire system. Just the aggregated result.18Experiment SetupExperiments on both PlanetLab and a cluster testbed

PlanetLab: 63 brokers 1 broker per box 20 publishers with publication rate of 10 40 msg/min 80 subscribers per publisher 1600 subscribers in total Pthreshold of 50 Gthreshold of 50Cluster testbed: 127 brokers Up to 7 brokers per box 30 publishers with publication rate of 30 300 msg/min 200 subscribers per publisher 6000 subscribers in total Pthreshold of 100 Gthreshold of 10019

Experiment Setup - Workloads2 workloadsRandom Scenario5 % are high-rated; sink all traffic from their publisher25% are medium-rated; sink ~50% of traffic70% are low-rated; sink ~10% of trafficSubscribers are randomly placed on N brokersEnterprise Scenario5 % are high-rated; sink all traffic from their publisher95% are low-rated; sink ~10% of trafficAll high-rated subscribers are clustered onto one broker, and all low-rated subscribers onto N-1 brokers

20

Average Input Utilization Ratio vs Subscriber Distribution Graph

Load reduction by up to 68%21

In this graph, lower is better because it means lower load is imposed onto the brokers.21Average Delivery Delay vs Subscriber Distribution Graph

Delivery delay reduction by up to 68%22

Average Message Overhead Ratio vs Subscriber Distribution Graph

23

ConclusionsPOP and GRAPE moves publishers to areas of matching subscribers toReduce load in the system to increase scalability, and/orReduce average delivery delay on publication messages to improve performancePOP is suitable for pub/sub systems that strive for simplicity, such as GooPSGRAPE is suitable for systems that strive to minimize in the extremes, such as system load in sensor networks and delivery delay in SuperMontage, or want the flexibility to adjust the performance and based on resource usage

24

25

Related ApproachesFilter-based Publish/Subscribe:Re-organize the broker overlay to minimize delivery delay and system load.R.Baldoni et al. The Computer Journal, 2007.Migliavacca et al. DEBS 2007.

Multicast-based Publish/Subscribe:Assign similar subscriptions to one or more cluster of serversSuitable for static workloadsMay get false-positive publication deliveryArchitecture is fundamentally different than filter-based approachesRiabov et al. ICDCS 2002 and 2003Voulgaris et al. IPTPS 2006Baldoni et al. DEBS 2007

26Lets take a look at the related approaches. In terms of filter-based publish/subscribe systems, which is the focus of our work, there are two pieces of literature which reconfigures the broker overlay to minimize system load and deilvery delay. In our work, we only relocate the publisher clients while keeping the broker overlay intact. Another approach in literature is clustering subscribers of similar interests together to eliminate any forwarding broker. However, the system architecture is fundamentally different from our work26

Average Broker Message Rate VS Subscriber Distribution Graph27

Average Output Utilization Ratio VS Subscriber Distribution Graph28

Average Delivery Delay VS Subscriber Distribution Graph29

Average Hop Count VS Subscriber Distribution Graph30

Average Broker Message Rate VS Subscriber Distribution Graph31

Average Delivery Delay VS Subscriber Distribution Graph32

Average Message Overhead Ratio VS Time Graph33

Message Rate VS Time Graph34

Average Delivery Delay VS Time Graph35

Average Hop Count VS Time Graph36

Broker Selection Time VS Migration Hop Count Graph37

Broker Selection Time VS Migration Hop Count Graph38

Publisher Wait Time VS Migration Hop Count Graph39Results SummaryUnder random workloadNo significant performance differences between POP and GRAPEPrioritization metric and weight has almost no impact on GRAPEs performanceIncreasing the number of publication samples on POP Increases the response timeIncreases the amount of message overheadIncreases the average broker message rateGRAPE reduces the input util ratio by up to 68%, average message rate by 84%, average delivery delay by 68%, and message overhead relative to POP by 91%.

40Phase 1 Logging Publication HistoryEach broker records, per publisher, the publications delivered to local subscribersEach trace session is identified by the message ID of first publication of that sessionThe trace session ID is in the header of each subsequent publication messageGthreshold publications are traced per trace session

41POP - IntroPublisher Optimistic PlacementGoal: Move publishers to the area with highest publication delivery or concentration of matching subscribers

42POPs Methodology Overview3 phase algorithm:Phase 1: Discover location of publication deliveries by probabilistically tracing live publication messagesOngoing, efficiently with minimal network, computational, and storage overheadPhase 2: Pinpoint the broker closest to the set of matching subscribers using trace data from phase 1 in a decentralized fashionPhase 3: Migrate the publisher to the broker decided in phase 2Transparently with minimal routing table update and message overhead

43Just to recap, the goal of POPIn order to achieve this, POP utilizes a 3 phase algorithmChallenges. Computational: dont want super elaborate probabilistic scheme for tagging43Phase 1 Aggregated RepliesB43B615B1B3B2B5B4B7B6B8PSSSSSS2x4x3x1x9x5xS1xB1B2B4B516315B32B3B224B89B75B6B7B8159Publisher Profile TableMultiple publication traces are aggregated by :Si = Snew + (1 - ) Si-1Reply9Reply5Reply15Reply15449x means 9 matching subscribers are represented by the S symbol.

Any questions?44Phase 2 Decentralized Broker Selection AlgorithmPhase 2 starts when Pthreshold publications are tracedGoal: Pinpoint the broker that is closest to highest concentration of matching subscribersUsing trace information from only a subset of brokersThe Next Best Broker condition:The next best neighboring broker is the one whose number of downstream subscribers is greater than the sum of all other neighbors' downstream subscribers plus the local broker's subscribers.

45Trace information stored in Publisher Profile Cache45Phase 2 ExampleB43B615B1B3B2B5B4B7B6B8SSSSSS2x4x3x1x9x5xPS1xB1B2B4B516315B32B3B224B89B75B6B7B8159AdvId: PDestId: nullBroker List:B1, B5, B610B646Notice there are a number of brokers whose PPCs were not used, which means POP does not need detailed global knowledge of the entire system. Just the aggregated result.46Phase 3 - ExampleB1B3B2B5B4B7B6B8SSSSSS2x4x3x1x9x5xPS1x(1) Update last hop of P to B6-xUpdate last hop of P to B6Remove all S with B6 as last hopUpdate last hop of P to B6Remove all S with B5 as last hopForward (all) matching S to B5How to tell when all subs are processed by B6 before P can publish again?DONE4747Phase 2 Minimizing Load with Weight P%Calculate the total broker message rate if the publisher is positioned at each of the downstream brokersNormalize, sort, and drop candidates with total message rate greater than 100 - P.Get publisher-to-broker ping times on remaining candidatesCalculate the average delivery delay if the publisher is positioned at each of the remaining downstream brokersSelect the candidate that yields the lowest average delivery delay.

48