event-condition-action rule languages over semistructured data george papamarkos

46
Event-Condition- Action Rule Languages over Semistructured Data George Papamarkos

Upload: griselda-miles

Post on 28-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

Event-Condition-Action Rule

Languages over Semistructured Data

George Papamarkos

Page 2: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 2

OutlineWhat Event-Condition-Action (ECA) Rules are and what we can do with them?

ECA Rules for XML ECA Langugage System Architecture Performance

ECA Rules for RDF ECA Langugage System Architecture Performance

Page 3: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 3

What is an ECA Rule?An Event-Condition-Action rule performs actions in response to events, given that a stated condition holds

An event in a database system can be the insertion of a new tuple

The condition can be a queryThe action may be a relational table update

This behaviour is called reactive functionality

Page 4: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 4

What is an ECA Rule?An ECA rule has the general syntax:

on event if condition do actionThe event part specifies when the rule is triggered

The condition part determines if the data are in a particular state, in which case the rule fires

The action part describes the actions to be performed if the rule fires.

Page 5: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 5

Advantages of using ECA Rules

Allow applications reactive functionality to be defined and managed within a single rule base rather than being encoded in the programs

Use of a high-level declarative syntax and are thus amenable to analysis and optimisation techniques that cannot be applied if the functionality was encoded in the programming code

Page 6: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 6

Outline

What Event-Condition-Action (ECA) Rules are and what we can do with them?

ECA Rules for XMLECA Language System ArchitecturePerformance

ECA Rules for RDFECA Langugage System ArchitecturePerformance

Page 7: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 7

ECA Rules for XML - Outline

Design issues of an ECA language for XML

The XTL Language Implementing an XTL rules processing system

Performance Study

Page 8: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 8

Design issues of an ECA language for XML Comparing with relational triggers the following are the most important XML-specific issues on designing an ECA language for XML

Event Granularity: Specifying the granularity of where data has be modified is more complex and requires path expressions

Action Granularity: Action may affect an entire sub-document meaning that: An action can trigger a different set of

events The analysis of which events are triggered by

an action cannot be based on syntax alone

Page 9: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 9

The XTL LanguageThe general syntax of XTL rules is:

on event if condition do action

Fragments of XPath and XQuery are used to specify the event, condition and action parts of XTL rules.

XPath is used for selecting and matching fragments of XML

XQuery is used withing actions where it is needed to construct a new XML fragment

Page 10: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 10

The XTL Language

Event Part Syntax: (INSERT | DELETE) e where e is an XPath expression evaluating to a set of nodes.

A rule is triggered if this set of nodes includes any node in the XML fragment inserted or deleted

The system-defined variable $delta contains this set of nodes and is available for use in condition and action part of the rule

Page 11: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 11

The XTL Language

Condition PartThe condition part is either the constant TRUE or one or more XPath expressions connected by the boolean connectives and, or, not.

Each of these expressions is evaluated on the data to tell whether the condition is TRUE or FALSE

Page 12: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 12

The XTL Language

Action Part: The action part is a sequece of one or more actions

Syntax: INSERT r BELOW e (BEFORE | AFTER) q r is an XQuery expression specifying the XML fragment

to be inserted, e is an XPath expression specifying the set of nodes under which the new fragment will be inserted, q is either a constant or an XPath qualifier specifying the set of nodes BEFORE or AFTER which the new nodes will be placed.

DELETE ee is an XPath expression specifing the set of nodes to

be deleted.

Page 13: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 13

XTL Language

Example rule:ON INSERT doc(‘s.xml’)/shares/share/day-info/prices/price

IF $delta > $delta/../../high

DO DELETE $delta/../high;

INSERT <high>$delta/text()</high>

BELOW $delta/../.. AFTER prices

Page 14: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 14

XTL rule processing system

Page 15: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 15

XTL rule processing system - Architecture

ECA Rules Management: Validates and registers a rule to the Rule Base

ECA Rule Processing Engine: Evaluates the Event and Condition Parts of the rules and schedules their actions for execution in the Action Schedule

Page 16: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 16

System Performance

The system performance was studied by: Developing an analytical model of the system

Performing experiments in the actual system

We have studied the effects of rule base indexes in system performance

Performance criterion: Update response time: The mean time taken to complete all rule execution resulting from a single update submitted by a top-level update transaction

Page 17: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 17

System Performance

Varying quantities:Number of rules in the rule base

Experiments on the actual performed with three (3) different rule sets

XML data set: a fragment of DBLP database

Page 18: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 18

System Performance - Analytical ModelThe analytical model is a mathematical description of the system behaviour

Uses queue theory to simulate the transaction queues and database processing

Uses a set of simplifying assumptions to emulate the behaviour of some system parameters (e.g. triggering probability, transaction arrival rate etc.)

Page 19: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 19

System Performance - Analytical Model Results

Page 20: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 20

System Performance - Analytical ModelResponse time increases non-linearly for as long as the system is stable (I.e. arrival rate in the transaction queue is less that the service rate)

After the stability point the transaction queue grows uncontrollably large, flooding the memory and slowing it down

Reasons: Everything served by a single queue High number of event query evaluations to find what is triggered

Page 21: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 21

System Performance - Experimental Results

Page 22: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 22

System Performance - Experimental Results

Difference with Analytical Model due to:implementation choices (use of DOM etc.) and

the simplification assumptions made in the analytical model

Page 23: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 23

System Performance

Page 24: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 24

System Performance - Indexing Rule Base

Page 25: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 25

System Performance - Indexing Rule BaseBetter overall behaviour and scalability characteristics due to smaller number of rules that need to be checked for triggering

Smaller number of rules checked --> smaller number of queries need to be evaluated

Page 26: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 26

OutlineWhat Event-Condition-Action (ECA) Rules are and what we can do with them?

ECA Rules for XMLECA Langugage System ArchitecturePerformance

ECA Rules for RDFECAPerformance Langugage System Architecture

Page 27: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 27

ECA Rules for RDF

The RDFTL ECA LanguageImplementing RDFTL processing system in P2P environments

System performance

Page 28: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 28

The RDFTL Language

We have designed the language from scratch specifically for RDF

General Syntax:ON event IF condition DO action

Page 29: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 29

The RDFTL LanguageEvent Part:

May contain let expressions of the form: LET $var := e

(INSERT | DELETE) e e is a path expression that evaluates on a set of RDF nodes. Catches the insertion or deletion of a node

(INSERT | DELETE) triple triple is an expression of the form (source,arc, target) specifying an RDF triple. Catches the insertion or deletion of a property in an RDF triple.

UPDATE upd_triple upd_triple is an expression of the form (source, arc, old_target->new_target). Catches the update of a property from one RDF node to another.

Page 30: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 30

The RDFTL Language

Condition Part: It is a boolean-valued expression May consist of conjunctions, disjunctions and negations

May also contain let expressions The $delta variable bound to the set of nodes or arcs modified and caught by the event part

Action Part: A sequence of actions Each action has similar syntax with the event part

Page 31: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 31

RDFTL Rules in P2P EnvironmentsSystem Architecture

Page 32: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 32

RDFTL Rules in P2P EnvironmentsEach peer (P) is supervised by a superpeer (SP)

The set of Ps supervised by an SP form a peergroup

At each SP there is an RDFTL processing engine installed

Each P or SP hosts a fragment of the RDF schema that may change due to updates

Hybrid fragmentation with possible replication

Page 33: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 33

RDFTL Rules in P2P EnvironmentsPs notify the SPs for any updates on their local data

An ECA rule generated at one P or SP may be replicated, triggered, evaluated or executed in different sites in the net.

Page 34: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 34

Distributed Rule Registration A rule generated is sent from P to SP for validation and storage

From there it is sent to all other SPs A replica of it will be stored also to those SPs that are e-relevant to the rule. I.e. the event part queries of a rule can be evaluated on SP

At each SP each rule is annotated with IDs of local peers that are e-, c- and a-relevant to the rule

c- and a- relevance have a similar meaning with e-relevance for the condition and action part

Page 35: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 35

Distributed Rule Execution Each SP manages its own rule execution schedule Each execution schedule is a sequence of updates to be executed on the local peergroup

Once an update u occurs in P, SP is notified SP determines if u may trigger any rule whose event part is annotated with P’s ID.

If yes, the event query is sent to P for evaluation If the rule is triggered, its condition will be evaluated

If the condition is true SP will send each instance of r’s action part to local peers that are a-relevant to it

Page 36: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 36

System Performance

The system performance was studied by: Developing an analytical model of the system Developing a system simulator and performing experiments with it

Performance criterion: Update response time: The mean time taken to complete all rule execution resulting from a single update submitted by a top-level update transaction

Page 37: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 37

System Performance

Cases studied with both the Analytical Model and the Simulator : Random Network topology between SPs, with various data replication degree

HyperCup Network topology between SPs, with various data replication degree

Varying quantities: Number of peergroups Number of rules

Page 38: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 38

System Performance

Random topology - Replication 10%

Analytical Model Simulation

Page 39: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 39

System Performance

With random topology system does not scale well even with low replication and number of rules and peergroups

Exponential update response timeSystem becomes unusable due to high load

Page 40: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 40

System Performance

HyperCup organises the SPs into hypercubes

HyperCup topology guarantees that: Each peer receives a message only once A total number of N-1 hops is necessary to broadcast a message to N peers

The more distant peers are reached after log2N hops

Page 41: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 41

System Performance

HyperCup - Replication 10%

Analytical Model Simulation

Page 42: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 42

System Performance

HyperCup - Replication 90%

Analytical Model Simulation

Page 43: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 43

System Performance

With HyperCup we achieve higher performance for various replication levels and number of peergroups

System scales betterSystem remains stable and the update response time within acceptable values

Analytical with simulation approach show good agreement

Page 44: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 44

Conclusions

We have described two ECA languages for XML and RDF

We have studied and defined the architectural characteristics for an ECA rule processing system in centralised and distributed environment

We have conducted a study to determine the system performance in both the centralised and distributed case

Page 45: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 45

Conclusions

The whole study shows that ECA rules is a usable technology for various different application environments over semi-structured data

Page 46: Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/2006 46

Thank you !!