the refined process structure tree - ibm · translating a graph-based process model (e.g. bpmn)...

IBM Zurich Research Laboratory | Business Integration Technologies

© 2008 IBM Corporation

The 6th International Conference on Business Process Management (BPM 2008)

September 2008

The Refined Process Structure Tree

Jussi Vanhatalo [email protected] Völzer [email protected] Koehler [email protected]


© 2008 IBM Corporation2 Jussi Vanhatalo, Hagen Völzer, Jana Koehler

Motivation: BPMN to BPEL Translation

<process ...>

...

<if>

<condition>...</condition>

<flow>

<invoke name="a1" ... />

<sequence>



</sequence>

</flow>

<else>

<repeatUntil>


<condition>...</condition>

</repeatUntil>

</else>

</if>

</process>

Sequence

If

a2

Repeat-Untila4

a3

a1Flow

Repeat-Until

Flow

Sequence

If

s a2

a1p1 p2

x3x2

a4

x1 x4 ea3

Business Process Modeling

Notation (BPMN)

Business Process Execution Language (BPEL)

a3

Sequence

Repeat-Until

If

Flow

a2

a1 a4

Parse tree



Research Problem: Parsing a Business Process Model

� Parsing

1) Decomposition into fragments

2) Categorization of the fragments

→→→→ Parse tree

� Our contribution is a new parsing

technique

– Refined process structure tree

(RPST)

– Improves existing techniques by providing a more fine-graineddecomposition

Repeat-Until

Flow

Sequence

If

s a2

a1p1 p2

x3x2

a4

x1 x4 ea3

a3

Sequence

Repeat-Until

If

Flow

a2

a1 a4

Process model in BPMN

Parse tree



Outline

� Research Problem: Parsing a Business Process Model

� Use Cases for Parsing

� Requirements for Parsing and the Related Work

� Our Solution: The Refined Process Structure Tree



Use Cases for Parsing

� Translating a graph-based process model (e.g. BPMN) into a block-

based process model (e.g. BPEL)

� Speeding up control-flow analysis [Vanhatalo et al., 2007]

� Pattern-based editing [Gschwind et al., 2008; Today 11:00 am]

� Process merging [Küster et al., 2008; Tomorrow 16:00 am]

� Understanding large process models

� Subprocess detection



Subprocess Detection

s

a17

a6

a7

a8

a11

a12

a13

a14

a10

a15 e

a18

a16

a4

a3

a2

a1

a5

a9

P2

P3



Subprocess Detection

s

a17a10

a15 e

a18

a16

a1

a9

P2

P3

a6

a7

a8

a4

a3

a2

a5

P2

s e

a11

a12

a13

a14

P3

es



Outline

� Problem: Parsing a Business Process Model


� Requirements for Parsing and the Related Work

– Uniqueness

– Modularity

– Computing the Parse Tree Fast

– Granularity




Requirement: Uniqueness

� The parse tree should be unique

– Motivation: The same BPMN diagram is always translated to the same BPEL process

� Parsing techniques presented for BPMN to BPEL translations are not unique

– Nondeterministic pattern-matching approach

s ex4x3x2x1

a4a3a2

a1s ex4x3x2x1

a4a3a2

a1If Repeat-UntilSequence 2Sequence 1

Sequence 3If Repeat-Until

Sequence 1

a2

If

Repeat-Until

Sequence 1

Sequence 2

a1

a3 a4 Repeat-Until

Sequence 1

Sequence 3

a4

a3a2a1

If



Requirement: Modularity

� Motivation: A local change in BPMN translates into a local change in BPEL

� Modular:

– Replacing a fragment with another fragment changes only the respective subtree in the parse tree

� Parsing techniques presented for BPMN to BPEL translations are not modular

s ex4x3x2x1

a4a3a2


s ex2x1

a5a3a2

a1Seq. 3

IfSequence 1

Sequence 1

Sequence 3

a5a3a2a1

If

a2

If

Repeat-Until

Sequence 1

Sequence 2

a1

a3 a4



The Normal Process Structure Tree (NPST)

� The NPST is unique and modular

– Extends work on the program structure tree [Johnson et al., 1994]

– Adapted for process models [Vanhatalo, Völzer and Leymann, 2007]

s ex4x3x2x1

a4a3a2

a1If Repeat-Until

Sequence 1

s ex2x1

a5a3a2

a1

IfSequence 1

Sequence 1

a5a3

a2a1

If

a2

If Repeat-Until

Sequence 1

a1

a3

a4a1

a3



The NPST is the Hierarchy of the Canonical Fragments

� Parse tree is a hierarchy of fragments in which any two fragments do not overlap

→→→→ Some fragments must be excluded a parse tree

� What makes the NPST different from the non-deterministic parse trees?

– Each fragment that overlaps some other fragment is excluded from the NPST

• Such a fragment is called non-canonical• The non-maximal sequences are the non-canonical fragments

s ex4x3x2x1

a4a3a2


a2

If Repeat-Until

Sequence 1

a1

a3

a4

Sequence 3

a2

If

Repeat-Until

Sequence 1

Sequence 2

a1

a3 a4

Repeat-Until

Sequence 1

Sequence 3

a4

a3a2a1

If



Requirement: Computing the Parse Tree Fast

� Some use cases require a fast algorithm for computing the parse tree

– Process version merging

• Process models are compared based on their parse trees

• Change operations are applied to merge the process models

• Each time a process model changes, the parse tree is recomputed

– Pattern-based editing

• Some editing operations are applicable/prevented based on the

information in the parse tree

– Speeding up control-flow analysis

� The NPST can be computed in linear time



Requirement: Granularity

� Motivation: Translate more BPMN diagrams into BPEL

� Our new contribution is the refined process structure tree

– Extends work on a parse tree for sequential programs

[Tarjan and Valdes, 1980]

– More fine-grained than any previous technique

x1 x2s ex3

a2

a1

a3

X

x1 x2s ex3

a2

a1

a3

If Repeat-Until

Sequence



Outline

� Problem: Parsing a Business Process Model


� Requirements for Parsing and Related Work


– Relaxed Notion of a Fragment

– Canonical Fragments

– The Refined Process Structure Tree

– Uniqueness, Modularity, Granularity

– A Linear Time Algorithm



Relaxed Notion of a Fragment

The commonly used notion:

� A fragment is a connected

subgraph that has

– exactly one entry edge, and

– exactly one exit edge.

Relaxed notion:

� A fragment is a connected

subgraph that has

– exactly one entry node, and

– exactly one exit node.

u v

a2

a1

F

u v

a2

a1

F



More Precisely:

� If anything inside a fragment F is executed, then

– the entry node was executed before, and

– the exit node will be executed afterwards

� A boundary node is an entry if

– all incoming edges are outside F, or

– all outgoing edges are inside F

� A boundary node is an exit if

– all incoming edges are inside F, or

– all outgoing edges are outside F

� A fragment F is a connected subgraph that has

– exactly two boundary nodes,

– one entry, and one exit

� [Tarjan and Valdes, 1980]These boundary nodes are

neither entries nor exits

Not a fragment!

entry exit

fragment

u vF

u v

a2

a1

entry exit

fragment

u v F

a1

entry exit

fragment

F

a3

u v

a4

F



Non-Canonical and Canonical Fragments

� Non-canonical fragments overlap with some fragment

� Canonical fragments do not overlap and thus they form a hierarchy

a

d

bu vc

N2

N1

u v

a

c

b

N2

N1

non-maximal sequences

ts u vb ca

N1N2

maximal sequence

ts u vb ca

F1

P1

F2 F3u v

a

c

b

B1

F1

F2

F3

non-canonical bond fragments

canonical bond fragments

a

d

bu v

cR1

B1



The Refined Process Structure Tree

� As the canonical fragments do not overlap, they form a hierarchy.

� The refined process structure tree is the tree of canonical fragments

of a process model G, such that the parent of a canonical fragment F is

the smallest canonical fragment of G that properly contains F.

G P1T1 B1

S1

s

v2

v1

v3 v4 v7v5 v6 t

a

b

c

d j

n

ef

g

h

i

k l

m

o

P1

T1 B1

S1

j k lm

c d f gea b h i n

o



Properties of the Refined Process Structure Tree

� The RPST is:

– Unique

– Modular

– More fine-grained than

• the NPST

• the parse tree by Tarjan

and Valdes

� It can be computed in linear time x1 x2s e

a2

a1If

Repeat-Until

X

x1 x2s e

a2

a1

Fragments in the NPST

Fragments in the RPST



Step 1: Detect the triconnected

components.

Step 2: Check whethereach triconnected

component is a fragment.

Step 3: Restructure the tree

into the RPST.

A Linear Time Algorithm for Computing the RPST

P1T1

T2

B2

B1

P2

P1T1 B1

S1

P1T1

T2

B2

B1

P2

G

s

v2

v1

v3 v4 v7v5 v6 t

a

b

c

d j

n

ef

g

h

i

k l

m

o

G

s

v2

v1

v3 v4 v7v5 v6 t

a

b

c

d j

n

ef

g

h

i

k l

m

o

G

s

v2

v1

v3 v4 v7v5 v6 t

a

b

c

d j

n

ef

g

h

i

k l

m

o

G

s

v2

v1

v3 v4 v7v5 v6 t

a

b

c

d j

n

ef

g

h

i

k l

m

o



Generalized Theory

� In this paper, we assumed two restrictions for process models to

simplify the presented theory

– Exactly one start node and exactly one end node

– Loops must have separate entry and exit node

� We have generalized this theory for arbitrary process models

– This will published in an extended version of this paper

x2x1

a1

s ex

a1

s e



Conclusions

� Parsing business process models

– Many interesting use cases

– Requirements for a parsing technique

• Uniqueness, modularity, granularity, fast computation

� A new parsing technique called the refined process structure tree

– Improves existing techniques by providing a more fine-graineddecomposition

– Unique, and modular

– Can be computed in linear time

� Future work: Applying the RPST for different use cases



References

� [HT73] J. Hopcroft and R. E. Tarjan. Dividing a graph into triconnected components. SIAM J. Comput., 2:135–158, 1973.

� [Val78] Jacobo Valdes Ayesta. Parsing flowcharts and series-parallel graphs. PhD thesis, Stanford, CA, USA, 1978.

� [TV80] Robert E. Tarjan and Jacobo Valdes. Prime subprogram parsing of a program. In POPL ’80: Proceedings of the 7th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 95–105, New York, NY, USA, 1980. ACM.

� [JJP94] Richard Johnson, David Pearson, and Keshav Pingali. The program structure tree: Computing control regions in linear time. In Proceedings of the ACM SIGPLAN’94 Conference on Programming Language Design and Implementation (PLDI), pages 171–185, 1994.

� [Joh95] Richard Craig Johnson. Ecient program analysis using dependence flow graphs. PhD thesis, Ithaca, NY, USA, 1995.

� [GM00] Carsten Gutwenger and Petra Mutzel. A linear time implementation of SPQR-trees. In Joe Marks, editor, Graph Drawing, volume 1984 of Lecture Notes in Computer Science, pages 77–90. Springer, 2000.

� [VVL07] Jussi Vanhatalo, Hagen Völzer, and Frank Leymann. Faster and more focused control-flow analysis for business process models though SESE decomposition. In 5th International Conference on Service-Oriented Computing (ICSOC), volume 4749 of Lecture Notes in Computer Science, pages 43–55. Springer-Verlag Berlin Heidelberg, September 2007.

the refined process structure tree - ibm · translating a graph-based process model (e.g. bpmn)...

Documents