the refined process structure tree - ibm · translating a graph-based process model (e.g. bpmn)...
TRANSCRIPT
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation
The 6th International Conference on Business Process Management (BPM 2008)
September 2008
The Refined Process Structure Tree
Jussi Vanhatalo [email protected] Völzer [email protected] Koehler [email protected]
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation2 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Motivation: BPMN to BPEL Translation
<process ...>
...
<if>
<condition>...</condition>
<flow>
<invoke name="a1" ... />
<sequence>
<invoke name="a2" ... />
<invoke name="a3" ... />
</sequence>
</flow>
<else>
<repeatUntil>
<invoke name="a4" ... />
<condition>...</condition>
</repeatUntil>
</else>
</if>
</process>
Sequence
If
a2
Repeat-Untila4
a3
a1Flow
Repeat-Until
Flow
Sequence
If
s a2
a1p1 p2
x3x2
a4
x1 x4 ea3
Business Process Modeling
Notation (BPMN)
Business Process Execution Language (BPEL)
a3
Sequence
Repeat-Until
If
Flow
a2
a1 a4
Parse tree
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation3 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Research Problem: Parsing a Business Process Model
� Parsing
1) Decomposition into fragments
2) Categorization of the fragments
→→→→ Parse tree
� Our contribution is a new parsing
technique
– Refined process structure tree
(RPST)
– Improves existing techniques by providing a more fine-graineddecomposition
Repeat-Until
Flow
Sequence
If
s a2
a1p1 p2
x3x2
a4
x1 x4 ea3
a3
Sequence
Repeat-Until
If
Flow
a2
a1 a4
Process model in BPMN
Parse tree
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation4 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Outline
� Research Problem: Parsing a Business Process Model
� Use Cases for Parsing
� Requirements for Parsing and the Related Work
� Our Solution: The Refined Process Structure Tree
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation5 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Use Cases for Parsing
� Translating a graph-based process model (e.g. BPMN) into a block-
based process model (e.g. BPEL)
� Speeding up control-flow analysis [Vanhatalo et al., 2007]
� Pattern-based editing [Gschwind et al., 2008; Today 11:00 am]
� Process merging [Küster et al., 2008; Tomorrow 16:00 am]
� Understanding large process models
� Subprocess detection
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation6 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Subprocess Detection
s
a17
a6
a7
a8
a11
a12
a13
a14
a10
a15 e
a18
a16
a4
a3
a2
a1
a5
a9
P2
P3
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation7 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Subprocess Detection
s
a17a10
a15 e
a18
a16
a1
a9
P2
P3
a6
a7
a8
a4
a3
a2
a5
P2
s e
a11
a12
a13
a14
P3
es
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation8 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Outline
� Problem: Parsing a Business Process Model
� Use Cases for Parsing
� Requirements for Parsing and the Related Work
– Uniqueness
– Modularity
– Computing the Parse Tree Fast
– Granularity
� Our Solution: The Refined Process Structure Tree
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation9 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Requirement: Uniqueness
� The parse tree should be unique
– Motivation: The same BPMN diagram is always translated to the same BPEL process
� Parsing techniques presented for BPMN to BPEL translations are not unique
– Nondeterministic pattern-matching approach
s ex4x3x2x1
a4a3a2
a1s ex4x3x2x1
a4a3a2
a1If Repeat-UntilSequence 2Sequence 1
Sequence 3If Repeat-Until
Sequence 1
a2
If
Repeat-Until
Sequence 1
Sequence 2
a1
a3 a4 Repeat-Until
Sequence 1
Sequence 3
a4
a3a2a1
If
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation10 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Requirement: Modularity
� Motivation: A local change in BPMN translates into a local change in BPEL
� Modular:
– Replacing a fragment with another fragment changes only the respective subtree in the parse tree
� Parsing techniques presented for BPMN to BPEL translations are not modular
s ex4x3x2x1
a4a3a2
a1If Repeat-UntilSequence 2Sequence 1
s ex2x1
a5a3a2
a1Seq. 3
IfSequence 1
Sequence 1
Sequence 3
a5a3a2a1
If
a2
If
Repeat-Until
Sequence 1
Sequence 2
a1
a3 a4
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation11 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
The Normal Process Structure Tree (NPST)
� The NPST is unique and modular
– Extends work on the program structure tree [Johnson et al., 1994]
– Adapted for process models [Vanhatalo, Völzer and Leymann, 2007]
s ex4x3x2x1
a4a3a2
a1If Repeat-Until
Sequence 1
s ex2x1
a5a3a2
a1
IfSequence 1
Sequence 1
a5a3
a2a1
If
a2
If Repeat-Until
Sequence 1
a1
a3
a4a1
a3
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation12 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
The NPST is the Hierarchy of the Canonical Fragments
� Parse tree is a hierarchy of fragments in which any two fragments do not overlap
→→→→ Some fragments must be excluded a parse tree
� What makes the NPST different from the non-deterministic parse trees?
– Each fragment that overlaps some other fragment is excluded from the NPST
• Such a fragment is called non-canonical• The non-maximal sequences are the non-canonical fragments
s ex4x3x2x1
a4a3a2
a1If Repeat-UntilSequence 2Sequence 1
a2
If Repeat-Until
Sequence 1
a1
a3
a4
Sequence 3
a2
If
Repeat-Until
Sequence 1
Sequence 2
a1
a3 a4
Repeat-Until
Sequence 1
Sequence 3
a4
a3a2a1
If
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation13 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Requirement: Computing the Parse Tree Fast
� Some use cases require a fast algorithm for computing the parse tree
– Process version merging
• Process models are compared based on their parse trees
• Change operations are applied to merge the process models
• Each time a process model changes, the parse tree is recomputed
– Pattern-based editing
• Some editing operations are applicable/prevented based on the
information in the parse tree
– Speeding up control-flow analysis
� The NPST can be computed in linear time
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation14 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Requirement: Granularity
� Motivation: Translate more BPMN diagrams into BPEL
� Our new contribution is the refined process structure tree
– Extends work on a parse tree for sequential programs
[Tarjan and Valdes, 1980]
– More fine-grained than any previous technique
x1 x2s ex3
a2
a1
a3
X
x1 x2s ex3
a2
a1
a3
If Repeat-Until
Sequence
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation15 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Outline
� Problem: Parsing a Business Process Model
� Use Cases for Parsing
� Requirements for Parsing and Related Work
� Our Solution: The Refined Process Structure Tree
– Relaxed Notion of a Fragment
– Canonical Fragments
– The Refined Process Structure Tree
– Uniqueness, Modularity, Granularity
– A Linear Time Algorithm
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation16 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Relaxed Notion of a Fragment
The commonly used notion:
� A fragment is a connected
subgraph that has
– exactly one entry edge, and
– exactly one exit edge.
Relaxed notion:
� A fragment is a connected
subgraph that has
– exactly one entry node, and
– exactly one exit node.
u v
a2
a1
F
u v
a2
a1
F
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation17 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
More Precisely:
� If anything inside a fragment F is executed, then
– the entry node was executed before, and
– the exit node will be executed afterwards
� A boundary node is an entry if
– all incoming edges are outside F, or
– all outgoing edges are inside F
� A boundary node is an exit if
– all incoming edges are inside F, or
– all outgoing edges are outside F
� A fragment F is a connected subgraph that has
– exactly two boundary nodes,
– one entry, and one exit
� [Tarjan and Valdes, 1980]These boundary nodes are
neither entries nor exits
Not a fragment!
entry exit
fragment
u vF
u v
a2
a1
entry exit
fragment
u v F
a1
entry exit
fragment
F
a3
u v
a4
F
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation18 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Non-Canonical and Canonical Fragments
� Non-canonical fragments overlap with some fragment
� Canonical fragments do not overlap and thus they form a hierarchy
a
d
bu vc
N2
N1
u v
a
c
b
N2
N1
non-maximal sequences
ts u vb ca
N1N2
maximal sequence
ts u vb ca
F1
P1
F2 F3u v
a
c
b
B1
F1
F2
F3
non-canonical bond fragments
canonical bond fragments
a
d
bu v
cR1
B1
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation19 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
The Refined Process Structure Tree
� As the canonical fragments do not overlap, they form a hierarchy.
� The refined process structure tree is the tree of canonical fragments
of a process model G, such that the parent of a canonical fragment F is
the smallest canonical fragment of G that properly contains F.
G P1T1 B1
S1
s
v2
v1
v3 v4 v7v5 v6 t
a
b
c
d j
n
ef
g
h
i
k l
m
o
P1
T1 B1
S1
j k lm
c d f gea b h i n
o
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation20 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Properties of the Refined Process Structure Tree
� The RPST is:
– Unique
– Modular
– More fine-grained than
• the NPST
• the parse tree by Tarjan
and Valdes
� It can be computed in linear time x1 x2s e
a2
a1If
Repeat-Until
X
x1 x2s e
a2
a1
Fragments in the NPST
Fragments in the RPST
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation21 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Step 1: Detect the triconnected
components.
Step 2: Check whethereach triconnected
component is a fragment.
Step 3: Restructure the tree
into the RPST.
A Linear Time Algorithm for Computing the RPST
P1T1
T2
B2
B1
P2
P1T1 B1
S1
P1T1
T2
B2
B1
P2
G
s
v2
v1
v3 v4 v7v5 v6 t
a
b
c
d j
n
ef
g
h
i
k l
m
o
G
s
v2
v1
v3 v4 v7v5 v6 t
a
b
c
d j
n
ef
g
h
i
k l
m
o
G
s
v2
v1
v3 v4 v7v5 v6 t
a
b
c
d j
n
ef
g
h
i
k l
m
o
G
s
v2
v1
v3 v4 v7v5 v6 t
a
b
c
d j
n
ef
g
h
i
k l
m
o
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation22 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Generalized Theory
� In this paper, we assumed two restrictions for process models to
simplify the presented theory
– Exactly one start node and exactly one end node
– Loops must have separate entry and exit node
� We have generalized this theory for arbitrary process models
– This will published in an extended version of this paper
x2x1
a1
s ex
a1
s e
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation23 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
Conclusions
� Parsing business process models
– Many interesting use cases
– Requirements for a parsing technique
• Uniqueness, modularity, granularity, fast computation
� A new parsing technique called the refined process structure tree
– Improves existing techniques by providing a more fine-graineddecomposition
– Unique, and modular
– Can be computed in linear time
� Future work: Applying the RPST for different use cases
IBM Zurich Research Laboratory | Business Integration Technologies
© 2008 IBM Corporation24 Jussi Vanhatalo, Hagen Völzer, Jana Koehler
References
� [HT73] J. Hopcroft and R. E. Tarjan. Dividing a graph into triconnected components. SIAM J. Comput., 2:135–158, 1973.
� [Val78] Jacobo Valdes Ayesta. Parsing flowcharts and series-parallel graphs. PhD thesis, Stanford, CA, USA, 1978.
� [TV80] Robert E. Tarjan and Jacobo Valdes. Prime subprogram parsing of a program. In POPL ’80: Proceedings of the 7th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 95–105, New York, NY, USA, 1980. ACM.
� [JJP94] Richard Johnson, David Pearson, and Keshav Pingali. The program structure tree: Computing control regions in linear time. In Proceedings of the ACM SIGPLAN’94 Conference on Programming Language Design and Implementation (PLDI), pages 171–185, 1994.
� [Joh95] Richard Craig Johnson. Ecient program analysis using dependence flow graphs. PhD thesis, Ithaca, NY, USA, 1995.
� [GM00] Carsten Gutwenger and Petra Mutzel. A linear time implementation of SPQR-trees. In Joe Marks, editor, Graph Drawing, volume 1984 of Lecture Notes in Computer Science, pages 77–90. Springer, 2000.
� [VVL07] Jussi Vanhatalo, Hagen Völzer, and Frank Leymann. Faster and more focused control-flow analysis for business process models though SESE decomposition. In 5th International Conference on Service-Oriented Computing (ICSOC), volume 4749 of Lecture Notes in Computer Science, pages 43–55. Springer-Verlag Berlin Heidelberg, September 2007.