process mining: control-flow mining algorithms technologie management 2 process mining • short...

53
/faculteit technologie management 1 Process Mining: Control Process Mining: Control - - Flow Flow Mining Algorithms Mining Algorithms Ana Karla Ana Karla Alves Alves de Medeiros de Medeiros Eindhoven University of Technology Department of Information Systems [email protected]

Upload: vankhuong

Post on 08-May-2018

225 views

Category:

Documents


2 download

TRANSCRIPT

/faculteit technologie management 1

Process Mining: ControlProcess Mining: Control--Flow Flow Mining AlgorithmsMining Algorithms

Ana Karla Ana Karla AlvesAlves de Medeirosde Medeiros

Eindhoven University of TechnologyDepartment of Information Systems

[email protected]

/faculteit technologie management 2

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 3

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 4

Process Mining

1. Register at hospital2. Talk to doctor3. Take blood test4. Take X-ray5. Get results6. Talk to doctor7. Go home

1. Register at hospital2. Talk to doctor3. Undergo surgery4. Take medicines5. Take blood sample6. Talk to doctor7. Go home

1. Register at hospital2. Talk to doctor3. Undergo surgery4. Take medicines5. Take blood sample6. Talk to doctor7. Go home

EventEventLogsLogs

1. Register at hospital2. Talk to doctor3. Undergo surgery4. Take medicines5. Take blood sample6. Talk to doctor7. Go home

1. Register at hospital

2. Talk to doctor3. Undergo surgery4. Take medicines5. Take blood

sample6. Talk to doctor7. Go home

1. Register at hospital

2. Talk to doctor3. Undergo surgery4. Take medicines5. Take blood

sample6. Talk to doctor7. Go home

/faculteit technologie management 5

Process Mining Start

Register order

Prepareshipment

Ship goods

(Re)send bill

Receive paymentContactcustomer

Archive order

End

Performance Performance AnalysisAnalysis

ProcessProcess ModelModel OrganizationalOrganizational ModelModel

SocialSocial NetworkNetwork

EventEventLogLog

MiningMiningTechniquesTechniques

Auditing/SecurityAuditing/Security

MinedMinedModelsModels

/faculteit technologie management 6

Event Logs are Everywhere!

aaaaaa aaaaaa aaaaaa aaaaaa

aaaaaa aaaaaa aaaaaa aaaaaa

Start

Get Ready

Travel by CarTravel by Train

BETA PhD Day Starts

Visit Brewery

Have Dinner

Go Home

Travel by Train Pay for Parking

Travel by Car

End

Give a Talk

Machines, Machines, MunicipalitiesMunicipalities, , AirportsAirports,,Internet, Internet, HospitalsHospitals, etc., etc.

/faculteit technologie management 7

Tools

• www.processmining.org

• ProM 4.2

• ProMimport

• Free tools!

/faculteit technologie management 8

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 9

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 10

Types of Algorithms

/faculteit technologie management 11

Start

Register order

Prepareshipment

Ship goods

(Re)send bill

Receive paymentContactcustomer

Archive order

End

ProcessProcess ModelModel

OrganizationalOrganizational ModelModel

SocialSocial NetworkNetwork

Types of Algorithms

/faculteit technologie management 12

Auditing/SecurityAuditing/Security

Start

Register order

Prepareshipment

Ship goods

(Re)send bill

Receive paymentContactcustomer

Archive order

End

ComplianceComplianceProcessProcess ModelModel

Types of Algorithms

/faculteit technologie management 13

Start

Register order

Prepareshipment

Ship goods

(Re)send bill

Receive paymentContactcustomer

Archive order

End

Bottlenecks/Bottlenecks/Business Business RulesRulesProcessProcess ModelModel

Performance Performance AnalysisAnalysis

Types of Algorithms

/faculteit technologie management 14

Process Mining

EventEventLogLog

DiscoveryDiscovery TechniquesTechniques: : ControlControl--FlowFlow MiningMining

MinedMinedModelModel

1. Start2. Get Ready3. Travel by Train4. Beta Event Starts5. Visit Brewery6. Have Dinner7. Go Home8. Travel by Train

1. Start2. Get Ready3. Travel by Train4. Beta Event Starts5. Give a Talk6. Visit Brewery7. Have Dinner8. Go Home9. Travel by Train

1. Start2. Get Ready3. Travel by Car4. Beta Event Starts5. Give a Talk6. Visit Brewery7. Have Dinner8. Go Home9. Pay Parking10. Travel by Car

1. Start2. Get Ready3. Travel by Car4. Conference Starts5. Join Reception6. Have Dinner7. Go Home8. Pay Parking9. Travel by Car10. End

Start

Get Ready

Travel by CarTravel by Train

BETA PhD Day Starts

Visit Brewery

Have Dinner

Go Home

Travel by Train Pay for Parking

Travel by Car

End

Give a Talk

StartStart

GetGet ReadyReady

Travel Travel bybyTrainTrain

Travel Travel bybyCarCar

Conference StartsConference Starts

GiveGive a Talka Talk

Join ReceptionJoin Reception

Have Have DinnerDinner

Go HomeGo Home

Travel Travel bybyTrainTrain

Travel Travel bybyCarCar

PayPayParkingParking

EndEnd

/faculteit technologie management 15

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 16

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 17

Common Constructs

• Sequence

• Splits

• Joins

• Loops

• Non-Free

Choice

• Invisible Tasks

• Duplicate

Tasks

PayPayParkingParking

Get Ready

Travel by CarTravel by Train

Defence Starts

Ask Question

Defence Ends

Go Home

Travel by Train Pay for Parking

Travel by Car

Give a Talk

Have Drinks

GetGetReadyReady

Travel Travel bybyTrainTrain

Travel Travel bybyCarCar

DefenseDefense StartsStarts

GiveGive a Talka Talk

AskAsk QuestionQuestion

DefenseDefense EndsEnds

Go HomeGo Home

Travel Travel bybyTrainTrain

Travel Travel bybyCarCar

Have Have DrinksDrinks

/faculteit technologie management 18

Common Constructs

• Sequence

• Splits

• Joins

• Loops

• Non-Free

Choice

• Invisible Tasks

• Duplicate

Tasks

PayPayParkingParking

Get Ready

Travel by CarTravel by Train

Defence Starts

Ask Question

Defence Ends

Go Home

Travel by Train Pay for Parking

Travel by Car

Give a Talk

Have Drinks

GetGetReadyReady

Travel Travel bybyTrainTrain

Travel Travel bybyCarCar

DefenseDefense StartsStarts

GiveGive a Talka Talk

AskAsk QuestionQuestion

DefenseDefense EndsEnds

Go HomeGo Home

Travel Travel bybyTrainTrain

Travel Travel bybyCarCar

Have Have DrinksDrinks

+ noise in logs!

/faculteit technologie management 19

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 20

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 21

Event Log: Mining XML (MXML)

The notion The notion of which of which tasks tasks belong to a belong to a same same instance is instance is crucial for crucial for applying applying process process mining mining techniques!techniques!

/faculteit technologie management 22

Event Log: Mining XML (MXML)

/faculteit technologie management 23

Event Log: Mining XML (MXML)

Compulsory Compulsory fields!fields!

/faculteit technologie management 24

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 25

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 26

α-algorithm

1.

Read a log2.

Get the set of tasks

3.

Infer the ordering relations4.

Build the net based on inferred relations

5.

Output the net

Core Step!Core Step!

/faculteit technologie management 27

• Direct succession: x>y iff

for some

case x is directly followed by y.

• Causality: x→y iff

x>y and not y>x.• Parallel: x||y iff

x>y

and y>x• Unrelated: x#y iff

not x>y and not y>x.

case 1 : task A case 1 : task A case 2 : task A case 2 : task A case 3 : task A case 3 : task A case 3 : task B case 3 : task B case 1 : task B case 1 : task B case 1 : task C case 1 : task C case 2 : task C case 2 : task C case 4 : task A case 4 : task A case 2 : task B case 2 : task B ......

A>BA>BA>CA>CB>CB>CB>DB>DC>BC>BC>DC>DE>FE>F

AA→→BBAA→→CCBB→→DDCC→→DDEE→→FF

B||CB||CC||BC||B

ABCDABCDACBDACBDEFEF

α-algorithm - Ordering Relations

>,→,||,#

/faculteit technologie management 28

Let W be a workflow log over T. α(W) is defined as follows. 1.

TW

= { t ∈ T | ∃σ ∈ W

t ∈ σ}, 2.

TI

= { t ∈ T | ∃σ ∈ W

t = first(σ) }, 3.

TO

= { t ∈ T | ∃σ ∈ W

t = last(σ) }, 4.

XW

= { (A,B) |

A ⊆ TW

∧ B ⊆

TW

∧ ∀a ∈

A

∀b ∈

B

a →W

b ∧ ∀a1,a2 ∈

A

a1

#W

a2

∧ ∀b1,b2 ∈

B

b1

#W

b2

}, 5.

YW

= { (A,B) ∈ X | ∀(A′,B′) ∈

X

A ⊆ A′ ∧B ⊆

B′⇒

(A,B) = (A′,B′) },

6. PW

= { p(A,B)

|

(A,B) ∈ YW

} ∪{iW

,oW

}, 7.

FW

= { (a,p(A,B)

) |

(A,B) ∈ YW

∧ a ∈

A } ∪

{ (p(A,B)

,b) |

(A,B) ∈ YW

∧ b

∈ B } ∪{ (iW

,t) |

t ∈ TI

} ∪{ (t,oW

) | t ∈

TO

}, and 8.

α(W) = (PW

,TW

,FW

).

α-algorithm - Formalization

/faculteit technologie management 29

A

B

C

D

E F

ABCDABCDACBDACBDEFEF

AA→→BBAA→→CCBB→→DDCC→→DDEE→→FF

B||CB||CC||BC||B

α-algorithm - Insight

/faculteit technologie management 30

• If log is complete with respect to relation >, it can be used to mine SWF-net

without short loops

• Structured Workflow Nets (SWF-nets) have no implicit places and the following two constructs cannot be used:

α-algorithm – Log properties + target nets

/faculteit technologie management 31

One-length

Two-length

B>B and not B>B implies B→B (impossible!)

A>B and B>A implies A||B and B||A instead of A→B and B→A

α-algorithm – No short loops Why no

short loops?

/faculteit technologie management 32

No invisible tasks, non-free-choice or duplicate tasks•

No noisy logs

α-algorithm – Common Constructs

Why no duplicate

tasks?

Why not invible tasks?

Why noise-free

logs?

/faculteit technologie management 33

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Algorithm

• Fuzzy Miner

/faculteit technologie management 34

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Algorithm

• Fuzzy Miner

/faculteit technologie management 35

Heuristics Miner

1.

Read a log2.

Get the set of tasks

3.

Infer the ordering relations based on their frequencies

4.

Build the net based on inferred relations5.

Output the net

/faculteit technologie management 36

Heuristics Miner

Insight:The more frequently a task A directly follows another task B, and the less frequently the opposite occurs, the higher the probability that A causally follows B!

/faculteit technologie management 37

No non-free-choice or duplicate tasks•

Robust to invisible tasks and noisy logs

α-algorithm – Common Constructs

Why no non-free- choice?

Why no guarantee whole net will be correct?

/faculteit technologie management 38

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 39

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 40

Genetic Process

Mining

(GPM)

• Genetic

Algorithms

+ Process

Mining

• Genetic

Algorithms

– Search technique

that

mimics

the process

of evolution

in biological systems

• Advantages–

Tackle all common

structural

constructs

– Robust

to noise

• Disadvantages–

Computational

Time

/faculteit technologie management 41

Genetic Process

Mining

(GPM)

Algorithm:

Internal Representation

Fitness MeasureGenetic

Operators

/faculteit technologie management 42

GPM – Fitness Measure

Start

Get Ready

Travel by CarTravel by Train

BETA PhD Day Starts

Visit Brewery

Have Dinner

Go Home

Travel by Train Pay for Parking

Travel by Car

End

Give a Talk

StartStart

GetGet ReadyReady

Travel Travel bybyTrainTrain

Travel Travel bybyCarCar

Conference StartsConference Starts

GiveGive a Talka Talk

VisitVisit BreweryBrewery

Have Have DinnerDinner

Go HomeGo Home

Travel Travel bybyTrainTrain

Travel Travel bybyCarCar

PayPayParkingParking

EndEnd

• Guidesthe search!

/faculteit technologie management 43

GPM – Fitness Measure

Start

Get Ready

Travel by CarTravel by Train

BETA PhD Day Starts

Visit Brewery

Have Dinner

Go Home

Travel by Train Pay for Parking

Travel by Car

End

Give a Talk

/faculteit technologie management 44

GPM – Fitness Measure

Start

Get Ready

Travel by CarTravel by Train

BETA PhD Day Starts

Visit Brewery

Have Dinner

Go Home

Travel by Train Pay for Parking

Travel by Car

End

Give a Talk

Start

Get Ready

Travel by TrainBETA PhD Day Starts

Visit Brewery

Have Dinner

Go Home

Pay for Parking

Travel by Car

End

Give a Talk

StartStart

GetGet ReadyReady

Travel Travel bybyCarCar

Conference StartsConference Starts

GiveGive a Talka Talk

VisitVisit BreweryBrewery

Have Have DinnerDinner

Go HomeGo Home

PayPayParkingParking

EndEnd

Travel Travel bybyTrainTrain

Punish for the amount of enabled tasks during the parsing!

Overgeneral solution

/faculteit technologie management 45

GPM – Fitness Measure

Start

Get Ready

Travel by CarTravel by Train

BETA PhD Day Starts

Visit Brewery

Have Dinner

Go Home

Travel by Train Pay for Parking

Travel by Car

End

Give a Talk

Start

Get Ready

Travel by TrainBETA PhD Day Starts

Visit Brewery

Have Dinner

Go Home

Pay for Parking

Travel by Car

End

Give a Talk

Start

Get Ready

Travel by CarTravel by Train

BETA PhD Day Starts

Visit Brewery

Have Dinner

Go Home

Travel by Train Pay for Parking

Travel by Car

End

Give a Talk

StartStart Start

Get ReadyGet Ready Get Ready

Travel by Train Travel by Car

BETA PhD Day StartsBETA PhD Day Starts BETA PhD Day Starts

Give a Talk

Visit BreweryVisit Brewery Visit Brewery

Have DinnerHave Dinner Have Dinner

Go HomeGo HomeGo Home

Travel by Train Pay for Parking

Travel by Car

EndEnd End

StartStart

GetGet ReadyReady

Travel Travel byby CarCar

Conference StartsConference Starts

GiveGive a Talka Talk

VisitVisit BreweryBrewery

Have Have DinnerDinner

Go HomeGo Home

Travel Travel byby TrainTrain

Travel Travel byby CarCar

PayPay ParkingParking

EndEnd

Travel Travel byby TrainTrain

Punish for the amount of duplicate tasks with common input/output tasks!

Overspecific solution

/faculteit technologie management 46

GPM – DGA ProM

Plug-in

Why does the GA Miner takes so

much time?

How could we improve its running time without

changing the code?

/faculteit technologie management 47

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 48

Process Mining

• Short Recap

• Types of Process Mining Algorithms

• Common Constructs

• Input Format

• α-algorithm

• Heuristics Miner

• Genetic Miner

• Fuzzy Miner

/faculteit technologie management 49

Fuzzy Miner - Motivation

Mine less structured processes!

/faculteit technologie management 50

Fuzzy Miner - Motivation

/faculteit technologie management 51

Fuzzy Miner

/faculteit technologie management 52

Fuzzy Miner

No “Ask Question” or “Give Talk”!

All details!

Abstracting even more from details!

/faculteit technologie management 53

Conclusions

• The notion of a process instance is crucial!

• Ordering of tasks is the basic information

• Frequencies are important to handle noise

• Local approaches–

α-algorithm, Heuristics Miner

• Global approaches–

Genetic Miner and Fuzzy Miner