enhancing business process mining with distributed tracing data …€¦ · instrument sample...

26
Chair of Software Engineering for Business Information Systems (sebis) Faculty of Informatics Technische Universität München wwwmatthes.in.tum.de Enhancing Business Process Mining with Distributed Tracing Data in a Microservice Architecture Jochen Graeff (B.Sc.) | 21.08.2017 | Master thesis final presentation Advisor: Martin Kleehaus

Upload: others

Post on 07-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Chair of Software Engineering for Business Information Systems (sebis) Faculty of InformaticsTechnische Universität Münchenwwwmatthes.in.tum.de

Enhancing Business Process Mining with Distributed Tracing Data in a Microservice ArchitectureJochen Graeff (B.Sc.) | 21.08.2017 | Master thesis final presentationAdvisor: Martin Kleehaus

Page 2: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Agenda

MotivationResearch questionsApproach§ Build sample architecture§ Instrument sample architecture§ Develop activity generation algorithm§ Set-up extended architecture§ Analysis creation

Live DemoEvaluation§ Benefits§ Limitations

© sebisJochen Graeff – Master thesis final presentation 2

Page 3: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Motivation

How does user behaviour and system behaviour influence each other?

© sebisJochen Graeff – Master thesis final presentation 3

BUSINESS LAYER

APPLICATION LAYER>

Page 4: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Is there a gap between the layers?

© sebisJochen Graeff – Master thesis final presentation 4

S6

S9

S1

S4

S2

S3

S7

S5

S8Application

Layer

Business transactions

Run-time behavior

User session

BusinessLayer Reserve

carShow

Details

Search cars

End rental

Book car

Process discovery of multiple sessions

Search cars

Reservecar

Book car

Show detailsReport Issue

<

Process Mining

Span ID 1

Span ID 2

Span ID 3

S1

S2

S3

DistributedTracing

Analysis of single trace

Transparency

Page 5: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Research questions

RQ1. How can a relationship between business activities and a distributedapplication architecture be established?

RQ2. What data has to be extracted and how has it to be mapped to enableand store the relationship knowledge?

RQ3. How can business process mining be extended with technical aspects inorder to uncover

a) user and system throughput times for business activity executions and,b) correlations between business process performance and system

behaviour?

© sebisJochen Graeff – Master thesis final presentation 5

?

?

?

Page 6: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Build sample architecture

Jochen Graeff – Master thesis final presentation

- Car sharing platform

- Spring Cloud microservicearchitecture

- 3 x infrastructure services

- 6 x business services

1Buildsample architecture

Page 7: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Build sample architecture

© sebisJochen Graeff – Master thesis final presentation 7

EUREKADISCOVERY SERVICE

:8761

CONFIGURA-TION

SERVICE

:8888

API

ACCOUNTINGSERVICE

:6060

CARS SERVICE

:9090

MAPSSERVICE

:6061

NOTIFICA-TIONS

SERVICE

:6063

PAYMENTSSERVICE

:5050

USERSERVICE

:5050

API

API

API

API

API

API

ZUULEDGE

SERVICE

:8762

API

WEBUISERVICE

:5050

API

APIBusiness services

Infrastructure services

HTTP

Front-end services

Database access

Page 8: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Instrument sample architecture

© sebisJochen Graeff – Master thesis final presentation 8

- Car sharing platform

- Spring Cloud microservicearchitecture

- 3 x infrastructure services

- 6 x business services

1Buildsample architecture 2 Instrument

samplearchitecture 3

- Add Spring Cloud Sleuth to every business service

- append sessionIDto span in controller of webuiservice

Page 9: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Instrument sample architecture

- Instrument every business service with spring cloud sleuth in order to generate span data in the applications

§ Add dependency spring-cloud-starter-zipkin§ Set sampling rate

© sebisJochen Graeff – Master thesis final presentation 9

BUSINESSSERVICE

API

WEBUISERVICE

API

- Instrument service (as for business services)- Append sessionID in every webui service endpoint

definition: tracer.addTag("sessionID", sessionID);

Page 10: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Develop activity generation algorithm

© sebisJochen Graeff – Master thesis final presentation 10

- Car sharing platform

- Spring Cloud microservicearchitecture

- 3 x infrastructure services

- 6 x business services

1Buildsample architecture 2 Instrument

samplearchitecture 3

Develop activity generationalgorithm

- Spring Cloud Sleuth

- append sessionIDto span in controller of webuiservice

- definition of required attributes and foreign for event log

- Definition of healthy/failed activities

- user and system activities

- inputs: spans, annotation & mapping table

Page 11: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Activity generation

© sebisJochen Graeff – Master thesis final presentation 11

CASE ID ACTIVITY START_TS END_TS DURATION TYPE SERVICE_NAME

12345 Book car 07/06/17 12:45:32.000

07/06/17 12:45:32.800

800 ms user webui-service

12345 http:/initializebooking

07/06/17 12:45:32.200

07/06/17 12:45:32.700

500 ms system accounting-service

12345 http:/handlecarbooking

07/06/17 12:45:32.400

07/06/17 12:45:32.500

100 ms system cars-serivice

12345 Unlock car 07/06/17 12:37:32.400

07/06/17 12:37:35.400

3000 ms user webui-service

... ... ... ... ... ... ...

TRACE ID

SPAN ID

PARENTID

NAME TIMESTAMP DURATION

a a null http:/bookcar 07/06/17 12:45:32.000

800 ms

a b a initialize 07/06/17 12:45:32.100

600 ms

a c b http:/initializebooking

07/06/17 12:45:32.200

500 ms

a d c http:/handlecarbooking

07/06/17 12:45:32.400

100 ms

b b null http:/opencar 07/06/17 12:37:32.400

3000 ms

spans table (zipkin)

activities table

annotations table (zipkin)TRACE ID

SPAN ID

KEY VALUE

a a http.method GET

a a cs

a a cr

a a sessionID 12345

a b ss

mapping tabletechnical_activity

pretty_name

is_activity

calls_service

http:/bookcar Book car 1 webui-service

http:/initializebooking

http:/initializebooking

1 accounting-service

http:/handlecarbooking

http:/handlecarbooking

1 cars-service

http:/opencar Unlock car 1 webui-service

http:/findroute Find route 1 webui-service

Page 12: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Setup extended architecture

© sebisJochen Graeff – Master thesis final presentation 12

- Car sharing platform

- Spring Cloud microservicearchitecture

- 3 x infrastructure services

- 6 x business services

1Buildsample architecture 2 Instrument

samplearchitecture 3Develop log

generationalgorithm 4 Setup

extended architecture

- Spring Cloud Sleuth

- append sessionIDto span in controller of webuiservice

- definition of required attributes and foreign for event log

- Definition of healthy/failed activities

- user and system activities

- inputs: spans, annotation & mapping table

- Deploy MySQL- Deploy Zipkin- Deploy Celonis- Build log

generation service that executes the algorithm

3Develop activity generationalgorithm

Page 13: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

ZIPKINDISTRIBUTED TRACINGSERVICE

:9411

API

ACTIVITIESGENERATIONSERVICE

:6064

API

Setup extended architecture

© sebisJochen Graeff – Master thesis final presentation 13

EUREKADISCOVERY SERVICE

:8761

CONFIGURA-TION

SERVICE

:8888

API

ACCOUNT-ING

SERVICE

:6060

CARS SERVICE

:9090

MAPSSERVICE

:6061

NOTI-FICATIONS SERVICE

:6063

PAYMENTSSERVICE

:5050

USERSERVICE

:5050

API

API

API

API

API

API

ZUULEDGE

SERVICE

:8762

WEBUISERVICE

:5050

API

APIBusiness services

Infrastructure services

HTTP

Front-end services

Database access

MySQLDatabase

Extended services

API CELONIS PROCESS MININGTOOL

:9000

Page 14: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Analysis creation

© sebisJochen Graeff – Master thesis final presentation 14

- Car sharing platform

- Spring Cloud microservicearchitecture

- 3 x infrastructure services

- 6 x business services

1Buildsample architecture 2 Instrument

samplearchitecture 3Develop log

generationalgorithm 4 Setup

extendedarchitecture 5 Analysis

Creation

- Spring Cloud Sleuth

- append sessionIDto span in controller of webuiservice

- definition of required attributes and foreign for event log

- Definition of healthy/failed activities

- user and system activities

- inputs: spans, annotation & mapping table

- Deploy MySQL- Deploy Zipkin- Deploy Celonis- Build log

generation service that executes the algorithm

- Configure data model

- Configure continuous data load

- Create 4 analysis with§ Business§ Application§ Cross-Domain§ Single Activity

views on the process

3Develop activity generationalgorithm

Page 15: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Analysis creation

© sebisJochen Graeff – Master thesis final presentation 15

Page 16: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Live Demo

© sebisJochen Graeff – Master thesis final presentation 16

Page 17: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Benefits

© sebisJochen Graeff – Master thesis final presentation 17

Cross-domain analysis

Provides a more holistic view between the business and application layer

Resource-efficient data source

Resource-efficient (easy to implement) input source for process mining in microservicearchitectures

Portability

Approach transferable to different architectures with limited effort (i.a. due to OpenTracingstandard)

Ubiquity

Ubiquitous through distributed tracing becoming a standard tool for microservicedebugging

Flexibility on process perspectives

Process scope flexibility through appending spans with arbitrary IDs

Bottom up process discovery

Bottom up process discovery in legacy systems

Page 18: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Limitations

© sebisJochen Graeff – Master thesis final presentation 18

System under survey (SUS)

Approach only tested on SUS- Single architecture- Data volume- Data contents

Process visualisation of system activities

Petri nets not a suitable visualisation method for system activities

Performance overhead

Necessity for SampingRate=1.0 for capturing whole process instances leads to performance overhead

Real-time event handling

Presented prototype only generates event log and reloads data model every 5 min

Page 19: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Backup

§ Workflow of activity generation and data reloading§ Related work§ Process Mining§ Inputs for activity generation§ Distributed tracing§ User request and span/trace context§ ‚End car rental‘ user activity sequence diagram

© sebisJochen Graeff – Master thesis final presentation 19

Page 20: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Workflow of activity generation and data reloading

© sebisJochen Graeff – Master thesis final presentation 20

User

Perform clicksin front-end

Trigger manual activitygeneration

Log generation service

Trigger scheduledactivity generation

~ 5min

Execute activity generationalgortihm

MySQL

Write activities 

table

Write trace_span

table

Write technical_activities 

table

Insert new entry into 

reload_triggertable

PMT

Check for new entry in reload_trigger 

table

Reload data model

~ 5min

Page 21: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Related work

5.3.R

elatedw

ork

Table 5.2.: Classification of related work

Logorigin

Activitytypes

Capturedbehaviour

Type of work Evaluationenviron-ment

Systemarchitecture

Languageindepend-ence

(Near)real-time

Poggi et al. [53] Web logs UserActivities

Business Algorithmevaluation

Real-lifeevent logs

No n/a No

Abe & Kudo [7] Web logs Useractivities

Business Framework Real-lifeevent logs

n/a n/a No

Bruckmann et al. [13] n/a UserActivities,systemactivities

Businessandsystem

Architectureproposal

n/a n/a n/a Yes

Leemans & van derAalst [41]

Joinpoint-pointcutmodelinstru-mentation

Useractivities

System Instrumentationstrategy,implementation

Real-lifeevent logs

Yes Yes No

Rubin et al. [59] Custominstru-mentation

Useractivities,systemactivities

User andsystem

Experimental Real-lifeevent logs

No n/a No

Proof-of-conceptprototype of this work

Distributedtracinginstru-mentation

Useractivities,systemactivities

Business,user andsystem

Instrumentationstrategy,architecturedescription,implementation

Simulateduserrequestson testingsystem

Yes Yes Yes

95

© sebisJochen Graeff – Master thesis final presentation 21

Page 22: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Process mining

© sebisJochen Graeff – Master thesis final presentation 22

Event LogTIMESTAMP ACTIVITY CASE ID2016-03-05 22:38:41.868 SEARCH CARS #12342016-03-05 23:46:32.306 REPORT ISSUE #56782016-03-05 23:47:42.321 RESERVE CAR #12342016-03-05 23:53:12.354 SHOW DETAILS #9012... ... ...

Search cars

Reserve car

Book car

Show detailsReport issue

<

“The idea of process mining is to discover, monitor and improve realprocesses (i.e., not assumed processes) by extracting knowledge from event logs readily available in today's (information) systems.”

IEEE CIS Task Force on Process Mining

There are three Classes of Process Mining:

1. Process Discovery2. Conformance Checking3. Extension

Page 23: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Develop activity generation algorithm

© sebisJochen Graeff – Master thesis final presentation 23

spans table (zipkin)namespan_idtrace_idstart_tsparent_idduration...

annotations table (zipkin)span_idtrace_ida_keya_valuea_timestamp...

mapping tabletechnical_activitypretty_nametypeis_activitycalls_service

1 has 1..*

activities tablesessionIDactivitystart_tsend_tsdurationtrace_idspan_idservice namefailuresortingtype...

1 has 1..* 1..* has 1

Page 24: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

Distributed tracing

© sebisJochen Graeff – Master thesis final presentation 24

The path taken through a simple servicing system on behalf of user request X.

The causal and temporal relationship between four spans of a trace.

Sigelman et al. (2010). Dapper, A Large Scale Distributed Systems Tracing Infrastructure. Google Research

time

SERVICE CTraceID = 100SpanID = 103ParentID = 101

SERVICE BTraceID = 100SpanID = 102ParentID = 101

SERVICE ATraceID = 100SpanID = 101ParentID = 100

FRONT-ENDTraceID = 100SpanID = 100ParentID = null

user

FRONTENDSERVICE

SERVICE A

SERVICE B SERVICE C

rpc1

rpc2 rpc3

requestX responseX

Page 25: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

User request and span/trace context

© sebisJochen Graeff – Master thesis final presentation 25

ZUULSERVICE

ACCOUNTINGSERVICE

CARMANAGEMENTSERVICE

requestrequestrequest

responseresponseresponse

TraceID =aSpanID =asr

TraceID =aSpanID =ccs

TraceID =aSpanID =b

TraceID =aSpanID =ccr

TraceID =aSpanID =ass

TraceID =aSpanID =b

TraceID =aSpanID =csr

TraceID =aSpanID =ecs

TraceID =aSpanID =d

TraceID =aSpanID =ecr

TraceID =aSpanID =css

TraceID =aSpanID =d

TraceID =aSpanID =esr

TraceID =aSpanID =f

TraceID =aSpanID =ess

TraceID =aSpanID =f

Page 26: Enhancing Business Process Mining with Distributed Tracing Data …€¦ · Instrument sample architecture - Instrument every business service with spring cloud sleuth in order to

‚End car rental‘ user activity

CARSSERVICE

lockscar

ACCOUNTINGSERVICE

PAYMENTSERVICE

ZUULSERVICE

NOTIFICATIONSERVICE

GET /lockCar

User

GET /endRental

GET /handlePayment

PUT /notifyUser

GET /finalizeBooking

notifiesuser

handlespayment

finalizesbooking

© sebisJochen Graeff – Master thesis final presentation 26