mining medical device logs to improve operational ... · overview. 2. siemens – who we are 3...

31
#TDPARTNERS16 GEORGIA WORLD CONGRESS CENTER Mining medical device logs to improve operational efficiency at Siemens HC Bruce Baum – Siemens Mike Watzke – Teradata Labs

Upload: others

Post on 13-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

#TDPARTNERS16 GEORGIA WORLD CONGRESS CENTER

Mining medical device logs to improve operational efficiency at Siemens HC

Bruce Baum – SiemensMike Watzke – Teradata Labs

Page 2: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

• Siemens Introduction• Business Problem• Technical Solution

Overview

2

Page 3: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Siemens – Who we are

3

Electrification, Automation and Digitalization are long-term growth fields of Siemens.

Power and Gas Wind Power and Renewables Power Generation Services

Energy Management Building Technologies Mobility

Digital Factory Process Industries and Drives Healthcare

Page 4: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Shifting markets drive need for answers

4

Page 5: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Business Problem

5

1) Business problem & data overview

2) The story so far: Classical machine learning

3) Towards pattern mining for imaging devices

4) Excursion: The Siemens compute environment

Page 6: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Remote Diagnostics @ Healthcare

6

• Siemens Healthineers medical imaging devices are used all across the world in a demanding market:

• Minimize downtimes. Just imagine…• … doctors puzzling over blurred images• … an ER room

• Minimize maintenance cost (personnel, material)

• Siemens answer: Remote monitoring & diagnostics• Goal: Exchange unplanned for planned downtime• Technology: Predictive maintenance (min. 3 days)• Critical constraint: False alarms not accepted

Page 7: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Data at a Glance (CT Example)

7

• Regulatory constraints require focus on already existing data sources:

• Device logs• Time-stamped sequence of events• >100m lines per device & year

• Parts exchange data• Calls to service center (+ exam results)• <10 faults per device & year

Page 8: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

The story so far: Classical ML

9

timestamp source code text

2014-05-17 11:31:12 A 37 xxx

2014-05-17 11:31:12 B 42 yyy

2014-05-17 11:31:13 B 17 .. hi temp (37.5) in ..

Device Logs

Page 9: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

The story so far: Classical ML

10

timestamp source code text

2014-05-17 11:31:12 A 37 xxx

2014-05-17 11:31:12 B 42 yyy

2014-05-17 11:31:13 B 17 .. hi temp (37.5) in ..

device episode f1 f2 … f10000

55049 4711 37 3.45 true

55049 4712 42 ? false

55049 4713 17 3.12 true

Feature Matrix

Device Logs

Feature Extraction

• Event counts• Statistics for

extracted values• Derived features

(grouping/ scaling/ trends/ …)

• Different time bins (day/ scan/ …)

Page 10: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

The story so far: Classical ML

11

timestamp source code text

2014-05-17 11:31:12 A 37 xxx

2014-05-17 11:31:12 B 42 yyy

2014-05-17 11:31:13 B 17 .. hi temp (37.5) in ..

device episode f1 f2 … f10000

55049 4711 37 3.45 true

55049 4712 42 ? false

55049 4713 17 3.12 true

Analytical Models

Feature Matrix

Device Logs

Feature Extraction

• Event counts• Statistics for

extracted values• Derived features

(grouping/ scaling/ trends/ …)

• Different time bins (day/ scan/ …)

Model Training

• Prediction horizon• Model selection• Feature selection

Parts Exchange Data

Page 11: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

The story so far: Classical ML

12

timestamp source code text

2014-05-17 11:31:12 A 37 xxx

2014-05-17 11:31:12 B 42 yyy

2014-05-17 11:31:13 B 17 .. hi temp (37.5) in ..

device episode f1 f2 … f10000

55049 4711 37 3.45 true

55049 4712 42 ? false

55049 4713 17 3.12 true

Analytical Models

Feature Matrix

Device Logs

Feature Extraction

• Event counts• Statistics for

extracted values• Derived features

(grouping/ scaling/ trends/ …)

• Different time bins (day/ scan/ …)

Model Training

• Prediction horizon• Model selection• Feature selection

Parts Exchange Data

High-quality modelsfor current use cases

New offers need evenfewer false alarms

X

Page 12: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Why temporal patterns?

13

CT log data is a “flattened” representation of overlapping processes!

AcquirePrepare StoreProcessPatient 12345

AcquirePrepare StoreProcessPatient 98765

AcquirePrepare StoreProcessPatient 506156

timestamp source code text

2014-05-17 11:31:12 A 37 xxx

2014-05-17 11:31:12 B 42 yyy

2014-05-17 11:31:13 B 17 .. hi temp (37.5) in ..

Page 13: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

First steps to pattern mining (1/2)

14

Domain-specific pattern-mining algorithm in Java

resilience to“stray events”from parallel processes

anytime-capability

support same-timeevents, includingrandom order of

“almost same time”

user-definedquality functions

X Scalability not sufficient for use case (transfer, processing)

Page 14: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

First steps to pattern mining (2/2)

15

Using Aster nPath features, instrumented with KNIME

X Millions of nPath calls, no generation in-DB, expensive grouping operation needed!

Page 15: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

The Siemens Smart Data Lab

16

Dat

a m

anag

emen

t

Data analytics

Data presentation

Data Warehouse

Hadoop

Data integration

Aster

… and others

… and others

4 nodes148 virtual units11 TB storage

Per node:• 24 cores• 256 GB RAM

Page 16: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Technical Solution

17

1. Overview

2. Sequence mining algorithm and implementation

3. Experimental data and demographics

4. Data preparation

5. Sequence mining training

6. Pattern scoring

7. Findings

Page 17: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Overview

18

• Hypothesis: temporal sequences of events can be used to provide early warning of failures?

• Test hypothesis with an experiment• Computed Tomography (CT) device logs• New sequence mining machine learning algorithm• Pattern scoring function

• Prior work: With the large volume of data and large pattern search space standard sequence mining approaches failed to work

Page 18: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Related Sequence Mining Work

19

• Frequent pattern mining (FP-growth, FrequentPaths) and Association Rules

• Frequency is not necessarily correlated to failure• Subgroup Discovery: related to above but allows for a more

flexible definition of the quality metric (frequency, unexpected, discriminating, ..)

• Temporal ordering of events is not considered

• Pattern Matching: requires patterns as inputs, in this context a function such as nPath would be more appropriate for scoring

Page 19: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Sequence Mining Definitions

20

Devices

FailureTime

BDFF A

ABACDABFEBDBACCEBDFAC

Event alphabet (A-F)

BDF

Sequence 1

Sequence 2

Events Pattern

Page 20: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Sequence Mining Algorithm

21

• Supervised machine learning algorithm, data classes are categorized based on time to failure

• Exhaustive iterative breadth first search of event pattern space. Search space size is an exponential function of depth

• Matching based on solving a Constraint Satisfaction Problem• Searching space pruning

• Quality Metric, Positive Predictive Value (PPV)• Sequence match counts• Prior matched sequences<->patterns• Terminate expansion of patterns with PPV = 1.0 (monotonic)

Page 21: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Sequence Mining Algorithm

22

Iteration 2, 32 Patterns

Iteration 1, 31 Patterns|Events| = 3Search Space

PPV = 0.93, matched 93 class 1 and 7 class 0Level 4 Pruning Example

PPV = 0.48, matched 48 class 1 and 52 class 0

. . .

. . . . . .

. . .

Page 22: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Algorithm Matching

23

• Matching a pattern to a sequence is based on solving a Constraint Satisfaction Problem

• Is (BDF) a subsequence of input sequence?• Constraints to be solved:

• Time(B,D) < Forward Gap OR Time(D,B) < Backward GapAND • Time(D,F) < Forward Gap OR Time(F,D) < Backward Gap

• Other matching approach would be a finite automata

BDF

Page 23: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Teradata DBS Implementation

24

PatternsSequences

Sequences (~Map)

Patterns(Vector)

Expand/ CSPMatch

Global (by pattern) Score and Prune

DuplicatedHashed

Sequence : Pattern

Nth instance of Global Score and Prune

Nth instance of Local Expand and Match

AMP 1 AMP NHashed

Page 24: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Experiment Data

25

• 350,000,000 CT device events• Data Record: {device, time, event,

class label}• Training data set (60% sample),

Validation data set (40%)• ~2,000 distinct Events

Page 25: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Experiment Data Demographics

26

• Event distribution

• X: Daily event count per device

• Y: frequency of specific daily event count

Page 26: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Experiment Data Preparation

27

• Generating Episodes

• Additional transformations• Timestamp to Epoch • Event string to numeric identifier• Event pruning; know error events, only events that

occur in failure window

Page 27: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Experiment Model Training

28

Sequences

Quality Metrics

Iterative Search Execution

Search Control

Patterns

Pattern <-> Sequences

Inputs Outputs

Depth=15~2B patterns

Page 28: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Experiment Model Scoring

29

• Metrics: precision and recall• Precision (PPV) = TP / (TP + FP)• Recall = TP / (TP + FN)

• Device and Episode match counts

Sequence MatchedFP TP

FNTN Sequence Not Matched

Time

Page 29: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Experiment Findings

30

50 Iterations of Train and Score using 60%/40% Samples

21 NAs

Page 30: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Experiment Findings

31

• Very high precession values can be achieved at the expense of recall.

• Configurable PPV per search depth iteration is useful• Common subpattern elimination

• Results from events occurring at same epoch and backward / zero delta support

• Episode support metric contributes to precision• Additional use cases being considered

• sensor data from power generation devices

Page 31: Mining medical device logs to improve operational ... · Overview. 2. Siemens – Who we are 3 Electrification, Automation and Digitalization are long-term growth fields of Siemens

Thank You

Questions/CommentsEmail:

Follow MeTwitter @

Rate This Session # with the PARTNERS Mobile App

Remember To Share Your Virtual Passes

[email protected]

645

32