behavior grouping based on trajectories mininghuanliu/sbp09/presentations...andrew2 26/ 45 0.578...

29
1 Behavior Grouping based on Trajectories Mining Shoji Hirano Shusaku Tsumoto Department of Medical Informatics Shimane University, School of Medicine, Japan

Upload: others

Post on 10-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

1

Behavior Grouping based on Trajectories Mining

Shoji Hirano Shusaku Tsumoto

Department of Medical InformaticsShimane University, School of Medicine, Japan

Page 2: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

2

Outline Introduction

Background, Objective, Approach Method

Multiscale comparison and grouping of trajectories Experimental Results

Australia Sign Language data Hospital Management

Conclusions

Page 3: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Temporal Data Mining

• One Dimensional Time Series:• Chronological Behavior of One Variable

• Two Dimensional Time Series• Trajectory: Behavior of Two Variables

• Grouping of Temporal Sequences•Capture the dynamic behavior of Temporal Variables

•2D: Detection of Co-variant variables•Disease Grouping, …..

Page 4: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Discoveries from Hepatitis Data

ALB

PLT#602 (C5;F4)

PLT#170 (C5;F4)

ALB

#558 (C15;F1)

ALB ALB

PLT

#636(C15;F3)

PLT

Left: ALB, PLT covariant Right: ALB, PLT non-covariant

Two Groups of Disease Progression of Liver Fibrosis

Group1: ALB, PLT: decreasingGroup2: PLT: decreasing, ALT: stable

Page 5: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

5

Segmentation and Generation of Multiscale Trajectories

Segment Hierarchy Trace and Matching

Calculation of Dissimilarities

Clustering of Trajectories

Trajectory Mining Process

Page 6: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

6t=0

t=0

Multiscale Structural Comparison Represent trajectories using multiscale description Search the best correspondences of partial trajectory

throughout all scales

Trajectory A

Trajectory B

Attr.1

Attr.2Attr.1

Attr.2

Scale 0

Scale 0

Scale 2Scale 1

Scale 1 Scale 2

Segment

(cf .Ueda et al. (1990)

Page 7: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

7

Multiscale Description Represent convex/concave structure of trajectories on

various observation scales Trajectory representation

Trajectory at scale σ

( ))(),...,(),()( 21 textextextc I=

),()(),( σσ tgtextEX ii ⊗=

σ=large

)0,(tC

),( σtC

Iitexi ∈),( : time series of test i

∑∞

−∞=−=

n in texIe )()(σσ

( )),(),...,,(),,(),( 21 σσσσ tEXtEXtEXtC I=

σ=large: Global feature of the trajectoryσ=small: Local feature of the trajectory σ=small

(cf. Mokahatan et al. (1986))

In: modified Bessel function of order n

Page 8: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

8

Segment Matching based on Concave/convex Structures

Segment: partial trajectory between inflection points

Curvature at scale σ(2D case)

Inflection point:

Segment representation

),( σtc j

)0(1a)0(

2a)0(A

)(σA

2/322

21

2121

)(),(

XEXEXEXEXEXEtK

′+′′′′+′′′

),()(),(),( )()( σσσ tgtext

tEXtEX mim

im

mi ⊗=

∂∂

=

0),(),1(:),( <×− σσσ tKtKtC j

{ }NiaA i ,...,2,1|)()( == σσ

(cf .Ueda et al. (1990)

σ=large

σ=small

Page 9: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

9

Multiscale Structural Comparison Global Matching Criteria

Minimization of total segment dissimilarity Complete match; the original trajectory must be formed

without gaps/overlaps by concatenating the segments Dissimilarity between two segments

)(hjbSegment

)(hbj

θ

)()(2)()(2)()()()( ),( hb

ka

hb

ka

hb

ka

jh

ki jijiji

vvggbad −+−+−= θθ

gradient rotation angle velocity

)(kiaSegment

)(kai

θ)(

)()(

ka

kak

a

i

i

i nl

v =(length)

(# of points)

))()(( )()( jh

ki bcac +×+ γ

replacement cost

)(kai

g )(hbi

g)(hb j

v

)()( , hj

ki ba),( )()( h

jk

i bad

Page 10: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

10

Value-based Dissimilarity of Trajectories

After structural matching, calculate the value-based dissimilarity for each pair of matched segments

dv1(ap,bp) = peak difference+(left diff. + right diff.)/2

Attribute 1 dissimilarity

dv2(ap,bp) = peak difference+(left diff. + right diff.)/2

Attribute 2 dissimilarity

Trajectory A

Attr.1

Attr.2Attr.1

Attr.2

∑=

=P

pppvalval bad

PBAD

1

)0()0( ) ,(1),(

Trajectory B

CoG

22

21

)0()0( ) ,( vvppval ddbad +=cost+

Page 11: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

11

Experiment 1: ASL Data Dataset: Australia sign lang. dataset in UCI KDD archive

Time-series data on the hand positions (3D) collected from 5 signers during performance of sign language.

Used for experimental validation by Vlachos et al. in ICDE02 (as 2D trajectory) and Keogh et al. in KDD00 (as 1D time-series)

For each signers, two to five sessions were conducted. In each session, five sign samples were recorded for each of the 95 words.

The length of each sample was different and typically contained about 50-150 time points.

signer A signer E

session 1 session n

word 1 word 95

sample 1 sample 5

session n

word 95

sample 1 sample 5Examples of“Norway”

Page 12: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

12

Experiment 1: ASL Data Experimental Procedure

Out of the 95 signs (words), select the following 10 signs: Norway, cold, crazy, eat, forget, happy, innocent, later, lose, spend.

Select a pair of words such as {Norway, cold}. For each word, there exist 5 sign samples; therefore a total of 10 samples are selected.

Calculate the dissimilarities for each pair of the 10 samples by the proposed method.

Construct two groups by applying average-linkage hierarchical clustering.

Evaluate whether the samples are grouped correctly.

word 1 (“Norway”)

sample 1 sample 5

word 2 (“cold”)

sample 1 sample 5

pairwise comparison & grouping (into two clusters)evaluate whether groups are correct or not

Apply this procedurefor every pair of 10 words(total 45 pairs /session)

Page 13: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

13

Experiment 1: ASL Data Results

According to Vlachos et al., the results by the Euclidean dist., DTW, and LCSS were 0.333 (15/45), 0.444 (20/45), and 0.467 (21/45). Signer/session info was not available on the paper.

Session # of correct pairs ratioandrew2 26/ 45 0.578john2 34/ 45 0.756john3 29/ 45 0.644john4 30/ 45 0.667stephen2 38/ 45 0.844stephen4 29/ 45 0.644waleed1 33/ 45 0.733waleed2 36/ 45 0.800waleed3 25/ 45 0.556waleed4 26/ 45 0.578

(worst)

(best)

Page 14: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Background for 2nd Expermeint

• Hospital Information System (1980’s- )• Computerization of All Hospital Information• Large-Scale Databases

• Data: Order and its Record: 1Order ≈ 3 to 5 Trans.• All the clinical actions are described as “orders”• Prescription

• Doctor → (Order) → Pharmacist• Laboratory Examination

• Doctor → (Order) → Laboratory

Page 15: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Background: HIS (2)

• Hospital Information System• Computerization of “Orders”• Results of Orders

• Data for Clinical Actions

• Reuse of Stored Data• Laboratory Examinations, Prescriptions,…

• They are “results from orders”• History of Orders: History of Clinical Actions

• Data-centric Hospital Management

Page 16: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Background: HIS (3)

• How many orders are made every day ?

• A Case: Shimane University Hospital • 616 beds, 1000 for outpatient clinic

• #Orders: about 8000• Prescription: 700, Injection: 700• Actions (Doctors & Nurses): 4300

• Storage of Data : 100MB /day • 30GB / year (cf. Image: 2.5TB/ year)

Page 17: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Chronology of #Orders(2008.6.1~6.7)

Sun

Mon TueWed

Thr

FriFri

Sat

Page 18: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Chronology of #Orders(2008.6.2)

Descriptions

NurseryDocuments

Page 19: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

#Login 2008/6/2~2008/6/7

OutpatientClinic

Wards

Page 20: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Reuse of Data

• Understanding Dynamic Behavior of Hospital , Doctors and Patients : Temporal Data Mining

•Reuse of “Orders”• Analysis of Clinical Actions• Data Mining for Temporal Behaviors of Hospital or Medical Staff

• New type of Hospital Management

Page 21: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Co-occurrence of #Orders(2008.6.2)

Records

ReservationsPrescription

ExaminationaMorning

Afternoon

Page 22: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Experiment 2 : Data of #Orders

Data # of Orders for Each Day (2008.6.2~6.7)

Objective Find groups of similar trajectories Analyze the relationships between the grouped trajectories

Method Generate a dissimilarity matrix using the proposed method Perform cluster analysis using dendrograms generated by

hierarchical clustering method Results

2 Major Groups: Outpatient/Ward + Ward

Page 23: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Clustering Results

Page 24: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Visualization for Clusters

Page 25: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Records + Reservations

Prescriptions, Examinations, Radiology, Reservations

Outpatient

WardsRecords

Reservations

MorningAfternoon

Page 26: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

Records and Nursery (Wards)

Nursery and Injections

Outpatient

Wards

Records

Nursery

Afternoon

Morning

Page 27: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

27

Conclusions Presented a new method for trajectory mining

Trajectory representation -> multiscale, structural comparison -> value-based dissimilarity -> clustering

Application to Australia Sign Language Dataset Correct grouping ratio: 0.556 (worst), 0.844 (best) High robustness to noise

Application to Hopsital Data Two Groups of Behavior of #Orders: Outpatient, Ward Captured the Macroscopic Behavior of the UniversityHospital

Future work Extention to Multidimensional Trajectories

Page 28: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

28

Preliminary Results (3D) Matching Results for 3-D Trajectories

Page 29: Behavior Grouping based on Trajectories Mininghuanliu/sbp09/Presentations...andrew2 26/ 45 0.578 john2 34/ 45 0.756 john3 29/ 45 0.644 john4 30/ 45 0.667 stephen2 38/ 45 0.844 stephen4

29