finding tribes: identifying close-knit individuals from employment patterns

29
Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei

Upload: maxima

Post on 29-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns. Lisa Friedland and David Jensen Presented by Nick Mattei. Introduction. Tribes – groups with similar traits in a large graph Distinguish those that work together and move together intentionally. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Finding Tribes: Identifying Close-Knit Individuals fromEmployment Patterns

Lisa Friedland and David Jensen

Presented by Nick Mattei

Page 2: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Introduction

Tribes – groups with similar traits in a large graph

Distinguish those that work together and move together intentionally

Page 3: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Relationship Knowledge Discovery

Exploit connections among individuals to identify patterns and make predictions

Discover underlying dependencies Links must be inferred

Page 4: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Graph Mining

Discover Hidden Group Structures Animal Herds, Webpages, Employees

Time Series Analysis Co-integration (Economics)

Security and Intrusion Detection Dynamic Networks

Page 5: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Motivation

National Association of Securities Dealers

Fraud Collusion 4.8 Million Records 2.5 Million Reps at 560,000 Firms 100 Years of Data

Page 6: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Complications

Jobs not necessarily in order (or singletons) 20% of employees hold more than

one job at a time 10% begin multiple jobs (up to 16) on

one day Leave gaps between employment Mergers and acquisitions

Page 7: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Model

Page 8: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Finding Anomalously Related Entities Input:

Bipartite Graph: G = (R A, E) Entities: R = {r1, r2, …, rn} (People) Attributes: A = {a1, a2, …, am}

(Orgs.) Entities should connect several

attributes Model co-occurrence rates of pairs

of attributes

Page 9: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Algorithm

Page 10: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Simple Model Measures

JOBS = (Number of shared Jobs in the sequence)

YEARS = (Number of Years of overlap)

Page 11: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Example Sequences

Page 12: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Probabilistic Model

X = P(BrA -> BrB -> BrC -> BrD) = pa * tAB * tBC * tCD

Estimate: P(start branch i)

=(#reps ever at i) / (#reps in database) Tij = P(reps from i to j | #ever at i)

=(#reps leave i to go to j) / (ever at i)

ip

Page 13: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Probabilistic Model

Null Hypothesis of Independent Movement

Movement Not Random Split and Merge Markov Chains

Page 14: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Probabilistic Model (Different Paths)

Tij becomes Vij Vij = P(move to branch j at any point

after branch I | currently at i) = (# reps who go to branch j at any

point after working at i) / (# reps ever at i)

Now each vij >= tij and probabilities no longer sum to 1.

Page 15: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Probabilistic Model (Different Paths) Vij becomes Wij

Wij = P (move to branch j at any point simultaneous to or after branch i | currently at i)

= (# reps who start at j at any point simultaneous or after starting at i) / (# of reps ever at i)

Now less precise in respect to direct transitions but more general

Page 16: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

PROB - TIMEBINS Bins of 1 year or more 10 people worked at each branch in

a bin period PiX = # reps ever at i during time

X / # reps in DB yiXjY = # reps ever at I during time

X and at j during time Y, where Y >= X / # reps ever at i during time X

Page 17: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

PROB-NOTIME Ignores order of job moves Use original pi

Zij = raw number of reps who are at both branches I and j during career

Transition Pr from i to j: = (zij / # reps ever at i) != (zij / # reps ever at j) =transition Pr from j to i

Page 18: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Tribe Size

Page 19: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Pairs

Page 20: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Commonality of Job Sequence

Page 21: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Disclosure Scores

Page 22: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Homogenaity and Mobility

Page 23: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns
Page 24: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns
Page 25: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Discussion JOBS, PROB, PROB-TIME, PROB-

NOTIME create tribes with higher than average disclosure scores

PROB creates more cross zip code results

PROB-TIME has higher phi-squared than all others

PROB favors large firms

Page 26: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Discussion

JOBS and YEARS compute larger connected components

JOBS and PROB find same number of tribes but pick different groups as tribes

Page 27: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Conclusions

With no explicit knowledge we can discover: Job transitions Geography Career track

Page 28: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Conclusions

Needed: Ongoing process Multiple affiliations Arbitrary times Time is a paradox in domain

Page 29: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Thanks!

Time for: Questions Comments Smart Remarks