evaluating role mining algorithms

43
Evaluating Role Mining Algorithms SACMAT’09, June 3 - 5, 2009, Stresa, Italy. Ian Molloy, Ninghui Li, Tiancheng Li, Ziqing Mao, Qihua Wang @ CERIAS Research Center Department of Computer Science, Purdue University Jorge Lobo @ IBM T.J. Watson Research Center Presentation by Onur Yılmaz - [email protected]

Upload: onur-yilmaz

Post on 08-Jun-2015

352 views

Category:

Education


2 download

DESCRIPTION

Presentation of the paper: Evaluating role mining algorithms. In Proceedings of the 14th ACM symposium on Access control models and technologies (SACMAT '09) Ian Molloy, Ninghui Li, Tiancheng Li, Ziqing Mao, Qihua Wang, and Jorge Lobo. 2009. ACM, New York, NY, USA, 95-104. DOI=10.1145/1542207.1542224 http://doi.acm.org/10.1145/1542207.1542224

TRANSCRIPT

Page 1: Evaluating Role Mining Algorithms

Evaluating Role Mining AlgorithmsSACMAT’09, June 3 - 5, 2009, Stresa, Italy.

Ian Molloy, Ninghui Li, Tiancheng Li, Ziqing Mao, Qihua Wang

@ CERIAS Research Center Department of Computer Science, Purdue University

Jorge Lobo @ IBM T.J. Watson Research Center

Presentation by Onur Yılmaz - [email protected]

Page 2: Evaluating Role Mining Algorithms

Outline

Introduction

Overview

Role Mining Algorithms

Evaluation Results

Analysis

Conclusion

Future Work

Page 3: Evaluating Role Mining Algorithms

Introduction

Aim of the study

Comprehensive study to compare role mining algorithms

What is presented?

Two new methods for generating datasets

Analysis of nine role mining algorithms

Page 4: Evaluating Role Mining Algorithms

Introduction

Role Mining

Using data mining techniques to discover roles from existing system configuration data

Page 5: Evaluating Role Mining Algorithms

Overview

3 key points:Output of a role mining algorithm

Criteria to compare outputs of algorithms

Input datasets

Page 6: Evaluating Role Mining Algorithms

OverviewOutput of Role Mining Algorithm

Existing algorithms based on their outputs:

Class 1: Outputting prioritized roles

Class 2: Outputting RBAC states

Page 7: Evaluating Role Mining Algorithms

OverviewOutput of Role Mining Algorithm

Class 1: Outputting prioritized roles Prioritized list of candidate roles, each of which is a set of permissions

CompleteMiner and Fast-Miner

Candidaterole

generation

Candidaterole

prioritization

a set of candidateroles from the user-

permission assignment data

Page 8: Evaluating Role Mining Algorithms

OverviewOutput of Role Mining Algorithm

Class 2: Outputting RBAC states

ρ = <User, Permission, UP >

RBAC state γ = <Roles, UserRoleAss, RolePermissionAss,

RoleHierarchy, DirectUserPermissionAss>

Page 9: Evaluating Role Mining Algorithms

OverviewOutput of Role Mining Algorithm

Class 2: Outputting RBAC states

Minimize some cost measure while finding RBAC output

Number of roles, number of user assignmentsetc..

Page 10: Evaluating Role Mining Algorithms

OverviewOutput of Role Mining Algorithm

Class 2: Outputting RBAC states

Weighted Structural Complexity (WSC)

Sums up the number of relationships in an RBAC state, with

possibly different weights for each relationship.

Page 11: Evaluating Role Mining Algorithms

OverviewOutput of Role Mining Algorithm

Class 2: Outputting RBAC states

Weighted Structural Complexity (WSC)

Given a weight vector W = < wr, wu, wp, wh, wd >

wsc(γ,W) = wr ∗ |R| + wu ∗ |UA| + wp ∗ |PA|+wh ∗ |transitive_reduce(RH)| + wd∗ |DUPA|

Page 12: Evaluating Role Mining Algorithms

OverviewOutput of Role Mining Algorithm

Class 2: Outputting RBAC states

Weighted Structural Complexity (WSC)

Different weight vectors encode different mining objectives and

minimization goals

HierarchicalMiner takes both a configuration ρ and a weight vector and

aims at outputting an RBAC state with low WSC.

Graph optimization minimizes the number of edges

Page 13: Evaluating Role Mining Algorithms

OverviewOutput of Role Mining Algorithm

Class 1 vs Class 2 Algorithms

RBAC states are easy to compare

List of candidate roles can be more useful in practice

Administrator examines the role mining results and

determine whether to adopt some part of it.

In practice, whether role mining algorithms can suggest the

best candidate roles.

Page 14: Evaluating Role Mining Algorithms

OverviewMetrics for Comparing Algorithms

Two metrics:

Complexity of the RBAC state

Quality of roles

Page 15: Evaluating Role Mining Algorithms

OverviewMetrics for Comparing Algorithms

Complexity of the RBAC state

Using WSC, how well each algorithm performs

under a variety of mining objectives

Page 16: Evaluating Role Mining Algorithms

OverviewMetrics for Comparing Algorithms

Quality of Roles

For each weight vector W, evaluate the complexity of the optimal

RBAC state using only the top k roles.

Among the top k roles, how quickly do the mined roles cover the UP

relation?

Among the top k roles, how well do they «resemble» the original

roles?

Page 17: Evaluating Role Mining Algorithms

OverviewInput Data Type

Access Control Configuration

ρ = <User, Permission, UserPermissionRelation >

Page 18: Evaluating Role Mining Algorithms

OverviewInput Data Type

Datasets from literature

Page 19: Evaluating Role Mining Algorithms

OverviewInput Data Type

Generated Datasets

Random Data Generator

Tree-Based Data Generator

ERBAC Data Generator

Page 20: Evaluating Role Mining Algorithms

OverviewInput Data Type

Random Data Generator

Permission Role

Roles Users

User – PermissionAssignment

Number of Users, Number of Roles, Number of Permissions,

Maximum Number of Roles for Users,Maximum Number of Permissions for Role

Page 21: Evaluating Role Mining Algorithms

OverviewInput Data Type

Tree-Based Data Generator

Number of Users, Number of Permissions,Height of Tree

Upper bound on number of children node,Lower bound on number of children node

Randomlygenerate a tree

Assignpermissions to

nodes in thetree

Assign users toleaf nodes

Page 22: Evaluating Role Mining Algorithms

OverviewInput Data Type

ERBAC Data Generator

Number of Users, Number of Business Roles, Number of Functional Roles,Number of Permissions

Maximum # of Business Roles,Maximum # of Functional Roles, Maximum # of Permissions

PermissionsFunctional

Roles

Business Roles

FunctionalRoles

Business Roles

Users

Page 23: Evaluating Role Mining Algorithms

Role Mining Algorithms

Class 1 Class 2

CompleteMiner (CM) ORCA

FastMiner (FM) Graph Optimization (GO)

DynamicMiner (DM) HP Role Minimization (HPr)

PairCount (PC) HP Edge Minimization (HPe)

HierarchicalMiner (HM)

Page 24: Evaluating Role Mining Algorithms

Role Mining AlgorithmsCompleteMiner (CM)

Initial set of roles

All possibleintersections

Prioritizationof roles

from userpermission sets Candidate roles

Exponential Time

Based on number of

exact matches

Page 25: Evaluating Role Mining Algorithms

Role Mining AlgorithmsFastMiner (FM)

Initial set of roles

Onlyintersection

between pairs of initial roles

Prioritization of roles

from userpermission sets Candidate roles

O (n2m)

n: users, m: permissions

Page 26: Evaluating Role Mining Algorithms

Role Mining AlgorithmsDynamicMiner (DM)

CM and FM -> static prioritization (does not consider candidateroles that been already chosen)

Initial set of roles

All possibleintersections

Prioritizationof roles

from userpermission sets Candidate roles

with the highestpriority first

O (n * |C| * min{n,m} )

C: Set of candidate roles

Page 27: Evaluating Role Mining Algorithms

Role Mining AlgorithmsPairCount (PC)

Newly proposed method

CM -> Prioritization based on exact numbers

In reality, multiple roles are assigned to a user

Pair Count: Pairs of users that share the only role, but no other

PC(P) = | { (ui, uj ) | ui = uj ∧ P(ui) ∩ P(uj) = P } |

O (n2m)

Page 28: Evaluating Role Mining Algorithms

Role Mining AlgorithmsPairCount (PC)

O (n2m)

Initial set of roles

All possibleintersections

Prioritizationof roles

from userpermission sets Candidate roles

Based on PairCounts

Page 29: Evaluating Role Mining Algorithms

Role Mining AlgorithmsORCA

Hierarchical clustering on permissions

O (m2n)

Set of clusters of permissions

Find pairs of clusters

Continueuntil

The number of users

having both permissions is

the largest

One clusteror

No user withpermissions in

two clusters

Page 30: Evaluating Role Mining Algorithms

Role Mining AlgorithmsHP Role Minimization (HPr)

Minimal set of roles to cover the user-permission assignmentrelation

O (nm)

Select a user u and finds a pair <U(u), P(u)>

All user-permission assignments between U(u) and P(u) are removed

This pair forms a «role»

P(u): Permissions of user uU(u): All users have all the permissions of u

Selecting the next user with

the fewest uncovered

permissions

Page 31: Evaluating Role Mining Algorithms

Role Mining AlgorithmsHP Edge Minimization (HPe)

Finding a RBAC state with minimal number of edges, called edge concentration

Similar to Graph Optimization algorithm, except this does not create a

role hierarchy

O (k2m)

k : number of iterations

HPr

Greedilyimproveobjectivefunction

Converge

If two roles have overlap in the permission or

user sets ->restructuring

Page 32: Evaluating Role Mining Algorithms

Role Mining AlgorithmsHierarchicalMiner (HM)

Concept: < P, U > such that

U contains all the users that have all permissions in P,

P contains all the permissions that are shared by all users in U

Similar to GraphOptimization but

uses conceptlattice.

Reducedfamily of concepts

Remove a role if RBAC stateis improved

Heuristicallycontinue

Removing a role:- Redistribution of users down the hierarchy- Permissions up the hierarchy

Page 33: Evaluating Role Mining Algorithms

Evaluation Results

For each dataset, each algorithm

Ranked according to their ability to optimize evaluation criteria

1 to N

Two metrics mentioned before:

Comparing Complexity of the RBAC States

Comparing Prioritized Role Quality

Page 34: Evaluating Role Mining Algorithms

Evaluation ResultsComparing Complexity of the RBAC States

Role Minimization

Page 35: Evaluating Role Mining Algorithms

Evaluation ResultsComparing Complexity of the RBAC States

Edge Concentration

HM has an advantage in this test because its roles are designed for a role-hierarchy

Page 36: Evaluating Role Mining Algorithms

Evaluation ResultsComparing Complexity of the RBAC States

Allowed Noise at Direct Assignments

Dataset contains errors that should not be covered by roles.

Page 37: Evaluating Role Mining Algorithms

Evaluation ResultsComparing Complexity of the RBAC States

Discovering Original Roles

Similarity of mined roles to original data

Used metric is average maximal Jaccard

HM: The top 40+ rolesare more or less the ones generated

PC: Performed the worst, generating roles farthestfrom the original data

Page 38: Evaluating Role Mining Algorithms

Evaluation ResultsComparing Prioritized Role Quality

Quality of WSC over k-roles

Page 39: Evaluating Role Mining Algorithms

Evaluation ResultsComparing Prioritized Role Quality

Quality of Coverage

How well the algorithm at quickly covering the UP relation?

Page 40: Evaluating Role Mining Algorithms

Analysis

Algorithms that minimize the number of roles often generate RBAC states

with a larger number of edges, resulting in increased complexity.

GO generates large role hierarchies when the number of users is greater

than the number of permissions.

DM is over-fitting some of the roles to cover users, and does not consider

the entire resulting RBAC state.

HM is computationally and memory intensive.

Page 41: Evaluating Role Mining Algorithms

Conclusion

Aim of the study

Comprehensive study to compare role mining algorithms

What is presented?

Two new methods for generating datasets

Analysis of nine role mining algorithms

Page 42: Evaluating Role Mining Algorithms

Future Work

Handling data with attribute information

In addition to the user-permission data, attribute

information may also be available.

Handling noisy data

In some scenarios, the input user-permission data

may contain noises.

Page 43: Evaluating Role Mining Algorithms

Evaluating Role Mining AlgorithmsSACMAT’09, June 3 - 5, 2009, Stresa, Italy.

Ian Molloy, Ninghui Li, Tiancheng Li, Ziqing Mao, Qihua Wang

@ CERIAS Research Center Department of Computer Science, Purdue University

Jorge Lobo @ IBM T.J. Watson Research Center

Presentation by Onur Yılmaz - [email protected]