annual conference of ita acita 2009 a metadata algebra for sharing tactical information mudhakar...

1
Annual Conference of ITA Annual Conference of ITA ACITA 2009 ACITA 2009 A Metadata Algebra for Sharing Tactical Information Mudhakar Srivatsa , Dakshi Agrawal and Steffen Reidt IBM T. J. Watson Research Center Royal Holloway, University of London Secure Information Flow Need for information flows across traditional organizational boundaries Military Coalitions Multiple countries and multiple teams (special forces Vs search & rescue) Share tactical intelligence and reconnaissance information Business-to-Government SEC: securities and exchange commission EDGAR: Electronic Data Gathering, Analysis, and Retrieval system XBRL: eXtensible Business Reporting Language Business-to-Business Collaborations (e.g.: Supply Chain Management) Web services (e.g.: mash up) Operation encounters situation where information within data X,Y may be very useful COLLABORATION Data X Data Y FLOW Risk of (not) sharing information Risk of unauthorized information disclosure (e.g.: leak secrets, mission failure, etc) Risk of not sharing information (e.g.: mission failure, loss of life, loss in business, etc) Risk is inevitable: How to systematically manage risk? How to quantify risk? Value of object * Probability of leakage? How to enable risk-based information sharing? How to control overall risk? Delivering right information to the right person at the right time State of the art Rigid; overtly conservative; and fails to adequately capture dynamic security attributes arising in tactical missions MLS-like access control Label information with sensitivity level (e.g.: unclassified, classified, secret and top secret) Entities have security clearance level Clearance level ≥ sensitivity level ? Static security labels and entity credentials Fails to consider dynamic security attributes such as time sensitivity of tactical information In a tactical mission entity credentials (e.g.: trust, allegiance, need-to-know, etc) and strategies may dynamically evolve Rigid access control Boolean 0/1 access control decision Does not adequately capture risk due to information sharing Does not support expressive reasoning: Why is an object not sharable? What form of the object may be shareable? Overtly conservative security calculus x = g(y 1 , y 2 , …, y n ) label(x) = Max(label(y 1 ), label(y 2 ), …, label(y n )) Does not consider downgrading transforms Monotonicity problem: eventually most derived objects are labeled top secret => most legitimate accesses are denied (or delayed due to manual intervention) “MLS has a tendency to inhibit legitimate information flows” [MITRE06] Strict obligation enforcement across domains is infeasible Cannot mediate access to information once it leaves the domain Digital Rights Management (DRM) is a hard problem! Tracking provenance across semantic transform(s) Value Arithmetic Support dynamic attributes, information downgrade and fusion Notation Object x Metadata vector x Є M (vector space) Value function Г: M (F F) x = (10, 2) Гx = 10-2*t; Гx = 10*e -2*t Г maps metadata vector to a time decaying value function Assumptions 0 ≤ Гx < ∞, for all t Object x is contained in object y => Гx ≤ Гy, for all t Гx is continuous and differentiable in t and ∂Гx/∂t ≤ 0, for all t Value arithmetic Info loss and gain (entropy based metrics) x = f(y 1 , y 2 , …, y n ) x = g(y 1 , y 2 , …, y n ) g is homomorphic to f: preserves info downgrade and fusion semantics Empirical Value Computation Γx(t) = Γy i (t) * 2 -(I(yi|x,B)-I(x| yi )) y i = {y 1 , …, y i-1 , y i+1 , …, y n ) B: background/public knowledge Self information I(y|x): minimum number of information bits required to learn y given x Metadata Calculus Operators +: M x M -> M and .: F x M -> M Strong homomorphism x = y 1 + y 2 Гx = Гy 1 + Гy 2 x = a.y 1 Гx = a * Гy 1 Properties Commutative: y 1 + y 2 = y 2 + y 1 Associative: (y 1 + y 2 ) + y 3 = y 1 + (y 2 + y 3 ) Distributive + in M: a.(y 1 + y 2 ) = a.y 1 + a.y 2 Distributive . in M: a.(b.y 1 ) = (a*b).y 1 Distributed + in F: (a + b).y 1 = a.y 1 + b.y 1 Zero Vector 0: 0 + y 1 = y 1 Scalar 1: 1.y 1 = y 1 Deducing output metadata x = f(y 1 , y 2 , …, y n ) Γx(t) = Γy i (t) * 2 -(I(yi|x,B)-I(x|yi )) x = y i . 2 -(I(yi|x,B)-I(x|yi )) Strong homomorphism Impossible to achieve strong homomorphism without incurring metadata expansion Optimal expansion rate depends on decay function Linear and exponential decay function x = y 1 + y 2 => |x| = |y 1 | + |y 2 | |x|: size of metadata vector x x = y 1 + y 2 + …+ y n => |x| ~ 2 n-1 * |y i | Optimal metadata expansion rate is exponential Weak homomorphism x = y 1 + y 2 => Гx ≥ Гy 1 + Гy 2 x = a.y 1 Гx = a * Гy 1 Weak homomorphism results in conservative value estimation Constructively show that one can achieve weak homomorphism without metadata expansion Trade off tightness of value estimates with metadata size using B-splines Support automated deduction of output object metadata B-Splines Spline Parametric curve defined by piece wise polynomials over a finite set of control points {p i }: S(p 1 , p 2 , …, p n ) B-Spline (basis spline) Spline with minimal support (most compact) Applications in computer graphics – smoothing We use a special kind of B- spline clamped uniform cubic B-spline Strong convex hull property of B- splines B-Spline is guaranteed to be contained within the convex hull of its control poly-line Homomorphic + and . operators on B-splines S(p 1 1 , p 2 1 , …, p n 1 ) + S(p 1 2 , p 2 2 , …, p n 2 ) = S(p 1 1 + p 1 2 , p 2 1 + p 2 2 , …, p n 1 + p n 2 ) a . S(p 1 , p 2 , …, p n ) = S(a*p 1 , a*p 2 , …, a*p n ) B-Splines and Weak Homomorphism How to use B-splines? Slow-decreasing value functions f is decreasing Derivative of f is non- decreasing Linear and exponential decay functions are slow-decreasing Basic algebraic result One can always construct a convex hull dominating slow decreasing functions Construct a B-spline over the control points of the convex hull Weak homomorphism follows from strong convex hull property Spline > convex hull > value function Detailed proofs in paper Metadata Size Vs Tightnes Removing a control point does not violate weak homomorphism Algebraic result: convex hull C P-p > C P C P > Гx ^ S(P-p) > C P-p => S(P-p) > Гx Note: S(P-p) may not completely dominate S(P) Minimum curvature heuristics Remove a control point p i such that |π-θ| is minimum, where θ is angle p i-1 p i p i+1 Metadata calculus enables scalable solutions for deducing security metadata Enrich security metadata and calculus to meet new requirements Future Work Background information and uncertainty: domain-specific models with P7 Disinformation: extend using belief calculus (e.g., Dempster-Shafer) Bootstrapping: deducing the value of human authored documents – open problem in NLP

Upload: morgan-whitehead

Post on 27-Mar-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Annual Conference of ITA ACITA 2009 A Metadata Algebra for Sharing Tactical Information Mudhakar Srivatsa, Dakshi Agrawal and Steffen Reidt IBM T. J. Watson

Annual Conference of ITAAnnual Conference of ITAACITA 2009ACITA 2009

A Metadata Algebra for Sharing Tactical InformationMudhakar Srivatsa†, Dakshi Agrawal† and Steffen Reidt‡

IBM T. J. Watson Research Center† Royal Holloway, University of London‡

Secure Information Flow Need for information flows across

traditional organizational boundaries

Military Coalitions Multiple countries and multiple

teams (special forces Vs search & rescue)

Share tactical intelligence and reconnaissance information

Business-to-Government SEC: securities and exchange

commission EDGAR: Electronic Data Gathering,

Analysis, and Retrieval system XBRL: eXtensible Business Reporting

Language

Business-to-Business Collaborations (e.g.: Supply Chain

Management) Web services (e.g.: mash up)

Operationencounters situation where information within data X,Y

may be very useful

COLLAB

ORATIO

N

Data X

Data Y

FLOW

Risk of (not) sharing information Risk of unauthorized information disclosure

(e.g.: leak secrets, mission failure, etc) Risk of not sharing information (e.g.:

mission failure, loss of life, loss in business, etc)

Risk is inevitable: How to systematically manage risk?

How to quantify risk? Value of object * Probability of

leakage? How to enable risk-based information

sharing? How to control overall risk?

Delivering right information to the right person at the right time

State of the art

Rigid; overtly conservative; and fails to adequately capture dynamic security attributes arising in tactical missions

MLS-like access control Label information with sensitivity level

(e.g.: unclassified, classified, secret and top secret)

Entities have security clearance level Clearance level ≥ sensitivity level ?

Static security labels and entity credentials

Fails to consider dynamic security attributes such as time sensitivity of tactical information

In a tactical mission entity credentials (e.g.: trust, allegiance, need-to-know, etc) and strategies may dynamically evolve

Rigid access control Boolean 0/1 access control decision Does not adequately capture risk due to

information sharing Does not support expressive reasoning:

Why is an object not sharable? What form of the object may be shareable?

Overtly conservative security calculus x = g(y1, y2, …, yn) label(x) = Max(label(y1), label(y2), …,

label(yn)) Does not consider downgrading

transforms Monotonicity problem: eventually

most derived objects are labeled top secret

=> most legitimate accesses are denied (or delayed due to manual intervention)

“MLS has a tendency to inhibit legitimate information flows” [MITRE06]

Strict obligation enforcement across domains is infeasible

Cannot mediate access to information once it leaves the domain

Digital Rights Management (DRM) is a hard problem!

Tracking provenance across semantic transform(s)

Value Arithmetic

Support dynamic attributes, information downgrade and fusion

Notation Object x Metadata vector x Є M (vector space)

Value function Г: M (F F) x = (10, 2) Гx = 10-2*t; Гx = 10*e-2*t

Г maps metadata vector to a time decaying value function

Assumptions 0 ≤ Гx < ∞, for all t Object x is contained in object y => Гx ≤

Гy, for all t Гx is continuous and differentiable in t

and ∂Гx/∂t ≤ 0, for all t

Value arithmetic Info loss and gain (entropy based

metrics) x = f(y1, y2, …, yn) x = g(y1, y2, …, yn) g is homomorphic to f: preserves info

downgrade and fusion semantics

Empirical Value Computation Γx(t) = ∑ Γyi(t) * 2-(I(yi|x,B)-I(x|yi))

yi = {y1, …, yi-1, yi+1, …, yn) B: background/public knowledge Self information I(y|x): minimum

number of information bits required to learn y given x

Metadata Calculus Operators +: M x M -> M and .: F x M -> M

Strong homomorphism x = y1 + y2 Гx = Гy1 + Гy2

x = a.y1 Гx = a * Гy1

Properties Commutative: y1 + y2 = y2 + y1 Associative: (y1 + y2) + y3 = y1 + (y2 +

y3) Distributive + in M: a.(y1 + y2) = a.y1 +

a.y2

Distributive . in M: a.(b.y1) = (a*b).y1

Distributed + in F: (a + b).y1 = a.y1 + b.y1

Zero Vector 0: 0 + y1 = y1

Scalar 1: 1.y1 = y1

Deducing output metadata x = f(y1, y2, …, yn) Γx(t) = ∑ Γyi(t) * 2-(I(yi|x,B)-I(x|yi))

x = ∑ yi . 2-(I(yi|x,B)-I(x|yi))

Strong homomorphism Impossible to achieve strong

homomorphism without incurring metadata expansion

Optimal expansion rate depends on decay function

Linear and exponential decay function x = y1 + y2 => |x| = |y1| + |y2| |x|: size of metadata vector x x = y1 + y2 + …+ yn => |x| ~ 2n-1 *

|yi| Optimal metadata expansion rate is

exponential

Weak homomorphism x = y1 + y2 => Гx ≥ Гy1 + Гy2

x = a.y1 Гx = a * Гy1

Weak homomorphism results in conservative value estimation

Constructively show that one can achieve weak homomorphism without metadata expansion

Trade off tightness of value estimates with metadata size using B-splines

Support automated deduction of output object metadata

B-Splines Spline

Parametric curve defined by piece wise polynomials over a finite set of control points {pi}: S(p1, p2, …, pn)

B-Spline (basis spline) Spline with minimal support (most

compact) Applications in computer graphics

– smoothing We use a special kind of B-spline

clamped uniform cubic B-spline

Strong convex hull property of B-splines

B-Spline is guaranteed to be contained within the convex hull of its control poly-line

Homomorphic + and . operators on B-splines

S(p11, p2

1, …, pn1) + S(p1

2, p22, …,

pn2) = S(p1

1 + p12, p2

1 + p22, …, pn

1 + pn

2) a . S(p1, p2, …, pn) = S(a*p1, a*p2,

…, a*pn)

B-Splines and Weak Homomorphism How to use B-splines?

Slow-decreasing value functions f is decreasing Derivative of f is non-decreasing Linear and exponential decay

functions are slow-decreasing

Basic algebraic result One can always construct a convex

hull dominating slow decreasing functions

Construct a B-spline over the control points of the convex hull

Weak homomorphism follows from strong convex hull property

Spline > convex hull > value function

Detailed proofs in paper

Metadata Size Vs Tightness Removing a control point does not violate

weak homomorphism Algebraic result: convex hull CP-p > CP CP > Гx ^ S(P-p) > CP-p => S(P-p) >

Гx Note: S(P-p) may not completely

dominate S(P)

Minimum curvature heuristics Remove a control point pi such that |

π-θ| is minimum, where θ is angle pi-

1pipi+1

Metadata calculus enables scalable solutions for deducing security metadataEnrich security metadata and calculus to meet new requirements

Future Work Background information and uncertainty: domain-specific models with P7 Disinformation: extend using belief calculus (e.g., Dempster-Shafer) Bootstrapping: deducing the value of human authored documents – open problem in NLP