[ieee 2014 ieee international conference on pervasive computing and communication workshops (percom...

Multitiered Inference Management Architecture forParticipatory Sensing

Stephen PipesEmerging Technology Services, IBM

Hursley Park, Winchester, U.K.

Supriyo ChakrabortyElectrical Engineering Department

University of California, Los Angeles

Abstract—This paper describes a multitiered architecture forrealizing an inference management firewall (IMF) that employscontext-aware information masking techniques for systematicmanagement of risk-vs-value trade-off of sensor data. Previ-ously we have demonstrated an initial implementation of theIMF running as messaging services on the Information Fabric,which is a middleware asset developed under the InternationalTechnology Alliance (ITA) research program. Furthermore, wehave presented an additional asset, recently implemented ona commercially-available mobile device running the Androidoperating system, which is intended to operate as an informationsource and first-line inference management capability at the edgeof the network. The low-cost and widespread use of Android-based mobile devices offers a popular platform for crowdsourcedparticipatory sensing. The focus of our current work is on theintegration of these two technology assets in support of policy-managed, sensor-driven workflows in coalition scenarios.

I. INTRODUCTION

Balancing the amount of information shared with a con-sumer such that on one hand it is sufficient for the consumerto perform computation and provide utility to the publisherand on the other hand the information is sanitized enoughto prevent the leakage of sensitive detail posing privacy risksto the publisher is key to the operational efficiency of anyinformation network. In [3] we present research towards aprincipled approach for managing the privacy versus utilitytrade-off during information sharing. The approach taken isbased on the idea of treating potential inferences from shareddata as primitives with which to reason about what a consumercan learn. Inferences that can be shared are specified ina whitelist and sensitive inferences that cannot be sharedare specified in a blacklist. These lists are used as factorsto determine in what form the information may be shared(i.e. with or without obfuscation of that information prior tosharing).

To complement our theoretical investigation, we envision anovel multitiered inference management firewall (IMF) span-ning the end-to-end information flow path from the publisherto the consumer. We classify the architecture of IMF intothree general subsystems: the first subsystem comprises ofthe policy enforcement points (PEP) [6] which operate atthe core of the information network, second is the policyenforcement subsystem, ipShield [4], which operates at thenetwork endpoints, and finally the communication and as-sociated subsystem which integrates the different tiers intoa logically single firewall. As part of prior work, we haveimplemented the two subsystems which enforce the policiesat various points within the information network. Based on

the publish-subscribe messaging pattern, the network core isdesigned as a messaging hub, which consists of one or moreinstances of the ITA Information Fabric [8] (herein Fabric)and applies obfuscation operations on information in thepresence of variable trust. This subsystem demonstrates howthe inference management capability is applied to the data asthey pass through network nodes en route to their destination.The network endpoint is a low-powered mobile device thatis capable of sensing its environment using sensors suchas camera, accelerometer, magnetometer and gyroscope. Thepopularity of low-cost mobile devices encourages their use forcollaborative participation, such as when a collective decisionis made based on the basis of numerous, individual opinionsor observations. Scenarios in which crowdsourcing is used toform a collective opinion demonstrate the effectiveness andwidespread adoption of such techniques. ipShield implementsthe subsystem which allows users to apply fine-grained policieson the sensor data before their release to the network.

In this paper, we focus on our current work to integratethe subsystems above to create an integrated inference man-agement architecture. Still in an early stage of design, werecognize that there are several challenges that need to beaddressed. First, the information publisher is required to con-figure privacy policies whose enforcement is shared betweenthe network core and endpoints. This requires defining a suit-able policy schema. Second, bi-directional exchange of controlinformation needs to be established between the core and theendpoints. The endpoints need to provide the core with re-lease policies according to the publisher’s privacy preferences.Similarly, the core must provide information about policiesto be applied at the information consumer endpoints. Thispolicy management scheme must be resilient to changes inoperational dynamics, such as the ad hoc addition and removalof network nodes, and can operate robustly in challengingenvironmental conditions whereby power availability and net-work connectivity may be significantly limited or sporadicallyavailable. Furthermore, our proposed IMF architecture musttake information security into account and be designed tomitigate identified vulnerabilities.

II. ARCHITECTURE COMPONENTS

Fig. 1 depicts an information flow in our integratedIMF architecture. Two network endpoints are connected tothe network core. One endpoint represents the informationsource (sensor) and the other endpoint is the informationsink (subscriber). The network core consists of services insupport of information sharing between endpoints and other

The First International Workshop on Crowdsensing Methods, Techniques, and Applications, 2014

978-1-4799-2736-4/14/$31.00 ©2014 IEEE 74

ipShieldsensorspublished

informationInformation Fabric

sourceregistration

network obfuscator subscriber

whitelist/blacklist

policies

edge of thenetwork (source)

core of thenetwork

edge of thenetwork(sink)

Fig. 1. End-to-end information flow in the multitiered inference management firewall (IMF).

related life-cycle methods, such as sensor registration or policydeployment. Information sourced from sensors operating at thenetwork edge are subject to policies on the mobile device(whitelist / blacklist) before being shared with the networkcore. Policy enforcement ensures that only apps installed onthe mobile device that are authorized to access sensor datamay do so, thus mitigating the threat of information leakingto an unauthorized app. Such an authorized app acts as anetwork client and is able to route data onto the networkcore. From here, the information is handled by the messaginginfrastructure, which is depicted as a database named Infor-mation Fabric. This messaging infrastructure will optimallyroute the information to the appropriate recipients (subscribers)that reside at network endpoints. Depending on routing policy,this may involve delivering multiple copies of the information.Additional policy may be applied to the information at eachnetwork node en route to an endpoint. This way local policymay be applied at appropriate points in the network, such ason an organization boundary.

Our IMF architecture consists of two core components:an Android-based mobile device, operating at the networkendpoint and messaging middleware, based on the Fabrictechnology, which connects network endpoints.

A. ipShield

ipShield [1], [4] is a framework which is built by modifyingthe Android operating system. It operates at the network edgeand allows an agent to specify their privacy preferences interms of inferences and, if required, configure fine-grainedprivacy rules for every accessed sensor on a per app basis atrun-time. ipShield instruments Android and allows monitoringof the sensors accessed by an app regardless of whether theyare specified explicitly by the app at install time. As per ourknowledge, ipShield is the first system that tracks innocuoussensors, access to which are not mediated through protectedAPIs provided by the OS. Second, the framework uses a listof inferences that could be made using the accessed sensors asthe language for conveying the risk of sharing data. Based onvarious user studies [5], [7], we believe it is an important steptowards making the agent understand the privacy implicationsof sharing data. The agent can then specify privacy preferencesin the form of a prioritized blacklist of private inferencesand a prioritized whitelist of allowed inferences. Third, theframework provides recommendations about possible privacy

rules by translating the blacklisted and white-listed inferencesinto privacy actions on individual sensors. Finally, the agent isprovided with an option to directly configure context-awarefine-grained privacy actions on different sensors on a perapp basis. These actions range in complexity from simplesuppression to setting constant values, adding noise of varyingmagnitude, and finally to playback of synthetic sensor datastreams.

B. Information Fabric

The Fabric [8] is a two-way messaging bus and set of mid-dleware services connecting the network’s assets. The Fabricleverages the publish-subscribe messaging pattern with multi-hop capabilities and ensures that messages are propagatedefficiently, without duplication and with the minimum useof valuable network bandwidth. Publish-subscribe messagingprovides a one-to-many distribution mechanism that utilizes acentral hub, or broker, through which all messages pass. Apowerful feature of the publish-subscribe messaging pattern isthe decoupling of publishers and subscribers. Clients only haveto deal with their connection to the message broker; they haveno knowledge of the source or destination of the messages,unless it is specifically included in the content of the message.Whilst this abstraction helps to minimize client complexity, itis necessary for the messaging middleware to have access tocertain knowledge about the network, such as node ownershipor about network users, such as a message recipient’s identity,in order to properly evaluate privacy policy.

The IMF extends the base capability of the Fabric throughthe deployment of services that operate in the Fabric. TheFabric supports the addition of extra capability in the formof pluggable modules (called plugins). We have previouslyutilized this plug-in framework to specify policy, in terms ofconditions and actions, on the information that flows througha Fabric node. We have demonstrated in [6] that attributes,such as information classification, source or destination canbe extracted by the Fabric and evaluated by policy. If acondition evaluates to true then an inference managementaction will be performed on that information flow. We showthat by adopting this approach, information can be selectivelyobfuscated depending on organization memberships of theinformation sender and information receiver.


75

III. FRAMEWORK REQUIREMENTS

We have previously considered inference management interms of obfuscation actions performed independently by amobile device operating in isolation and by a network of Fabricnodes. Since it is now our intention to develop an integratedinference management capability, we consider privacy impli-cations of releasing information from the mobile device at thenetwork endpoint over the network.

A. Scenarios

We explore requirements by way of two scenarios in whichinformation is shared over the Fabric in accordance withprivacy policy specified by the information publisher. Thescenarios demonstrate the implications of a variable risk-utilitytrade-off on the disclosure of information in the network. Foreach scenario, the information publisher specifies a privacypolicy at the network endpoint.

In the first scenario, the publisher adopts a high-risk, low-utility posture. Privacy policy is specified with the aim ofminimizing inferences on sensor data. Extensive obfuscationof information at the publisher node reduces the likelihood ofdisclosure prior to sharing on the Fabric.

Fig. 2 depicts a network configuration in which informationdisclosure to the network is minimal, reducing the likelihoodof inferences on that information. The direction of informationflow between network nodes is indicated by a directed edge,from the publisher at the root node to subscribers in thenetwork core. The degree of intended information disclosureis indicated on a node as either High or Low.

In an operational context, this limited ability to inferknowledge may be undesirable. By adjusting the risk-utilityratio, the publisher is able to share sensor data in a form thatis better suited for current operational needs. In the secondscenario, the publisher aims to improve utility (whilst accept-ing some risk), by disclosing more information; the publisherintends to mitigate the risk of sensitive information leakingto an unauthorized subscriber by delegating sharing policyto the network core. The publisher may achieve delegationby specifying privacy policy on the shared information (andtrusting that the network core will enforce the policy asintended). The publisher may specify privacy in terms of oneor more of the following types of inferences:

• Informational, such as classification;

• Network, such as a subscriber’s identity or affiliation;

• Contextual, such as temporal or spatial aspects.

Fig. 3 depicts how by utilising the distributed policy-basedinference management capability of the IMF it is feasible toshare information sufficiently with authorized partners whilstlimiting disclosure to other network nodes. At each node, ob-fuscation operators limit the distribution of possible inferencesaccording to information sharing policy.

B. Expressing Policy

The role of policy management traditionally resides withnetwork nodes through which information flows. In our previ-ous work on inference management in the Fabric [6], this

H

L

L L

L

L L

Fig. 2. Fully-restricted disclosure for high risk-low utility policy.

H

H

H L

H

L L

Fig. 3. Partially-restricted disclosure for low risk-high utility policy.

role is held by the Fabric node administrator. This role isshared with information publishers as we move towards aview of policy management across multiple network tiers. Weexamine the requirements of multitiered policy managementin the context of two approaches:

1) Policy is defined and enforced by nodes in the net-work core;

2) Policy is defined by information publishers and en-forced by core network nodes.

In the first approach, an organisation wishes to enforceinference management policy consistently across all nodes inthe network. The organisation mandates that all informationflowing through the network must observe enforced policy,regardless of any privacy requirements set out by the infor-mation publishers. In this approach, information sharing isgoverned exclusively by the owning organisation with informa-tion publishers having no control in the inference managementcapability (and thus the risk-utility posture) of the network.

In the second approach, inference management is a sharedresponsibility between information publishers and networkowners. Publishers may specify privacy policy on informationwhich is then enforced at appropriate points in the network.For example, a publisher may decide to share informationwith members of a specific organisation and wishes to restrictinformation to all other users. The policy defined by thepublisher may state this requirement, in terms of the recipient’saffiliation and in a form that the information broker nodescan evaluate and enforce. To achieve this, an extension isrequired to our previously-used policy description language toallow expression of network characteristics in support of thismultitiered view of privacy.

To fully support these approaches, a policy model is re-quired that encapsulates aspects of both the publisher’s domainand of the network domain. Fig. 4 depicts a message flowfrom publisher to subscriber, via the Fabric. Some pertinent


76

Fig. 4. Entity attributes for expressing privacy policy.

characteristics for each actor (publisher, message broker andsubscriber) are listed. For instance, the publisher runs anapp on a smartphone that collects sensor data; inferencesachievable from those data are managed by ipShield whichfacilitates the construction of privacy policy in terms of usercontext, data source (e.g. accelerometer, proximity or ambientlight sensor), data recipient (an app running on the smartphone)and a predetermined action on that data (e.g. suppress, constantor perturb). This may be sufficiently descriptive for privacyspecification in scenarios where data are not shared with thenetwork. However, the primary goal of our multitiered archi-tecture is to share information over a network in accordancewith privacy policy. To achieve this goal, we propose anextension to the ipShield policy model to support expressionof network characteristics. At the publisher, the notion of datasource is extended to included a message publisher’s identity,such as name and other attributes, such as affiliation. Thisextension introduces a network-centric perspective of the datasource. A message published by the data source on a messagetopic may be specified with a classification that reflects its sen-sitivity. Similarly, message brokers in the Fabric and messagesubscribers may have an identity, such as a name, and otherattributes, such as affiliation and clearance. For the purposesof policy management, message brokers and subscribers areall network nodes. Based on our domain analysis, we havedeveloped the following expression for policy:

if (∧ contexts)from a (source) to a (destination)

on a (resource) perform an (action)

where

resource = sensor | topic | nodesource = app | publisherdestination = app | subscriber

The action may be used to evaluate attributes associatedwith nodes or messages, such as the message classificationand subscriber clearance, and to apply obfuscation functionson messages:

action = if (message.classification== "private" and destination.clearance== "basic") then actionType = perturb

Furthermore, a mechanism is required with which to com-municate policy over a network, from information publisher,via the network core to information subscribers. Our preferredapproach is to encode policy with the message, perhaps withan XML schema, to ensure that messaging policy is commu-nicated consistently between network endpoints. The designof the policy-based messaging schema is beyond the scope ofthis paper.

IV. THREAT ANALYSIS

The potential for attacks on the system by an adversaryoperating within the network must be considered during thedesign of the integrated architecture. Several threats have beensummarized in this section. We have limited our analysis to thethree main entities in the publish-subscribe messaging pattern:publisher, broker and subscriber, and have also consideredpotential threats for network-based communication. For eachcase, the goal of the adversary is to achieve one or more of:

• Leaking sensitive information; gaining read access tothe information is sufficient;

• Modifying sensitive information; gaining read andwrite access to the information is required.

Sensitive information is considered to be either inferencemanagement policy or sensor data.

A. Publisher

A secure environment is provided at the network endpointby ipShield by extending Android’s security mechanisms.ipShield has components which run as a trusted process inuser space. To establish trust ipShield relies on secure keygeneration and management. The key is used to validate thatthe process requesting for data obfuscation is the trusted app.The adversary could try to break the key management schemeand spoof as the trusted app. We also consider attacks oninformation spaces that are accessible to other processes. Ourattack scenario involves an adversary running an app on theendpoint device in an attempt to leak information from thedevice. To achieve this, the adversary must gain access to theinformation as it resides either in memory or on the filesystem.

Android provides process-level isolation which offers pro-tection from other processes operating on the system. However,mechanisms exist to facilitate inter-process communication;these include shared memory spaces and memory-mappedfiles. An adversary would aim to utilise such mechanisms tointeract directly with the memory of the ipShield. However,these attacks would need to circumvent the process-levelmemory isolation enforced by the underlying Linux kernel.Thus, ipShield builds on top the security model provided bythe kernel and is as at least as safe.

B. Message Broker

The Fabric consists of several distinct components whichtogether form a distributed message-bus capability on whichinformation is published, transformed and delivered to sub-scribers. The channel through which information is coordinatedis the message topic. Message topics are defined in a global(with respect to the network domain) namespace and are


77

visible to network clients. Perhaps the most straight-forwardform of attack would be for an adversary to subscribe to one ormore topics and simply wait for information to be published.The Fabric will duly ensure that subscribers receive a copyof information that is published on a topic. This way, theinformation publisher may unwillingly leak information to theadversary.

An adversary may also attempt to mount an impersonationattack by deliberately registering topics that are similar inappearance to legitimate ones with which to coerce publishersinto using. If successful, the adversary may cause a denial ofservice by effectively denying information sharing on genuinetopics, to which legitimate clients subscribe.

The Fabric is designed to operate in demanding scenariosin which the network topology may rapidly change as nodesenter and leave the network. The Fabric deals with dynamicchanges by efficiently managing the registration of new nodes.Such flexibility offers a potential opportunity for an adversaryto register a rogue node within the network of legitimateFabric nodes. Once registered, the adversary may deployvarious tactics on the information that flows through the roguenode to achieve the goals of leaking or modifying sensitiveinformation.

A final consideration is that an adversary may attempt toattack a legitimate node that is currently registered in thenetwork in order to modify its configuration. One possibleroute is to install a Fabric module that reveals sensitiveinformation to the adversary, or modifies information in anunauthorized fashion.

C. Subscriber

Policy enforcement must extend from one end of themessage network to the other: from publisher to subscriber.An adversary may view a subscriber as a potential point ofweakness in the system if that subscriber does not provideadequate controls on all information received. Mishandling bythe subscriber of sensitive information may lead to attacks oninformation confidentiality, integrity or availability and it is theresponsibility of the subscriber to ensure that sufficient accesscontrols, in accordance with policy, are in place.

D. Network

An adversary may takes steps to subvert the protections inplace for inter-node communication. Whilst data transmittedover the network may be encrypted to defend from casualeavesdropping, this solution alone is not infallible. Key man-agement is a perpetual problem in a cryptographic system;ensuring that cipher keys remain in authorized hands only isjust one of the many challenges to be solved. A robust processfor provisioning of digital certificates, and for revoking ofcompromised copies, is essential if publishers and subscribersare to have confidence in the integrity of the network. Im-personation attacks are feasible where a properly managedand enforceable digital certificate scheme is not in placed.Likewise, vulnerabilities in challenge-response protocols usedduring key phases of communication, such as user authentica-tion or session negotiation, may be exposed by an adversarywho is able to subvert their security mechanisms. Such attacksmay lead to session hijacking or user impersonation and may

commonly be mounted as man-in-the-middle attacks on thenetwork.

V. FRAMEWORK REALIZATION

This section summarises our work towards implementingthe integrated IMF architecture.

A. Implementation

The proposed implementation consists of the two coreassets of the IMF, namely Fabric and ipShield. At the networkcore, the IMF is implemented as one or more code modulescalled message plugins. These plugins operate within theFabric run-time and permit operations on the informationflowing throughout the Fabric Manager. The Fabric Manageris configured with a policy that governs how obfuscation func-tions are applied to information flow. The Fabric administratorhas responsibility for the configuration process; it is this rolethat has the authority to reconfigure the basic operation of theFabric Manager and its various services.

Fig. 5 shows two components in the Fabric where policymay be applied to information work-flow: on informationreceived by a Fabric node (inbound) and on information sentby a node (outbound). Traditionally, the Fabric administratorexclusively manages policy enforced at these points: policymay be applied at the level of a Fabric node, by Task (a logicalgrouping of Fabric assets) or by Actor (a subscriber). For ourintegrated architecture, we propose an extension to the wayFabric views individual endpoints in the information network;currently limited to information consumers1, this view shouldbe extended to include information publishers in order thatprivacy policy may be linked to individual information sources,suitable for the privacy-motivated scenarios considered inprevious work [3], [4].

As described in an earlier section, privacy policy mayexpress constraints in terms of the information being shared,the intended recipient and of the environmental context inwhich that information is shared.

An enhancement to the current ipShield implementationis required in order to integrate an MQTT-based [2] messageclient. This client will allow ipShield to communicate with amessage broker, which will be used to publish messages tothe Fabric and to receive responses. This messaging client isprovided as a Java library, which can be loaded on the Androidoperating system and linked with the ipShield binary. Fig. 6extends the traditional view of the ipShield architecture withthe messaging client (as part of the Trusted App component).We plan to develop ipShield for use at a network endpointwhereby it encodes and publishes messages, containing sensordata and metadata, along with privacy policies and receivespolicy updates from the network.

1For historical reasons the Fabric provides a logical construct, the Actor,for representing information consumers but offers no formal equivalent forinformation producers. Data sensors and other information-producing deviceswere, at the time of original design, considered to be simple entities with nonotion of privacy. With more sophisticated devices, such as smartphones, andincreasingly privacy-aware users, there is now a need to extend the Fabric tocope with privacy requirements of the individual.


78

Fig. 5. Fabric message plugin architecture.

Hardware

Linux Kernel

Libs Daemons SensorHALOther Native

Services

Android Runtime/Dalvik

SDKsSystem Server

OtherServices

Sensor Service Other SDKs

Sensor ManagerRule-Based Obfuscator

Trusted App (part of Inference Management System)

ContextEngineInference Management

Firewall Configurator

MQTT Based Rules Publisher Context

Whitelist and Blacklist

of inferences

Information FabricUser

Information FabricAdministrator

Rules Sensor Data

Sens

or D

ata

Bind

er

Rules

Rules

Fig. 6. ipShield using MQTT to publish/subscribe rules.

VI. DISCUSSION

This paper elaborates on current efforts in the designof a nascent multitiered IMF architecture and its realizationacross both the Fabric at the network core and ipShield atthe network endpoint. This architecture will enable muchfiner control to be exercised over risk versus utility trade-offduring information sharing than is currently possible. With thisarchitecture policy writers may take a network-wide view ofpolicy management. The multitiered architecture will providemechanisms for robust policy distribution to support the sharedresponsibility of policy configuration and enforcement, allow-ing both administrators in the network core and informationpublishers at the network edge to coordinate effectively. Giventhat the IMF capability extends to the operating system ofendpoint devices, the multitiered architecture is suitable for a

variety of scenarios, ranging from personal mobile sensing tocrowd-sourced sensing applications.

Solution requirements include development of an expres-sive policy language that supports concepts relating to networkcore and endpoint characteristics and of the information underdisclosure. Also required is a means to link policy to infor-mation that is shared over the network to ensure consistentpolicy enforcement at all points in the information work-flow.Finally, the solution must be resilient to threats to informationconfidentiality, integrity and availability throughout the infor-mation life cycle. These requirements are guiding the researchand development activities that are currently underway.

During implementation of the multitiered solution, experi-mental testing will be carried out towards validating our theo-retical work on inference management and policy enforcementand quantitative results will be reported.

ACKNOWLEDGMENTS

This research was sponsored by the U.S. Army Research Labo-ratory and the U.K. Ministry of Defense under Agreement NumberW911NF-06-3-0001 and by the NSF under award #0910706. Theviews and conclusions contained in this document are those of theauthor(s) and should not be interpreted as representing the officialpolicies, either expressed or implied, of the U.S. Army ResearchLaboratory, the U.S. Government, the U.K. Ministry of Defense orthe U.K. Government or the NSF. The U.S. and U.K. Governmentsare authorized to reproduce and distribute reprints for Governmentpurposes notwithstanding any copyright notation hereon.

REFERENCES

[1] ipShield: A Framework For Enforcing Context-Aware Privacy.http://tinyurl.com/ipshieldgit.

[2] Mq telemetry transport. http://mqtt.org/.

[3] S. Chakraborty, N. Bitouze, M. Srivastava, and L. Dolecek. Protectingdata against unwanted inferences. ITW ’13, 2013.

[4] S. Chakraborty, K. R. Raghavan, M. P. Johnson, and M. B. Srivastava. Aframework for context-aware privacy of sensor data on mobile systems.HotMobile ’13, pages 11:1–11:6, 2013.

[5] A. P. Felt, E. Ha, S. Egelman, A. Haney, E. Chin, and D. Wagner. Androidpermissions: user attention, comprehension, and behavior. SOUPS ’12,pages 3:1–3:14, 2012.

[6] S. Pipes, B. Hardill, C. Gibson, M. Srivastava, and C. Bisdikian.Exploitation of distributed, uncertain and obfuscated information.https://www.usukita.org/node/2048.

[7] A. Raij, A. Ghosh, S. Kumar, and M. Srivastava. Privacy risks emergingfrom the adoption of innocuous wearable sensors in the mobile environ-ment. CHI ’11, pages 11–20, 2011.

[8] J. Wright, C. Gibson, F. Bergamaschi, K. Marcus, R. Pressley, G. Verma,and G. Whipps. A dynamic infrastructure for interconnecting disparateisr/istar assets (the ita sensor fabric). Fusion ’09, 2009.


79

[ieee 2014 ieee international conference on pervasive computing and communication workshops (percom...

Documents