user-aware privacy control via extended static-information-flow analysis xusheng xiao,nikolai...
TRANSCRIPT
User-Aware Privacy Control via Extended Static-Information-Flow Analysis
Xusheng Xiao,Nikolai Tilmann,Manuel Fahndrich,Jonathan de Halleux,Michal Moskal
Presented by:Abishek Krishnan
Outline• Introduction• Privacy Control Mechanisms• Types of Information flow/Identification• Information Flow Analysis(Overview)• Simplified Language• Summaries of Basic Blocks and Actions• Evaluation• Related Work• Conclusion
Introduction• Modern mobile device platforms have a central app store for
downloading applications.• These applications access mobile device resources such as
photos, location and other private information.• These applications may leak private user information through
output channels .
Privacy Control Mechanisms• Manual app validation
• Access control granting
• User Aware Privacy Control mechanism
User Aware Privacy Control mechanism
• Reduces the efforts for app validation and access granting by computing.
• Classifies information as safe and unsafe.
• Source is origin of private information• Sink refers to point where information leaks from the app
User Aware Privacy Control mechanism
Fig 1:Information flow view of sample script
User driven access control• Real information,• Anonymized information • Abort execution
Tamper Analysis• Extended static analysis to compute information flow and
check tamper information for classifying information flow as safe/unsafe.
• To better understand how apps handle private information flow and how they can improve privacy control
• Vetted sink presents an explicit dialog requesting users permission before the information being shown escapes.
• Ex.Sharing of photo taken from camera shows dialog for users to review the picture before it leaks from the device.They do not leak private information and should be safe. Malicious app could encode users phone number into the pixels.
Performance of Information Flow analysis• A prototype of this privacy control in touch develop for
analyzing published scripts and to present user privacy settings to the user based on the analysis and policy.• Out of 546 scripts published 172 use private source, but only 78
flow private information to a sink. Among these 78, the approach classifies 24 as safe, reducing the need to make access granting choices to a mere 10.1%(54) .
Classified Information Flow• Example shows how classified information flows among values like
Number and String
Line 4:Variable loc contains geolocation information via gpsLine 5:The location is transformed to a string and assigned to sLine 6:the location string is rendered as text in picture p.Line 7:the share action leaks the classified information to facebook.
Reference type information flow
• Line5:Message added to message collection
• Line6:msg is classified• Line7:msg2 contains
information of other messages.
Implicit Information flow• Arise from conditional control
structures such as if statements where the conditional sstatement depends on classified information.
• Lines 10,11
Capability Identification• Tells Users what kind of mobile
device resources is being used.• If information flows from sink
pictures,emails,phone numbers then they would be identified as unsafe.
• Sharing is a vetted sink.• Web is unvetted sink.
Automated Capability Identification:
• Static analysis to automatically identify all the application capabilities.• Manually annotated all Touch develop APIs with
source and sink information.• For each action in a script ,parse the action into
an abstract syntax tree(AST), and automatically scans each statement node to identify what sources and sinks are used.
Information Flow Analysis(Overview)
• The approach Statistically computes information flow using abstract interpretation.
• Information flows from source s1 to sink s2 whenever source s1 appears in the abstract state of sink s2.
• The state maps local variables to sets of sources.• A single mutable location for each kind to a set of sources.• Sinks to sources flowing to that sink.
• Implicit Flows:• Special additional local variable called pc• Pc assigned the source information at conditionals at the entry of
both the branches.• At each basic block Pc is defined by the value at each dominator
block instead of all predecessor blocks.
• Inter Procedural analysis:• Computes the summaries of basic blocks in an action and use
these summaries to compute summaries of action.
• Mutable and Immutable values:• Each value has two separate paths
• Immutable part• Mutable part
• Numbers ,String, GeoLocation have only immutable paths.• Picture has both mutable and immutable path
• Embedded Reference:• Values may have embedded reference to other values that can be
mutable.• Keeping track of the directed edges from one mutable location to
another.• Does not support references from immutable part to mutable
part.
Simplified Language• The input program consists of a number of actions where each
action has a number of parameters and any number of results.• The body of the action consists of control flow graphs of basic
blocks with a distinguished entry and exit blocks.
• The instruction inside the block has the following form
• Simple assignment• A primitive invocation of parameters• Conditional branches.
Summaries of basic blocks and actions
• Separated into three parts• Local variable Information• Pc information for implicit flow• Mutable state information
• Fix point computation of the following data structure
Block Summary:• Entry Block• Initialize Lpre to map each parameter local i to the
singleton{Parameter(i)}• All other locals to the empty set• PCpre to singleton {PCin}• Spre is empty for entry block• The information for Rpre and Mpre keep track of the assumptions
under which the action has been analyzed.
• Non Entry Blocks• Locals on entry to a block are the
union of all the post local state of all predecessor block.
• PC classification is obtained by the post PC classification of immediate dominator of block b.
• Action Summary• Each Action has a single exit block• Summary of action is the post state of the exit block of the action• For each action keep track of the initial M and R under which it
was analyzed in the information for its entry block.• If there is a call to the action with a larger M or R update the
information for entry block and propagate the changes through the blocks of the action.
• Summary of Action FOO
• Tampered Information:• A source to sink information flow that we compute may not be
enough to validate scripts as good or bad
User Aware Privacy Control• Applying static analysis to compute information flow on a per
script and per action basis• Show summaries of which sources flow to which sinks in each
script
• Classified as• Safe Flows
• Untampered flow to a vetted sink• Vetted sink results in an explicit dialog at runtime• Example: Post to Facebook would prompt the user to review the
information before the actual sharing happens.
• Unsafe Flows• All other flows including untampered flows to unvetted sinks(web).
• Update the policy based on user feedback
• Granting Accesses:• The user is presented with all
sources appearing in unsafe flows.• Real information , Anonymized
information• Default Settings:• To keep user safe and minimize
effort in granting access.
Evaluation• Touch develop as a choice for platform• Source code availability
• The script is made available through publishing• Simplicity
• Expressiveness of Touch develop language enables applications in much fewer lines.
• Integrated static information flow analysis into the server part of the TouchDevelop language.
• Each and every script is analyzed automatically and the resulting flow information informs the privacy settings when the user installs the script.
• Experiments on 546 scripts showed that 395 scripts have LOC from 0-80.
• Information Flow Summary:• Advantage of using information flow from sources to sinks to
classify scripts, as opposed to mere presence of both sources and sinks• Out of 546 scripts, 242 have either source or sink• For information flow a script must have at least one source and one
sink• 89 scripts have both source and sink out of which only 11 scripts
have no information flows.• Reduced the prompting by 48.26% over traditional approach.• Using actual information flows as computed by the analysis further
reduces prompting by 12.36% (from 89 to 78)
• Safe Scripts:• Using tamper analysis to further eliminate the need to ask users
for permission to grant access.• Apply static analysis on 78 subject scripts that have information
flows to measure the number of scripts having safe flows.• Sink web is an unvetted sink• The result shows that 45(57.69%) scripts have safe flows.• 54(69.23%) have unsafe flows. • Among the 54, 40 scripts have flows to unvetted sinks and 47 have
tampered information flows.• Based on safe/unsafe flow summary, we know that 24(30.77%)
scripts only have safe flows.• Among 21 scripts that have both safe and unsafe flows none are
mix scripts.• Current access granting allows users to grant access only based on
sources only instead of flows.
• Safe Sources:• Consider the total number of times a user would have to change the default
settings for a source in order to give full access to all scripts.• Total number of times the source appears in a given context
• Among 33 scripts that have source camera, 24 have source camera as a safe source
• 9 scripts have source camera in tampered flows. • 25 scripts have safe sources of contacts• Only 5 scripts have source contacts appearing untampered flows.• 47.06%(56) of 119 sources are safe and are allowed to use real information
directly.• Among 63 unsafe sources 7 are solely due to flow to unvetted sinks• Remaining 56 sources appear in tampered information flows.• Using tamper analysis and vetted sinks with information flow our approach
reduces the burden63 changes an overall redduction of 58.6%
Generalization• Issues to be addressed to generalize this approach to other
mobile platforms like android,iOS,etc:• Have much larger API surface than touch develop and takes a
major effort to annotate the APIs with source sink and flow information.
• The languages used for example Java,C# provide more ways to obscure flow than in our scripting language.
• The static analysis would have to be complemented with dynamic analysis to address various issues like an indirect flow through mutable storage.
Limitations• Handling of implicit flows may produce false negatives.• A script can store a classified picture into the media library and
later share it through Facebook via a different application. Our approach does not contemplate on what would happen to the picture after it is stored in the library.
Related Work• User Aware Application capabilities:• Android and social network platform Facebook Use manifests to
show application capabilities and request permissions at install time.
• The capabilities shown in the manifests are claimed by the developers or only part of the requested application capabilities.
• Felt et al proposed a technique that uses static analysis to map API calls used by applications to permissions. However they adopt automated testing methodology.
• Access Granting• Android and Facebook use manifests. Once permission is given by
users the permission can not be changed. • Instead of only showing the information about access to
resources our approach presents information flows to describe what applications may do with private information.
• It also provides the users with a way to try out application before using private information and these settings can be changed at will.
Conclusions• This approach provides a user aware privacy control approach
based on static information flow analysis extended with tamper analysis.
• The experiment results show that the approach computes useful information flows and can be used to automatically provide default privacy settings for each script that keeps the users safe without any user intervention.
• Does away with the need for manual app validation
Questions???