puppies: transformation-supported personalized privacy ... · [email protected],...

16
PuPPIeS : Transformation-Supported Personalized Privacy Preserving Partial Image Sharing Jianping He , Bin Liu * , Deguang Kong * , Xuan Bao * , Na Wang * , Hongxia Jin * , George Kesidis [email protected], [email protected] {doogkong,xbxuanbao8}@gmail.com [email protected], [email protected], [email protected] The Pennsylvania State University * Samsung Research America Abstract—Sharing photos through Online Social Networks is an increasingly popular fashion. However, it poses a serious threat to end users as private information in the photos maybe inappropriately shared with others without their consent. This paper proposes a design and implementation of a system using a dynamic privacy preserving partial image sharing technique (namely PuPPIeS ), which allows data owners to stipulate specific private regions (e.g., face, SSN number) in an image and correspondingly set different privacy policies for each user. As a generic technique and system, PuPPIeS targets at threats about over-privileged and unauthorized sharing of photos at photo service provider (e.g., Flicker, Facebook, etc) side. To this end, PuPPIeS leverages the image perturbation technique to “encrypt” the sensitive areas in the original images, and therefore it can naturally support popular image transformations (such as cropping, rotation) and is well compatible with most image processing libraries. The extensive experiments on 19,000 images demonstrate that PuPPIeS is very effective for privacy protection and incurs only a small computational overhead. In addition, PuPPIeS offers high flexibility for different privacy settings, and is very robust to different types of privacy attacks. I. I NTRODUCTION Fueled by the pervasive use of Online Social Networks (OSN), photo sharing becomes part of a modern life style in the Internet era. Recent Studies [1], [2] show that images have surpassed plain text as the number one sharing format among OSN users. For instance, in every second, nearly 4,000 photos are uploaded to Facebook, around 4,600 photos are exchanged through Snapchat. Behind this phenomenal growth, new cloud computing technologies are deployed to support the services. One common practice of photo sharing service is to use centralized Photo Sharing Platforms (PSPs) as the backbone, where photos are uploaded, saved and shared on cloud. This arrangement leaves common users no choice but to trust the security mechanism of the cloud service provider. Clearly, in this setting, users’ privacy may be at stake [3]. For example, adversaries may try to take advantage of such centralization to steal and leak users’ shared photos illegally. In year 2014, several unfortunate events have happened involving popular service providers such as Apple [4] and Snapchat [5]. In addition, photo owners face the danger of privacy leakage after they upload the images to PSPs because PSPs may access and process users’ photos without explicitly asking for users’ agreement [6]. Furthermore, unprotected sharing of photos may unintentionally hurt photo owners’ privacy. A Motivating Example We consider a scenario that Alice and Bob took a photo together. Alice posted the photo on Facebook and shared it with Bob. Bob may like or comment (a) An original image (b) The perturbed image Fig. 1. An example of protecting users’ privacy by encrypting the partial image. The sensitive region (2 people) is marked using a bounding box. The insensitive region (the Statue of Liberty in background) is un-protected without any hidings/occlusions. The original image is downloaded from [7]. this photo. Then Bob’s friends could see the photo due to Bob’s comment. However, Alice may not want her photo to be accessed by users other than her friends at PSP side. Instead of sharing a photo to public users, one naive solution to protect the photo privacy is to store it locally. However, this goes against the growing usage of cloud service and cannot take the advantage of the service provided by PSPs. Furthermore, this may also prevent fair use of certain regions of a photo that do not raise privacy concerns. In an image, the sensitive regions generally occupy some parts of images while the insensitive regions (such as background) should still be used for image recognition and understanding purpose. For example, Fig. 1(a) shows an image with Statue of Liberty as the background. Fig. 1(b) shows the perturbed image after encrypting the sensitive regions. While searching both images in Google Image Search, Fig. 2 shows that top-10 search results are both relevant and highly overlapped, suggesting that perturbed partial images are still useful with occluded sensitive regions. Apparently, unsharing photos may lose this kind of opportunity due to unavailability of photos. Therefore, beyond this naive solution, a question that natu- rally follows is: are we able to protect the privacy of an image and at the same time take advantage of the storage resources and sharing capability offered by PSPs without scarifying the usability of images? Furthermore, the problem itself is more

Upload: others

Post on 19-Apr-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

PuPPIeS : Transformation-Supported PersonalizedPrivacy Preserving Partial Image Sharing

Jianping He†, Bin Liu∗, Deguang Kong∗, Xuan Bao∗, Na Wang∗, Hongxia Jin∗, George Kesidis†

[email protected], [email protected] {doogkong,xbxuanbao8}@[email protected], [email protected], [email protected]

†The Pennsylvania State University ∗Samsung Research America

Abstract—Sharing photos through Online Social Networksis an increasingly popular fashion. However, it poses a seriousthreat to end users as private information in the photos maybeinappropriately shared with others without their consent. Thispaper proposes a design and implementation of a system usinga dynamic privacy preserving partial image sharing technique(namely PuPPIeS ), which allows data owners to stipulate specificprivate regions (e.g., face, SSN number) in an image andcorrespondingly set different privacy policies for each user. Asa generic technique and system, PuPPIeS targets at threatsabout over-privileged and unauthorized sharing of photos atphoto service provider (e.g., Flicker, Facebook, etc) side. To thisend, PuPPIeS leverages the image perturbation technique to“encrypt” the sensitive areas in the original images, and thereforeit can naturally support popular image transformations (suchas cropping, rotation) and is well compatible with most imageprocessing libraries. The extensive experiments on 19,000 imagesdemonstrate that PuPPIeS is very effective for privacy protectionand incurs only a small computational overhead. In addition,PuPPIeS offers high flexibility for different privacy settings, andis very robust to different types of privacy attacks.

I. INTRODUCTION

Fueled by the pervasive use of Online Social Networks(OSN), photo sharing becomes part of a modern life stylein the Internet era. Recent Studies [1], [2] show that imageshave surpassed plain text as the number one sharing formatamong OSN users. For instance, in every second, nearly 4,000photos are uploaded to Facebook, around 4,600 photos areexchanged through Snapchat. Behind this phenomenal growth,new cloud computing technologies are deployed to supportthe services. One common practice of photo sharing serviceis to use centralized Photo Sharing Platforms (PSPs) as thebackbone, where photos are uploaded, saved and shared oncloud. This arrangement leaves common users no choice butto trust the security mechanism of the cloud service provider.

Clearly, in this setting, users’ privacy may be at stake [3].For example, adversaries may try to take advantage of suchcentralization to steal and leak users’ shared photos illegally. Inyear 2014, several unfortunate events have happened involvingpopular service providers such as Apple [4] and Snapchat [5].In addition, photo owners face the danger of privacy leakageafter they upload the images to PSPs because PSPs may accessand process users’ photos without explicitly asking for users’agreement [6]. Furthermore, unprotected sharing of photosmay unintentionally hurt photo owners’ privacy.

A Motivating Example We consider a scenario that Aliceand Bob took a photo together. Alice posted the photo onFacebook and shared it with Bob. Bob may like or comment

(a) An original image (b) The perturbed image

Fig. 1. An example of protecting users’ privacy by encryptingthe partial image. The sensitive region (2 people) is marked usinga bounding box. The insensitive region (the Statue of Liberty inbackground) is un-protected without any hidings/occlusions. Theoriginal image is downloaded from [7].

this photo. Then Bob’s friends could see the photo due toBob’s comment. However, Alice may not want her photo tobe accessed by users other than her friends at PSP side.

Instead of sharing a photo to public users, one naivesolution to protect the photo privacy is to store it locally.However, this goes against the growing usage of cloud serviceand cannot take the advantage of the service provided by PSPs.Furthermore, this may also prevent fair use of certain regionsof a photo that do not raise privacy concerns. In an image, thesensitive regions generally occupy some parts of images whilethe insensitive regions (such as background) should still beused for image recognition and understanding purpose. Forexample, Fig. 1(a) shows an image with Statue of Libertyas the background. Fig. 1(b) shows the perturbed image afterencrypting the sensitive regions. While searching both imagesin Google Image Search, Fig. 2 shows that top-10 searchresults are both relevant and highly overlapped, suggestingthat perturbed partial images are still useful with occludedsensitive regions. Apparently, unsharing photos may lose thiskind of opportunity due to unavailability of photos.

Therefore, beyond this naive solution, a question that natu-rally follows is: are we able to protect the privacy of an imageand at the same time take advantage of the storage resourcesand sharing capability offered by PSPs without scarifying theusability of images? Furthermore, the problem itself is more

Page 2: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

challenging due to the fact that the practical solution is highlydesirable to support practical scenarios. In fact, a practicalsolution needs to address the following challenges (C1–C3).

C1: Maximal Usability The solution is expected to pro-vide maximal usability of images while protecting images’privacy. Also, it should maximize the usage of public cloudstorage. Furthermore, users are able to reconstruct the per-turbed or encrypted photos they received.

C2: Transformation Supported In order to apply photosharing on PSPs, the mechanism is desired to support pop-ular image transformations and standard image processingtechnique, because these transformations and processing (e.g.,cropping, compression, etc) are popularly used in PSPs.

C3: Personalized Privacy Flexibility Privacy regions inimages are relative and personal. The proposed techniqueshould provide users with the flexibility so that image ownerscan customize desired privacy levels. Ideally, the private in-formation that needs to be stored locally by the image ownershould be minimal.

However, the current solutions for image privacy protectionis still limited in addressing these challenges1. For example,many image protection works (e.g., [8], [9], [10], [11], [12],[13]) only support full-image encryptions before sharing ofphotos to others. Unfortunately, the usability of image itselfis greatly decreased. A few number of methods ([14], [15])maximize the usability of images by supporting the partialimage sharing. They are, however, not compatible with thepopular image transformations widely used in PSPs. Therefore,the actual availability of these methods is suppressed. Further-more, few (if not all) works (e.g., [8], [9], [10], [11], [12],[14], [15]) have provided users with flexibility to customizepersonalized privacy settings in image privacy control.

Against this backdrop, in this paper, we develop an im-age transformation-supported privacy preserving partial im-age sharing technique to protect image privacy. In this newscenario, an image is segmented into the public region and theprivate regions. One or more private regions are protected usingimage perturbation technique with a specific security key, andthe public region together with the encrypted private regionsare stored and shared on PSPs. The image owner can distributesecurity keys to other users through secure channels, and a userwith one or more keys can decrypt (some of) the correspondingprivate regions encrypted by the keys she has. In the contextof image sharing, the security key is in fact the private matrixused in image perturbation scheme shared between the senderand receiver2. As such, we protect the image privacy withoutscarifying the usability of an image itself. Furthermore, ourscenario supports popular image transformations and allowsa user to achieve personalized desired privacy level, andtherefore, a user has more flexibility to control the privacysettings in image sharing.

An illustration Go back to the story about Alice andBob, Alice can easily enforce her preference by encryptingher face region before posting the photo to Facebook and onlydistributes the key to her friends including Bob. At the receiver-side, only her friends can use the shared key to decrypt and

1Please refer to Section II-B for more details.2Please refer to Section III for more details.

(a) Top-10 search results usingFig. 1(a).

(b) Top-10 search results usingFig. 1(b).

Fig. 2. Top-10 search results in Google Search Engine using Fig. 1.

see the corresponding encrypted regions. In this way, the user-desired image privacy level is achieved, which prevents theunintentional privacy leak3. An illustration of this example isdemonstrated in Fig. 3.To summarize, this paper makes the following contributions:

• We propose a partial image sharing technique, PuPPIeS ,to protect images privacy and photo sharing. To ourknowledge, this is the first image privacy protection workthat supports all three essential features: partial imagesharing, popular image transformation and personalizedprivacy control and enforcement on photos.

• To address the challenge of maximal usability (C1),PuPPIeS supports partial image sharing from segmentingimages into sensitive and insensitive regions using regionof interest (ROI) detection.

• To address the challenge of supporting image transfor-mation (C2), PuPPIeS leverages the image perturbationtechnique and proposes several DCT coefficient preserv-ing method to perturb images, which are transparent toimage transformation techniques and can support existingimage processing libraries without any extra changes.

• To address the challenge of personalized privacy flex-ibility (C3), PuPPIeS allows users to customize theirprivacy-sensitive regions, and therefore is resilient toperturbations on any specific private regions based onprivacy sharing settings.

• We evaluate PuPPIeS on the real-world large-scale pop-ular image datasets of 19,000 images. On the fourdatasets, PuPPIeS successful provides satisfactory privacyprotection on users’ privacy with a reasonable overhead.Moreover, our method is very robust to different types ofprivacy attacks, from brute force attack to sophisticatedinference attacks like face detection at PSP side.

II. BACKGROUND & RELATED WORK

In this section, we first introduce the JPEG standard andimage transformation technique, and then review the existingefforts on privacy-preserving image protection and sharing.

3Notice that, we do not claim the proposed scheme can completely preventBob’s intentionally harmful behaviors, if any, that can lead to Alice’s privacyleak. Actually, to the best of our knowledge, there is no good solution that canreally solve this scenario, because, Bob can do an decrypted image re-sharinginstead of directly sharing the one Alice encrypted and posted if he wants.This is similar to the scenario that a secret password is revealed through, saycollusion. This paper targets at image privacy attacks at PSP side and solvesthe problem from image content perturbation perspective, which is the focusof this paper. Standard crypto method is used to distribute the keys.

2

Page 3: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

(a) An original image. WhatMr. Einstein and Mr. Chaplin’sfriend see.

(b) What is actually stored andshared in a PSP (in the publiccloud).

(c) What only Mr. Einstein’sfriend see.

(d) What only Mr. Chaplin’sfriend see.

Fig. 3. A motivating example of privacy preserving partial image sharing scheme.

A. JPEG image format

A RGB (Red, Green and Blue) image is generally de-scribed by pixels, and each pixel is denoted as a 8-bit (inrange[0...255]) value in three channels, while each value char-acterizes one particular color. Transforming a RGB image tothe JPEG [16] format usually takes four steps.

Step 1. Transform a RGB image to its YUV representation.While the RGB space represents an image by color values, theYUV representation maps an image to three color layers, Y, Uand V4. Step 2. DCT transformation. Each YUV layer is thendivided into 8x8 pixel blocks in which DCT transformationis applied. DCT is a frequency-domain linear transformationand is invertible. The output of each block is an 8x8 DCTraw matrix. For each DCT raw coefficient matrix, the firstentry, i.e., the mean value of all pixels, is called Direct Current(DC) component, and the rests are called Alternating Current(AC) components. Step 3. Quantization. Raw DCT coefficientsare floating point numbers, which need to be rounded tothe nearest integers in this step. A quantization table is thenused to quantize the transformed integers. Note that the stepsize in the quantization is not the same for different DCTcoefficients. For most natural images, coefficients of lowerfrequencies contain more visual information than coefficientsof higher frequencies [16], [17]. Therefore, in order to achievea better compression ratio, JPEG algorithm tends to use largerquantization step size for higher frequency coefficients. Step4. Entropy coding. At this step, DC and AC coefficients areseparately encoded using differential value encoding and run-length encoding (RLE). These initial encoded outputs arefurther compressed using Huffman coding algorithm. Then allvisual information is included in DCT coefficients, quantiza-tion tables, and a Huffman coding table.

B. Image transformation

Image transformations are widely used to improve imagequality or to save storage. Popular image transformations arewidely used in PSPs including scaling, cropping, compression,and other linear or revertible transformations. Scaling is theoperation that changes the total pixel number of an imageby resizing the image, including downscaling and upscaling.Cropping is the operation that cuts away areas of an imageoutside of a selected rectangular region. Compression is theoperation that decreases image size (in bytes) without losingpixel size. Other linear or reversible transformations includerotation, filtering [18] and overlapping, etc.

4A monochromatic image only has layer Y. Nevertheless, this does not affectthe algorithms in this paper because each layer is processed independently.

C. Related work

C.1 Securing data on PSPs. Plenty of existing worksdiscuss how to secure data on PSPs, e.g., [19], [20], [21], [22],[23], [24], [25]. Bellare et al. [26] and Senftleben et al. [27]proposed approaches to encrypt data while preserving theirformat. However, these techniques do not support image trans-formations because they are agnostic about image domains.Nilizadeh et al. [28] proposed a decentralized architecture forOSNs. Anderson et al. [29] proposed a privacy-enabling OSNarchitecture over traditional OSNs. Egele et al. [30] proposedCOMPA to detect compromised accounts in OSNs. Thoseschemes, however, are orthogonal to our work in this paper.

C.2 Image and video privacy work summary. Manyimage content protection methods have been proposed forimage privacy protection, e.g., [14], [8], [9], [10], [11], [12],[15], [13]. According to the types of the encrypted signalsin images, they can be divided into several categories: (1) en-crypting Huffman coding table [8], (2) encrypting quantizationtable [9] , (3) encrypting DCT dictionary [10], (4) encryptingcoefficients [11], [12], [15], [13]. Most of them (e.g., [14],[15], [13]), however, cannot be compatible with the popularimage transformations such as scaling since they are notparticularly designed for photo sharing via PSPs. Furthermore,few methods support partial image sharing scheme. Comparedto these works, PuPPIeS is the first work that supports allthe important and indispensable features with low overhead,such as partial image sharing, popular image transformationand personalized privacy flexibility. A thorough comparisonof PuPPIeS to existing techniques is given in Table I.

C.3 Method review. To see how existing techniques arelimited in solving image privacy problems, let us review eachmethod briefly. Wu et al. [8] proposed Multiple Huffmancoding Tables (MHT) to encrypt images and preserve theimage privacy. In MHT, these secret tables cannot be shared tothe third-party. Therefore, PSPs are unable to parse image dataappropriately since PSPs do not have any information about thecoding table actually used in supporting transformations likecompression. Just as MHT, the quantization table encryptionmethod proposed by Chang et al. [9] can support neitherimage compression nor scaling. The secret DCT transfor-mation dictionary encryption method [10] does not supportscaling well because the representative pixels could be a linearcombinations of encrypted and non-encrypted pixels, whichhowever, are unknown to PSPs even the information is criticalfor performing scaling.

The permuting of DCT coefficients within each image

3

Page 4: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

TABLE I. COMPARISON OF DIFFERENT IMAGE PRIVACY PROTECTION METHODS.

Method Encrypted Signals Partial Image Sharing Compatible with Image TransformationScaling Cropping Compression Rotation

Cryptagram [14] File bit stream√

× × × ×Multiple Huffman codingTables (MHT) [8] Huffman coding tables × ×

√×

Chang et al. [9] Quantization table × ×√

×√

Aharon et al. [10] DCT transformation dictionary × ×√ √ √

Unterweger et al. [11] Coefficients × ×√ √ √

Dufaux et al. [12] Coefficients × ×√ √ √

Steganography [15] Coefficients√

× × ×√

P3 [13] Coefficients × ×√ √ √

PuPPIeS (ours) Coefficients√ √ √ √ √

(a) The original scaled image. (b) The recovered image of (a) in P3. (c) The recovered image of (a) in PuPPIeS .

Fig. 4. Compared to the scaled original image (a), many fine details are lost in the recovered image (b) using P3 method, while the recovered image (c) isexactly the same as the original image (a) at receiver side using our method PuPPIeS .

block scheme [11] does not support scaling in pixel domain,because the permutation applied in the DCT domain haschanged the original pixels in an unpredicted way, and such“unprediction” prevents the PSPs from performing visuallymeaningful scaling transformation. For the same reason, theflipping the sign or bits of coefficients [12] does not supportscaling. The scheme of encryption of images in the bit leveland storage of encrypted bits in pixel blocks adopted inCryptagram [14] cannot support popular transformations dueto the incorrect parsing of pixel data at PSP side.

C.4 Limitations of P3. The most related scheme to ours isP3 [13], which splits an image into two images (i.e., a publicimage and a private image) based on a pre-defined threshold,where in public image, all DC coefficients are removed andAC coefficients whose absolute values are greater than thethreshold, and in the private image, DC coefficients and thecompensations to the AC coefficients w.r.t the threshold arestored. However, P3 has its inherent limitations. (i) P3 workson the whole-image level only and does not differentiatebetween different regions in a photo. (ii) P3 is not, by design, tosupport image transformations widely used in image process-ing libraries. (iii) P3’s privacy-preserving capability and cloudstorage utilization could be further improved. For example, inP3 scheme, many fine details in images are lost when an imageis recovered at client side.

Therefore, to address these limitations, a more robustimage privacy sharing scheme is desirable. This motivates thedesign and implementation of PuPPIeS . A thorough study ofthe advantage of PuPPIeS over P3 is given in Section V.Fig. 4 demonstrates an example of image recovery resultsusing PuPPIeS and P3. Clearly, many fine details are lostin the recovered image Fig. 4(b) using P3 method, while therecovered image Fig. 4(c) is exactly the same as the originalimage Fig. 4(a) at receiver side using PuPPIeS .

III. PuPPIeS DESIGNA. Threat Model and Assumptions

Threat Model In this paper, we consider privacy threatsto images’ private regions (such as faces and SSNs) at PSPside, i.e., unauthorized access to photos at PSPs. We assumePSPs are not as trustworthy as some have claimed, i.e., semi-honest. On one hand, the PSP itself may apply computer visionand pattern recognition techniques on user photos to achievecertain functionality (e.g., facing tagging service provided byFacebook [6]). On the other hand, PSPs store massive amountsof user data, making them the ideal attack targets [31], [4],[5]. Obviousely, it is desirable to prevent the private regionsof images being accessed by unauthorized users at PSP side.

Assumptions We assume one’s friends are trustworthy andthe devices are not compromised. The friends’ misbehaviormay raise privacy issues to users, as described in the Alice-Bobexample before. We are more interested in unintentional imageprivacy leakage on OSNs. Intentional image privacy leakage5,such as taking screenshots of a private photo or recording itwith a camera, is out of the scope of this paper. We assumethe key distribution and management process is secure usingstandard crypto method.

B. Design Goals and Principles

As is illustrated in Section I, the developed system isexpected to (1) protect image privacy effectively; (2) supportpopular image transformation; (3) maximize cloud storageusage; 4) incur minimal overhead including CPU, storage andnetwork bandwidth. Also, the developed system should over-come the limitations of P3. These design goals are consistentwith the challenges (C1–C3) we need to address. Our systemdesign is guided by the following key observations (O1-O3).

5To defend such attack, what we need is social engineering work insteadof pure technology work.

4

Page 5: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

• O1: The private content only takes up some small areasof an image. The examples include human faces in a landscapephoto, private text (e.g., SSN number/password) in an indoorpicture, or sensitive objects (valuables/license plate/home ad-dress) in a street snapshot. Excluding these sensitive regions,other regions of an image are safe to be stored in cloud andexposed to the public. They may even be used by the PSPsfor supporting fair use of the partial image. This motivates usto adopt partial image encryption and sharing technique tomaximize the usability of images (challenge C1).

• O2: Image perturbations have been widely used tosupport popular image transformation and image processinglibraries. This gives us the opportunity to leverage imageperturbations and propose novel and efficient methods suchas manipulating DCT coefficient to support different imagetransformations (challenge C2).

• O3: An image owner may have different sharing prefer-ences when she shares an image with different groups of users.However, most likely, such sharing policy differences onlyneed to be reflected in those small sensitive areas discussedbefore. The majority of an image can be kept unmodifiedeven given different sharing policies. In the story about Aliceand Bob, the photo owner, Alice, may want to hide herface to non-friends, but share other areas of this image toeveryone. These observations motivates to implement a systemthat offers enough flexibility to achieve users’ personalizedprivacy settings (challenge C3).

Sender(Image Owner)

Private MatrixSharing Channel

Image SharingChannel

RoI Detection andRecommendation

ImagePerturbation

Private Matrices,Public Parameters

ImageReconstruction

Receivers

PSP(may perform image

transformations)

Fig. 5. System architecture of PuPPIeS (sender-side, receiver-side and PSP).

C. PuPPIeS Overview

Fig. 5 illustrates the overview design of PuPPIeS . Threecomponents are involved in PuPPIeS : an image sender, one ormore image receivers6 and a PSP. The workflow of PuPPIeSis illustrated as follows.

6 Note that, in OSNs, a receiver can be a group of users sharing the sameprivacy privilege.

An original image

ROI region 1

ROI region n

Private matrix 1

Private matrix n An encrypted image

Insensitive regions (no changes)

ROI Detection

DCT Coefficient

Perturbation

DCT Coefficient

Perturbation

Fig. 6. Image perturbation process on the original image at sender-side.

An encrypted

image

ROI region 1

ROI region n

Private matrix 1

Private matrix n An original image

Original insensitive regions

Public data

ROI decomposition

DCT coefficientreconstruction

DCT coefficientreconstruction

Fig. 7. Image reconstruction process on the only perturbed image at receiver-side. The image is not transformed at PSP side (scenario 1).

An encrypted & transformed

image

Transformed ROI

region 1

Transformed ROI

region n

An transformed

image

Transformed insensitive regions

Public data

Shadow ROI n

Shadow ROI 1

ROI decomposition

Reconstruct transformed

ROI 1

Reconstruct transformed

ROI n

Fig. 8. Image reconstruction process on the perturbed and transformed imageat receiver-side. The image is transformed at PSP side (scenario 2)

1. Sender-side At sender-side, it consists of two keyoperations: (1) region of interest (ROI) detection and recom-mendation; (2) image perturbation on ROI. Before a senderuploads an image, the object detection and recommendationengine (Section IV-A) is automatically triggered with thepurpose of discovering the possible privacy sensitive regions ofinterest (ROI) in an image. The sender can choose to agree withthe recommended ROIs or instead customize privacy-sensitiveregions manually and then chooses a receiver she wants toshare. She can also choose the privacy level of the image fromadjustment of parameters (e.g., mR and K in Algorithm 3). Adefault privacy setting (Table IV) is recommended to generalusers.

Note that an image may have more than one ROIs. Giventhese ROIs, image perturbation module is triggered, whereeach ROI could be perturbed using different private matricesand shared with different receivers. Given that DCT coefficientblocks are important characteristic of images, we leveragethe image perturbation technique by perturbing coefficientblocks only in ROIs. Moreover, image perturbation technique isadopted due to its low computational complexity and flexibilityto support image transformations on the perturbed ROIs.

After the above two steps, an original image is encrypted inROI regions and still intact in insensitive regions, as is shownin Fig. 1(b). Fig. 6 illustrates the flow of image encryptionprocess on sender-side.

2. Receiver-side After downloading the perturbed image,the receiver retrieves public data about this image and thenrecovers the perturbed ROIs. Public data includes mR, K,position and size of ROI, ZInd (the indices of new zero set)

Private matrix i

Public dataShadow RoI i

Shadow ROI generator

Fig. 9. Shadow ROI generation process given private matrix P and publicdata including transformation type on PSPs.

5

Page 6: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

(a) An original image with markedROI at sender-side.

(b) The perturbed image using PuP-PIeS -C at sender-side.

(c) The 180◦-rotated perturbed imageon PSPs.

(d) The reconstructed rotated and per-turbed image at receive-side.

Fig. 10. The flow of image perturbation and reconstruction process. The image is perturbed at send-side, transformed at PSP-side, and reconstructed atreceiver-side. In this example, the transformation operation is rotation at PSP side. The image is from PASCAL dataset.

in Algorithm 2 and 3, ID of private matrix Pi that is usedto encrypt region i, transformation type at PSP side. In fact,these public data can be accessed by anyone. A receiver needsto first conduct ROI decomposition from retrieving the sizeand locations of ROIs. After recovering the ROI regions, thereceiver reconstructs the original (transformed) images usingthe shared private matrix Pi from the secure channels betweenthe image owner and the receiver. Regarding whether theperturbed images are transformed at PSP side, there are twoscenarios shown in Figs. 7 and 8 respectively.

Scenario 1: no image transformation at PSP-side

For each perturbed ROI, once the receiver gets the corre-sponding private matrix P ′ and public data such as positionand size of ROI, ZInd (the indices of new zero set) inAlgorithm 2, the original image can be easily recovered.Fortunately, all these information are either public or sharedto a receiver. Fig. 7 illustrates this process.

Furthermore, we have Lemma III.1 to guarantee the exactrecovery of the perturbed image at receiver side. Let ei bethe i-th coefficient in a 64-dimensional vector e correspondingto a block in ROI in perturbed image, pi be the i-th entryin vectorized private matrix P ′ after normalized by mR inPuPPIeS (Algorithm 2), bi be the reconstructed i-th entry inthe 64-dimensional reconstructed coefficient vector. The exactrecovery means that bi can be accurately computed, given eiand pi.

Lemma III.1. The reconstructed DCT coefficient bi is givenby,

bi =(

(ei − pi + 1024) mod 2048)− 1024.

Please refer to the Appendix for detailed proof.

Scenario 2: image transformation at PSP-side

In this scenario, PSPs conduct image transformations onthe perturbed images. The process is similar to reconstructionon perturbed images except that the reconstruction needs tobe conducted on both perturbed and transformed images. Thekey idea is the perturbed and transformed ROIs can still berecovered by using the private matrix (Algorithm 2) even afterbeing transformed. The only difference is a different privatematrix P ′′, which we call “shadow ROI matrix”. The shadowROI matrix is generated from its corresponding private matrixP ′ and public data including the transformation type used atPSP-side. Then following the same process as in Scenario 1,

we can recover the perturbed and transformed image as shownin Fig. 8. More details are presented in Section IV-C.

Note that the receiver may not be able to recover all theperturbed ROIs because these ROIs may be perturbed usingdifferent private matrices and the receiver may only get partof these matrices from the sender according to the sender’sprivacy and sharing preferences.

3. PSP We assume PSPs (such as flickr, facebook) cando any image transformation operations (such as cropping,rotation, filtering, etc). PSPs also store perturbed images andthe public parameters of images. All of these operations couldbe done via general file store and retrieval APIs.

4. Communications between sender and receiver Thesender only needs to share private matrix P with the receivervia secure channels7. Note this channel could be any logicalsecure communication channel, e.g., insecure channel usingkey exchange algorithms [32].

5. Summary PuPPIeS includes three key components.(i) ROI detection and recommendation at sender side andROI decomposition at receiver-side (Section IV-A); (ii) Imageperturbation (i.e. DCT coefficient perturbation) at sender sideand perturbed image (i.e. DCT coefficient reconstruction)recovery at receiver-side (Section IV-B); (iii) Perturbed andtransformed image reconstruction at receiver-side (SectionIV-C). In the next section, we will elaborate on the detailsof each component.

IV. PuPPIeS ALGORITHM DETAILSA. ROI detection and recommendation

Although privacy is subjective, ROIs in images still sharecommon characteristics – they are usually human faces, sensi-tive text (SSN/phone/credit card numbers) or particular objects(e.g. buildings, cars, animals etc.). In order to automaticallydetect ROIs in an image, we built our detection module basedon the techniques of Face Detection [33], Optical CharacterRecognition (OCR) [34] and General Object Detection [35]widely used in pattern recognition and computer vision com-munity. Given an image, ROI detection module is automati-cally triggerd to detect all human faces, text blocks and thetop-N general objects in this image.

The key observation is that there are overlapped regionsamong the detected regions by using face, OCR and object

7In this paper, we do not assume the image itself can be shared via securechannels due to the reason that the overhead is generally high for the largesize of the image.

6

Page 7: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

2 6 10 14 18 22 26 3032

0.5

1

1.5

2

2.5

3

3.5x 10

5

Number of Private Matrices

Siz

e o

f P

riv

ate

Pa

rt (

By

te)

PuPPIeS

P3−PASCAL

P3−INRIA

Fig. 11. The size of private parts in P3 and PuPPIeS . Fig. 12. Detected ROIs. All 3 images are from PASCAL dataset.

detection due to the fact that each object (e.g., face, text or gen-eral object) detection technique runs on an image individually,To overcome this limitation, in our system, we implementeda method that splits the overall detected regions into disjointregions by merging all regions from different object detectionresults. The split disjoint multiple rectangle regions are finallyrecommended to image owners as potential ROIs. Illustrativeexamples of split results are shown in Fig. 12. This splittinghas the advantage that image owners can easily secure disjointregions using different private matrices, and it is also easyfor owners to combine multiple disjoint regions into one newregion for an integral treatment such as encryption.

In addition to accepting, denying or modifying the recom-mended ROIs, this module also allows image owners to man-ually mark other ROIs that are not automatically recognized.Those determined ROIs are then passed to image perturbationalgorithm for the following perturbation operations. In practice,this module can log different image owners’ choices andpreferences, and therefore is possible to train an automateddetection and recommendation classifier by capturing users’privacy preference, which actually offers personalized privacysetting for photo sharing service. We skip more discussionsabout personalized privacy preserving photo sharing operationsdue to space limit.

B. Image Perturbation and Reconstruction on DCT coefficient

We will show how to do image encryption using DCT co-efficient perturbation at sender side. As shown in Lemma III.1and Section III-C, the reconstruction can be correspondinglydone based on the encryption scheme and its parameters likeprivate matrix P, mR, K, Q, etc.

B.1 Why image perturbation for photo privacy protection?

We are aware that blurring [36] and masking are the mostpopular approaches and widely used for modifying imagecoefficients to hide the selected areas of an image (likeface, people, etc). However, neither blurring nor masking isinvertible, so they can permanently damage the chosen areas ofan image. From this perspective, blurring and masking can notbe adopted in supporting our photo sharing scenario. Instead,we leverage the image perturbation technique and design theencryption methods based on DCT coefficient to achieve thephoto privacy preserving purpose.

B.2 How to secure ROIs?

Let R be a specific ROI region detected by ROI module.To secure ROI, the key idea of our method is to perturbingregion R using a private matrix P . The size of private matrixP is determined by the size of each block in region R. In

the paper next, we assume each region is divided into manyfixed blocks with size 8x8 for each block, correspondingly,the size of matrix P is 8x8. For computational convenience,we vectorized matrix P into private vector P ′, where P ′ isa 64-dimensional vector that preserves the values of P . Thenthe question is how to use private vector P ′ to perturb regionR. We present four methods as follows.

1. PuPPIeS -N : equally treating all coefficients

An intuitive approach is treating each DCT coefficientblock Bk (1 ≤ k ≤ K, K is the number of blocks) in region Rindependently and adding the corresponding element in matrixP to each of these blocks.

Mathematically, for a given block Bk in region R, its DCTcoefficient vector is encoded as a 64-dimensional vector. LetBk = {bki , 0 ≤ i ≤ 63} be the DCT coefficient correspondingto the k-th block in region R, where bki is the i-th entry in k-thblock Bk. Let P ′ = {p′i, 0 ≤ i ≤ 63} be private vector, wherei-th entry is p′i. Then, for the k-th block of image region R, itsDCT coefficient can be encrypted to a 64-dimensional vectorEk, where Ek = {eki , 0 ≤ i ≤ 63} and its elements are givenby eki = bki + p′i.

However, this approach has a major drawback. All DCcoefficients bk0s w.r.t different k-th block, which contain themost important visual information of an image – as shownin Fig. 13 and Fig. 14, are secured by the same singlenumber p′0 – because p′0 is a random number in a very smallrange, i.e., −1024 ≤ p′0 ≤ 1023 as explained in the JPEGstandard. Therefore, an adversary could easily retrieve the DCcomponents of an image and recover bk0s by using brute forcesearch.

2. PuPPIeS -B: discriminantly treating DC and AC

Given the above observation, we suggest a more robustscheme that treats DC and AC components differently so thatDC coefficients are secured by different random numbers.More specifically, we propose to perturb different entries inEk as follows,

For DC components: ek0 = bk0 + p′`, ` = (k mod 64),

For AC components: eki = bki + p′i, for 1 ≤ i ≤ 63,(1)

where mod is the modulo operation. It is well-known that inJPEG standard, all DC and AC coefficients are in the range−1024 ≤ bki ≤ 1023. Therefore, the result obtained fromEq (1) should be normalized to this range.

Compared to the intuitive method, this simple yet efficientimprovement of perturbing DC coefficients by a whole private

7

Page 8: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

(a) An image with only DC com-ponent preserved.

(b) An image with only AC com-ponent preserved.

Fig. 13. Separation of DC and AC components in Fig. 1(a).

(a) Original image. (b) Preserve DC com-ponent only.

(c) Preserve AC com-ponent only.

Fig. 14. Separating DC and AC components in Fig. 14(a).

vector P ′, instead of the same single value, can result in amore robust scheme which is denoted as our base algorithm(PuPPIeS -B) for image perturbation.

For demonstration purpose, we apply PuPPIeS -B toFig. 15(a) which contains two ROIs and the resulting imageis shown in Fig. 15(b). Clearly, it is hardly to tell the contentof the original ROIs from the perturbed image.

To understand the extra overhead introduced by PuPPIeS -B, we apply it The PASCAL Visual Object Classes Challenge2007 [38] dataset8. In experiment, we perturb the whole imageto simulate the worst case overhead. Table II shows the statis-tics of normalized file size in the dataset after perturbation.As we can see, PuPPIeS -B increases image size by about 10times, which is unacceptable and can pose great storage andprocessing pressures on PSPs. Thus an approach with reducedsize of perturbed images is highly desirable. To conduct a faircomparison, the referenced image size numbers hereafter aredenoted as relative numbers w.r.t the original image size, i.e.,1.46 means a 46% overhead.

3. PuPPIeS -C: Reducing perturbed image size

The reason behind the bloated image size is simple. Thedefault Huffman coding tables are optimized for each original

8Please refer to Section V for more details.

TABLE II. NORMALIZED SIZE OF PERTURBED IMAGES IN PASCALDATASET (UPPER BOUND NUMBERS OBTAINED BY PERTURBING WHOLE

IMAGES).scheme mean median std min maxPuPPIeS -Base 10.45 9.69 3.88 5.01 85.8PuPPIeS -Compression 1.46 1.41 0.230 1.15 6.26PuPPIeS -Zero 1.23 1.22 0.064 1.10 1.80

image according to the distribution of DCT coefficients: thehigher the frequency of the corresponding DCT coefficient, theshorter the Huffman codeword.

The default Huffman coding tables, however, become lessoptimized, because PuPPIeS -B adds random numbers to thecoefficients which actually changes the frequency distribution,Furthermore, this also leads to extra overhead. For example,it is possible that after perturbation, instead of the shortestHuffman codeword, the longest codeword is in fact used torepresent the coefficient with the highest frequency, whichcompletely breaks the Huffman coding. To address this limita-tion, we propose to reconstruct Huffman coding tables basedon the distribution of DCT coefficients after perturbation.

Theoretically9, the efficiency of Huffman coding is neg-atively correlated with the value range of random numbersin private matrix P . Therefore, a possible optimization forgenerating entries of P is to provide narrower ranges forgenerating random numbers that are used to perturb higherfrequency coefficients, and to offer wider ranges for generatingrandom numbers that are used to perturb lower frequencycoefficients. In other words, lower frequency coefficients getstronger protection by adding larger range randomness sincevisual information of most natural images is concentrated inlower frequency coefficients [16], [17].

Following the above principles, we revise PuPPIeS -B andpropose PuPPIeS -C. In PuPPIeS -C, another 8x8 matrix Q,which we call private range matrix, is used to control the rangeof each entry in the private matrix P . Let Q′ be the vectorizedprivate range matrix Q, thus Q′ is a 64-dimension vector withthe same element as Q. In the actual implementation, we onlyneed to use the corresponding element in private matrix P toperturb DC, and use corresponding element in private rangematrix Q to perturb AC, respectively (shown in Line 3 andLine 6 in Algorithm 1).

As shown in Algorithm 3, we suggest an algorithm togenerate private range matrix Q′ using two parameters mRand K, where mR is the minimum range of entries in P andK is the number of coefficients the algorithm perturbs. BothmR and K are adjustable input parameters. In Algorithm 3,higher frequency coefficients correspond to narrower ranges ofthe entries in Q′. For example, if K = 1, Algorithm 1 onlyperturbs DC coefficients. If K is greater than 63, the range ofhigher frequencies is set to mR.

We also apply PuPPIeS -C to each image in PASCALdataset. The privacy level is set as medium (Table IV). Asshown in Table II, PuPPIeS -C reduces overhead significantly,the median and average image size are decreased from 9.69and 10.45 (with respect to the original image size) to 1.46 and1.41 respectively.

Public data and parameters shared between the senderand receiver In PuPPIeS -B and PuPPIeS -C, parameters R,mR and K are public, and are stored together with theperturbed image, say in the description of the image. An imagemay have perturbed and un-perturbed regions. Users need todownload public parameters only if they are interested in theperturbed regions. So, if the un-perturbed regions are viewed

9The proof is straightforward but verbose, and thus is omitted due to spacelimitation.

8

Page 9: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

(a) An original image (b) PuPPIeS -B perturbation (c) PuPPIeS -C perturbation via Al-gorithm 1

(d) PuPPIeS -Z perturbation via Al-gorithm 2

Fig. 15. Perturbing car plates in Fig. 15(a) using different algorithms. The original photo is downloaded from [37].

Algorithm 1 : PuPPIeS -C, Perturb a given ROI of an imagewith a flexible overheadInput: a ROI region R in the original image I , the vectorized

private matrix P ′, and the vectorized privater rangematrix Q′

Output: An image I ′ perturbed on a given ROI using P ′

1: k ← 0; {Bk denotes the k-th block in R}2: for each DCT coefficient in k-th block Bk in R do3: Bk

0 ← Bk0 + P ′j ; j ← (k mod 64);

4: k ← k + 1;5: for i← 1 to 63 do6: Bk

i ← Bki + (P ′i mod Q′i);

7: write all coefficient blocks to a new image I ′;8: return I ′;

by many users, we can further reduce the size of perturbedimages by shifting some information to public parameters, asshown below.

Algorithm 2 : PuPPIeS -Z, Selectively Perturb a given ROIof an imageInput: a ROI region R, the vectorized private matrix P ′, and

the vectorized private range matrix Q′

Output: An image I ′ perturbed on a given ROI using P ′, andthe positions of new zeros (ZInd)

1: k ← 0; {Bk denotes the k-th block in region R}2: ZInd← ∅3: for each DCT coefficient Bk in k-th block in R do4: Bk

0 ← Bk0 + Pj , j ← (k mod 64);

5: k ← k + 1;6: for i← 1 to 63 do7: if Bk

i ! = 0 then8: Bk

i ← Bki + (P ′i mod Q′i);

9: if Bki = 0 then

10: add (k, i) to ZInd;11: write all coefficient blocks to a new image I ′;12: return I ′, ZInd;

4. PuPPIeS -Z: Shifting perturbed image size to publicparameters

Generally, the more consecutive zeros in AC coefficients ofDCT, the higher encoding efficiency and compression ratio canbe achieved [16]. Thus, instead of perturbing these consecutivezero blocks, we can simply skip them. However, because theperturbation to a non-zero AC entry may change the entryvalue to zero by adding an unpredictable random amount, andthis zero may not be distinguished from the skipped originalzeros. This can cause problems if the perturbed ROIs need to

Algorithm 3 : Privacy range matrix Q′ generationInput: (i) mR: minimum range of entries in P ; (ii) K: number

of coefficient to be perturbed.Output: Vectorized privacy range matrix Q′

1: r ← 2048;2: for i← 0 to 63 do3: Q′i ← r;4: if r > mR then5: r ← r/2;6: if i ≥ K then7: r ← 1;8: return Q′;

be recovered for sharing purposes.

Given the above considerations, we add an additional checkso that if an AC entry becomes zero after perturbation, werecord the position of the newly generated zero by using DCTcoefficient block index and the entry index of the new zerocoefficient in the DCT coefficient block. For each new zero,we need 28 bits to record its position in the image (2 bitsfor the layer – there are at most 3 layers in a JPEG image,16 bits for the block index and 6 bits for the entry index).Based on these changes, we propose PuPPIeS -Z which furtherreduces the size of perturbed images compared to PuPPIeS -C,however, introduces more public parameters to store.

In actual implementation, we need to store all positions ofnew zero coefficients in a set, which we call new zero indexset (denoted as ZInd). Please refer to Line 10 in Algorithm 2.Leaking new zero index set does not break users’ privacy. Thus,public parameters in PuPPIeS -Z include R, mR, K andZInd.

When applying PuPPIeS -Z to the whole images in PAS-CAL dataset and setting privacy level to be medium (Table IV),the median and average image size after perturbation arefurther reduced from 1.46 and 1.42 to 1.22 and 1.23, asshown in Table II. Fig. 15(d) illustrates a perturbed imageof Fig. 15(a) using PuPPIeS -Z. However, ZInd also bringsin 12%-36% an additional overhead (Section V-B) of theoriginal image size compared to PuPPIeS -C. As discussed,this overhead is stored as public parameters. Section V-Bdetailedly discusses the applicable scenarios of PuPPIeS -Cand PuPPIeS -Z.

C. Transformed image reconstruction at receiver-side

Since the perturbed images are required to be first up-loaded to PSPs before downloaded by receivers for sharing.After uploading, image owners or PSPs may perform varioustransformations such as scaling, cropping, compression, etc.

9

Page 10: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

Below, we show that our algorithms can well support populartransformations, i.e., the perturbed and transformed ROIs canstill be recovered by using the private matrix P even afterbeing transformed10. Note that the following arguments holdfor different versions of encryption algorithm at sender-sidein PuPPIeS system, because they are designed in a similarmanner (using P to perturb DCT coefficient blocks) and theonly difference is the choice of different variants of P .

C.1. Supporting linear transformations

There are in general two types of linear transformations thatare operated either on the YUV domain or on the frequencydomain [39]. YUV domain transformations, such as scaling,cropping and rotation, directly change Y, U or V parameters ofthe involved pixels, while frequency domain transformations,such as filtering and overlapping usually change the DCTcoefficients of an image.

First, we consider these YUV domain transformations(such as scaling, cropping and rotation) which can be consid-ered as doing linear algebra operations on pixel blocks in YUVdomain. Each 8x8 DCT coefficient block in the frequencydomain can be reversely transformed back to a 8x8 pixel blockin the YUV domain by using a linear transformation, and viceverse. Let b be the YUV domain representation of a DCTcoefficient block B in a ROI, then b = f(B), where f(.) is alinear transformation. So after perturbation, we have B+P inthe frequency domain, and in the YUV domain, we furtherhave f(B + P ) = f(B) + f(P ) = b′, where f(B) is aYUV pixel block of the original ROI and f(P ) is the YUVrepresentation of P .

Then considering the whole perturbed ROI of size Kblocks, it can be considered as the original ROI added toa “shadow ROI” which is formed by those correspondingf(P )s from K blocks. When a YUV domain transformationis performed on a perturbed ROI, it can be equivalent to:(1) performing this transformation on the original ROI; (2)performing exactly the same transformation on the shadowROI; (3) adding the outputs of (1) and (2).

Therefore, given a transformed and perturbed ROI, thereceiver side can easily derive f(P )s based on the privatematrices, and then perform the same transformation on theobtained shadow ROI, and finally subtracts the transformedshadow ROI from the transformed and perturbed ROI to re-cover the transformed original ROI. This explains why PuPPIeScan recover transformed images very well using existing imageprocessing libraries without requiring any changes to the ex-isting transformation implementation. Fig. 16 demonstrates anexample of supporting scaling. Supporting frequency domainlinear transformations (such as filtering and overlapping) canbe explained similarly by simply replacing f(B), f(P ) andf(B + P ) with B, P and B + P respectively.

C.2. Supporting non-linear compression.

PuPPIeS can also support some non-linear transformationssuch as compression. To support the compression transforma-tion, both quantization tables of the original images (denoted

10In this paper, we assume the transformations that have been doneon an uploaded image are known to PuPPIeS . This is a necessaryassumption for all the advanced image sharing mechanisms thatsupport image transformations, such as P3 [13].

TABLE III. DATASETS USED IN OUR EXPERIMENT.

dataset image number mean size typical resolution experimentCaltech 450 152 KB 896×592 face detectionFERET 11,338 10.4 KB 256×384 face recognitionINRIA 1,491 1,842 KB 2448×3264 all othersPASCAL 4,952 84 KB 500×330 all others

by T ) and quantization tables of the perturbed images (denotedby T ′) need to be used. Given T and T ′, a receiver canknow each perturbed but not compressed DCT coefficientblock, say B′. So, similar to the explanation in the abovelinear transformation, one can get the original DCT coefficientblock B by subtracting P from B′ and then calculates thecompressed DCT coefficient block of B using T ′. This processis repeated for all B′s in a given ROI, and finally the receivercan recover the compressed original ROI.D. Practical extensions and others

In the above description, we assume DC and AC coeffi-cients share the same private matrix P during perturbation.As an extension, one may apply two private matrices PDC

and PAC to independently perturb DC and AC coefficientsand enhance the security. We are actually using this variantin practice. Furthermore, users can even use arbitrary numberof private matrices to perturb different coefficient blocks of aROI. In such case, the total number of bits to secure DC andAC coefficients of a ROI increases linearly with the number ofprivate matrices being used, as shown in experiment section.

V. EXPERIMENT EVALUATIONA. Experiment Design

We mainly consider two types of metrics, overhead andprivacy preservation capability. Image storage size, whichimpacts the network resources to share images and storageresources on PSPs, is used to evaluate the overhead of ouralgorithms.

Dataset. Four commonly used datasets are used for differ-ent purposes. More details of the four datasets can be foundin Table III. (i) Caltech face dataset [40], which contains 450JPEG color images about human faces of 27 people, is usedfor face detection experiments. (ii) FERET [41] dataset, whichincludes 11,338 facial images, is used for face recognitionexperiments. (iii) INRIA [42] dataset, which contains 1,491high resolution JPEG color images including rivers, mountains,small towns, etc. (iv) The PASCAL Visual Object ClassesChallenge 2007 [38], which contains 4,952 JPEG color imagesabout various objects, people, animals, buildings etc. Thisdataset contains images with low to medium resolutions.

Implementation. The implementation of PuPPIeS is basedon Libjpeg version 8d [43] and OpenCV [44]. In the currentimplementation, users can set the desired privacy level tolow, medium, or high. The security analysis of different levelscan be found in Section VI, and the mapping of privacylevels to parameters mR and K are listed in Table IV. Theexploration of finer-grained privacy levels is left to future work.The P3 algorithm [13] is also implemented for performancecomparison. A threshold of 20 is set up for P3 in splittingpublic and private parts, as recommended by the authors.We implemented PuPPIeS -Z11 (Algorithm 1), PuPPIeS -C

11In experiment result tables shown below, PuPPIeS -Compression isPuPPIeS -C and PuPPIeS -Zero is PuPPIeS -Z.

10

Page 11: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

(a) An original image. (b) The perturbed image using PuP-PIeS -Z via Algorithm 2.

(c) The scaled perturbed image of(b).

(d) The reconstructed scaled imageof (c).

Fig. 16. Demonstration of PuPPIeS ’s capability in recovering the scaled perturbed image. The original image is from PASCAL dataset.

TABLE IV. PRIVACY LEVEL vs. PARAMETERS IN PERTURBATIONALGORITHMS

Privacy level mR KLow 1 1Medium 32 8High 2048 64

low medium high 0

2

4

6

8

privacy level

No

rmalized

im

ag

e s

ize

PuPPIeS−Compression

PuPPIeS−Zero

(a) PASCAL

low medium high 0

2

4

6

8

10

12

privacy level

No

rmalized

im

ag

e s

ize

PuPPIeS−Compression

PuPPIeS−Zero

(b) INRIA

Fig. 17. Normalized image size after perturbation using different privacysettings in Table IV. Error bar is standard deviation.

(Algorithm 2) and P3 on Windows platforms12 running on aSamsung ATIV 9 plus laptop with 8GB memory, and i7-4500U1.8GHz CPU.

B. Storage Overhead

B.1 Privacy Settings vs. Perturbed Image Size In PuP-PIeS , users can adjust the parameters (mR and K) to achievethe trade-off between privacy levels and the perturbed imagesize. In Table IV, we have already quantified the privacy levelusing the number of secure bits for the low, medium, andhigh privacy settings. In this section, we further quantify theperturbed image size.

12As an ongoing effort, we are implementing PuPPIeS in Android platformrunning on a Galaxy S5.

20 40 60 80 100

1

1.2

1.4

1.6

1.8

2

RoI area(%)

No

rmalized

pu

blic s

ize

PuPPIeS−Compression

PuPPIeS−Zero

PuPPIeS−Zero−−no newZeroIndex

P3

(a) PASCAL

20 40 60 80 100

1

1.5

2

2.5

RoI area(%)

No

rmalized

pu

blic s

ize

PuPPIeS−Compression

PuPPIeS−Zero

PuPPIeS−Zero−−no newZeroIndex

P3

(b) INRIA

Fig. 18. Normalized size of public par of different methods. The solid linewith triangle marks the average size of public images in P3. Error bar isstandard deviation.

Fig. 17 shows normalized image size after perturbationunder different privacy settings using PASCAL and INRIAdatasets. In experiment, the whole images are perturbed toquantify the maximum possible overhead after perturbation.As it indicates in Fig. 17, the perturbed image size increaseswith the privacy levels. When privacy level is high, perturbedimage size increases by 5 times and 8 times for PuPPIeS -C using PASCAL and INRIA dataset respectively. If theprivacy level is medium, the perturbed image size reducesto around 1.1 to 2. The size gap between PuPPIeS -C andPuPPIeS -Z increases with privacy levels. The reason can beexplained as follows. The higher the privacy level is, the morehigh frequency coefficients are perturbed. Recall that for mostnatural images, high frequency coefficients can contain manyconsecutive zeros. However, while these consecutive zeros arekept in PuPPIeS -Z, they can be mostly changed to non-zeronumbers after perturbed by PuPPIeS -C, which downgradesthe encoding efficiency of run-length coding (Section II-A).We also notice that only perturbing the DC coefficient (cor-responding to low privacy level) incurs negligible overheads.Considering the trade-off between the privacy level and theperturbed image size, we set medium as the default privacylevel and the rest of the experiments are conducted under thisdefault setting.

B.2 Public Part The public part in PuPPIeS includes theperturbed image and public parameters. In PuPPIeS , not allareas of an image need to be perturbed. The bigger the totalROI area, the higher the storage overhead. Fig. 18 shows thetotal normalized size of the public part using PuPPIeS -C andPuPPIeS -Z by setting different ROI area percentages. The sizeof the public part increases linearly with the percentage of ROIareas.

In most cases, the size of public part in PuPPIeS -Zis higher than that in PuPPIeS -C, because PuPPIeS -Z hasan extra public parameter, ZInd, which introduces an extraoverhead of from 12% to 36% for the total size. This dictatesthe applicable scenarios of the two algorithms. PuPPIeS -Camortizes most storage overheads over the perturbed imagesthemselves, while PuPPIeS -Z shifts some storage overheadsto public parameters. Therefore, PuPPIeS -C is applicablefor general scenarios, while PuPPIeS -Z is more suitable forscenarios that unperturbed regions of images are preferredby majority users so that the extra ZInd only needs to bedownloaded by minorities that are interested in the perturbedROIs – also shown in Fig. 18, the size of public part inPuPPIeS -Z becomes less without ZInd.

Compare to PuPPIeS , the size of public part in P3 is much

11

Page 12: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

(a) An original image. (b) The public part of PuPPIeS -Zvia Algorithm 2.

(c) The public part using P3. (d) The private part using P3.

Fig. 19. Encryption results using PuPPIeS and P3. The private part of PuPPIeS -Z only includes two private matrices.

TABLE V. UPPER BOUND OF IMAGE ENCRYPTION/DECRYPTION TIMEUSING PuPPIeS -Z . THE TIME IS IN UNIT OF MILLISECOND (MS).

Dataset encryption/decryption time (ms)Mean Median Max Min Std

INRIA 198 156 554 7.5 99.2PASCAL 20.3 16.0 62.0 0 8.0

less13, as shown in Fig. 18. The reason is quite obvious. P3is designed to eliminate all meaningful information from thepublic part while PuPPIeS is designed to partially perturbROIs and at the same time maximize the uses of cloud storageservice. Therefore, most regions in PuPPIeS are kept the sameas those in the original image, which makes PuPPIeS havelarger public part. An example that illustrates such differenceis shown in Fig. 19.

B.3 Private Part The private part in PuPPIeS only includesprivate matrices. Two private matrices PDC and PAC (which isdiscussed in IV-D) can be used to perturb one or more ROIs.In other words, the size of private part in PuPPIeS is onlyrelated to the number of private matrices associated with theperturbed image. As a comparison, the private part in P3 is aprivate image. Fig. 11 shows both the average size of privateparts in P3, and also the size of private parts in PuPPIeS withincreasing number of private matrices.

The size of private parts in P3 does not vary becauseP3 perturbs whole images. For images with low to mediumresolutions (e.g., PASCAL dataset), the size of private parts inPuPPIeS is less than those in P3 when the number of privatematrices is less than 26. For images with high resolutions(e.g., INRIA dataset), PuPPIeS saves the size of privateparts significantly by more than 93%, compared to P3. Thesenumbers once again show that PuPPIeS is promising in savinglocal storage by shifting most overhead to achieve the desiredprivacy protection to the public cloud.C. Computational overhead

PuPPIeS is light-weighted, and the only operation in en-cryption/decryption is to add/subtract private matrices. Table Vshows the encryption/decryption time using PuPPIeS -Z. Evenfor large images, the average processing time is still less than200ms on laptop14. Notice that the table shows the upper boundprocessing time since we set the ROI to be whole images.In practice, the actual required processing time is much less.The encryption/decryption time for PuPPIeS -C is roughly thesame. We measured the processing time of automated ROIdetection and recommendation discussed in Section IV-A. The

13This size in P3 is a constant for a given dataset because P3 works onwhole images.

14In implementation on Galaxy S5, the time to encrypt/decrypt a typicalwhole image (750×750) is less than 170ms.

result, including the time for splitting detected ROIs intodisjointed rectangle areas, is 3.85s on average (min=2.67s,max=5.38s, median=3.81s). The currently employed ObjectDetection technique [35], takes more than 99% of the totalprocessing time.

D. Advantages of our method over P3

As is shown in the above experiment, PuPPIeS perfor-mance is far better than P3 in terms of privacy-preservingcapability and cloud storage utilization even applied on thewhole image. For example, the size of private part generatedby P3 is much larger than that generated by PuPPIeS sincePuPPIeS shifts more image volume (by bytes) to public partthus maximizes the usage of cloud storage.

We illustrate the reason why P3 cannot support imagetransformation. P3 generates two parts for a given image, apublic image stored by PSP and a private image stored bya trustworthy third party. Suppose the PSP scales down thepublic image using, say the widely used Libjpeg library [43].Without modifying the transformation implementation of Lib-jpeg, client can only do exactly the same transformation (i.e.,performing scaling) on the private image and then combinethe downloaded scaled public part with the local scaled privatepart. Fig. 4(b) shows the output. Ideally, the recovered image isexpected to be close to the output of applying scaling directlyon the original image Fig. 4(a). However, the recovered imageFig. 4(b) loses many fine details of the original image, e.g., thetexture on books. This is because, in P3, the sign informationof DCT coefficients is lost after scaling the private image usingstandard image processing libraries. Therefore, image qualitydistortion can happen when combining the scaled public andprivate parts. In other words, P3 requires different implementa-tions of a given transformation separately for the public imageand the private image, while the transformation algorithm inLibjpeg only works for the public image. A possible solutionis that re-implementation of the scaling functions of Libjpegfor the private image. This, however, is not a desirable waysince great efforts are needed.

In contrast, PuPPIeS does not lose any fine details evenwhen the public part is transformed using standard imagetransformation library, say Libjpeg – we just need to apply thesame scaling operations without changing the implementationof algorithm library, to a private matrix based “shadow ROI”(Section IV-C), and a straightforward combination of thescaled public part and the scaled “shadow ROI” can recoverthe exact image of Fig. 4(a).

VI. PRIVACY ANALYSIS

In this section, we conduct privacy analysis under attacks.We consider brute force attack and image inference attacks,

12

Page 13: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

(a) SIFT feature matching using P3.One feature matched.

(b) SIFT feature matching usingPuPPIeS -Z. No feature matched.

Fig. 20. SIFT feature matching results between encrypted images and originalimages. For each image, the left side is the original image and the right sideis the encrypted image. The line connecting left and right sides indicates amatched SIFT feature.

0 0.02 0.04 0.06 0.080

0.2

0.4

0.6

0.8

1

Normalized number of matched pixels

CD

F

P3

PuPPIeS−Zero

Fig. 21. The CDF of ratio of de-tected pixels in edge detection forperturbed images encrypted by PuP-PIeS -Z and public part of P3.

0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

rank (x)

Ra

tio

(y

)

P3

PuPPIeS−Zero

Fig. 22. The cumulative face recog-nition ratio for perturbed images ofPuPPIeS -Z and public part of P3.

and evaluate the privacy disclosure of perturbed images.

A. Brute force attack

The privacy protection of PuPPIeS algorithms depends ontwo private matrices PDC and PAC . Each entry in PDC andPAC is randomly chosen in range [−1024, 1023] and can berepresented using an 11-bit number. Therefore, an adversaryneeds 11x(8x8) = 704 bits to recover the DC coefficients of aROI region, because the size of each region is 8x8.

The number of bits to secure the AC coefficients, which isthe number of bits to code the 63 higher frequency coefficientsin PAC , is controlled by users’ privacy settings, and arefixed to 1, 90, and 631 for low, medium and high privacylevels (computed according to Algorithm 3 and Table IV). Theoverall privacy protection is bounded by making a sum overthe secure bit numbers of the independent random matricesPDC and PAC generated by Algorithm 3 using mR and K inTable IV, and is 705, 794 and 1335 bits for low, medium andhigh privacy levels.

These secure bit numbers are far longer than NIST’s 15

standard – 256 bits [45], and it is practically impossible todirectly check more than 2704 images to find the very bestmatch to the original one. Therefore, naive brute force attackis very unlikely to reconstruct an image which is perturbedusing either PuPPIeS -C or PuPPIeS -Z.

B. Image Inference attacks

We consider image inference attacks, which utilize pat-tern recognition and computer vision techniques (e.g., featurelearning, face detection) to analyze images with the purposeof discovering semantic information from perturbed images.In this section, we evaluate the resistance and robustness of

15National Institute of Standards and Technology

PuPPIeS to these attacks. In particular, we consider five typesof inference attacks: (1) scale-invariant feature transformation(SIFT) feature attack; (2) edge detection attack; (3) face detec-tion attack ; (4) face recognition attack; (5) signal correlationattack. We analyze the counter measures to these attacks asfollows.

B.1 SIFT feature attack SIFT feature extraction is a pre-processing step in image recognition tasks such as roboticmapping and navigation, image stitching, 3D modeling, gesturerecognition, video tracking, individual identification of wildlifeand match moving, etc. SIFT features are extracted fromoriginal and perturbed images in PASCAL dataset. Specifically,we run SIFT feature extraction on original images in PASCALdataset, public parts encrypted by P3, and images perturbedby our proposed PuPPIeS -C and PuPPIeS -Z. The ROI is setto be the whole image to accommodate P3 which can onlyprotect whole images. There are on average 1,500 featuresfound in each original image. The average number of matchedfeatures is far less than 1 for our algorithms and P3. Moresignificantly, for more than 90% of images, the features foundin the perturbed version does not match any features foundin the original version. Therefore, PuPPIeS protects imageprivacy as well as P3 with respect to potential attacks usingSIFT features. Fig. 20 illustrates an example of SIFT featurematching results using different encryption algorithms. Thisindicates that even if an adversary extracts all SIFT featuresin the perturbed image, she, however, cannot recover any offeatures in the original image.

B.2 Edge detection attack. Edge detection aims at de-tecting pixels whose values change sharply in digital images,such as cured line segment. The well-known Canny EdgeDetection algorithm [46] are applied in our experiment. Fig. 21shows the cumulative distribution function (CDF) of ratio ofidentified pixels using edge detection in images perturbed byPuPPIeS -Z and public part of P3. Clearly PuPPIeS -Z and P3have similar performance in our experiment, with less than5% detected pixels marked as edges. In experiment, PuPPIeS -C has similar results to PuPPIeS -Z. This implies that in theperturbed images, it is difficult to draw meaningful conclusionsfrom such a small portion (<5%) of identified edges.

B.3 Face detection attack. Face detection [33] such asHaar cascading method is a popular technique to discoverROIs in a photo, especially for social life photos which usuallycontain human faces. Haar classifier from OpenCV library isapplied to perturbed images of PuPPIeS and public parts ofP3 for detecting human faces. We run face detection on theCaltech face dataset. The total number of faces detected inthe original images is 59616. The number of detected facesin images perturbed by PuPPIeS -C and PuPPIeS -Z are 53and 52 respectively, which are less than the number of facesdetected in public parts of P3 – 140. So, PuPPIeS protectsimage privacy better than P3 with respect to face-detectionattacks. This demonstrates that very limited face information(less than 53/596=8.89%) is preserved in the perturbed imagesusing PuPPIeS.

B.4 Face recognition attack. Face recognition attacksearches and recognizes known faces in a perturbed image

16In this experiment, we count the correctly detected faces only, i.e., theground-truth in original images.

13

Page 14: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

(a) An original image. (b) A perturbed image intext area via PuPPIeS

(c) A recovered image viathe guessed private matrix

(d) A recovered image viafeature correlation

(e) A recovered image viaPCA

Fig. 23. Reconstruction of a simple perturbed image: a white background image with ‘‘Hello World!" in foreground.

based on a pre-built database, which returns the closest matchor a ranked list of matches to the database. The PCA basedalgorithm [47] and its implementation [48] are employed inour experiment. Clearly, running standard face recognitionalgorithm for an image generates a set of ranked candidatesfrom the face database. In experiment, we pick up the top-k (k = 1, 5, 10, etc is a parameter) candidates and verify thatwhether the ground-truth face is contained in top k candidates.Then, across all perturbed images, we calculate the ratio ofcorrectly recognized images to the total number, and denotethis ratio as r. The result of applying top k = [1, 50] candidatesare shown in Fig. 22, where the cumulative face recognitionratio of perturbed images encrypted by different algorithmsare computed. Surprisingly, PuPPIeS performs much betterthan P3 in this experiment. For P3, the number is as highas 50%. For PuPPIeS -Z, the ratio of finding a match face inthe top k=50 faces is no more than 5%. PuPPIeS -C has similarresults to PuPPIeS -Z. This means that face recognition task isincredibly difficult on the perturbed images since the detectionrate is only around 5%, and therefore, almost unable to beexploited by an adversary.

B.5 Signal correlation attack Signal correlation attackaims at recovering the perturbed ROIs by taking advantage ofpotentially high spacial correlation of signals in images. Thereare several ways to achieve this goal, e,g., private matrix Pinference based on the continuity of image signals, recoveryof perturbed areas based on feature correlations or otherdimension reduction methods such as principal componentanalysis (PCA). We implemented 3 representative methods andsee how well they can help for the reconstruction of perturbedareas using PuPPIeS .

(1) Perturbation matrix inference based on continuity ofimage signals. The intuition that most of the perturbed areasand unperturbed areas share many similar signals based onthe continuity of images regions. In implementation, we firstretrieve the upper-left coefficient matrix, which contains thefull perturbation information according to PuPPIeS , and sub-tract an unperturbed coefficient matrix obtained by averagingall unperturbed regions. The result of subtraction is treatedas the inferred matrix. The inferred matrix is used to decryptthe perturbed ROI. (2) Recovery of perturbed areas based onfeature correlations. Motivated by [49], we use the weightedlinear combination of pixels’ neighbors’ values in unperturbedareas to infer the pixels in perturbed ROIs. In our experiment,an iterative encrypted value prediction process starts fromthe outermost encrypted pixels to the innermost ones in aspiral manner. Initially, all pixels in the ROI are marked asencrypted pixels. Starting from the outermost ones, the valueof each encrypted pixel is reset to the average value of itsclosest ` non-encrypted ones. Then, this reset pixel is marked

as non-encrypted. This process continues until no encryptedpixel left. (3) Recovery of perturbed areas based on PCA.PCA [50] is widely used for image recovery by maximizingthe signal covariance in the new projected the low-dimensionalspace. The preserved principal components capture the mostsignificant signals in the original images. We run PCA on theperturbed images, and use the top k principle components torecover the original images. Empirically, k is set to the 2C–3C,where C is the number of classes in an image dataset.

Fig. 23 demonstrates an example of image recovery resultsusing the above three methods for the simplest image: awhite background in background with “Hello World!” inforeground. Surprisingly, all three methods cannot recover anyof the perturbed part. Notice that this is a very simple setting,we have enough reason to believe that we may get even worseresult given more complicated perturbed and transformed real-world images. This inspires us to conduct a user study, andlook at the users’ satisfactions on the recovered images.

User study on users’ response to recovered imagesIn user study, we first randomly selected 50 photos from

4 datasets. We then applied PuPPIeS -Z and PuPPIeS -C tothese 50 photos respectively and generated 100 fully encryptedphotos. After that, we employed the aforementioned threesignal correlation attacks to each of the 100 encrypted photosand developed 100 recovered photos. We recruited partici-pants (N = 53) using Amazon’s Mechanical Turk (MTurk),a recruitment source that has become popular for conductingonline user experiments in recent years. We restricted partic-ipants to MTurkers with a North American IP address anda Human Intelligence Task (HIT) approval rate of 90% orhigher (www.mturk.com). We compensated participants with50 cents for a completed test. In our sample, 56.60% weremale, 79.24% completed at least a high school education,77.36% were Caucasian, and the average age was 29.12.

In our test, our participants were first instructed to completean on-line pretest questionnaire to collect their demographicinformation. After the pretest, participants were randomlypresented 10 distinct recovered photos. For each photo, weasked participants to indicate the context of the photo theysaw through a open ended question – “for the photo presentedabove, it is a photo describes:”. At the end of the test, eachparticipant was thanked for participation and given a randomly-generated unique confirmation code for redeeming paymentthrough MTurk. Unsurprisingly, none of our participants wasable to describe the original contents of those recovered photos.Instead, mosaic, confusion and fail to recognize the content ofthe photos were mentioned most frequently.

“Nothing but mosaic.” (P15)

“I have no idea what’s going on there.” (P33)

14

Page 15: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

Summary These studies suggest that a signal correlationattack is unable to restore the perturbed images.

C. Limitations and Discussions

Firstly, PuPPIeS cannot protect an image being encryptedby malicious users. The intentional image privacy attack,such as taking screen-shot or recording images, is out of thescope of this paper. Secondly, in order to support differentimage transformations, PuPPIeS needs the cooperation ofPSPs and gets the information about the transformation PSPdoes. Inferring the transformations on PSP-side is out of theresponsibility of PuPPIeS . Finally, the current implementationof PuPPIeS supports the Joint Photographic Experts Group(JPEG) format [16] which is the most popular image standardused by almost all PSPs [51]. As a continuous effort, we planto keep exploring the possibilities of supporting other imageor video standards.

VII. CONCLUSIONS

In this paper, we proposed PuPPIeS , a light-weight privacypreserving system for partial image sharing. The key ideais to perturb different private regions of an image and savethe perturbation matrix as the private information on owners’devices. Upon sharing, the private matrices can be securelydelivered to the appropriate receivers while the public regionsare preserved intact and can be accessed by both the websiteand other common users. Experiments on real datasets showthat our algorithms are robust to inference attacks, flexiblein setting desired privacy levels, and reasonably shift moststorage overhead to the public cloud. The whole frameworkis based on image perturbation analysis. As the future work,we may leverage the cryptanalysis to study multi-party privacypreserving image sharing.

REFERENCES

[1] Social photos generate more engagement: New research.http://www.socialmediaexaminer.com/photos-generate-engagement-research/.

[2] Why visual content will rule digital marketing in 2014. http://www.steamfeed.com/visual-content-will-rule-digital-marketing-2014/.

[3] Y. Shoshitaishvili, C. Kruegel, and G. Vigna. Portrait of a privacyinvasion – detecting relationships through large-scale photo analysis.15th Privacy Enhancing Technology, 2015.

[4] icloud image leakage. http://www.theguardian.com/technology/2014/sep/01/naked-celebrity-hack-icloud-backup-jennifer-lawrence.

[5] Hackers get their hands on 100k ‘deleted’ snapchat images.http://www.foxnews.com/tech/2014/10/12/hackers-eye-release-100k-deleted-snapchat-images/.

[6] Facebook shuts down face.com apis, klik app. http://www.cnet.com/news/facebook-shuts-down-face-com-apis-klik-app/.

[7] http://www.fourjandals.com/other/six-months-old/.[8] C. Wu and C. Kuo. Design of integrated multimedia compression and

encryption systems. Multimedia, IEEE Trans. on, 7(5):828–839, 2005.[9] C. Chang, C. Chen, and L. Chung. A steganographic method based

upon jpeg and quantization table modification. Information Sciences,141(1):123–138, 2002.

[10] M. Aharon, M. Elad, and A. Bruckstein. -svd: An algorithm fordesigning overcomplete dictionaries for sparse representation. SignalProcessing, IEEE Trans. on, 54(11):4311–4322, 2006.

[11] A. Unterweger and A. Uhl. Length-preserving bit-stream-based jpegencryption. In Proc. ACM MMSec, 2012.

[12] F. Dufaux and T. Ebrahimi. Scrambling for privacy protection in videosurveillance systems. Circuits and Systems for Video Technology, IEEETrans. on, 18(8):1168–1174, 2008.

[13] M. Ra, R. Govindan, and A. Ortega. P3: Toward privacy-preservingphoto sharing. In Proc. USNIX NSDI, 2013.

[14] M. Tierney, I. Spiro, C. Bregler, and L. Subramanian. Cryptagram:photo privacy for online social media. In Proc. ACM OSN, 2013.

[15] N. Johnson and S. Jajodia. Exploring steganography: Seeing the unseen.Computer, 31(2):26–34, 1998.

[16] G. Wallace. The jpeg still picture compression standard. ConsumerElectronics, IEEE Trans. on, 38(1):xviii–xxxiv, 1992.

[17] C. Hsu and J. Wu. Hidden digital watermarks in images. ImageProcessing, IEEE Trans. on, 8(1):58–68, 1999.

[18] Image filtering tutorial. http://lodev.org/cgtutor/filtering.html.[19] F. Beato, M. Kohlweiss, and K. Wouters. Scramble! your social network

data. In PET, pages 211–225. Springer, 2011.[20] E. D. Cristofaro, C. Soriente, G. Tsudik, and A. Williams. Humming-

bird: Privacy at the time of twitter. In IEEE Security and Privacy 2012.[21] A. J and A. Oprea. New approaches to security and availability for

cloud data. Communications of the ACM, 56(2):64–73, 2013.[22] B. Lau, S. Chung, C. Song, Y. Jang, W. Lee, and A. Boldyreva. Mimesis

aegis: A mimicry privacy shield–a system’s approach to data privacyon public cloud. In USENIX Security, pages 33–48, 2014.

[23] B.H. Kim, W. Huang, and D. Lie. Unity: secure and durable personalcloud storage. In Proc. the 2012 ACM Workshop on Cloud computingsecurity workshop, pages 31–36. ACM, 2012.

[24] M. Bellare, S. Keelveedhi, and T. Ristenpart. Dupless: Server-aidedencryption for deduplicated storage. In Proc. the 22nd USENIXconference on security, 2013.

[25] S. Bugiel, S. Nurnberger, A.-R. Sadeghi, and T. Schneider. Twin clouds:Secure cloud computing with low latency. In Communications andMultimedia Security, pages 32–44. Springer, 2011.

[26] M. Bellare, T. Ristenpart, P. Rogaway, and T. Stegers. Format-preserving encryption. In Selected Areas in Cryptography, pages 295–312. Springer, 2009.

[27] M. Senftleben, M. Bucicoiu, E. Tews, F. Armknecht, S. Katzenbeisser,and A.-R. Sadeghi. Mop-2-mop–mobile private microblogging. InFCDS, pages 384–396. Springer, 2014.

[28] S. Nilizadeh, S. Jahid, P. Mittal, N. Borisov, and A. Kapadia. Cachet: adecentralized architecture for privacy preserving social networking withcaching. In ENET, pages 337–348. ACM, 2012.

[29] J. Anderson, C. Diaz, J. Bonneau, and F. Stajano. Privacy-enablingsocial networking over untrusted networks. In Proceedings of the 2ndACM workshop on Online social networks, pages 1–6. ACM, 2009.

[30] M. Egele, G. Stringhini, C. Kruegel, and G. Vigna. Compa: Detectingcompromised accounts on social networks. In NDSS, 2013.

[31] M. Balduzzi, J. Zaddach, D. Balzarotti, E. Kirda, and S. Loureiro. Asecurity analysis of amazon’s elastic compute cloud service. In Proc.ACM SAC, 2012.

[32] W. Diffie and M. Hellman. New directions in cryptography. InformationTheory, IEEE Trans. on, 22(6):644–654, 1976.

[33] Face detection using haar cascades. http://docs.opencv.org/trunk/doc/py tutorials/py objdetect/py face detection/py face detection.html.

[34] Tesseract ocr project. https://code.google.com/p/tesseract-ocr/.[35] B. Alexe, T. Deselaers, and V. Ferrari. What is an object? In Proc.

IEEE CVPR, 2010.[36] A. Berg and J. Malik. Geometric blur for template matching. In Proc.

IEEE CVPR, 2001.[37] http://marsasadventures.blogspot.com/2010 02 01 archive.html.[38] M. Everingham, L. Gool, C. Williams, J. Win-

nand, and A. Zisserman. Voc 2007. www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.

[39] A.K.Jain. Fundamentals of digital image processing. 1989.[40] Caltech cv group. http://www.vision.caltech.edu/html-files/

archive.html.[41] Feret dataset. http://www.nist.gov/itl/iad/ig/colorferet.cfm.[42] H. Jegou, M. Douzeand C. Schmid, et al. Hamming embedding

and weak geometry consistency for large scale image search-extendedversion. 2008.

[43] libjpeg 8d. https://github.com/omcfadde/jpeg-8d.

15

Page 16: PuPPIeS: Transformation-Supported Personalized Privacy ... · na.wang1@samsung.com, hongxia@acm.org, gik2@psu.edu y The Pennsylvania State University Samsung Research America Abstract—Sharing

[44] Opencv. http://opencv.org/.[45] Nist recommendation for key management. http:

//csrc.nist.gov/publications/nistpubs/800-57/sp800-57 part1 rev3general.pdf,page67.

[46] J. Canny. A computational approach to edge detection. Pattern Analysisand Machine Intelligence, IEEE Trans. on, (6):679–698, 1986.

[47] M. Turk and A. Pentland. Eigenfaces for recognition. Journal ofcognitive neuroscience, 3(1):71–86, 1991.

[48] Face recognition algorithms. http://cs.colostate.edu/evalfacerec.[49] R. Garnett, T. Huegerich, C. Chui, and W. He. A universal noise removal

algorithm with an impulse detector. Image Processing, IEEE Trans. on,14(11):1747–1754, 2005.

[50] Z. Huang, W. Du, and B. Chen. Deriving private information fromrandomized data. In Proc. of ACM SIGMOD, pages 37–48, 2005.

[51] Usage of image format on the internet. http://w3techs.com/technologies/overview/image format/all.

Appendix: Proof of Lemma III.1: According to Algorithm 1 and 2, we have

ei = (bi + pi + 1024) mod 2048− 1024. (2)

Because pi is normalized by parameter mR, we have pi ∈ [0, 2047], thus pi = pi

mod 2048. Since bi ∈ [−1024, 1023], we have (bi+1024) ∈ [0, 2047]. Therefore,

bi = (ei − pi + 1024) mod 2048− 1024

= ((bi + pi + 1024) mod 2048− pi) mod 2048− 1024(3)

Thus (bi + pi + 1024) ∈ [0, 4094] due to pi ∈ [0, 2047].

(i) If (bi + pi + 1024) ∈ [0, 2047],

bi = ((bi + pi + 1024) mod 2048− pi) mod 2048− 1024

= (bi + 1024) mod 2048− 1024, (bi + 1024) ∈ [0, 2047]

= bi.

(4)

(ii) Else (bi + pi + 1024) ∈ [2048, 4094]

bi = ((bi + pi + 1024− 2048)− pi) mod 2048− 1024

= (bi − 1024) mod 2048− 1024 = bi.(5)

(i) and (ii) cover all situations, and therefore bi = bi.

16