smarteye: assisting instant photo taking via integrating ... ?· smarteye: assisting instant photo...

Download SmartEye: Assisting Instant Photo Taking via Integrating ... ?· SmartEye: Assisting Instant Photo Taking…

Post on 17-Jul-2019




0 download

Embed Size (px)


  • SmartEye: Assisting Instant Photo Taking viaIntegrating User Preference with Deep View Proposal

    NetworkShuai Ma1,2*, Zijun Wei3*, Feng Tian1,2, Xiangmin Fan1,2, Jianming Zhang4, Xiaohui Shen5, Zhe Lin4,

    Jin Huang1,2, Radomr Mch4, Dimitris Samaras3, Hongan Wang1,21 State Key Laboratory of Computer Science and Beijing Key Lab of Human-Computer Interaction, Institute of Software,

    Chinese Academy of Sciences.2 School of Computer Science and Technology, University of Chinese Academy of Sciences.

    3 Stony Brook University. 4 Adobe Research. 5 ByteDance AI, {zijwei, samaras},,

    {tianfeng,xiangmin,huangjin,hongan}, {jianmzha,zlin,rmech}*: Both authors have contributed equally to this work; : Corresponding author

    ABSTRACTInstant photo taking and sharing has become one of the mostpopular forms of social networking. However, taking high-quality photos is difficult as it requires knowledge and skill inphotography that most non-expert users lack. In this paperwe present SmartEye, a novel mobile system to help userstake photos with good compositions in-situ. The back-endof SmartEye integrates the View Proposal Network (VPN), adeep learning based model that outputs composition sugges-tions in real time, and a novel, interactively updated module(P-Module) that adjusts the VPN outputs to account for per-sonalized composition preferences. We also design a novelinterface with functions at the front-end to enable real-timeand informative interactions for photo taking. We conducttwo user studies to investigate SmartEye qualitatively andquantitatively. Results show that SmartEye effectivelymodelsand predicts personalized composition preferences, providesinstant high-quality compositions in-situ, and outperformsthe non-personalized systems significantly.

    CCS CONCEPTSHuman-centered computingHuman computer in-teraction (HCI); Interactive systems and tools;

    Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies are notmade or distributed for profit or commercial advantage and that copies bearthis notice and the full citation on the first page. Copyrights for componentsof this work owned by others than ACMmust be honored. Abstracting withcredit is permitted. To copy otherwise, or republish, to post on servers or toredistribute to lists, requires prior specific permission and/or a fee. Requestpermissions from 2019, May 49, 2019, Glasgow, Scotland Uk 2019 Association for Computing Machinery.ACM ISBN 978-1-4503-5970-2/19/05. . . $15.00

    KEYWORDSPhoto composition, user preference modeling, interactivefeedback, deep learning

    ACM Reference Format:Shuai Ma, Zijun Wei, Feng Tian, Xiangmin Fan, Jianming Zhang, Xiao-hui Shen, Zhe Lin, Jin Huang, Radomr Mch, Dimitris Samaras, HonganWang. 2019. SmartEye: Assisting Instant Photo Taking via Integrating UserPreference with Deep View Proposal Network. In CHI Conference on Hu-man Factors in Computing Systems Proceedings (CHI 2019), May 4-9, 2019,Glasgow, Scotland, UK. ACM, New York, NY, USA, Paper 471, 12 pages.

    1 INTRODUCTIONWith the proliferation of photo-taking smartphones, per-sonal photography has become a significant part of con-temporary life: people engage in capturing every memorablemoment in their lives [37, 50] and sharing these photographsthrough social networks such as Instagram, Snapchat, andothers. However, although smartphone cameras have ex-panded the accessibility of photography to general usersby making photography simple, cheap, and ubiquitous, tak-ing high-quality and visually aesthetic photos is still quitechallenging as it requires expertise in photography suchas deciding the distance to target objects, their poses andthe overall scene compositions. Non-expert photographersmay have difficulty adjusting their compositions to achievevisually pleasing photographs.

    One way to help non-expert users to get high-quality pho-tos is to apply intelligent photo composition algorithms tophotos for offline post-processing. Researchers have pro-posed various methods for auto-composition [11, 20, 26, 30,33, 45, 57]. These algorithms generally require users to takea photo first, save it and then feed it into a re-compositionpipeline. Compared to the in-situ pipeline proposed in this

    CHI 2019 Paper CHI 2019, May 49, 2019, Glasgow, Scotland, UK

    Paper 471 Page 1
  • (a)

    Learn Learn Learn Learn

    Update P-Module Update Update Update


    Figure 1: (a) Given an input photo on the upper left, the View Proposal Network (VPN) suggests a set of diversified well-composed views (as shown on the upper right); the P-Module adjusts the suggestions (as shown on the bottom) based onthe learned user preferences in real-time. (b) SmartEye learns user preferences interactively and progressively: as the userselects their favorite composition at the bottom of the screen, the P-Module gets updated. Thus SmartEye will suggest morepersonalized compositions the longer it is used.

    work, there are several limitations associated with post-processing approaches: i), they take extra storage and time;ii), post-processing can only operate on already taken photos,hence potentially good compositions can be overlooked dur-ing capturing; iii), most of these methods ignore user-specificpreferences in composition. Recent work [43] addresses suchpersonalized preferences in an offline manner. However, itrequires additional data collection and significant computa-tional resources for offline model-training.Hence, a successful in-situ composition suggestion sys-

    tem should address the following challenges: first, onlinemethods should provide knowledgeable feedback in real-time to avoid the unpleasant experiences caused by lags incontrol [25]; second, interactions with the interface shouldbe intuitive; third, since photo composition is well known tobe highly subjective as individual users have very differentvisual preferences [6], the proposed model should take intoaccount personalized composition preferences and modelthese preferences interactively and effectively.To address these challenges, we propose SmartEye. The

    back-end of SmartEye is based on a View Proposal Network(VPN), a deep learning based model that outputs composi-tion suggestions in real time. The VPN is general-purposeas it learns photo compositions from a large scale datasetwith crowdsourced annotations. We propose to append anadditional module (P-Module) to model personalized compo-sition preferences. As opposed to the offline machine learn-ing approaches [2, 5, 6, 22, 43], in our work, the P-Moduleis designed to model and predict the users preferences in-teractively and progressively, following a standard interac-tive machine learning paradigm [16]. We then combine theVPN scores and the P-Module scores using a memory-basedweighting algorithm [13]. As SmartEye captures each users

    preferences, it offers user-specific composition recommenda-tions (Fig. 1). The front-end of SmartEye is a novel interfacewhich enables a set of useful functions (Fig. 2) named smart-viewfinder, smart-zoom and smart-score. With the interface,the users can interact with the back-end algorithm effectivelyand intuitively.We have conducted two user studies to investigate the

    overall performance of SmartEye and its effectiveness in mod-eling and predicting personalized composition preferences.Quantitative results reveal that photos taken with SmartEyehave significantly higher agreement with the compositionsselected by the user, compared to the tested real-time in-situphoto composition systems. Qualitative results show thatSmartEye effectively models personalized preferences andthat users enjoy interacting with it.To the best of our knowledge, this is the first work that

    incorporates personalized preference modeling into a mo-bile, in-situ, user assistance system for photo-capture. Thecontributions of this work are: 1) We propose SmartEye, anovel in-situ mobile system: the back-end integrates a deepcomposition suggestion model (VPN) and a user preferencemodule (P-Module) to provide personalized compositions; thefront-end presents a novel interface with multiple functionsthat allow users to interact with the system effectively andintuitively. 2) We show in qualitative and quantitative resultsfrom user studies that SmartEye outperforms the examinedbaselines significantly both in the qualities of the suggestedcompositions and the levels of user-satisfaction. The surveyson participants additionally provide valuable implicationsand recommendations for design systems on other subjectivetasks.

    CHI 2019 Paper CHI 2019, May 49, 2019, Glasgow, Scotland, UK

    Paper 471 Page 2

  • 2 RELATEDWORKPhoto Composition Suggestion: AlgorithmsThere are two major types of photo composition algorithms:rule-based and learning-based. Rule-based models gener-ally utilize empirical rules such as rule of thirds [28, 54],or symmetry [8]. However, photo composition is a com-plex task. Only applying pre-defined rules might lead tosub-optimal results. Learning-based algorithms learn com-position from data of different types [4, 17, 54] using variousmodels [38, 55].Recently the learning based models using deep neural

    networks have achieved promising results. Many deep learn-ing methods have been proposed to find views with goodcompositions for various applications, such as image crop-ping [8, 9, 17, 21, 26, 30, 31, 54, 55], image thumbnail genera-tion [15] and view recommendation [7, 10, 48]. Despite theimproved performance, these deep learning models are basedon inefficient sliding window paradigm to evaluate candi-dates, thus ver