mobimed: comparing object identification techniques on … · 2014-04-29 · acted with posters...

10
MobiMed: Comparing Object Identification Techniques on Smartphones Andreas M¨ oller, Stefan Diewald, Luis Roalter Technische Universit¨ at M ¨ unchen Munich, Germany [email protected], [email protected], [email protected] Matthias Kranz Lule˚ a University of Technology Department of Computer Science, Electrical and Space Engineering Lule˚ a, Sweden [email protected] ABSTRACT With physical mobile interaction techniques, digital devices can make use of real-world objects in order to interact with them. In this paper, we evaluate and compare state-of-the-art interaction methods in an extensive survey with 149 partic- ipants and in a lab study with 16 participants regarding ef- ficiency, utility and usability. Besides radio communication and fiducial markers, we consider visual feature recognition, reflecting the latest technical expertise in object identification. We conceived MobiMed, a medication package identifier im- plementing four interaction paradigms: pointing, scanning, touching and text search. We identified both measured and perceived advantages and disadvantages of the individual methods and gained fruitful feedback from participants regarding possible use cases for MobiMed. Touching and scanning were evaluated as fastest in the lab study and ranked first in user satisfaction. The strength of visual search is that objects need not be aug- mented, opening up physical mobile interaction as demon- strated in MobiMed for further fields of application. Author Keywords Physical mobile interaction, object identification, touching, pointing, scanning ACM Classification Keywords H.5.m. Information Interfaces and Presentation (e.g. HCI): Miscellaneous General Terms Experimentation, Human Factors, Performance INTRODUCTION More than a decade ago, the idea of bridging the gap between the virtual and physical world has been coined [22], and later been referred to by the term of physical mobile interaction Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. NordiCHI’12, October 14-17, 2012 Copenhagen, Denmark Copyright c 2012 ACM 978-1-4503-1482-4/12/10... $15.00 Figure 1. MobiMed identifies drug packages using different methods of physical mobile interaction: pointing, touching, scanning, or text search. For a distinctive definition of the interaction techniques, see the Related Work section. The subject’s face was blurred for privacy reasons. [16]. This concept subsumes any system where digital de- vices interact with physical objects in their environment [9]. Typical interaction paradigms such as touching, pointing and scanning [21, 17] have been presented, as well as numerous fields of applications in which they have been used and eval- uated, such as identifying and remote-controlling devices at home or at the workplace [8, 23], and interacting with dis- plays or posters in public spaces [7, 20, 2]. Today, both quantitative and qualitative results of prior com- parisons are valid only with restrictions for several reasons. Applications have been realized with communication tech- nologies which were popular at that earlier time, but are not any more today (such as infrared). Interaction methods were restricted by hardware limitations, e.g. VGA resolution cam- eras [12] or were entirely custom-made, e.g. with a laser pointer and photo diodes [17] and have not been further pur- sued. A more detailed discussion of previously presented ap- proaches is provided in the Related Work section. Meanwhile, smartphones offer high-resolution cameras and plenty of processing power, which opens up new technolog- ical possibilities and enables approaches formerly not possi- ble. Earlier approaches had in common that real-world arti- facts needed to be augmented with markers or similar in or- der to be interacted with. Today’s fast multi-core smartphones enable vision-based feature recognition, making object recog- nition in real-time applicable for the pointing technique and

Upload: others

Post on 14-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

  • MobiMed: Comparing Object Identification Techniqueson Smartphones

    Andreas Möller,Stefan Diewald, Luis Roalter

    Technische Universität MünchenMunich, Germany

    [email protected],[email protected], [email protected]

    Matthias KranzLuleå University of Technology

    Department of Computer Science,Electrical and Space Engineering

    Luleå, [email protected]

    ABSTRACTWith physical mobile interaction techniques, digital devicescan make use of real-world objects in order to interact withthem. In this paper, we evaluate and compare state-of-the-artinteraction methods in an extensive survey with 149 partic-ipants and in a lab study with 16 participants regarding ef-ficiency, utility and usability. Besides radio communicationand fiducial markers, we consider visual feature recognition,reflecting the latest technical expertise in object identification.We conceived MobiMed, a medication package identifier im-plementing four interaction paradigms: pointing, scanning,touching and text search.

    We identified both measured and perceived advantages anddisadvantages of the individual methods and gained fruitfulfeedback from participants regarding possible use cases forMobiMed. Touching and scanning were evaluated as fastestin the lab study and ranked first in user satisfaction. Thestrength of visual search is that objects need not be aug-mented, opening up physical mobile interaction as demon-strated in MobiMed for further fields of application.

    Author KeywordsPhysical mobile interaction, object identification, touching,pointing, scanning

    ACM Classification KeywordsH.5.m. Information Interfaces and Presentation (e.g. HCI):Miscellaneous

    General TermsExperimentation, Human Factors, Performance

    INTRODUCTIONMore than a decade ago, the idea of bridging the gap betweenthe virtual and physical world has been coined [22], and laterbeen referred to by the term of physical mobile interaction

    Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.NordiCHI’12, October 14-17, 2012 Copenhagen, DenmarkCopyright c©2012 ACM 978-1-4503-1482-4/12/10... $15.00

    Figure 1. MobiMed identifies drug packages using different methods ofphysical mobile interaction: pointing, touching, scanning, or text search.For a distinctive definition of the interaction techniques, see the RelatedWork section. The subject’s face was blurred for privacy reasons.

    [16]. This concept subsumes any system where digital de-vices interact with physical objects in their environment [9].Typical interaction paradigms such as touching, pointing andscanning [21, 17] have been presented, as well as numerousfields of applications in which they have been used and eval-uated, such as identifying and remote-controlling devices athome or at the workplace [8, 23], and interacting with dis-plays or posters in public spaces [7, 20, 2].

    Today, both quantitative and qualitative results of prior com-parisons are valid only with restrictions for several reasons.Applications have been realized with communication tech-nologies which were popular at that earlier time, but are notany more today (such as infrared). Interaction methods wererestricted by hardware limitations, e.g. VGA resolution cam-eras [12] or were entirely custom-made, e.g. with a laserpointer and photo diodes [17] and have not been further pur-sued. A more detailed discussion of previously presented ap-proaches is provided in the Related Work section.

    Meanwhile, smartphones offer high-resolution cameras andplenty of processing power, which opens up new technolog-ical possibilities and enables approaches formerly not possi-ble. Earlier approaches had in common that real-world arti-facts needed to be augmented with markers or similar in or-der to be interacted with. Today’s fast multi-core smartphonesenable vision-based feature recognition, making object recog-nition in real-time applicable for the pointing technique and

  • function with any object, so that items need not to be aug-mented any more.

    In addition, people have now prior experience with sometypes of physical interaction, unlike some years ago whereresearchers reported on subjects without prior knowledge intheir studies [10]. Scanning visual codes has e.g. becomewidely popular, e.g. for for product price comparisons. Thismust be taken into account, since users might prefer tech-niques they know a-priori, or which seem at first glance moresuitable, faster, or more intuitive. This evolution justifies anupdated comparative view on object identification techniques.

    The contribution of our paper is twofold. First, we presentan updated view and comparison of physical mobile interac-tion techniques, in light of prevalent technologies today andadvances of mobile phones in the recent years. Second, weincorporated these techniques into an example application,MobiMed, and evaluate them in a medically motivated usecase. We conducted a large online study to get an up-to-dateview on real users’ opinions on the different techniques andtheir perceived advantages and disadvantages today. With asmaller user group, we conducted a lab study with a proto-type in which we confirmed the large-scale results and com-pared four techniques regarding their efficiency, utility andperceived usability.

    The remainder of the paper is structured as follows. We startwith an overview of physical interaction techniques and re-lated work. Subsequently, we introduce the MobiMed appli-cation and use case, and explain the implemented object iden-tification methods. After that, the two studies (online surveyand lab study) are described and their results are presentedand discussed. We conclude with a summary of our findingsand a brief outlook.

    PHYSICAL INTERACTION TECHNIQUESApproaches for identifying objects with smartphones and in-teracting with them are manifold. In accordance with previ-ous research, we distinguish three general paradigms: point-ing, touching and scanning [21, 17].

    PointingPointing denotes the process of aiming at an object with thesmartphone and is considered as a natural technique, sincewe are used to point at things with our fingers [17]. Thepointing paradigm can be implemented by an optical beam(e.g. infrared or laser) or visual codes [12, 17, 21]. Bothtechniques require the target object to be augmented and aline of sight between mobile device and target object. Wuet al. detected the pointing direction with the phone’s ac-celerometer [23]; however, the position of the target ob-jects in the room must be known. Due to the availabilityof high-quality cameras in smartphones, the popularity of 1-dimensional (e.g. bar codes) or 2-dimensional visual codes(e.g. quick response (QR) codes) steadily increases. Infraredis meanwhile found rarely in smartphones and thus a less at-tractive interaction technology to use. With increasing pro-cessing power, content-based image retrieval (CBIR, basedon visual feature recognition) has become an alternative toidentify objects. Föckler et al. [6] present a mobile phone

    museum guide that visually recognizes exhibits and providesadditional information on them. A system by Nistér et al. [11]finds similar-looking CD covers based on a reference imagein real time. However, feature recognition is often more error-prone than explicit visual codes.

    TouchingTouching is a proximity-based approach that allows to iden-tify an object by bringing the phone close to it. Objects areaugmented with electronic tags based on radio communica-tion [22], such as RFID (radio frequency identification) orNFC (near field communication), which can be read with acapable smartphone in a distance of few centimeters. Thetouching modality has e.g. been used for interaction withposters [20] and public displays [7] with a grid of NFC tags.Benelli et al. [3] presented scenarios for the medical area withusage of mobile devices. They propose NFC-based accesscontrol for work hour tracking and access control in hospi-tals, rapid exchange of patient information and case histories.

    ScanningIn accordance with a definition by O’Neill et al. [12], scan-ning is a proximity-based approach where the user points witha camera at a visual tag (cf. ‘to scan a bar code’). While thetouching technique requires the target to be in direct proxim-ity, scanning works for objects in close proximity (as far asthe visual code is readable).

    Scanning can also be understood as searching for availableservices in an environment by wireless techniques such asBluetooth or WLAN [21, 17]. In our work, however, weuse the term of scanning in the sense of targeting a fiducialmarker.

    Comparative WorkRukzio et al. [17] compared three mobile physical interac-tion techniques regarding performance, cognitive load, phys-ical effort and other factors: Objects could be touched (usingNFC tags and a NFC-enabled phone), scanned (using Blue-tooth access points and a corresponding mapping of nearbyobjects), or pointed at with a light beam (using a laser pointerattached to a smartphone and a light sensor attached to theobject). A further factor of distinction was the distance to theobjects to interact with: scanning was preferred for more dis-tant objects, pointing for objects in intermediate range, andtouching for close objects. The authors could not give univer-sal recommendations, since questionnaires and experimentsshowed that the preferred technique depends on location, ac-tivity and motivation.

    In a trial by O’Neill et al. [12] where untrained users inter-acted with posters using two-dimensional fiducial markersand RFID tags, the visual QR codes were read faster. An-other study, conducted by Mäkelä et al. [10], investigated us-ability of RFID and 2D bar codes and revealed that users havea limited understanding of both interaction techniques. Visualcodes vary in their dimension, layout, size and data density,which influences reading speed [13]. Higher data density alsoplaces higher requirements on camera and processing power.Ailisto et al. [1] reviewed visual codes, infrared, RFID and

  • Bluetooth as base technologies for physical selection. Theypointed out that infrared has the advantage of bidirectionalcommunication and mentioned the wide support in devicesas benefit of this method, compared to others. Due to theprogress in mobile hardware, these results do not apply todayany more.

    Reischach et al. [14] proposed the usage of physical mobileinteraction for product identification and demonstrated an ad-vantage of NFC and barcode identification over conventionalproduct search in dimensions of task completion time andperceived ease of use.

    Physical Mobile Interaction vs. Object IdentificationOur comparison of techniques is motivated by physical mo-bile interaction as understood by e.g. Valkynnen et al. [21]or Rukzio et al [17]. However, in order to interact with aphysical object, we need to identify it in the first place. Thisincludes the aspect of recognizing its presence as well as dis-tinguishing it from other, similar objects. Hence, we general-ize the concept of interaction techniques and speak of objectidentification techniques in the remainder of this paper. Weinvestigate not universal, but domain-specific object identi-fication applicable for our scenario described in the follow-ing section. Our ambition is to recognize objects, such asdrug packages, with state-of-the-art smartphones in reason-able time (i.e., an order of magnitude of few seconds).

    MOBIMED: IDENTIFYING DRUGS WITH A SMARTPHONEIn times of food supplements and vitamin compounds, andwith regard to the aging society, managing multiple drugsis an issue people are struggling with. We observe an in-crease of health-related apps and services, such as drug ref-erence guides1, pill reminders or identifiers2 and food ingre-dient databases3. This development motivated us to combinepersonal medication support with physical mobile interactionas use case for our evaluation. We conceived MobiMed, asystem for identifying medication packages as example ap-plication to evaluate object identification techniques on thesmartphone. MobiMed can be imagined as digital packageinsert replacement, applicable e.g. when the original pack-age insert is lost, or for quickly comparing medications. Itprovides detailed information on drugs, such as active ingre-dients, application areas, intake instructions and side effects.

    Use CaseIn order to evaluate MobiMed, it is important for subjects tounderstand how the application is used in a specific scenario.We conceived the following use case, which we presented tostudy participants, to motivate MobiMed.

    John, 45, architect, is an active golfer and likes cyclingtours in his holidays. He regularly takes food supple-ments (carotenes, Vitamin E) and anticoagulants sincehe is a cardiac patient. He needs to take up to four dif-ferent medications a day in different intervals, why he issometimes not sure about the correct dosage. Since he

    1WebMD. http://www.webmd.com/mobile2Drugs.com. http://www.drugs.com/apps3FDDB. http:/fddb.info

    ServerMobileDrug Package

    Reference Image

    Database

    DrugInformation Database

    NFC Tag

    MEDICATIONVisual Augmentation

    Radio Augmentation

    NFC Reader

    Camera

    Keyboard

    Pointing

    Scanning

    Touching

    Text Search

    Bar Code

    InherentVisual Features }

    Information Retrieval

    Matching

    1 2 3

    1234567

    Unique ID /Name

    Figure 2. Principle of drug identification with MobiMed. Drug packageshave bar codes, a unique identification number, inherent visual featuresand are augmented with NFC tags (Step 1). By pointing at a packageor scanning the bar code (using the portable device’s camera), the pack-age is identified visually. By touching, the package’s NFC tag is read.Additionally, the drug can be searched by its name or unique ID (Step2). Once the package has been identified, drug information is retrievedfrom a database and displayed on the phone (Step 3).

    had trouble pulling the blister packages out of the box,he removed the package inserts and cannot refer to theinstructions. John points at the drug package with Mo-biMed which identifies the medication by the appear-ance of the box. He gets detailed information on theproduct and scrolls to the correct dosage instruction.

    The other day, John wants to get an influenza medicationat the pharmacy. Since he is allergic to acetaminophen,he scans the package with MobiMed and checks whetherthe product contains the substance in order to know if hecan safely buy it.

    Prototype ImplementationThe interplay between drug packages, smartphone and theMobiMed server is illustrated in Fig. 2. The package’s barcode or NFC tag and inherent visual features allow its iden-tification by four methods: pointing, scanning, touching andtext search. For the first three methods, the phone’s cameraand NFC reader are used. Text input is performed on thephone’s soft keyboard (see Fig. 3, left image). The user canswitch between these identification methods in the MobiMedapplication by four tabs at the top of the screen (see Fig. 3).

    MobiMed was implemented in Android 2.3 using the defaultNFC and Camera APIs. For the scanning modality, the ZXingBarcode reader library4 was used.

    The drug information database and ∼100,000 reference im-ages for feature recognition are stored on a server, wherealso the feature extraction is performed. Our database cur-rently contains entries on more than 47,000 drugs, makingMobiMed’s product search an extensive real-world scenario.Drug information as well as reference images were retrievedfrom online pharmacy websites. The individual identificationmethods work as described in the following.

    ScanningIn many countries, medical products have a unique identi-fication number, such as the NDC (National Drug Code) inthe US, the PZN (Pharmazentralnummer) in Germany, or the4ZXing. http://code.google.com/p/zxing/

    http://www.webmd.com/mobilehttp://www.drugs.com/appshttp:/fddb.infohttp://code.google.com/p/zxing/

  • PPN (Pharmacy Product Number) which will be introducedin the European Union and be encoded in a two-dimensionalDataMatrix code on every product package. Currently, thebar code on each drug package holds a 7-digit unique PZNand can hence be used to unambiguously recognize a product.The bar code is targeted (scanned) with the phone’s camera ata distance of few centimeters. Upon recognition of the code,the name of the medication appears in a popup and the user isasked whether it is the searched one. After confirmation, thedetails page for the medication is displayed (see Fig. 3, rightimage). The recognition starts immediately with the camerapreview screen, it is not necessary to take a photo.

    TouchingFor this method, medication packages were enhanced withNFC tags on which we stored the same unique PZN as inthe bar code. We used MIFARE Ultralight tags operating at13.56 MHz (NFC Forum Type 2) with 48 bytes of memory,complying with the ISO 14443A standard. Touching a drugpackage with a NFC-capable phone reads the PZN stored inthe tag and shows the drug’s information page.

    PointingEach drug package has inherent visual features, such as logos,imagery, colors, package shape and imprinted text. Thesecharacteristics can serve for identification, which is referredto as content-based image retrieval (CBIR). We use an ex-tension to the vocabulary tree approach [19] with a trainingset of 100,000+ images. At a distance of 20–50 centimeters,the user points at the drug package with the camera, whichcontinuously captures several query images. A visual featuretracker identifies characteristic features (maximally stable ex-tremal regions, MSER) and matches them with reference im-ages in the database. Moving the camera while pointing isunproblematic; varying perspectives in the query images evenimprove matching. The result is a candidate list of potentialmatches from which the user selects the desired medication.The correct hit is usually amongst the first page of results.

    Text SearchIn this modality, a free text search is performed on alldatabase fields, so that drugs can be found based on their PZNas well as by their name, active ingredients, side effects andso on. A list of search results that match the specified searchterm is presented to the user.

    Omission of Speech RecognitionSpeech recognition on mobile devices is experiencing a re-vival with Apple’s Siri and Google Now. We did not in-clude voice-based interaction as modality in our example sce-nario for various reasons. Medication names are often hard topronounce or similar-sounding, which is still problematic forspeech recognition engines. Often, there exist variants withdifferent dosage but similar name, which entails a high riskof inappropriate results. For these reasons, we consideredspeech input not as appropriate for identifying drugs.

    ONLINE SURVEYAt an early stage of the development of MobiMed, we con-ducted a large-scale survey with 149 participants to investi-gate the following research questions (RQ):

    Figure 3. Left: Text search for medications in MobiMed. Right: Theresult screen with detailed drug information. The top of the screen showstabs for switching between the identification methods.

    • RQ1: What advantages and disadvantages of identificationtechniques as presented in MobiMed do people see?

    • RQ2: Which method is preferred by users a priori?

    • RQ3: What potential do people see for MobiMed as awhole?

    The gathered responses are a priori feedback, i.e. user’s opin-ions are reflected based on textual descriptions of MobiMedand the incorporated methods, without having used it bythemselves.

    QuestionnaireSince our goal here was broad feedback, and a lab study witha number of 100+ participants would not have been feasi-ble, the survey was conducted as online questionnaire. Atthe beginning, the concept of MobiMed, the use case and thefour identification methods were introduced and illustratedwith descriptive screenshots. By explaining in detail for eachmethod how the process of identifying a package works, wetried to give subjects a best possible idea of the system with-out actually using it.

    Then followed two sets of questions, addressing the individ-ual research questions. The first set of questions addressedthe advantages and disadvantages of MobiMed’s identifica-tion techniques, and of MobiMed as a whole. In the secondblock, we asked whether subjects would use the app them-selves, how much money they would spend for it, and whichother functions they would like to see in it. All questions wereasked in a neutral manner (e.g. explicitly asking for both ad-vantages and disadvantages) to exclude a confirmation bias.

    ParticipantsThe questionnaire was uploaded as Human Intelligence Task(HIT) to Amazon Mechanical Turk5. Completion of a ques-tionnaire was compensated with $0.30. Excellent answers in5Mechanical Turk. https://www.mturk.com

    https://www.mturk.com

  • length and quality were rewarded with a bonus of an addi-tional $0.50. 149 participants worldwide (thereof 100 livingin the US) aged between 17 and 79 (average age: 31 years,SD = 11) took part in the survey; 74 were females, 75 weremales. 110 participants owned a smartphone.

    Results and Discussion of the Online SurveyRQ1: Advantages/Disadvantages of Identification TechniquesWe collected a variety of statements on the presented meth-ods, giving an impression what people did and did not like.Items were formulated as open questions with free text an-swers. Hence, results reflect main tendencies and opinions,but cannot expose the spectrum of answers in its entirety.

    ScanningPeople considered this method as ‘quick and convenient’ wayto identify packages. Further attributes mentioned repeatedlywere ‘precise’, ‘easy’, ‘specific’ and ‘cool’. Respondents par-ticularly liked that ‘you can know exactly that it is the rightproduct’, since the bar code uniquely identifies it, even ifbrand names or packages look similar. There was much fa-miliarity with this technique due to market penetration of barcode scanner apps. Some people reported to ‘scan productsall the time’ with their phones when shopping for price com-parisons. This a priori knowledge might have biased partici-pants’ responses towards ‘scanning’.

    As drawbacks, people mentioned the necessity for a cameraand a readable bar code. Furthermore, it takes time to find thebar code on the product. It could happen that it is too small tobe recognized (due to lightning conditions, a damaged pack-age, or a weak camera). Usability problems were reportedmore frequently for bar code scanning than for other meth-ods, perhaps because people have tried this method alreadyfor themselves. Statements such as ‘doesn’t always work forme’ and ‘sometimes hard to focus the bar code’ indicate thatusers have experienced problems on their own, which makesthem perceive scanning more difficult than other methods.

    TouchingNFC tags were affirmed to allow fast and precise identifica-tion, combined with good usability. It was highlighted thattouching the object with the phone is the only necessary userinteraction, which makes the method, according to respon-dents, ‘hassle-free’, ‘fool-proof’ and suitable for ‘old peopleand non-expert persons’. Other adjectives used were ‘mod-ern’, ‘cool’, and ‘satisfying to [...] get so much informationso quickly’.

    Downsides mentioned were the ‘extensive’ requirements: aNFC-capable phone, augmented medication packages, and anincreased energy consumption on the phone. People men-tioned potential error-proneness of NFC, being a novel tech-nology, which indicates that they were less familiar with NFCand thus more skeptical, compared to bar codes. Other dis-advantages addressed costs (NFC augmentation would raisedrug prices6) and privacy concerns due to the radio technol-ogy. People here seemed to overestimate the proximity range6actually, the mass market price of RFID tags is

  • Drawbacks mentioned were the high amount of interactionwith the device (i.e., the necessity to type), longer searchtime and potential misspelling (which is likely for compli-cated drug names and medical terms). Not only the problemof no results due to typos was seen, but also that ‘incorrectwording [...] could end up giving people information on thewrong medicine’. Also, similar items in the result list couldlead to confusion. Text search was considered more difficultthan the other methods especially due to small on-screen key-boards on smartphones.

    RQ2: PreferencesWhen asked which method respondents would prefer, 48.3%chose scanning, 25.3% text search, 13.6% touching, and12.2% pointing (see Fig. 4).

    Text25.9 %

    Pointing12.2 %

    Touching13.6 %

    Scanning48.3 %

    User Preferences for Methods in Online Survey

    Figure 4. Preferences for identification methods in the online survey. Re-spondents liked scanning most (probably due to higher familiarity thanNFC or visual search) followed by text search, pointing and touching.

    The high preference for scanning and text search could be ex-plained with their level of familiarity. Most smartphone usersare experienced with text search and bar code scanning, whilethey are less familiar with NFC and visual feature recogni-tion. ‘I have experience using other software using barcodes,and have liked their ease.’, one person stated. A respondentwho chose text search said: ‘This is a tried and true way ofresearching information’. Previous knowledge and positiveexperience may have attracted respondents to choose a ‘fa-miliar’ method as their favorite.

    It seemed more difficult for people to evaluate pointing (vi-sual search) and touching (NFC) without hands-on experi-ence. In particular, recognizing objects just by pointing waspartly seen as ‘science-fiction’, and the range of NFC wasoverestimated, leading to the assumption that closely placedproducts could interfere with each other and make targetingthe desired one difficult. Some users even worried about be-ing inundated with information when passing by the shelvesin a store without having a hand in the matter.

    RQ3: Potential of MobiMedRespondents highly appreciated MobiMed. 81.8% answered‘yes’ to the question ‘Would you use such a system as de-scribed above?’. They liked the idea of finding drug informa-tion fast and easily, and envisaged various focus groups whocould benefit from MobiMed, such as pharmacists, doctors,

    or people who take multiple drugs. A person said it’s ‘per-fect for if you have something at home that you want to findsomewhere so you can pick more up or learn more about it’.

    In average, interviewees would spend $8.40 for MobiMed(with a standard deviation of $17.12). The high variance isrooted in the difference between older and younger respon-dents: Those under 25 would averagely pay $6.34, thoseolder than 25 in average $14.01. There are two possible rea-sons: First, older people might have a higher need for medicalapplications, so that the personal value is higher for them.Second, they have a different idea of product pricing thanyounger people who are used to get software in mobile appstores for small amounts of money.

    LAB STUDY (PROTOTYPE EVALUATION)After having collected a priori large-scale feedback on iden-tification methods in the MobiMed use case, we wanted toverify our findings with a smaller number of participants ina hands-on lab study. We investigated the following researchquestions (RQ):

    • RQ1: Which object identification method is superior interms of efficiency?

    • RQ2: Which method is preferred by users in practical useof the MobiMed application?

    • RQ3: What potential do people see for MobiMed as awhole after having used the application?

    RQ1 was evaluated in an experiment, RQ2 and RQ3 with aquestionnaire after the experiment. The lab study’s researchquestions corresponded with those of the online survey (seeprevious section) and confirmed the initial results. While theonline survey revealed a priori findings (i.e. before subjectsactually used the application) of a large user group, this timethe research questions were investigated quantitatively andqualitatively in an experiment in a smaller scale.

    Participants16 people (6 females, 10 males), aged between 22 and 69years (average age: 31, SD = 12) took part in the evalua-tion. All of them had a mobile phone and used it regularly, 9were smartphone owners. No subject had physical disabilitiesthat could have hindered the execution of the demanded tasks(such as difficulties with holding the smartphone steadily).Participants were recruited among acquaintances of some ofthe researchers; none were involved in the project. They wererewarded for their participation with a small compensation inthe form of sweets.

    Experimental TaskSubjects identified medication packages using the four meth-ods described above: scanning, touching, pointing and textsearch (conditions). Each participant ran through all condi-tions (within-subjects design) in one of four orders, whichwere counterbalanced using a 4×4 Latin Square [5].The experiments were conducted at a table in a separated,brightly lit room at a medical office. 13 NFC-augmentedmedication packages (see Fig. 5) were placed in a box on

  • Figure 5. Drugs packages like they were used in the experiment. Allpackages contain bar codes; our sample packages for the study wereadditionally augmented with NFC tags for the touching method.

    the table. Subjects were handed a Samsung Nexus S smart-phone with MobiMed installed. They were introduced to thedevice and the task, and could make themselves familiar withthe phone and the MobiMed app prior to the experiment.

    For each of the four conditions, 10 packages had to be iden-tified. Participants were asked to fetch one package at a timeout of the box (blind draws; the order was randomized) and toidentify it with MobiMed. After successful identification of10 packages, they were put back in the box and the conditionwas changed. In the text search condition, users were free toeither type drug name or identification number (PZN) whichwas printed below the bar code on each package.

    During the experiment, subjects were encouraged to expressany thoughts that came to their mind (‘think aloud’). Fig. 6shows subjects performing different identification methods.The experiment took about 30 minutes per participant. A re-searcher was present during the entire time.

    Data CollectionQuantitative DataA timer built in to the application (in millisecond resolution)measured the time needed for each single identification. Themeasurements were saved to a log file on the smartphone.Subjects were asked to tap a button on the screen when theywere ready, which started the recognition process. At this mo-ment, the timer was initialized, and it was stopped when theidentification process was complete and the drug’s descrip-tion page was shown.

    For touching and scanning, the measured time was equivalentto the recognition time of the NFC tag or the bar code. Forpointing and text search, recognition led to a result list, sincethese methods can return ambiguous hits. In these cases, thetotal time added up of the recognition time and the user se-lection time of the correct list item. The timer was alwaysstopped upon the first appearance of the drug’s descriptionscreen, i.e. it was assumed that no corrections of the user’schoice from the list were required.

    Figure 6. Subjects (faces were blurred for privacy reasons) identifyingdrug packages with MobiMed in the lab study: Bar code scanning (1),touching (2), pointing (3) and checking results after text search (4).

    Qualitative DataQualitative feedback was gathered in a questionnaire after theexperimental task (addressing RQ2 and RQ3). Subjects wereleft alone while they filled out the form to prevent any influ-ence of a researcher present in the room.

    In the questionnaire, people were asked how they liked eachidentification method and whether they would continue usingthe app with each particular method. Moreover, utility andusability of the MobiMed app were evaluated.

    Results of the Lab StudyRQ1: Experimental Comparison of MethodsA summary of the measurements is shown in Fig. 7. Subjectsneeded 1.8 seconds (all figures are mean values) to identify asingle drug by touching the NFC tag. This was significantlyfaster than scanning the bar code (13.5 s), pointing at the ob-ject (16.4 s) and using text search (20.5 s).

    The standard deviations (SD) were 3.7 s for touching, 9.6 sfor scanning, 6.1 s for pointing and 22.9 s for text search. Al-though scanning was performed slightly faster than pointing,the higher variance for scanning and some large single val-ues (above 40 s, maximum at 61.7 s) indicate that subjectssometimes struggled with this identification method. In theexperiments, this became especially evident when the camerahad focusing problems with small bar codes.

    While the upper quartile of the NFC identification time endsat 3.9 s, some subjects took more than 10 s for the task, witha maximum of 18.1 s. We explain this variation with differentproceedings we observed in the experiment. While most par-ticipants focused the NFC tag on the package prior to pressingthe start button on the phone, few others pressed the buttonand began afterwards to turn the package in their hands tosearch where to touch it with the NFC reader.

  • 0 s

    10 s

    20 s

    30 s

    40 s

    Touching (N

    FC)

    Scanning

    (Barcode)

    Pointing (V

    isual)Text S

    earch

    Average Object Identification Time

    13.5 s

    1.9 s

    16.4 s

    20.5 s

    Figure 7. Box plots indicating speed measures of object identificationwith MobiMed (figures are means). Boxes represent the interquartilerange (IQR), whiskers represent extrema. Maxima for scanning (61.7 s)and text search (102.3 s) were cut off for better readability.

    The high variance for text search reflects the diverging typingcapabilities of participants. With a maximum of 102.3 s, textsearch took more than five times longer than the longest NFCidentification (18.1 s). It could also be explained by the lengthof some drug names, providing no upper bound for text inputlength. It is worth to mention that these results were still ob-tained under ‘ideal’ conditions, i.e. under the assumption thatsubjects selected the correct item from the result list. In prac-tice, the need to correct accidental choices might entail evenlonger total times for the text search modality.

    RQ2: Qualitative Comparison of MethodsFig. 8 visualizes users’ preferences for the individual meth-ods in the experiment. The ratings reflect the measured iden-tification times, i.e. faster methods were rated better. Ona 7-step Likert scale (−3 = strongly disagree, 3 = stronglyagree), participants responded with 3.0 that they liked touch-ing (SD = 0.0), which was also the fastest method. Scanning,the second fastest method, was rated with 2.0 (SD = 1.2).Pointing received a rating of 1.4 (SD = 1.9) and text inputwas rated averagely with 0.2 (SD = 2.0). When asked whetherusers would use the methods in future, the pattern was gener-ally the same, but at a lower agreement level. Subjects agreedwith 2.8 (SD = 0.4) that they would like to use the touch-ing method. The scores for scanning and pointing were 1.9(SD = 1.0) and 0.6 (SD = 2.0), respectively. The wish to usetext search was expressed with only −1.1 (SD = 1.8).

    RQ3: Utility and Usability of MobiMed as a WholeIn order to evaluate the perceived usability and general ac-ceptance of MobiMed, we used the System Usability Scale(SUS) [4] which contains a 10-item set using a 5-step Lik-ert scale (1 = strongly disagree, 5 =strongly agree). The setis a mixture of positive and negative statements, such as ‘Ithought the system was easy to use’ (positive), ‘I felt veryconfident using the system’ (positive) or ‘I thought there wastoo much inconsistency in this system’ (negative). A SUS

    Touching (NFC)

    Scanning (Bar Code)

    Pointing (Visual)

    Text Search

    -3 -2 -1 0 1 2 3

    -1.1

    0.6

    1.9

    2.8

    0.2

    1.4

    2.0

    3.0

    Liked in experiment Would use in future

    User Preferences for Methods in Lab Study

    Figure 8. Likert scale averages on whether users liked a specific identi-fication method and whether they would use this method in the future(-3=strongly disagree, 3=strongly agree).

    Would like to use frequentlyApp is unnecessarily complex

    App is easy to useWould need tech support to use

    Functions are well integratedToo much inconsistency in app

    Most people would learn it quicklyApp is cumbersome to use

    Felt confident using appNeed to learn a lot to use app

    1 2 3 4 51.3

    4.41.3

    4.51.4

    4.81.3

    4.71.3

    3.3

    System Usability Score (SUS)

    Total SUS Score: 88.0

    Figure 9. Individual aspects of the System Usability Score for MobiMedon a 5-step Likert scale (1=lowest, 5=highest agreement with state-ments). Statements are abbreviated; for exact wording, see [4].

    score ranging from 0 to 100 is then calculated from the indi-vidual items (negative statements are inversely incorporated).

    Fig. 9 illustrates the SUS values for the individual items. Pos-itive items were usually rated with a score of more than 4, ex-cept the statement ‘I think that I would like to use this systemfrequently’, which was rated with 3.3. Subjects averagely dis-agreed with negative items with a score of less than 1.4. Onlyone participant stated to ‘need support of a technical person’to use the system. The resulting SUS score is 88.0 points outof 100 possible points, which can be considered as a good re-sult [18] so that we can assume that subjects appreciated theMobiMed app and did not find major usability concerns.

    In order to evaluate MobiMed’s utility in everyday life, weasked people which information sources they use to learnabout medications, their active ingredients, dosage, side ef-fects etc. 75% ask their doctor or pharmacist, 69% read thepackage insert. 56% stated to consult books or the internet,13% use other sources. 75% of subjects use more than onesingle source to get information on drugs. After the study,14 out of 16 participants (88%) declared that they were inter-ested to use MobiMed as alternative source to inform them-selves on drugs. Not only is this an indicator that subjects

  • appreciated the prototype; MobiMed was also the most pop-ular individual source of information of all other ones.

    Additional Feature SuggestionsWith an open question, we asked subjects for desired addi-tional features in MobiMed. They came up first and foremostwith shopping-related functions: lookup of prices, findingcheaper generic products, providing a list of suppliers, andthe possibility for direct order. Subjects were also interestedin active ingredient analysis: MobiMed could suggest prod-ucts that show fewer cross-correlated side effects for a spe-cific combination of components. One participant suggesteda tool that helps diagnosing based on symptoms a user enters.Several subjects were interested in a personalized medicationmanagement tool that allows to manage drug intake, createsmedication lists and reminds users to take their pills.

    DISCUSSION AND SUMMARYWe investigated advantages and disadvantages, efficiency anduser acceptance of four object identification techniques at theexample of a drug information system (RQ1 and RQ2), andevaluated utility and usability of MobiMed, a prototype im-plementation thereof (RQ3). We examined the research ques-tions in two steps: qualitatively with a 149 users in an onlinesurvey, and both quantitatively and qualitatively with 16 sub-jects in a lab study.

    We summarize our findings and observations in five mainpoints. Results from both studies coincide in large parts, al-though we found some divergences which we also try to ex-plain below.

    1. Physical Mobile Interaction is Popular and EfficientIn the lab study, subjects preferred all physical mobile interac-tion techniques over conventional text search, and would alsoprefer them in future use. Quantitative measures showed thatphysical mobile interaction techniques were faster for iden-tifying drug packages than text search, and that the standarddeviation was significantly lower. Today, physical mobile in-teraction is widely still neglected compared to traditional in-put approaches, although all requirements to the hardware arenow met (which they have not even few years ago). Our find-ings motivate to foster the use of object identification tech-niques in everyday applications, making the interaction withthe physical world more effective and intuitive.

    2. Touching and Scanning Evaluated BestIn the comparison of the individual techniques, touching andscanning were evaluated as fastest methods, and also pre-ferred by study participants. Touching was significantly fasterthan scanning, and also adopted more positively by subjectsin the lab study.

    In the online survey people responded differently: almost halfof respondents stated that they liked scanning most, followedby text search. A reason for this high agreement might sim-ply be that they knew these due to their wide popularity, whilethey could not imagine so well how touching using NFC orpointing using visual search would work. Lab study partici-pants who tried all modalities might have been guided by theiractual experience rather than by their previous knowledge.

    It remains to be seen how the further distribution of NFCon the market, e.g. through digital payment services such asGoogle Wallet, will influence people’s attitude towards thisinteraction technique. The practical use for identifying ev-eryday objects with NFC is currently not yet given since fewproducts are augmented with tags yet. However, a widespreaduse of NFC to improve customer services has been suggestedpreviously [15], and could be promoted by cheap tags, as theyare already integrated in clothing.

    3. Visual Search as Interesting Alternative for Future SystemsThe pointing paradigm, which we realized by visual feature-based search, was in our study less popular and slightlyslower than touching and scanning. However, pointing almostreached the performance of scanning, and had no outliers likescanning due to unreadable bar codes. We believe that fu-ture hardware and improved implementations could furtherincrease recognition speed and reliability of visual search,and outperform scanning of visual codes in terms of bothspeed and usability. The major advantage of visual searchis that it can work for any object, which opens up this methodfor a variety of applications beyond identifying marker-aug-mented products. In the online survey, subjects recognizedthis potential, mentioning that there is no need to search forthe bar code and that it is the most natural way of physicalmobile interaction.

    4. Best Method Depends on ScenarioWe had to confine to a selected task in the lab study in or-der to compare the performance of four identification meth-ods, but subjects were creative about alternative use cases ofMobiMed. Mentioned were, among others, product compar-isons in the pharmacy, online price checks, a medication diaryand pill reminder systems that incorporate side effects of thecombinations of active ingredients. It became clear that the‘best’ interaction method depends on the selected scenario. Aparticular example is the question whether a method shouldreturn unambiguous results or a result list. For product com-parisons, multiple results as produced by visual search aredesired, while reliable information for drug intake at homerequires a method that provides a unique result, e.g. scanningthe bar code. Visual search can be an interesting alternativewhen the bar code is not readable or invisible, e.g. when drugpackages are placed behind glass in the pharmacy, or infor-mation should be retrieved from a picture, e.g. on a website.NFC tags are nowadays not in widespread use, but were in-cluded in the comparison as they are a state-of-the-art tech-nology on the rise. Both the short identification time and thecompelling user feedback of pointing using NFC imply thatthis method could be an interesting alternative in cases whereobjects can be augmented with respective tags.

    It also turned out that performance is only one factor for userpreferences, which suggests future research on factors thatinfluence the likability of physical mobile interaction tech-niques for a specific scenario.

    5. General Demand for Medical AppsResponses of both the online survey and the lab study a highlevel of interest in medical apps. This might be related to an

  • increased awareness for a healthy lifestyle (which includes in-terest in food supplements and ingredients), but also to a ris-ing need for medication support in light of the aging society.Our studies indicate a high demand for support in managingand retrieving information on drugs, their dosage, effects andinteractions, which can be complicated for lay people. Mo-biMed was evaluated by subjects as helpful complement toother information sources the pharmacist’s advice.

    OUTLOOK AND FUTURE WORKIn future work, we will obtain more detailed data w.r.t. humanand system performance. The results of this work showedthat performance and user preference not necessarily coin-cide. They bring up the new research question whether amore ‘likable’ interaction technique is probably an argumentagainst a performant, but unpopular method, and what actu-ally accounts for pleasant physical mobile interaction. Wewill further investigate other scenarios of physical mobile in-teraction in order to gain insights on adequate identificationmethods for individual use cases. In particular, we will refinevisual feature recognition for the use of the pointing modalityto evaluate in which cases it could possibly replace scanningof bar codes.

    ACKNOWLEDGMENTSWe would like to thank Hans Heid for his valuable work withthe MedScan prototype. Furthermore, we thank all online andlab study subjects for their participation.

    REFERENCES1. Ailisto, H., Korhonen, I., Plomp, J., Pohjanheimo, L., and

    Strömmer, E. Realising physical selection for mobile devices.In Mobile HCI 2003 Conference, Physical Interaction (PI03)Workshop on Real World User Interfaces, Udine, Italy (2003).

    2. Ballagas, R., Borchers, J., Rohs, M., and Sheridan, J. G. Thesmart phone: A ubiquitous input device. IEEE PervasiveComputing 5, 1 (Jan. 2006), 70–77.

    3. Benelli, G., and Pozzebon, A. Near field communication andhealth: Turning a mobile phone into an interactivemultipurpose assistant in healthcare scenarios. BiomedicalEngineering Systems and Technologies (2010), 356–368.

    4. Brooke, J. SUS–a quick and dirty usability scale. Usabilityevaluation in industry 189 (1996), 194.

    5. Cochran, W., and Cox, G. Experimental designs. John Wiley &Sons, 1957.

    6. Föckler, P., Zeidler, T., Brombach, B., Bruns, E., and Bimber,O. Phoneguide: museum guidance supported by on-deviceobject recognition on mobile phones. In Proceedings of the 4thinternational conference on Mobile and ubiquitous multimedia,MUM ’05, ACM (New York, NY, USA, 2005), 3–10.

    7. Hardy, R., and Rukzio, E. Touch & interact: touch-basedinteraction of mobile phones with displays. In Proc. of the 10thIntl. Conference on Human-Computer Interaction with MobileDevices and Services, MobileHCI ’08, ACM (NY, USA, 2008),245–254.

    8. Kranz, M., Möller, A., Hammerla, N., Diewald, S., Plötz, T.,Olivier, P., and Roalter, L. The mobile fitness coach: Towardsindividualized skill assessment using personalized mobiledevices. Pervasive and Mobile Computing, Special Issue onMobile Interaction with the Real World (2012),DOI=10.1016/j.pmcj.2012.06.002.

    9. Kranz, M., and Schmidt, A. Prototyping Smart Objects forUbiquitous Computing. In Proceedings of the InternationalWorkshop on Smart Object Systems in Conjunction with theSeventh International Conference on Ubiquitous Computing(Sept. 2005).

    10. Mäkelä, K., Belt, S., Greenblatt, D., and Häkkilä, J. Mobileinteraction with visual and rfid tags: a field study on userperceptions. In Proceedings of the SIGCHI conference onHuman factors in computing systems, CHI ’07, ACM (NewYork, NY, USA, 2007), 991–994.

    11. Nistér, D., and Stewenius, H. Scalable recognition with avocabulary tree. In IEEE Computer Society Conference onComputer Vision and Pattern Recognition, vol. 2, IEEE (2006),2161–2168.

    12. O’Neill, E., Thompson, P., Garzonis, S., and Warr, A. Reachout and touch: using NFC and 2D barcodes for servicediscovery and interaction with mobile devices. PervasiveComputing (2007), 19–36.

    13. Querini, M., Grillo, A., Italiano, G. F., and Lentini, A. 2DColor Barcodes for Mobile Phones. In International Journal ofComputer Science & Applications, Special issue onMultimedia Applications, vol. 8 (2011), 136–155.

    14. Reischach, F., Michahelles, F., Guinard, D., Adelmann, R.,Fleisch, E., and Schmidt, A. An evaluation of productidentification techniques for mobile phones. In Proceedings ofthe 12th IFIP TC 13 International Conference onHuman-Computer Interaction: Part I, INTERACT ’09,Springer-Verlag (Berlin, Heidelberg, 2009), 804–816.

    15. Resatsch, F., Karpischek, S., Sandner, U., and Hamacher, S.Mobile sales assistant: NFC for retailers. In Proceedings of the9th international conference on Human computer interactionwith mobile devices and services, MobileHCI ’07, ACM (NewYork, NY, USA, 2007), 313–316.

    16. Rukzio, E. Physical mobile interactions: Mobile devices aspervasive mediators for interactions with the real world. PhDthesis, 2006.

    17. Rukzio, E., Leichtenstern, K., Callaghan, V., Holleis, P.,Schmidt, A., and Chin, J. An experimental comparison ofphysical mobile interaction techniques: Touching, pointing andscanning. UbiComp (2006), 87–104.

    18. Sauro, J. Measuring usability with the system usability scale(SUS). http://www.measuringusability.com/sus.php,2011.

    19. Schroth, G., Al-Nuaimi, A., Huitl, R., Schweiger, F., andSteinbach, E. Rapid image retrieval for mobile locationrecognition. In IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP), IEEE (2011),2320–2323.

    20. Siorpaes, S., Broll, G., Paolucci, M., Rukzio, E., Hamard, J.,Wagner, M., and Schmidt, A. Mobile interaction with theinternet of things. In Adjunct Proc. of Pervasive, Citeseer(2006).

    21. Välkkynen, P., Korhonen, I., Plomp, J., Tuomisto, T.,Cluitmans, L., Ailisto, H., and Seppä, H. A user interactionparadigm for physical browsing and near-object control basedon tags. In Mobile HCI, Physical Interaction Workshop on RealWorld User Interfaces (Udine, Italy, 2003), 31–34.

    22. Want, R., Fishkin, K., Gujar, A., and Harrison, B. Bridgingphysical and virtual worlds with electronic tags. In Proceedingsof the SIGCHI conference on Human factors in computingsystems: the CHI is the limit, ACM (1999), 370–377.

    23. Wu, J., Pan, G., Zhang, D., Li, S., and Wu, Z. MagicPhone:pointing & interacting. In Proceedings of the 12th ACMinternational conference adjunct papers on Ubiquitouscomputing, Ubicomp ’10, ACM (New York, NY, USA, 2010),451–452.

    http://www.measuringusability.com/sus.php

    IntroductionPhysical Interaction TechniquesPointingTouchingScanningComparative WorkPhysical Mobile Interaction vs. Object Identification

    MobiMed: Identifying Drugs with a SmartphoneUse CasePrototype ImplementationOmission of Speech Recognition

    Online SurveyQuestionnaireParticipantsResults and Discussion of the Online SurveyRQ1: Advantages/Disadvantages of Identification TechniquesRQ3: Potential of MobiMed

    Lab Study (Prototype Evaluation)ParticipantsExperimental TaskData CollectionQuantitative DataQualitative Data

    Results of the Lab StudyRQ1: Experimental Comparison of MethodsRQ2: Qualitative Comparison of MethodsRQ3: Utility and Usability of MobiMed as a WholeAdditional Feature Suggestions

    Discussion and SummaryOutlook and Future WorkACKNOWLEDGMENTSREFERENCES