graspables: grasp-recognition as a user interface

Upload: srlresearch

Post on 30-May-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Graspables: Grasp-Recognition as a User Interface

    1/9

    Graspables: Grasp-Recognition as a User InterfaceBrandon Taylor

    MIT Media Lab20 Ames St. E15-346

    Cambridge, MA 02139 [email protected]

    V. Michael Bove, Jr.

    MIT Media Lab20 Ames St. E15-368B

    Cambridge, MA 02139 [email protected]

    ABSTRACT

    The Graspables project is an exploration of how measuringthe way people hold and manipulate objects can be used asa user interface. As computational ability continues to beimplemented in more and more objects and devices, newinteraction methods need to be developed. The GraspablesSystem is embodied by a physical set of sensors combinedwith pattern recognition software that can determine howusers hold a device. The Graspables System has beenimplemented in two prototypes, the Bar of Soap and theBall of Soap. Applications developed for these prototypesdemonstrate the effectiveness of grasp-recognition as aninterface in multiple scenarios.

    Author Keywords

    Grasp, User Interface.

    ACM Classification Keywords

    H.1.2: Models and Principles: User/Machine Systems.H.5.2: Information Interfaces and Presentation: UserInterfaces. K.8.0: Personal Computing: General. J.7:Computer Applications: Computers in Other Systems.

    INTRODUCTION

    The Graspables are devices developed to explore howmeasuring the way people grasp objects can enhance userinterfaces. This paper will first attempt to explain therationale and inspiration behind using grasp-recognition asan interface. It will provide a detailed description of howthe sensors and software were implemented in the twoGraspables prototypes. Next, follows a discussion of theapplications that have been developed to explore anddemonstrate the capabilities of grasp-recognition. Lastly,there will be a discussion of our experiences with grasp-recognition, including methods for evaluation andimprovement.

    The origin of the Graspables can be traced back to a high

    level discussion of ways to improve multi-function

    handheld devices. It was suggested that an ideal multi-function device would need to be capable of two things: itwould need to automatically infer what users want to dowith it and it would need to be able to alter its affordancesaccordingly. When it wasnt being used, the device wouldsimply appear to be an undifferentiated block, like a bar ofsoap.

    While the Graspables may not completely fulfill this vision,the idea of creating devices that implicitly understand usersintentions, without the need for menus and direct

    commands, was the launching point for the project. As theproject evolved, emphasis shifted away from multi-functionhandhelds to exploring how basic manipulations of objectscan contain useful information. In other words, what canyou learn from the way people hold an object? Can youdistinguish whether a user wants to make a phone call or

    just look up a contact by the way they hold their phone?Can a golf club predict a slice if it is gripped improperly?

    In pursuing these questions, the Graspables wereconstrained by the desire to have a system that could berealistically implemented in existing objects and devices.This led us to shy away from approaches that requireelaborate sensing environments or expensive input devices.

    The hope was that the right combinations of sensors andsoftware could give objects an enhanced understanding oftheir users actions without limiting portability oraffordability.

    Another key aspect of the research was the focus placed onobjects themselves. Instead of focusing on just creating anew interface method or a specific type of controller, wewere very interested in understanding and exploring how

    people interact with a variety of different objects. Our viewwas that understanding how people grasp and interact witha coffee cup is potentially just as valuable as how theyinteract with electronics. Thus, we wanted a system thatcould be implemented into arbitrary geometries.

    Background

    Nearly twenty years ago, Mark Weiser coined the termUbiquitous Computing to describe the idea of a vastnetwork of computing devices interacting unobtrusively toenhance productivity. While the proliferation anddispersion of computational power has certainly occurred, ithas not yet vanish[ed] into the background [17].

    Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copies

    ear this notice and the full citation on the first page. To copy otherwise,or republish, to post on servers or to redistribute to lists, requires priorspecific permission and/or a fee.CHI 2009, April 49, 2009, Boston, MA, USA.Copyright 2009 ACM 978-1-60558-246-7/09/04$5.00.

    CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

    917

  • 8/9/2019 Graspables: Grasp-Recognition as a User Interface

    2/9

    Projects like the Graspables fit into the realm of UbiquitousComputing by trying to expand the ways in whichcomputers are controlled. By developing grasp-recognitionas a user interface, it is hoped that users can be presentedwith a more natural method of interacting with devices.Instead of seeing a device and trying to imagine how itsmenu system and buttons are mapped, grasp-recognitioncan leverage users intuitions about how devices should be

    used for certain functions.Studies have been performed demonstrating how certaincomputerized tasks can be more easily accomplished when

    properly modeled by physical devices [5]. Early work onthe concept of Graspable User Interfaces suggested that byfacilitating 2-handed interactions, spatial caching, and

    parallel position and orientation control physicalcomputing could provide a richer interface than virtual,graphics-based interfaces [4].

    A grasp-recognition based interface would, by virtue of itsnature, capitalize on these advantages. Rather than creatingcontrols through arbitrary key-mappings, the physical

    nature of the Graspables provides suggestive physicalaffordances and passive haptic feedback by approximatingand representing real-world objects.

    Motivation

    Portable device interfaces provide a distinct challenge fordesigners. For more complex portable systems, there is anatural desire to mimic the operational semantics of thecomputer as much as possible. People are accustomed tothe window metaphor of most desktop computer GUIs, so itmakes sense to leverage this knowledge to some extent.

    A common approach in portable devices is to imitate theclicking and dragging functions of a mouse with a

    touchscreen. Full keyboards are often implemented eitherin the form of physical buttons or virtual ones. Bothapproaches have drawbacks. Physical buttons arenecessarily small and always present, even when anapplication only needs a subset of the keys. Virtual buttonson the other hand, provide no tactile feedback, which canrender them unusable to certain groups of users.

    These issues have led researchers to explore otherinteraction methods that may end up being moreappropriate for handheld devices. A common example isthe use of accelerometers in many cameras and phones forswitching between portrait and landscape views. Anotherapproach is to capitalize on the inherent mobility of

    handheld devices by exploring gestures as an interactionmethod. Studies have explored using gestures for thingssuch as the detection of common usage modes [9] to themapping of functions to relative body positions [1]. While itis hard to predict what new interfaces will catch on,successes like the iPhones Multi-Touch display provideencouragement for continuing research.

    In implementing the Graspables system into the Bar ofSoap and the Ball of Soap, we were interested in how the

    devices geometries impact what objects they can easilyrepresent.

    The most common input devices, such as mice, keyboardsand even video game controllers, generally sacrificerepresentation in favor of more robust, general controls.Over time, these systems develop semantics of their own(think how similar most video game controllers are or how

    people expect to be able to click icons in graphicalinterfaces) and people hardly even think about the controlmetaphors. However, there are exceptions.

    Tablet and stylus systems exist to better bridge the gap between writing or drawing and computers. Video gamescan use specialized peripherals such as steering wheels orguns. These examples highlight the importance of objectsfor some tasks. While there is likely no way to completelyavoid the tradeoff between robust and representativecontrols, it is certainly worth exploring how new interfacescan create more literal interactions and of what value thesemay be.

    RELATED WORKThe Huggable is a robotic Teddy Bear being designed bythe Personal Robots group at the MIT Media Lab to providetherapeutic interactions similar to those of companionanimals. The Huggable is being developed with the goal ofproperly detecting the affective content of touch [15].Towards this end, the Huggable is thus equipped with anarray of sensors that detect the proximity of a human hand,measure the force of contact and track changes intemperature. The data from these sensors is then processedto distinguish interactions such as tickling, poking orslapping [14].

    From a technical perspective, the goals of the Huggable are

    very similar to those of the Graspables System. Both seekto identify and understand the ways users manipulate anobject. In many ways, the Huggable could be viewed as asophisticated example of a grasp-recognition system. Thatsaid, there are obvious differences between the GraspablesSystem described in this paper and the hardware/softwaresystem of the Huggable. The sensing hardware of theHuggable, for example, relies on dense arrays of QuantumTunneling Composite (QTC) force sensors and broadelectric field sensors for touch sensing, whereas theGraspables are implemented with a dense set of capacitivesensors. Additionally, whereas the Huggable is intimatelyconnected to the Teddy Bear form factor, our work

    demonstrates a system that can readily be adapted intovarious geometries for different uses.

    The Tango is a whole-hand interface designed by theMultisensory Computation Laboratory at Rutgers for themanipulation of virtual 3D objects [11]. The device is ahand-sized spherical object with a 3-axis accelerometer andan 8x32 capacitive sensing grid housed in a compressibledielectric material. The Tango is calibrated to detect

    CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

    918

  • 8/9/2019 Graspables: Grasp-Recognition as a User Interface

    3/9

    variations in pressure from which a simplified hand modelcan be estimated.

    The Tango uses spherical harmonics to create a rotationallyinvariant map of pressures [8]. These pressure maps canthen be reduced using principal component analysis andclassified using K-nearest neighbors. A 3D virtualenvironment in which the Tango was used to manipulatevirtual objects was also developed.

    While the Tango clearly shares certain objectives with theGraspables, there are significant differences in theirrespective implementations. First, the grid structure of thecapacitive sensors and the classification software of theTango would not directly translate to other devicegeometries, severely limiting the number of objects it couldrepresent. Also, since the Tango is actually attempting toinfer general hand poses it requires additional constraints,such as single hand use. In the end, while the sensingtechniques and software analysis provide interestingreferences, the goals of the Tango require a significantlydifferent approach than those of the Graspables.

    When development began on the first version of the Bar ofSoap, a similar study was being conducted by the SamsungAdvanced Institute of Technology (SAIT) [2,7]. Afterreceiving encouraging results from an initial study in which

    painted gloves where used to create image maps of grasp patterns, a prototype device was built for real-time graspdetection. The SAIT device contained a 3-axisaccelerometer and 64 capacitive sensors. A user study was

    performed to try and classify 8 different use modes with thedevice.

    The results from the SAIT study match up well with thoseof the initial Bar of Soap study [16], correctly classifying75% to 90% of grasps across multiple users. The SAITdevice uses non-binary capacitive sensors, a differentsensor layout on a device of different physical dimensions,a unique set of use modes and different classificationtechniques from the Bar of Soap. In addition to thesedifferences, our research goes beyond static griprecognition to explore how changing grasps and gesturescan enhance interactions. We also look beyond commonhandheld electronics to see how grasp recognition canimpact interactions with physical objects and virtualenvironments.

    DESIGN

    The Graspables System is a hardware and software platform

    capable of detecting how a user is manipulating a device.The system needs to be flexible enough to accommodatedistinct sensor layouts for objects with different physicalgeometries. It is also important that the system be able to

    process and transmit data in real-time.

    The Bar of Soap

    The Bar of Soap was designed to explore how grasp-recognition could be of use in a variety of modern handheld

    devices. It also served as a test bed for the GraspablesSystems hardware.

    The Bar of Soap prototype, shown in Figure 1, is a11.5x7.6x3.3 cm rectangular box containing a 3-axisaccelerometer and 72 capacitive sensors. The capacitivesensors are controlled by three Qprox QT60248 chips,which treat each one as a discrete, binary sensor. An AtmelAtmega644 microcontroller in the device samples thesesensors and can communicate results to a PC via Bluetooth.Low-power cholesteric LCD screens on the two largestfaces can provide user feedback. Transparent capacitivesensors were developed and placed over both screens. Thisallowed the display surfaces to also function as sensingsurfaces, which in turn allowed the Bar of Soap to betteremulate functional devices with interactive touchscreens.Having screens on both sides preserved the symmetry of thedevice and allowed the Bar of Soap to function as a genericrectangular sensing device with two customizable faces.

    The transparent sensors were created by placing thin filmcoated with Indium Tin Oxide (ITO) on opposite sides of a

    clear piece of acrylic. ITO is a transparent conductivematerial that fills the role of the interdigitated copper tracesof the other capacitive sensors. We tested ITO sensors in avariety of patterns and sizes before choosing strips ofapproximately 5mm width. This design has the advantageof being relatively simple to construct while providingsensitivity to finger touches comparable to the copper tracesused elsewhere. Settings in the QT60248 chip were able toamplify the responses of sensors located further along theITO to compensate for its resistance. Response was furtherimproved by lining the edges of the device with a groundedcopper strip.

    Figure 1. The Bar of Soap

    The Ball of Soap

    As the Bar of Soap evolved from an exploration of ways toimprove multi-function handhelds into a more general

    platform to explore grasp-recognition, we began to consider

    CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

    919

  • 8/9/2019 Graspables: Grasp-Recognition as a User Interface

    4/9

    the limitations of its physical form. While a smallrectangular box provides an adequate representation ofmany handheld electronics, it has inherent limitations.

    In order to explore interactions with different physicalforms, the Ball of Soap was developed. Since a trulyspherical object would create difficulties in laying out thecapacitive sensors used in the Bar of Soap, we built the Ballof Soap as a small rhombicosidodecahedron. This 62-sidedArchimedean solid provides flat surfaces near the quartersquare inch size and surface density of the Bar of Soapssensors when the overall diameter approaches three inches.

    Figure 2. The unassembled Ball of Soap

    The surface structure of the Ball of Soap prevented thesimple printing of interdigitated copper traces used assensors on the Bar of Soap. Instead, adhesive copper padswere cut and attached to the faces with wires running tocircuit boards inside the ball. We explored using a variety

    of trace shapes for each of the different face geometries, butfound that only using the smallest, triangle arrangementprovided a more consistent response across the capacitivesensors.

    As can be seen in Figure 2, the smallrhombicosidodecahedron shape also allowed the Ball ofSoap to be separated into three sections for easier assembly.The two end pieces are identical and each contains a Qproxchip that controls the 23 capacitive sensors on its surface.The center piece has 16 faces, 15 of which have capacitivesensors and one that houses the power button and

    programming interface. Inside the Ball, attached to thecenter piece is the main circuit board with the

    microcontroller, accelerometer, battery and Bluetooth chip.A grounding bracelet can be attached to improve sensorresponse.

    APPLICATIONS

    In the process of developing the Graspables System,applications were always a consideration. While it is hopedthat the grasp-recognition technique is general enough to beapplied to many other scenarios, specific objectivesstrongly influenced the design of the prototypes. This

    section will discuss the applications that have beendeveloped for the Graspables implementations.

    Multi-Function Handheld

    As handheld electronics have become more powerful, it has become possible to add more functionality to individualdevices. Whereas phones used to be just phones, mostmodern cell phones now take pictures, play music, browse

    the internet and more. One difficulty for designers has beenhow to layout the appropriate affordances for these variousfunctions given the small size of the devices. We noticedthat many of the functions now provided by multi-functionhandhelds have been adopted from devices that people areaccustomed to holding and operating in distinct ways. TheBar of Soap as a multi-function handheld application isdesigned to capitalize on peoples previous experiences byinferring how they want to use the device based on howthey are holding it.

    This application provides a very self-containeddemonstration of the potentials of grasp-recognition. Thedevice passively senses its orientation and the position of a

    users hands, and then displays an interface correspondingto the most likely functionality mode. For demonstration

    purposes the sampling and classification routine is performed every three seconds, but it could easily betriggered by some sort of gesture. Figure 3 shows the Barof Soaps multi-function handheld mode switchingapplication in use.

    Figure 3. The Bar of Soap being held in a phone grasp

    Currently, the application switches between fivefunctionality modes: camera, gamepad, phone, personaldata assistant (PDA), and remote control. These modeswere chosen to represent common, pre-existing handhelddevices that were assumed to have relatively distinctinteraction methods. The data used to train the Bar of Soapto distinguish the different modes was gathered from userswho were given a functionality mode (camera, gamepad,

    phone, PDA or remote) and asked to hold the devicehowever they felt was appropriate given the mode.

    CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

    920

  • 8/9/2019 Graspables: Grasp-Recognition as a User Interface

    5/9

    Rubiks Cube

    Increasingly, gesture recognition using accelerometers andother sensors is being incorporated into handheld deviceinterfaces [13,6]. To demonstrate that the Graspables arenot just limited to distinguishing a small number of staticgrasps we designed a virtual Rubiks cube application. Thisapplication makes use of gesture recognition to control avirtual object using a tangible real-world interface.

    Figure 4. The Bar of Soap controlling a virtual Rubiks cube

    This application, shown in Figure 4, exists as a Matlabscript that streams raw sensor data from the Bar of Soap viaBluetooth. A graphical version of a Rubiks cube isdisplayed on screen and is mapped to the Bar of Soapsorientation as determined by the accelerometer data.Rotating ends of the virtual Rubiks cube is accomplished

    by sliding a finger across the sensors that are most closelymapped to that end on the Bar of Soap.

    The Rubiks cube application demonstrates how grasp-recognition can be used to provide a more tangible interfaceto virtual environments. By mapping the orientation of thevirtual object to the real world device, changing viewpointsis incredibly intuitive. Selecting different virtual objectscan be as simple as picking up a different grasp-recognitionequipped device. Lastly, this application shows how thesensors used for distinguishing static grasps can also beused to recognize dynamic manipulations and gestures.

    Pitch Selection

    In baseball, subtle differences in the way the pitcher grips

    the ball have a profound effect on the outcome of the pitch.Thrown correctly, a slider can become nearly unhittable asit darts away from the batter at the last second. However,the slightest error in its delivery can see the pitch landinghundreds of feet in the wrong direction.

    Given the importance of fine finger manipulations on pitching, this seems an ideal scenario for the GraspablesSystem. A baseball that can detect how its being heldcould be extremely useful in training players to throw

    certain pitches or diagnose potential delivery issues. On theother hand, baseball video games could use such a device to

    provide a method of pitch selection that is more realisticand engaging than pushing a button on a controller.

    While individual grasps may vary slightly from pitcher topitcher, in general the outcome (pitch type) is mapped to acertain grip relative to the baseballs seams. Given these

    previously defined grasps, training data was acquired byhaving a single user appropriately hold the Ball of Soap fora set of pitch types. For this application, the skin of a

    baseball was wrapped around the Ball of Soap. Due to thefour-way symmetry of a baseball, each individual pitch typecan be thrown with the ball in four unique absoluteorientations. Since the sensors on the Ball of Soap arealigned to the absolute orientation of the Ball rather thanrelative to the baseballs seams, training data was collectedfor each pitch being held in the four different absoluteorientations.

    Figure 5. The Ball of Soap with a baseball cover as a pitch

    selector

    The pitch selection application, shown in Figure 5, operatesas a Matlab script. Upon activation, Matlab opens a serial

    port for communication with the Ball of Soap and a screen presents the user with a pitcher ready to throw. The userthen grips the Ball appropriately to throw a fastball, acurveball, or a splitter and makes a throwing gesture. Theacceleration of the Ball triggers the classification routine,which in turn triggers an animation taken from NintendosMario Super Sluggers videogame to display the selected

    pitch.

    DATA ANALYSIS

    In addition to building the Graspables hardware, we alsohad to develop methods for making sense of the data that it

    produced. A significant amount of work was put intocreating appropriate data features and classificationmethods for each of the applications.

    CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

    921

  • 8/9/2019 Graspables: Grasp-Recognition as a User Interface

    6/9

    Multi-Function Handheld Data

    Each time the Bar of Soap is sampled, data are gatheredfrom 75 independent sensors. In order to create a moremanageable feature space, we needed a way to processthese data before applying classification techniques.

    While we were testing the first version of the Bar of Soap(which had no capacitive sensors on one of the largestfaces) we developed a Matlab script to visualize the data.This visualization tool, shown in Figure 6, had checkboxeslaid out to represent the capacitive buttons, a 3D plot of theaccelerometer readings and icon display that could either

    present a samples class label or the result of a classifier.

    Figure 6. A visualization of the data of a phone grasp. This

    visualization layout was designed for the Bar of Soap V1,

    which did not have sensors on the front face.

    Using this visualization tool, we noticed that the capacitivesensors had a strong tendency to be activated in regionalgroups. Additionally it became clear that treating thesensors as having an absolute location reduced the accuracyof classifiers. To account for these facts, we beganreducing the dataset by counting the number of activatedcapacitive sensors along each face (splitting the largest twofaces into two halves) and orienting these groups accordingto the accelerometer readings. This method outperformsdatasets reduced using Principal Component Analysis andFischer Linear Discriminants and automatically adjusts thedevice when it is rotated or flipped.

    Rubiks Cube Data

    In order to control the virtual Rubiks cube, data is rapidlysampled via Bluetooth. The accelerometer data issmoothed and mapped to the orientation of the virtual cube.

    Additionally, sliding a finger across different faces of theBar of Soap triggers a rotation of the corresponding part ofthe Rubiks cube. While it might be possible to detectsliding gestures in a simpler manner, it was desired that themethod be generalizable to more complex gestures. Thus,we chose to implement hidden Markov models [12] todetect the sliding gestures.

    In order to train the Hidden Markov Models, data wascollected and labeled as a single user slid his finger over the

    face of the Bar of Soap. The sliding gesture was recordedin both directions along each edge of the Bar of Soap andalong the outermost rows and columns of sensors on thelargest two faces. While the sliding gesture was beingrecorded on a specific side, no particular attention was

    placed on how the user was holding the Bar of Soap. Thisinsured that data about manipulations that were not slidinggestures was also recorded.

    The sliding gestures were modeled using a left-right HiddenMarkov Model. The states represented the position of thefinger activating either a single capacitive sensor or two asit slides between them. The number of states in the modeldepended upon the number of capacitive sensors on the sidethat was being modeled. In addition to the left-right HMMsmodeling the sliding gestures, ergodic models exist tomodel general, non-sliding interactions.

    These models are trained using the raw sensor data asobservation sequences. The sliding models are trainedusing the corresponding sliding gestures. The generalergodic model is trained using the data from sliding

    gestures that do not correspond to the modeled area. Forexample, the data set that represents a sliding gesture alongthe short edge of the Bar of Soap is used to train the ergodicmodel of the long side.

    The time sequences of activated capacitive sensors are thenbroken up into observation sequences corresponding to thedifferent gesture models. The trained models are used tocalculate the probability of observing such a sequence. If asequence has a higher probability of being observed givenone of the sliding gesture models, a sliding event istriggered.

    If the sliding gesture is performed on either of the largestfaces, the rotation will occur in the direction of the slidinggesture and on the corresponding end of the virtual cube.To rotate an end along the axis that is mapped

    perpendicular to the front and back face, the user simplyplaces their hand over one of the large faces and slides theirfinger along one of the edge faces in the direction of desiredrotation. The virtual cube will interpret the covered side asthe stationary side of the cube and rotate the opposite side.

    Pitch Selection Data

    Like with the Bar of Soap as a multi-function handheld, thecapacitive sensors are grouped in order to reduce the size ofthe feature space. Instead of grouping by sides, as in theBar of Soap, capacitive buttons are grouped around the 12

    pentagonal faces on the small rhombicosidodecahedron.Each pentagonal face is surrounded by five square faces,shared with a single other pentagonal face, and five trianglefaces, shared with two other pentagonal faces. These facesare weighted inversely to the number of groups theyinhabit: an active pentagonal face receiving a weight of 6, asquare 3, and a triangle 2. Thus each of the twelve groupscreates a feature with a value between 0 and 31 dependingon the number of activated faces.

    CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

    922

  • 8/9/2019 Graspables: Grasp-Recognition as a User Interface

    7/9

    The pitch recognition application for the Ball of Soapoperates very similarly to the multi-function modeswitching application for the Bar of Soap. The capacitivesensors are grouped and processed as discussed above, thenBayesian discriminants are calculated. Each of the four

    pitch orientations is treated as a separate class. Thus, for Npitch types, 4xN determinants are calculated.

    The classification routine is triggered by a throwinggesture. Thus, though the sensors are continually sampled,the discriminant functions are not calculated until theaccelerometer values surpass a threshold. When thethreshold is crossed, the discriminants are calculated andthe most likely pitch is selected.

    EVALUATION

    One of the greatest difficulties our research faced wasfinding an effective method for evaluating the Graspables.On a general level, this is a difficulty facing any newmethod of interface. If it really is novel, then taskcompletion comparisons with existing interfaces overlooknew possibilities offered by the interface. On a more

    specific level, grasp-recognition is difficult to evaluatebecause of the lack of any ground truth. Even in situationswhere a grasp is universally recognized, such as therelations of fingers to baseball seams when throwing a four-seam fastball, the exact grip will vary from person to person

    based on hand size. The problem becomes even morechallenging when dealing with less defined grasps. Afterall, whos to decide what the proper way to grasp a phoneis?

    This isnt to say that no evaluation can be done. Obviously,for grasp-recognition to be of any value, grasps must atleast have enough meaning to be remembered and repeated

    by individuals. We conducted a two-part user study toexamine first, how reliably our system could recognizegrasps, and second, how grasps associated with variousdevices vary across a population.

    This study was conducted using the first version of the Barof Soaps Multi-Function Handheld application. The

    procedure was the same for both parts of the study, the onlydifference being that the first part was conducted using asingle user, whereas the second part was conducted with atotal of thirteen individuals.

    For this study, users were seated with the Bar of Soap infront of them on a table. They were told that they would begiven a specific functionality mode (camera, gamepad,

    PDA, phone or remote control) and that they should thenpick up the device and interact with it however they saw fituntil instructed to set it back down.

    After giving these instructions, we would begin recordingdata from the Bar of Soap and then verbally indicate whatfunctionality mode the device should be treated as. Oncethe user had established a relatively stable pose with thedevice, we would label and save the data sample and havethe user place the device back on the table. We then

    repeated this process with each user until we had datasamples from each of the five tested functionality modes.For the single user part we collected 39 sample grasps ineach functionality mode for a total of 395 grasps. For themultiple users part, each of the 13 users provided threesample grasps per mode to create a matching data set.

    Single User Test Multi-User Test

    Templates 82.2% 75.4%

    Nnets 92.5% 79%

    KNN 95% 75.8%

    Parzen 95.4% 72.3%

    Bayes 95% 79%

    GLD 87.5% 70.3%

    Table 1. Recognition rates for different classification

    techniques for both single and multiple user datasets.

    For this study we tested a wide range of classifiersincluding Templates, Neural Networks, BayesianClassification, k-Nearest Neighbors, Parzen Windows andGeneral Linear Discriminants [3]. To test the reliability ofthe system, we used the single user dataset. Each classifierwas trained using a randomly selected set of 29 samplegrasps from each mode, then tested on the remaining 10samples. This process was repeated 10 times usingdifferent training sets. To see how consistently grasps areassociated with devices across multiple users, we trainedthe classifiers using the entire single user dataset and testedthem with the multiple user dataset. The recognition ratesfor both studies are shown in Table 1.

    We were pleased to find that multiple classificationtechniques were able to correctly identify a single usersgrasps with over 90% accuracy. This led us to concludethat our sensor design was adequate for grasp-recognition.The relatively high rates of recognition across multipleusers further encouraged us that even without prompting ordemonstration, people do have similar models of how tograsp things. In the end, we chose to implement theBayesian classifier due to its high performance and minimalcomputational complexity.

    DISCUSSION

    While we were generally pleased with the results of our

    user study, there are a few caveats that must be kept inmind. Our study group varied in hand size and handedness,however, all the participants were fairly homogenous in age(18 to 36) and familiar with electronic devices.Additionally, the tested functionality modes were selected

    based on the assumption that they would have distinctgrasps associated with them. A large source of error in themulti-user study was when this assumption failed. Forexample, when users held the phone as though they were

    CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

    923

  • 8/9/2019 Graspables: Grasp-Recognition as a User Interface

    8/9

    looking up a number rather than up to their ear as thoughspeaking on it, the grasp was often classified as a PDA.

    Another problem the Graspables encountered is that thesystem tends to show a significant bias regarding hand size.While this can be overcome by explicitly training theclassifiers for different users or by implementing a learningalgorithm, it would be desirable to have a more universalresponse.

    There is also the common question of how many differentgrasps the system is capable of recognizing. Unfortunately,we do not have a simple answer. From our experience thelimiting factor is not the sensing system as much as theapplication. Theoretically, the binary capacitive sensors onthe Bar of Soap can distinguish 722 different grasps in anumber of different orientations. While the anatomy of thehand would make that infeasible, the fact is there are notmany applications where shifting a finger a fraction of aninch creates a meaningful difference in the users mind.The Ball of Soaps pitch selection is an example whereminor changes in grip could have a meaningful impact.

    However, even in that case it would be difficult to drawdistinctions between minor grip changes withoutextensively measuring and studying live pitchers and balltrajectories.

    While the limited number of meaningful ways to hold arectangular box may make the Graspables System seem likeoverkill, we feel that our sensor density has other benefits.As the Rubiks cube application demonstrates, a grasp-recognition system can be used for more than justrecognizing distinct grasps. Virtual environments couldmake use of more precise mappings of hand position.Another possibility is a twister-like finger dexterity gamecould make use of far more grasp combinations that would

    naturally be used.Aside from the study used to test the accuracy of the grasp-recognition, many users have informally interacted with theGraspables. It is interesting how quickly users respond tothe idea of grasp-recognition. For the Bar of Soap as amode switching handheld in particular, users are quick toadjust how they are holding the device and seem to enjoytrying to figure out how the grasps have been trained in thedevice. Similarly, many users of the Ball of Soap as a pitchselector begin experimenting with various grasps even

    before the trained pitch grips are explained. Whether thisenthusiasm would transfer to real implementations of grasp-recognition is questionable, but it does seem to indicate that

    making better use of peoples senses of touch andproprioception could provide better interfaces.

    FUTURE WORK

    The prototypes discussed in this paper only represent asmall fraction of the potential implementations of theGraspables System. Another implementation that wasdiscussed and would be worth developing is a stylus

    prototype. In the graphics art world alone, pencils,

    paintbrushes, erasers and wands could all be represented bydifferent ways of grasping a stylus.

    For more of departure from the work in this paper,implementing Graspables into existing devices would beinteresting. Using the handheld device mode switching thatwas demonstrated by the Bar of Soap in a fully functionalhandheld would be worth studying. Questions about whento trigger the classification algorithms, what error rateswould be acceptable to users, and the general effectivenessof the natural mode switching would be better explored bylonger studies with fully functional devices.

    Applying what has been learned from the existingapplications to other scenarios also has potential. Can theGraspables System be used as a safety check to ensure that

    power tools are being operated properly? What could begained by expanding the scale of the system from handheldobjects to whole body-sized arrays?

    There is also room to perform further tests to improve thereliability and robustness of the system. Optimizing sensordensities could be valuable. Exploring how environmental

    factors such as humidity impact the capacitive sensorscould improve system reliability. Exploring additionalinputs such as pressure sensors could be beneficial. Lastly,the software and classifiers could always benefit from moretraining data.

    CONCLUSIONS

    The Graspables demonstrate how grasp-recognition canprovide a unique and intuitive user interface. We presenteda design rationale, our system design, a variety ofapplication scenarios, and a discussion of our experienceswith the system. The system we developed canimplemented in multiple geometries to provide a better

    representation of different objects in virtual environments.It can also provide devices with additional awareness ofusers intentions via their manipulations. As mobiledevices continue to grow in power and popularity, newinteraction methods will need to be developedaccommodate them. We feel that grasp-recognition has the

    potential to provide significant enhancement to currentinterfaces.

    ACKNOWLEDGEMENTS

    We thank Jeevan Kalanithi, Daniel Smalley and MattAdcock for their work on the classification techniques.Thanks also to Quinn Smitwick, James Barabas and AnaLuisa Santos for their assistance. This work was supported

    by the CELab, Digital Life, and Things That Thinkconsortia at the MIT Media Lab.

    REFERENCES

    1. ngeslev, J., Oakley, I. and Hughes, S. BodyMnemonics: Portable Device Interaction DesignConcept. InProc. of Info. Vis. (2003).

    2. Chang, W., Kim, K.E., Lee, H., Cho, J.K., Soh, B.S.,Shim, J.H., Yang, G., Cho, S. and Park, J. Recognition

    CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

    924

  • 8/9/2019 Graspables: Grasp-Recognition as a User Interface

    9/9

    of Grip-Patterns by Using Capacitive Touch Sensors.IEEE ISIE 2006, 4 (2006), 2936-2941.

    3. Duda, R., Hart, P., Stork, D. Pattern Classification,John Wiley & Sons, Inc., 2001.

    4. Fitzmaurice, G.W., Ishii, H. and Buxton, W. Bricks:Laying the Foundations for Graspable User Interfaces.CHI 1995: Proc. of the SIGCHI Conf. on Human

    Factors in Computing Systems. ACM Press (1995),442-449.

    5. Fitzmaurice, G.W. and Buxton, W. An EmpiricalEvaluation of Graspable User Interfaces: TowardsSpecialized, Space-Multiplexed Input. CHI 1997: Proc.of the SIGCHI Conf. on Human Factors in Computing

    Systems. ACM Press (1997), 43-50.

    6. Harrison, B.L., Fishkin, K.P., Gujar, A., Mochon, C.,and Want, R. Squeeze Me, Hold Me, Tilt Me! InProc.of the SIGCHI Conf. on Human Factors in Computing

    Systems. ACM Press (1998), 17-24.

    7. Kim, K.E., Chang, W., Cho, S., Shim, J., Lee, H., Park,J., Lee, Y., Kim, S. Hand Grip Pattern Recognition forMobile User Interfaces. InProc. of AAAI 2006. (2006),1789-1794.

    8. Kry, P.G. and Pai, D.K. Grasp Recognition andManipulation with the Tango. Int. Symposium on

    Experimental Robotics. (2006), 551-559.

    9. Mntyl, V.M., Mantyjarvi, J., Seppanen, T. andTuulari, E. Hand Gesture Recognition of a MobileDevice User. IEEE Int. Conf. on Multimedia and Expo.1 (2000), 281-284.

    10.MacKenzie, C.L., Iberall, T. The Grasping Hand.Elsevier, 1994.

    11.Pai, D.K., VanDerLoo, E.W., Sadhukhan, S. and Kry,P.G. The Tango: A Tangible Tangoreceptive Whole-Hand Human Interface. InProc. of World Haptics.(2005), 141-147.

    12.Rabiner, L.R. A Tutorial on Hidden Markov Modelsand Selected Applications in Speech Recognition. InProc. of the IEEE. Vol. 77, no.2, (1989), 257-286.

    13.Rekimoto, J. Tilting Operations for Small ScreenInterfaces. InProc. of 9th ACM Symposium on User

    Interface Software and Technology. (1996), 167-168.

    14.Stiehl, W.D. and Breazeal, C. Affective Touch forRobotic Companions. At 1stInt. Conf. on AffectiveComputing and Intelligent Interaction. (2005).

    15.Stiehl, W.D., Leiberman, J., Breazeal, C., Basel, L.,Lalla, L., and Wolf, M. Design of Therapeutic RoboticCompanion for Relational, Affective Touch. 2005 IEEE

    Int. Workshop of Robots and Human Interactive

    Communication. (2005), 408-415.

    16.Taylor, B.T. and Bove, V.M. The Bar of Soap: A GraspRecognition System Implemented in a Multi-FunctionalHandheld Device. CHI 2008: Extended Abstracts on

    Human Factors in Computing Systems. (2008), 3459-3464.

    17.Weiser, M. The Computer for the 21st Century.Scientific American. Vol. 265, no. 3, Sept. 1991, 94-104.

    CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

    925