event-driven low-power gesture recognition using differential … · 2019-04-09 · ieee sensors...

13
IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition Using Differential Capacitance Gurashish Singh, Member, IEEE, Alexander Nelson, Student Member, IEEE, Sheung Lu, Student Member, IEEE, Ryan Robucci, Member, IEEE, Chintan Patel, Member, IEEE, and Nilanjan Banerjee Abstract—Individuals with mobility impairments represent a large portion of the population. These impairments are often the sequels of strokes, spinal cord injury, or diseases, such as amyotrophic lateral sclerosis and Guillain–Barré syndrome. Many of these diagnoses manifest in upper-extremity mobility impairment, which may prohibit or make difficult actuating devices within the home. While solutions exist for smart-home automation and control, a few are approachable by those with mobility impairments. To address this issue, we designed Inviz, a touchless low-power assistive gesture recognition system that utilizes an array of textile capacitive sensors to control a smart-home automation system. We present two novel research contributions; the use of flexible textile-based capacitive arrays as differential proximity sensors, and a low-power event-driven hierarchical signal processing algorithm for capacitor sensor arrays. A fully functional prototype system is implemented and evaluated in the context of an end-to-end home automation sys- tem using intuitive swipe and hover gestures that are comfortable and approachable for users with mobility impairments. The evaluation demonstrates that the system achieves high accuracy (>90%) while maintaining low latency (<1 s) and low energy consumption (2475 μW). In addition, we evaluate the system on a subject with upper extremity mobility impairment to verify its usage as an accessibility device. Index Terms— Capacitors, sensor arrays, assistive technology, assistive devices, gesture recognition, wearable sensors. I. I NTRODUCTION S MART home systems and cyber-physical interfaces have permeated the technological landscape in recent years. Despite this uptick in bridging the physical world to a more universally accessible sphere, there remain very few interfaces which consider people with upper-extremity (UE) mobility impairment. This population represents a large portion of society. In fact an estimated 1.5 million individuals in the United States alone are admitted to hospitals each year because of injuries (spinal-cord, traumatic brain) and stroke [1]–[3]. Manuscript received December 25, 2015; accepted January 31, 2016. Date of publication February 16, 2016; date of current version May 17, 2016. This work was supported in part by the National Science Foundation under Grant CNS-1305099 and Grant CNS-1308723, and in part by the National Science Board under Grant IIS-1406626. This is an expanded paper from the IEEE Pervasive Computing and Communications 2016 Conference. The associate editor coordinating the review of this paper and approving it for publication was Prof. Ravinder S. Dahiya. (Gurashish Singh and Alexander Nelson contributed equally to this work.) The authors are with the Department of Computer Science and Electrical Engineering, University of Maryland at Baltimore County, Baltimore, MD 21250 USA (e-mail: [email protected]; [email protected]; slu2@ umbc.edu; [email protected]; [email protected]; [email protected]). Digital Object Identifier 10.1109/JSEN.2016.2530805 These diagnoses can oft result in weakness, paresis, or paral- ysis to a broad region because of damage to the nervous system. Changes to healthcare has resulted in shorter length stays even though these circumstances often require extensive rehabilitation to achieve meaningful physical recovery [4], [5]. Assistive technologies which interact with the user to maximize independence early and efficiently can ease the physical and economic burden which result from this transitory period. For injuries which have long-term or permanent effect, integrated assistive technologies which are embedded into the individuals environment can create an accessible living space capable of allowing autonomy within their own home. Gesture recognition is fast becoming a common method for device control, and is approachable by individuals with UE motor disabilities. A 2014 study of interviews with individuals who have motor impairments concludes that an intuitive, reliable, easy to set up system is important for continuance of use [6]. Gesture recognition has been performed in the past using inertial sensors, vision systems, resistor sensing, and other approaches which capture body motions [7]–[12]. UE mobility impairments, however, present fundamental challenges which the above approaches often fail to address. Vision-based systems, such as eye-tracking or body gesture, are typically rigidly fixed, having a single orientation in which they will recognize a gesture. If the camera is in the environment, then the user’s body may occlude the system. Cameras fixed to the user are obtrusive, calling unnecessary attention to the disability which leads often to abandonment [13]. Physical contact systems such as tactile switches and resistive sensing devices can cause skin contact injuries. Similarly, evoked-potential systems which use electrodes and often an electrolytic gel can cause skin irritation and abrasion. These skin injuries create conditions that can have a deleterious effect if unnoticed due to diminished sensation in the extremities, which is common for individuals with UE disabilities. Moreover, existing solutions are often not suitable for those with impairments as they assume certain motions which a person may not be able to complete. Capacitive-Sensor Arrays (CSA) utilize a change in capac- itance to detect remote bodies. Using differential capacitance on a fixed geometry, movement gradients can be determined by the combinatorial change in capacitance between pairs of capacitors. These sensors are made from metal-infused textiles (thread and cloth) that can worn or otherwise integrated easily into the environment of persons with UE mobility impairments (e.g. wheelchairs, bed-sheets) rendering the sensing modality 1558-1748 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Upload: others

Post on 23-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955

Event-Driven Low-Power Gesture RecognitionUsing Differential Capacitance

Gurashish Singh, Member, IEEE, Alexander Nelson, Student Member, IEEE, Sheung Lu, Student Member, IEEE,Ryan Robucci, Member, IEEE, Chintan Patel, Member, IEEE, and Nilanjan Banerjee

Abstract— Individuals with mobility impairments represent alarge portion of the population. These impairments are oftenthe sequels of strokes, spinal cord injury, or diseases, suchas amyotrophic lateral sclerosis and Guillain–Barré syndrome.Many of these diagnoses manifest in upper-extremity mobilityimpairment, which may prohibit or make difficult actuatingdevices within the home. While solutions exist for smart-homeautomation and control, a few are approachable by those withmobility impairments. To address this issue, we designed Inviz,a touchless low-power assistive gesture recognition system thatutilizes an array of textile capacitive sensors to control asmart-home automation system. We present two novel researchcontributions; the use of flexible textile-based capacitive arraysas differential proximity sensors, and a low-power event-drivenhierarchical signal processing algorithm for capacitor sensorarrays. A fully functional prototype system is implemented andevaluated in the context of an end-to-end home automation sys-tem using intuitive swipe and hover gestures that are comfortableand approachable for users with mobility impairments. Theevaluation demonstrates that the system achieves high accuracy(>90%) while maintaining low latency (<1 s) and low energyconsumption (2475 µW). In addition, we evaluate the system ona subject with upper extremity mobility impairment to verify itsusage as an accessibility device.

Index Terms— Capacitors, sensor arrays, assistive technology,assistive devices, gesture recognition, wearable sensors.

I. INTRODUCTION

SMART home systems and cyber-physical interfaces havepermeated the technological landscape in recent years.

Despite this uptick in bridging the physical world to a moreuniversally accessible sphere, there remain very few interfaceswhich consider people with upper-extremity (UE) mobilityimpairment. This population represents a large portion ofsociety. In fact an estimated 1.5 million individuals in theUnited States alone are admitted to hospitals each year becauseof injuries (spinal-cord, traumatic brain) and stroke [1]–[3].

Manuscript received December 25, 2015; accepted January 31, 2016. Dateof publication February 16, 2016; date of current version May 17, 2016.This work was supported in part by the National Science Foundation underGrant CNS-1305099 and Grant CNS-1308723, and in part by the NationalScience Board under Grant IIS-1406626. This is an expanded paper fromthe IEEE Pervasive Computing and Communications 2016 Conference. Theassociate editor coordinating the review of this paper and approving it forpublication was Prof. Ravinder S. Dahiya. (Gurashish Singh and AlexanderNelson contributed equally to this work.)

The authors are with the Department of Computer Science and ElectricalEngineering, University of Maryland at Baltimore County, Baltimore,MD 21250 USA (e-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]).

Digital Object Identifier 10.1109/JSEN.2016.2530805

These diagnoses can oft result in weakness, paresis, or paral-ysis to a broad region because of damage to the nervoussystem. Changes to healthcare has resulted in shorter lengthstays even though these circumstances often require extensiverehabilitation to achieve meaningful physical recovery [4], [5].Assistive technologies which interact with the user tomaximize independence early and efficiently can ease thephysical and economic burden which result from this transitoryperiod. For injuries which have long-term or permanent effect,integrated assistive technologies which are embedded into theindividuals environment can create an accessible living spacecapable of allowing autonomy within their own home.

Gesture recognition is fast becoming a common methodfor device control, and is approachable by individuals withUE motor disabilities. A 2014 study of interviews withindividuals who have motor impairments concludes thatan intuitive, reliable, easy to set up system is importantfor continuance of use [6]. Gesture recognition has beenperformed in the past using inertial sensors, vision systems,resistor sensing, and other approaches which capture bodymotions [7]–[12]. UE mobility impairments, however, presentfundamental challenges which the above approaches oftenfail to address. Vision-based systems, such as eye-trackingor body gesture, are typically rigidly fixed, having a singleorientation in which they will recognize a gesture. If thecamera is in the environment, then the user’s body mayocclude the system. Cameras fixed to the user are obtrusive,calling unnecessary attention to the disability which leadsoften to abandonment [13]. Physical contact systems such astactile switches and resistive sensing devices can cause skincontact injuries. Similarly, evoked-potential systems which useelectrodes and often an electrolytic gel can cause skin irritationand abrasion. These skin injuries create conditions that canhave a deleterious effect if unnoticed due to diminishedsensation in the extremities, which is common for individualswith UE disabilities. Moreover, existing solutions are oftennot suitable for those with impairments as they assume certainmotions which a person may not be able to complete.

Capacitive-Sensor Arrays (CSA) utilize a change in capac-itance to detect remote bodies. Using differential capacitanceon a fixed geometry, movement gradients can be determinedby the combinatorial change in capacitance between pairs ofcapacitors. These sensors are made from metal-infused textiles(thread and cloth) that can worn or otherwise integrated easilyinto the environment of persons with UE mobility impairments(e.g. wheelchairs, bed-sheets) rendering the sensing modality

1558-1748 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

4956 IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016

Fig. 1. This figure shows all the components of our assistive gesturerecognition system. The fabric capacitor-sensor-arrays (CSA) are sewn intodenim fabric in some geometrically defined grid, as shown in the 4-plate and12-plate sensors. We measure, collect, and analyze the capacitor data on ourcustom designed wireless module, which combines application specific ICs,an MSP430 micro-controller, and a Bluetooth Low Energy (BLE) communi-cation module. This is all worn as a pad, or integrated into clothing which isworn by users for home automation access.

effectively invisible. We have implemented and evaluatedtextile CSAs in a fixed geometry as an assistive gesture recog-nition system for UE motor impairment. Our solution, Inviz,is responsive and accurate as demonstrated by our evaluation.Moreover, we integrate the sensors into clothing material,and perform the gesture recognition on a small embeddedcontroller such that the entirety of the system is minimallyobtrusive, a major deterrent of long-term use [6]. The fabricsensors are subject to the same wear-and-tear as traditionalfabrics, but are themselves inexpensive and replaceable, asthe computation module is detachable and reusable. Figure 1demonstrates the prototype system which we have developedusing fabric capacitive plates and conductive threads sewninto denim to emulate a system built into jeans. The sensorsare sensitive to small movements within a reasonable range(<10 cm) of the sensors. The data from the sensors iscollected and analyzed using a low-power hierarchical signalprocessing algorithm which converts the differential capacitivesignals into gestures. We have connected this to a Smart-Homeautomation system to control appliances. The system supportstwo types of gestures: a “Swipe” which involves moving thehand from one area to another, and a “Hover” which involvesplacing the hand above a single area, and then removingit from the array. These gesture types were established astypically approachable and comfortable for people withUE motor impairments in conversations with medical profes-sionals (e.g. Physical Therapists) and individuals with partialparalysis (though capabilities within this population varywidely depending on diagnosis, therapy, and the individual).The use of CSAs allows for non-contact gestures, whichprevents skin-contact injuries, and imprecise gestures whichcan be common in those with tremor or extended weakness.

II. RELATED WORK

The system and the ideas presented within build uponthe body of work in capacitive sensing, gesture recognition,and signal processing for gesture recognition. Our work iscompared with relevant literature below.

A. Capacitive Sensing

Our system utilizes and builds on knowledge from capac-itive sensing [14] applied to many areas including industrial,automotive, and healthcare applications [15], [16]. Addition-ally, these sensors have been used for positioning [17], [18];humidity sensing [19], [20]; tilt sensing [21], [22]; pressuresensing [23]; and MEMS-based sensing [23]. Capacitor sen-sors have been extended to proximity sensing applicationsfor robotics, industrial monitoring, and healthcare [24]–[26].More recently, products such as Microchip’s GestIC [27] havebeen developed which allow for 3D touchless gesture trackingthrough the use of a rigid array of capacitor sensors. Oursystem extends the use of capacitor sensors in two distinctfashions. First, the use of textiles as a capacitor sensor arrayembedded into objects of everyday use for remote gesturesensing is novel within the field to the best of our knowledge.This particular application faces several unique challenges,especially as applied to users with UE mobility impairments.That the sensors are not rigid increases uncertainty withinthe system. Further, Inviz utilizes an intelligent hierarchicalsignal processing algorithm for ambient gesture recognitionwithin the context of flexible capacitor sensors, allowing fora responsive and low-power solution. We have outlined theapproach which allows for customization both on the recogni-tion scheme and in terms of personalization to the user space.

B. Gesture Recognition Systems

Gesture recognition systems have emerged as an intuitiveand easily approachable solution for environmental control.Many such systems utilize vision for gesture recognition [28],which allow environmental control without necessitating con-tact with a physical device. However, vision based systemseither require blanket coverage of the home, which can costthousands of dollars, or a user-mounted camera. The develop-ment of assistive home-automation solutions for persons withdisabilities has been adopted to enable greater autonomy interms of environmental control. Examples of such systemsinclude voice-activated systems [29], head and eye trackingsystems [30], and inertial sensor based systems [12]. Theability to customize these systems is imperative, as manydiagnoses can express in a wide range of severity. For UE,the system should be able to function in any space; it shouldbe responsive and approachable, regardless of orientation orposition of the user. Inviz allows for subtle arbitrary gestureswithin the user’s own reference frame as it is built into theenvironment of the user, notably into clothes, bed sheets, oras a pad.

C. Signal Processing for Gesture Recognition

Signal processing as applied to gesture recognition is anestablished field with a large body of work. Hidden Markov

Page 3: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

SINGH et al.: EVENT-DRIVEN LOW-POWER GESTURE RECOGNITION USING DIFFERENTIAL CAPACITANCE 4957

Models [31], Bayesian inference [32], and decision trees [33]have all been applied to gesture recognition to convert datafrom sensors into movements and gestures. Our system utilizesthis higher-level learning as a classification tool. The inno-vation within Inviz is the combination of feature extractionand learning techniques within these different power tiers toresponsively process flexible CSA data with minimal energyconsumption.

D. Previous Work

We extend on our own work in this area through analy-sis and characterization of the fabric capacitor sensors, theanalysis of the system as it applies to individuals with limitedmobility and with the addition of virtualized combinatorialdifferential sensors [34]. A separate parallel work utilizeda similar sensing platform with multiple single-ended mea-surements of capacitance for detailed position-tracking andclassification system for recognition of complex drawn ges-tures [35]. In that work, training and test gesture data waspreprocessed to ignore position, scale, and rotation in orderto improve classification performance for prescribed complexgestures. The newly presented work herein instead exploitsposition, scale and rotation features in the design of sensingand processing to provide a simpler gesture interface especiallyuseful to some with injuries that co-occur with traumaticbrain injury. The sensors themselves are differentiated bythe utilization of geometric differential connections and thesensing modality and processing scheme uses event-drivendifferential capacitive sensing and recognition.

III. CAPACITIVE SENSOR ARRAYS

FOR GESTURE RECOGNITION

In order to address the holes left by existing assistivesystems in regards to touch, precision, occlusion, andobtrusion, we implemented a wearable, effectively invisible,sensing platform through the integration of textile capacitivesensors. While many capacitive sensor applications utilizetouch-based sensing, we utilize capacitive sensing as remote-body detection as a touchless gesture recognition system.

The capacitor sensors can be touchless because of the modein which the sensors can detect changes in the electric field.The use of sensor plates separated by air as the medium createsa capacitor where the capacitance is relative to the distancebetween plates and the dielectric, which in the nominal caseis air. When the user places their hand into the area aroundthe plates, the perturbation of the electric field is detected bya change in the effective dielectric. This method of sensingallows for versatile remote movement detection as compared toinertial sensors which measure only movement relative to thebody of which they are attached. Figure 2 is a representation ofthe capacitor sensors being worn by a user who is performinggestures above the array. As the user moves their hand closeto the array, the capacitance Cb increases, and therefore theinverse of Cb can be used to localize the distance of the handwith respect to a single plate. Figure 2 also demonstrates thecross-section of the sensor array, which is composed of a gridof fabric capacitive sensors connected via conductive threads.

Fig. 2. This figure demonstrates two design decisions within our prototypesystem. The left image depicts the equivalent electrical circuit which allowsfor measurement of remote bodies using capacitor sensors. Coupling the userinto the ground allows for more sensitive measurement of motion in the activesensing region. The right image shows the construction of the sensor array.A layered solution allows for the top layer to sense only bodies above thearray through the use of the middle shield layer. The bottom layer allows forcoupling the user into the system ground. Conductive threads act as the wiresto connect the sensor to the data collection module.

Fig. 3. (a) Top image is a depiction of two different types of sensor con-struction. The left is a sensor plate made from embroidering into denim withmulti-ply conductive thread. The right is a sensor plate made of conductivefabric which are ironed onto denim. (b) This plot give a representation of therelative sensitivity of the sensor plates as a function of diameter and type ofconductive plate.

We use a shielding plane to restrict capacitance measurementto above the array and to minimize parasitic capacitance andnoise coupling. The last layer is a ground plane which couplesthe human body to the ground of the sensor which provides acommon reference for the capacitance measurements.

A. Sensor Characterization

A textile capacitor plate can be fabricated using twomethods. The first method uses fabric capacitor plates(see Figure 3 (a) right) that are cut from metal infused fabricsheets and ironed and sewn into their proper positions onsome fabric (e.g. denim). Routes are drawn to the plates usingmetal infused threads sewn into the fabric and attached tothe sensor board using vampire clip adapters. Alternatively,one may use embroidery (see Figure 3 (a) left) to create

Page 4: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

4958 IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016

these plates from metal-infused threads, thereby automatingthe process of creating Inviz sensors using CNC sewing andembroidery machines. We characterize the range of both typesof capacitor plates in Figure 3 (b). In the experiment, wefabricate circular capacitor plates of varying radius and thenmeasure the capacitance as the distance between a user’s handand the plate is varied. We experimented with three capacitorplate diameters (35mm, 45mm, 60mm). From Figure 3 (b) wecan draw two conclusions. First, we find that both method offabricating the sensors have similar exponential capacitance-distance characteristics. However, the embroidered sensorplates have lower overall capacitance. This is due to the densityof the metal in the embroidered plates which is less thanthe fabric plates. A larger capacitance can be achieved byincreasing the size of the plate or by creating thicker weaves.Second, we find that the smallest plates have a range of closeto 4 inches after which the capacitance does not change withdistance between the hand and the plate. These values areconsistent with those found by Rus et al. [36] who perform ananalysis of capacitor sensor sensitivity in multiple forms. Thisrange is sufficient to prevent accidental touch and skin abrasionand can be adapted as the range is variable with plate size andshape. We have developed two sensors with 4 and 12 sensorplates respectively using textile fabric plates (see Figure 1).

B. Capacitive Sensor Arrays

This system uses an array of textile capacitive sensor plates,granting multiple advantages compared to a single platesystem. First, the use of differential capacitive measurementsaids in minimizing the noise due to stray capacitance andmovements within the array. Additionally, the geometricproperties of the array in combination with the differentialcapacitance measurements can be used to extract rich motionartifacts such as velocity, which can be useful in the determi-nation of different gestures. Figure 4(a) is a demonstration ofone of these features. The figure displays the analog differencebetween two capacitor plates as a person performs a “swipe”gesture and moves their hand from one plate to another. Thedifferential capacitance graphed is shown as a peak, followedby a zero crossing, followed by a valley. The peak-to-valleytime can be utilized as the inverse of the velocity in thedirection specified by the angle of the sensor-pair, with thedirection of the vector specified by the relative ordering ofthe peak-valley. Figure 4(a) also graphs the analog differenceof a “hover” gesture, demonstrating the differential capacitancebetween plates when the user maintains their hand above asingle plate. In this case, the width of the peak correspondsto the time in which the “hover” gesture took place.

The examples above serve as demonstrations as to how anarray of capacitors can be used to capture high-level movementfeatures such as time, velocity, and relative position of aforeign body with respect to the array. The use of an arrayof these sensors and their combinatorial sensor pairs providemultiple vantage points of the each movement and can increasethe reliability of gesture recognition. Our system converts datafrom the CSA into reliable gestures through a hierarchicalsignal processing algorithm which we detail in Section IV.

Fig. 4. (a) This figure depicts two different gestures, a “swipe” (hand movedfrom one sensor to another) and a “hover” (hand rests above a single plate),and the characteristic response gathered from the raw capacitance data (i, iii).The capacitance is measured against a high and a low threshold, which is usedto make a ternary classification of each data frame. This data minimization islater used to infer features, which are used in gesture classification. (b) Thisfigure illustrates one of the primary challenges of gesture recognition withfabric CSA. The three iterations are each the same gesture type performed bythe same subject. The data from the CSA has a high amount of irregularity dueto perturbations in movement by the user. Our system mitigates the effect ofthis irregularity through the use of higher level feature extraction and machinelearning classification.

The intuition behind some of the signal processing is shownin Figure 4(b), which illustrates one of the fundamentalchallenges in processing the raw differential capacitance data.Pictured in the figure is the same gesture which is performedthree times by the same user, with regions i and i i demon-strating the variability and irregularity of the raw capacitance.

IV. HIERARCHICAL SIGNAL PROCESSING

We utilize a hierarchical design to distribute the capaci-tive data processing into several computational tiers, includ-ing observation, feature extraction, and machine learning asdepicted in Figure 5(a). The observation tier itself utilizeshardware invoked low and high power tiers to remain sensitivewhen a user is approaching the array, but in a low power

Page 5: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

SINGH et al.: EVENT-DRIVEN LOW-POWER GESTURE RECOGNITION USING DIFFERENTIAL CAPACITANCE 4959

Fig. 5. (a) Our system utilizes a hierarchical approach to gesture recognition. This organizational chart shows the building blocks and data flow whichmake up the end-to-end system. The Inviz prototype contains two power tiers, which allow for low-power responsive gesture recognition. (b) This chart isa depiction of some of the important features within our signal-processing hierarchy. Examples of filtering rejection rules are shown in the insets. The maindata flow demonstrates how each gesture is processes, filtered, and classified to determine the intended motion.

tier while the array is in disuse. The data flow moves fromlow-power special-purpose hardware to a general purposemicro-controller which is woken up as-needed when pertinentfeatures are collected. Figure 5(b) illustrates the conceptualblocks for transforming raw data from the capacitor plates intogestures. This design allows the system to be responsive andaccurate while using a minimal amount of energy. The detailsof each of the different processing tiers in our hierarchy areillustrated in Figure 5(b).

Though our evaluation utilizes a gesture set which iscomposed of only ‘hover” and “swipe” type gestures, thesignal processing for the system is intended to be a generalframework through which arbitrary gestures can be recognizedfrom an array of wearable fabric capacitors while consumingvery little energy (<3 mW). This architecture can potentiallybe extended to support a more complex gesture set.

A. Observation Calculation

The calculation of the input observations occurs at thelowest tier of the hierarchy and are calculated as linearcombinations of the capacitance from each of the sensors ofthe array. The observations {y1, .., yk}, are created through theobservation model

∑i=ni=1 Wi,k · ci where Wi,k ∈ {0, 1,−1},

ci represents the analog capacitance between a plate and theground, and n is the total number of capacitor sensors (4 forour system). Our system utilizes a single capacitance to digitalconversion unit, and therefore the observation vector is nottaken concurrently. Instead, the measurements are recordedperiodically in a sequential pattern which compose a singleobservation vector [y1, .., yk]. The linear combinations pro-vided to the capacitance-to-digital converter for each obser-vation are controlled through the use of low-power analogmultiplexers. The exact ordering of the measurements, in fact,does not have an effect on recognition provided that gesturemotions are slow compared to the sampling rate. The impact ofthe ordering is further diminished in our hierarchy through theuse of machine learning at the top computational level. Much

of the transient environmental noise is rejected through theuse of analog differential measurements, including any com-mon noise from among the plates. Differential measurementstrade off some amount of potential measurement distance foraccuracy as the differential measurements form a receptivefield which is most sensitive to motions in the proximity ofthe plates, but is more insensitive to motions at a distance ascompared to a single-ended measurement. This is acceptable,and maybe preferential as the differential measurements cancancel noise due to stray movements which are far from thesensor array, but instead can capture more subtle movementscloser to the array.

In Figure 4(a-i) we demonstrate a characteristic responsefrom the differential capacitor sensors during a “swipe” ges-ture. This gesture has the hand passing successively overthe two sensors. Figure 4(a-iii) similarly shows the analogdifferential during a “hover” gesture in which the hand onlypasses above a single sensor. These two responses are capturedin the system utilizing a pair of thresholds and detectors whichcapture events as illustrated in Figure 4 as the high (green) andlow (red) thresholds. Figure 4(a-ii and a-iv) demonstrate whatis captured in the system as two binary responses, green show-ing when the signal is above the high threshold and red whenthe signal is below the low threshold. Event thresholds areestablished comparative to a baseline measurement for eachobservation channel, {y1, .., yk}, and is continually recalcu-lated when there is no activity near the sensor (i.e. when thereis a minimal change in capacitance �C). Experimentation wasused to determine the minimum separation of the thresholdsfrom the baseline, but is dependent upon the size of the platesthat make up the CSA. The binary threshold-event valuesare collected using an ultra-low power capacitance-to-digitalconversion IC which provides threshold-crossing detection andmanual offsets. Threshold events are often used to determinewhen a user has touched a capacitor sensor, but we haveexploited the measurement to drive the event generation for thehigher-tier calculations. This requires a much lower and noise-

Page 6: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

4960 IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016

prone threshold which must be accounted for in higher tiers.Further, irregularity of the sensors (including potential degra-dation of signal quality with wear) and irregularity betweensimilar gestures necessitate a more sophisticated higher-tierprocessing than what is required for simple touch-based ges-tures. Additionally, the CSA measures more complex signalsdue to the combination of the user’s hand, forearm, and wrist,which is not present when using a single-point finger touch.Below, we discuss how the binary outputs coming fromthe digital threshold detectors are used to build higher-levelfeatures which are passed into the final stages of machinelearning-based classification.

B. Event Message Generators

The next level of our processing hierarchy utilizethresholding signals for two important purposes. The binarysignals themselves are the only representation of the actualsignal that is passed into the event detection. This representsa compact representation of the signal, which minimizes thememory requirements of the system for capturing the signalhistory. This simplicity also enables real-time recognitionwhen using higher-level feature-extraction and machinelearning. Additionally, the binary signals serve as interruptsignals which wake-up the higher-level processor. This wake-up threshold, determined to be approximately 12 cm, wasdetermined experimentally to be sensitive in our laboratoryexperiments, though experimentation during in-situ trialswould be needed to determine the relevance of this thresholdin terms of false-positive and false-negative rates. For eachof the observations (yk) there is a defined upper and lowerthreshold, T Uk and T Lk respectively, which are calculatedas an offset from the baseline which is determined by themeasurement IC. Based on these thresholds and binaryinputs, we define two signals, B Pk and B Nk which are thepositive-peak and negative-peak binary signals respectivelythat are collected by the processor by the following formula:

BPk[n] ={

TRUE if yk[n] > TUk

FALSE otherwise(1)

BNk[n] ={

TRUE if yk[n] < TLk

FALSE otherwise(2)

where n is the sample number. Event detectors utilize thesebinary threshold signals to determine characteristics which areextracted as event features to form an “event message.”

An event is defined in our system as the period of con-tinuous TRUE values for either BPk or BNk for a givenobservation channel. Three event features are created for eachevent, and are defined as: (1) arrival time: the number ofsamples from the first threshold crossing to the beginningof the event; (2) duration: the amount of samples that theparticular binary signal is TRUE; (3) event polarity: a binaryvariable which determines which of BPk or BNk is TRUE.At the end of each event, an indicator is sent to the higher-level stage to prepare to process the events.

Gesture recognition is processed either on-line or inpost-processing. On-line recognition requires a considerable

Algorithm 1 CaptureEventMessages (�T = 1Sampling rate )

Event E1 = first event.gesturetimeout = eventduration (E1)for every �T seconds

IF gesturetimeout == 0break

end IFdecrement gesturetimeoutIF Event Ei is collected from observation i .

gesturetimeout + = eventduration(Ei)end IF

end forreturn TRUE

amount of processing as the classification is performed andupdated during the gesture until it reaches a confidence level.This is intractable for a low-power embedded processor. Forpost-processed gestures, one must determine the end of agesture, which is an important tradeoff. Ending the gestureearly could result in a missed event which could createa misclassification. On the other hand, latency and powerconsumption of the system are relative to the length of thegesture. We maintain a countdown timer which is incrementedby the duration of events and terminate the gesture when thetimer reaches zero.

The event duration is indicative of two parameters: (1) thespeed of a gesture, and (2) when events could be co-locatedacross multiple observation channels. The intuition behindAlgorithm 1, is to prevent a timeout while the gesture maystill be in progress by extending the timeout proportional tothe length of these events. Once all events are collected, theyare propagated to the aggregation and filtering module.

C. Event Filtering

During evaluation of our prototype, several instances haveoccurred in which spurious or atypical events are created eitherdue to irregularity in the way gestures are performed or withthe sensor data itself.

Figure 5(b) gives examples of these spurious events whichcan be a product of movement irregularity, and must be filteredusing per-channel filtering rules. There are four per-channelrules, defined as: (1) Any event which has a much largerduration than all previous durations in a given observationchannel will remove all previous event messages from thatchannel queue; (2) When there exist two events with the samepolarity within a single channel, only the longer of the twoevents is kept; (3) If consecutive events arrive which have asizable delay in arrival time compared to an earlier event, theearlier message is removed from the channel queue; (4) If anytwo event messages have a large difference in their duration,then the shorter message is removed from the channel queue.

After the per-channel filtering, the events are aggregatedacross all observation channels to apply another set of cross-channel filtering rules. An example of these rules is shownin Figure 5(b). The rules for cross-channel filtering are defined

Page 7: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

SINGH et al.: EVENT-DRIVEN LOW-POWER GESTURE RECOGNITION USING DIFFERENTIAL CAPACITANCE 4961

as: (1) If any channel’s total duration of messages is consid-erably shorter than the average total, then it is assumed thatthis channel is receiving noise, and is removed from the eventset; (2) If the sum of all durations across the observationchannels is small, then the entire gesture is filtered out andreported as a null gesture. After filtering, each observationchannel’s will have either no events, a single event or a pairof complimentary events. If there are no events, then thechannel is labeled as NAE for not-an-event. If there is a singleevent, then the channel is labeled by the polarity of the event(P for positive, N for negative). Finally, if a channel containsa complimentary pair of events, i.e. a positive then a negativeevent, then the message is labeled as PN for if the channelhad a positive followed by a negative event, or NP for theopposite case. In the case of a PN or NP channel, the eventsare combined, and have a single duration which is calculatedas the sum of the pair of durations. In the case of NAE, theduration and arrival time are both zero.

D. Event Characteristics

The features calculated from all of the events is then passedto the gesture classifier. Each observation channel reports thefollowing information, called “Event Characteristics”: EventLabel (P, N, PN, NP, NAE), arrival time with respect tofirst event, and duration. Each gesture contains a uniqueset of event signatures which are important for distinguish-ing between gestures. For instance, the combination of theduration and arrival time encode the velocity of the gesture(both speed and direction of motion), which can be used todistinguish gestures, and potentially between separate users.The combination of all of the features are used by the machinelearning algorithm to infer the gesture classification. All signalprocessing occurs in real-time on the micro-controller.

E. Machine Learning Algorithm

The highest level of our signal processing hierarchy isa supervised machine learning classification. The classifiertakes in the filtered message bundles as its observationmodel, and gives as output the classification. To obtain thesupervised training data, each subject is asked to performeach of the gestures from the set a number of times, and thatis inserted into the training data set with its correct labeledgesture. To determine the classifier which we would use forthe real-time system, we experimented with several low-levelclassifiers including Nearest Neighbor, Decision Trees, andNaive Bayes. The relative accuracies and complexity trade-offs are compared in Section VI. When the classifier hasfinished and determines the gesture output, a Bluetooth LowEnergy (BLE) module is woken and transmits the gesturetype to a base-station which is used to control the appliancesof a custom smart-home automation system.

V. PROTOTYPE IMPLEMENTATION

For the development and evaluation of this system, we haveimplemented an end-to-end fully functional prototype cyber-physical system for home automation, which is illustrated

in Figure 1. The classified gestures are transmitted over aBLE connection to a personal computer which controls thesmart-home appliances over a Z-Wave connection throughthe Micasaverde Vera gateway. Our prototype is composedof a custom-designed PCB which is stocked with: a low-power capacitance measurement IC which can perform themeasurement, observation calculation, and thresholding;an MSP430 micro-controller, and a BLE module forwireless communication. The system is powered by a 1 AhLi-Ion rechargeable battery. The CSA is created from metal-infused fabric which is sewn into denim, and routed to themeasurement circuit with 4-ply conductive thread which hasa linear resistance of approximately 50 �/m.

During this implementation we faced two specific designchallenges for the integration of textile-based wearable capac-itor sensors. First, the threads used for the routing are createdby weaving silver-plates/spun-stainless steel threads togetherwith non-conductive thread. This composition, however, has atendency to fray at the ends, which can cause microscopicshorts between capacitor plates of adjacent threads. Thisproblem is particularly difficult to diagnose, as the two sensorsmay appear to be functioning because their sensor values are acombination of the two plates. This is mitigated by maintainingenough space between threads, and coating the connections toprevent fraying. Connecting the threads to the measurementcircuit is also difficult as the threads combust with too muchheat, and therefore are resistant to soldering. We mitigated thisproblem through the use of vampire tap connectors which clipinto the thread.

For the sensor array, we used two different array configu-rations, one with four sensor plates and the other with twelvesensor plates as shown in Figure 1. Increasing the numberof capacitor plates on the sensor allows us to explore thefidelity of gesture recognition as the spatial resolution ofthe sensors is increased. The use of the four plate sensoris straightforward – each physical sensor plate is consideredas the sole component of a positive or negative terminal,with directionality established within the two-dimensionalplane. The 12-plate sensor was constructed to maximize thecapability of our measurement IC. In the 12-plate sensor, eventhough we increased the number of plates we maintained thedefinitions of the swipe and hover gestures defined originallyon the 4-plate sensor. To this end, we define multi-plate sub-regions according to their location with respect to the locationsof the original plates in the 4-plate sensor. However, we areafforded the flexibility to dynamically reshape the sub-regionsmost appropriate for each particular differential measurement.For consistency in nomenclature in discussing the two sensors(4-plate and 12-plate), we define dynamic “virtual sensors”,0 through 3, of the 12-plate sensor as the dynamic sub-regionsdescribed. For instance, in the 12-plate sensor a stage =C1 + C2 − C3 − C4 would imply that plates 1 and 2 areassigned to the positive terminal and plates 3 and plate 4 areassigned to be the negative terminal. For the above stage,C1 + C2 would comprise a virtual sensor and C3 + C4would comprise a second virtual sensor. In our evaluation,we use two configurations of virtual sensors for the 12-platesensor in Figure 6(a) and Figure 6(b). The configurations in

Page 8: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

4962 IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016

Fig. 6. (a) This figure demonstrates sensor configuration 1 for the 12-patchsystem. This configuration places the virtual sensors in the corners so thatit can use more sensor plates per stage, creating a greater coverage area.Each of the six stages used for the signal processing is identified bythe plates connected to the positive (green) and negative (red) terminals.(b) This figure demonstrates sensor configuration 2 for the 12-patch system.This configuration places the virtual sensors along the edges, thereby reducingthe number of “reaching” gestures. This most closely approximates theprocessing used for the 4-patch system. Each of the six stages used for thesignal processing is identified by the plates connected to the positive (green)and negative (red) terminals.

Figure 6(a) place the virtual sensors towards the corners ofthe sensor array and in Figure 6(b) the virtual sensors areplaced towards the edges. We chose these two configurationsbecause our conversations with subjects with limited mobilityhave revealed that some users can perform gestures by movingtheir hands to the edges of the sensor array while others aremore comfortable performing gestures that span the corners ofthe sensor array. The sub-figures in Figure 6(a) and Figure 6(b)illustrate the stages. The + sign depicts the capacitor platesthat are used as positive terminals while the − sign depictsthe plates that are negative terminals.

VI. SYSTEM EVALUATION

The main goal of this system is to provide a low-power,accurate, and real-time method of controlling a smart-homethrough gesture recognition for assistive purposes, especiallythose with UE limited mobility. To demonstrate this, wehave focused the evaluation of this system on the followingquestions:

(1) Does the system accurately classify gestures acrossall subjects? (2) What is the average energy consumptionof the system, and how does that compare to the baselinesystem which does not utilize a hierarchical signal-processingarchitecture? (3) What trade-offs exist between recognitionaccuracy and the type and extent of training that must occur?In answering these questions, we present a set of micro-benchmarks which give a more detailed representation of thepower-consumption and latency of the different subsystemswithin the prototype system.

A. Experimental Setup

The experimentation was performed using five adult able-bodied subjects and a single subject with a C6 spinal cord

TABLE I

MICRO-BENCHMARKS OF POWER CONSUMPTION OF EACH OF THEDIFFERENT COMPONENTS ON THE DATA-COLLECTION MODULE

TABLE II

LATENCY OF EACH COMPONENT

injury. The five able-bodied subjects act as a baseline forevaluating the accuracy of our gesture recognition system. Ourevaluation on the limited mobility subject helps us understandthe limitations of our system when applied to the targetpopulation group. We discuss our results from our limitedmobility subject separately in Section VI-A. The experimentalsetup consisted of each subject wearing the CSA on theirthigh, and performing a set of “swipe” and “hover” gestures.Each of the subjects performed an average of 180 gestures,which consisted of between 9-12 gesture sets, with each setcontaining each of the 16 gestures defined below. Using thefour plates in the initial prototype illustrated in Figure 1, the“swipe” gestures are performed as the combinatorial move-ment from each plate to each other plate. More formally, allcombinations of i → j , where i �= j , and i and j are the platesnumbered from 0 through 3, representing a full 12 gestures.The gesture set included 4 “hover” gestures, which are denotedby the plate numbers {0, 1, 2, 3}. Note that for the twelveplate CSA, we used the configurations defined in Section V todesign the four virtual sensors. Each subject was allowed timeto get accustomed to the system with practice gestures beforetraining began. All of the accuracy results for the able-bodiedsubjects comes from this single session. To accomplish this,the training and testing were performed off-line using cross-validation. Below, we present our micro-benchmarks from theprototype system, followed by accuracy results from the fourplate prototype system, and system trade-offs. In our experi-ments with the limited mobility subject, we also compare theaccuracy of recognizing the gesture when using the four plateand twelve plate configurations.

B. Micro-Benchmarks

Tables I and II present an evaluation of the energyconsumption and latency of the different subsystems within theprototype system. The need for a hierarchical signal processingarchitecture is apparent based on the energy consumption.On average, the Bluetooth module consumes an order of mag-nitude more energy than the micro-controller which consumesanother four times more than the capacitance measurement IC

Page 9: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

SINGH et al.: EVENT-DRIVEN LOW-POWER GESTURE RECOGNITION USING DIFFERENTIAL CAPACITANCE 4963

Fig. 7. (a) This graph depicts the relative accuracy of three different classification methods Nearest Neighbor Classifier, Decision Tree Classifier, and NaiveBayesian Classifier across our five subjects. Although the difference is slight, the 1NN algorithm maintained a higher overall accuracy than the other two.Error bars are included as the standard deviation of accuracy per algorithm. (b) A confusion matrix demonstrating the accuracy of the 1NN algorithm acrossall gestures and subjects. (c) This graph demonstrates the classification accuracy as a function of the size of the 1NN codebook. The accuracy increasesasymptotically with the number of training sets. Error bars are included as the standard deviation of accuracy per user. (d) This plot demonstrates the accuracyof each subject for aggregate training (data from all subjects is considered) against personalized training (data only from that particular subject is considered).Personalization results in slightly increased accuracy and decreased variance on a per-user basis.

in its low-power measurement mode. This motivates the needto duty cycle the micro-controller and BLE module as oftenas possible, and only waking each piece of the hierarchy asneeded. However, this design is useful only if the transitionlatency from low- to high-power tiers is reasonable for a real-time system. The latency reported by Table II demonstratesthat the major component of delay in the system is inperforming the machine learning algorithm, which takeson average little more than a quarter of a second on a1 MHz micro-controller, demonstrating the efficiency of theimplementation.

C. System Accuracy

This first accuracy evaluation focuses on the initial trialwhich compares the accuracy of classification of the fiveable-bodied subjects using the 4-plate prototype. Figure 7(a)demonstrates the relative classification accuracy for threemachine learning algorithms (Nearest Neighbor Classifier,Decision Tree Classifier, and Naive Bayesian Classifier) acrossthe five subjects. These three classifiers were chosen becausethey represent a wide computational range of needs. TheBayesian classifier and decision tree classifier require anamount of training that may not be feasible on a micro-controller. Each of these classifiers, however, are computa-tionally efficient enough to operate on the micro-controllerafter their training phase. Nearest Neighbor can be completelyimplemented on the micro-controller so long as the number ofgestures needed in the codebook are not substantially largeenough to consume the memory. The accuracy of each ofthe three algorithms is around 90%, but a 1NN algorithmperforms slightly better for our subjects with an averageaccuracy of approximately 93%. Therefore, for the real-timeclassification in our prototype system we implemented the1-Nearest Neighbor classification. A confusion matrix is pre-sented in Figure 7(b) for all of the sixteen gestures. Theconfusion matrix demonstrates that accuracy is diminishedin “swipe” gestures for which plate 1 was involved. In theexperimental setup, each of the subjects strapped the prototype

Fig. 8. This graph plots the current consumption (measured at 3.3 V) as afunction of gesture frequency for two systems; the Inviz system with hierar-chical signal-processing and wake-up scheme, and a similar sensing systemwhich transmits all data to a base-station for processing. The consumption ofeach system is broken down into its respective components.

to their right leg and performed gestures with their right hand.This orientation of the sensor plates (illustrated in Figure 1)causes the subject to pass their hand over other plates forany “swipe” gesture that includes plate 1, which can causespurious data points and potentially result in misclassificationif the filtering does not remove the accidental events. A changein the orientation will create misclassification on other plates,so rotation does not solve this problem. More filtering andsophisticated data collection could be used to remove theseerrors. Despite this interference, our system is able to per-form gesture classification with an average accuracy of closeto 93%.

D. Energy Consumption

Figure 8 demonstrates the energy consumption of twodifferent systems; one which utilizes a hierarchical wake-upand data processing scheme, and a second (termed baseline)which processes all data on the micro-controller and uses theBLE only to send gestures. The baseline system consumesmuch more energy for the observation and measurement ICbecause the chip does not process the raw data, and cannotmake decisions about whether to be in its low- or high-power state. Similarly, the baseline micro-controller is in an

Page 10: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

4964 IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016

always on state, consuming a near constant 500 uA, where thehierarchical system is able to duty cycle the micro-controllerto reduce energy consumption by over 80% compared to thebaseline.

The figure illustrates the average power consumption of theprototype at different gesturing frequencies, demonstratingthat the energy consumption is partially dependent uponthe frequency, but asymptotically converges to a baselineconsumption near 500 uA. From these measurements, wepresent the following three conclusions. (1) The absolutepower consumption of the system is very low for a gesturerecognition system, and is close to 525 μA (1.7 mW) whengesture frequency is once every 2 minutes (which for the givenapplication is a very high rate). Given a 1000 mAh battery,our prototype system could theoretically last for 2000 hours(83 days) on a single charge (though this would be diminisheddue to self-battery drain). (2) Our hierarchical signal process-ing allows for the system to consume approximately 4 timeslower power than the baseline system, which utilizes alreadylow-power components. (3) We find that the primary energyconsumers in the system are the BLE module and the observa-tion and measurement IC. These two energy consumptions canbe addressed through the use of application specific hardware.The amount of information being transferred wirelessly is asingle byte at most every 3-5 seconds. This would allow forthe use of other wireless communications to reduce energy,including ambient back-scatter [37] communication, whichcould reduce the wireless communication energy consumptionaltogether. The measurement IC could also be implementedin low-power analog sub-threshold circuits to considerablyreduce the energy consumption.

E. System Trade-Offs

This section sets out to investigate trade-offs in training sizeand accuracy, especially considering the effect of personalizedvs aggregate gesture sets. Figure 7(c) demonstrates the firstof these tradeoffs, showing the accuracy for each user as thetraining size increases. Across each of the five subjects, theaccuracy increases as the training size increases, but saturatesafter approximately five training sets. This is beneficial, as itlimits the amount of training that a user must perform, andthe amount of processing that must occur to perform the 1NNcalculation. The second trade-off, aggregate vs. personalizedtraining, is illustrated in Figure 7(d). The personalized train-ing (i.e. where only training from that particular subject isused for classification) has a slightly higher average accuracythan the use of aggregate training (i.e. training data fromall subjects is used). Additionally, the variance of accuracyin subjects 1 and 4 when using aggregate training data isconsiderably larger than the variance within the personalizedset. These two subjects have proportions of their forearms andlegs that are different from the other subjects, which couldcontribute to capturing different motion artifacts that were notcaptured in the other subjects. The use of personalized trainingallows for data to fit to the individual, resulting in better overallaccuracy performance, which represents a fundamental designprinciple for our system.

Fig. 9. (a) Accuracy of the gesture recognition system on an individualwith incomplete quadriplegia across three different sensor configurations.(b) Confusion matrix of the 16 gestures performed by the mobility impairedsubject.

Fig. 10. (a) This picture is a time sequence image of a gesture where ourmobility impaired subject extends his arm while moving from the bottomplate to the top plate. This motion causes poor peak detection as the user’sarm stays in proximity of the bottom plate. (b) This figure is a time sequenceimage of a gesture where the subject pronates his arm while moving fromthe right to the left. This motion facilitates keeping the arm near the arraythroughout a gesture.

1) User Study on a Subject With Impaired Mobility: Tovalidate the performance of our system on subjects belongingto the target population group, we performed an evaluationon a single subject who has an upper extremity mobilityimpairment. This individual has a C6 Spinal Cord Injury whichhas resulted in total loss of mobility below the waist, andlimited mobility in the fingers, wrists, and arms. Our initialdiscussion and evaluation with this individual identified thatthe hovers and swipes that comprise our gesture set are bothwithin the individual’s ability and comfortable to performover a long period of time. During the evaluation session, theindividual was trained on the system by performing two setsof each gesture and receiving verbal instructive feedback. Theindividual then performed ten of each gesture on the 4-platesystem for a total of 160 gestures. The subject then was given ashort break before being asked to perform five of each gestureon each of the two 12-plate configurations (see Section Vfor description of the configurations) for a total of another160 gestures (80 for each configuration). The collected dataset was divided into 5 sets and was randomly divided into atraining and testing set comprised of 3 and 2 sets respectively.

Figure 9(a) graphs the accuracy of Inviz in recognizinggestures for the 4-plate and 12-plate sensors. For the12-plate sensor accuracy results are shown for bothconfigurations described in §V. We can make two observationsfrom the figure. First, the accuracy of gesture recognitionimproves as we use more plates. For instance, the accuracy of

Page 11: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

SINGH et al.: EVENT-DRIVEN LOW-POWER GESTURE RECOGNITION USING DIFFERENTIAL CAPACITANCE 4965

Fig. 11. This figure demonstrates two different hover gestures. The first, a “Hover 3” gesture is performed correctly as you can see the differentialscorresponding to plate 3 are breaking their thresholds towards plate 3. The second, a “Hover 1” gesture is performed incorrectly as part of the extensionissue raised above. Two of the differentials for “Hover 1” properly break the thresholds, but the one corresponding to the differential between sensor 0 andsensor 3 (the near sensor) does not. Additionally, the differential for sensor plate 3 and plate 2 does break the threshold and therefore there is confusion inthe machine learning algorithm because the feature extraction did not perform properly.

recognizing the gestures on the 12-plate sensor is 80(+/−)3%.For the 4-plate sensor the accuracy of Inviz is lower than thehealthy subjects (≈ 70%). We delve deeper into understandingthe causes of the inaccuracies using the timestamped videosfor the subject performing the gestures. We derived threemajor artifacts specific to this quadriplegic individual thatcaused the inaccuracies.

First, when considering extension motions such as movingthe hand from the bottom plate to the top plate, the idealmovement maintains a greater separation between the forearmand the sensor plates compared to the hand. However, thissubject’s forearm stays near the array as compared to thehand. This is due to his reduced ability to perform wristextension. The phenomenon is demonstrated in Figure 10 (a).This type of movement biases capacitance readings in a nearand far sensor pair towards the near sensor, potentially causingimbalanced thresholds that need to be met for the recognitionalgorithm. This is further illustrated in the confusion matrix forthe subject in Figure 9 (b). From the matrix, we observe thatthe classification for this subject was diminished for gesturesthat extended over the top of other pads (e.g., hovering overplates 0 and 1 and moving from plate 1 to either 0 or 2).An example data stream which demonstrates this bias is shownin Figure 11. The figure illustrates two expected streams andthe errors that are created by biased sensor readings from thepresence of the forearm.

Secondly, the subject has small thighs, in part because ofatrophy of the thigh muscles from disuse. This causes thesensor array to contour the leg in a more convex shape than theother healthy subjects. In order to facilitate correct left-to-rightand right-to-left motions, the individual pronates and supinateshis hand with muscle groups near the elbow to keep the handnear to the array. This motion is demonstrated in Figure 10 (b).Some individuals with upper extremity mobility impairment donot have the ability to perform this type of arm rotation basedon upper arm rigidity and therefore have difficulty trackingthe contour of the leg, and will have to mimic this motionby propagating shoulder movements, thereby creating morevariation in the amplitude of peaks and potentially reducingthe accuracy of moving above the correct plates.

Finally, the subject tended to face the side of his hand oppo-site to the thumb toward the sensor array while our healthyusers all faced their palm towards the array. In discussion withthis individual it was revealed that this was to facilitate moving

through each gesture so that the opposite leg would not bea burden. However, a user who does not have the ability toretract fingers through tenodesis may choose to maintain thisposition so that their fingers do not drag along the array. Thisposition has a thinner profile than a downward facing palm,creating a modest increase in peak separation and decrease inpeak width comparatively, which can create error within thefeature vectors and potentially causing the gesture recognitionalgorithm to misclassify the gesture.

The above artifacts points to two limitations of our prototypeInviz system. First, the sensor must be built in proportionto the user. A larger or smaller than required sensor arraycauses the gestures to be performed incorrectly. Second, theobservations point to the need of a more robust statisticallearning approach like Hidden Markov Models [31] for gesturerecognition. We are pursuing the first avenue as future work,and have utilized a different mechanism to consider the secondapproach [35]. Furthermore, we plan to further investigatethe impact of the distance of the hand from the sensors andthe influence of disturbances in the environment on gesturerecognition accuracy.

VII. CONCLUSION

In this paper we present an event-driven, low-power, wear-able capacitive gesture recognition system as an assistivedevice for individuals with upper-extremity mobility impair-ments. The use of textile-based capacitive sensors integratedinto items of everyday living allows the flexibility to usethe system without issues of occlusion or obtrusion. Oursystem uses a replicable hierarchical sensor processing algo-rithm to break down computation into different power tiersand maximize responsiveness at a very low average energyconsumption (<750 μA). We have implemented and evaluateda fully functional end-to-end prototype of the system in termsof latency, power consumption, and accuracy as an assistivehome-automation cyber-physical system to demonstrate thatthe system is capable of accurately (>90%) classifying ges-tures in real-time (<1 s latency) for home use.

ACKNOWLEDGMENT

Any opinions, findings, and conclusions or recommenda-tions expressed in this material are those of the authors anddo not necessarily reflect the views of the NSF or Microsoft.

Page 12: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

4966 IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016

REFERENCES

[1] M. J. Hall, S. Levant, and C. J. DeFrances, “Hospitalization for strokein U.S. hospitals, 1989–2009,” Diabetes, vol. 18, no. 23, p. 23, 2012.

[2] J. A. Langlois, W. Rutland-Brown, and M. M. Wald, “The epidemiologyand impact of traumatic brain injury: A brief overview,” J. Head TraumaRehabil., vol. 21, no. 5, pp. 375–378, Sep./Oct. 2006.

[3] National Spinal Cord Injury Statistical Center, Facts and Figures at aGlance, Univ. Alabama Birmingham, Birmingham, AL, USA, 2015.

[4] S. Steinbaum, J. D. Harviel, J. H. Jaffin, and M. H. Jordan, “Lightningstrike to the head: Case report,” J. Trauma Acute Care Surgery, vol. 36,no. 1, pp. 113–115, Jan. 1994.

[5] M. S. Workinger and R. Netsell, “Restoration of intelligible speech13 years post-head injury,” Brain Injury, vol. 6, no. 2, pp. 183–187,1992.

[6] S. K. Fager and J. M. Burnfield, “Patients’ experiences with technologyduring inpatient rehabilitation: Opportunities to support independenceand therapeutic engagement,” Disab. Rehabil., Assistive Technol., vol. 9,no. 2, pp. 121–127, 2013.

[7] H. Junker, O. Amft, P. Lukowicz, and G. Tröster, “Gesture spotting withbody-worn inertial sensors to detect user activities,” Pattern Recognit.,vol. 41, no. 6, pp. 2010–2024, Jun. 2008.

[8] A. Y. Benbasat and J. A. Paradiso, “An inertial measurement frameworkfor gesture recognition and applications,” in Gesture and Sign Languagein Human-Computer Interaction. Heidelberg, Germany: Springer, 2002,pp. 9–20.

[9] S.-J. Cho et al., “Magic wand: A hand-drawn gesture input device in3-D space with inertial sensors,” in Proc. IEEE IWFHR, Oct. 2004,pp. 106–111.

[10] Y. Wu and T. S. Huang, “Vision-based gesture recognition: A review,”in Gesture-Based Communication in Human-Computer Interaction.Heidelberg, Germany: Springer, 1999, pp. 103–115.

[11] Y. Xia, Z.-M. Yao, X.-J. Yang, S.-Q. Xu, X. Zhou, and Y.-N. Sun,“A footprint tracking method for gait analysis,” Biomed. Eng., Appl.,Basis Commun., vol. 26, no. 1, 2014.

[12] A. Nelson et al., “Wearable multi-sensor gesture recognition for paral-ysis patients,” in Proc. IEEE SENSORS, Nov. 2013, pp. 1–4.

[13] R. E. Cowan, B. J. Fregly, M. L. Boninger, L. Chan, M. M. Rodgers,and D. J. Reinkensmeyer, “Recent trends in assistive technology formobility,” J. NeuroEng. Rehabil., vol. 9, p. 20, Dec. 2012.

[14] G. Cohn et al., “An ultra-low-power human body motion sensor usingstatic electric field sensing,” in Proc. ACM UbiComp, 2012, pp. 99–102.

[15] J. Cheng, O. Amft, and P. Lukowicz, “Active capacitive sensing:Exploring a new wearable sensing modality for activity recogni-tion,” in Pervasive Computing. Heidelberg, Germany: Springer, 2010,pp. 319–336.

[16] H. K. Ouh, J. Lee, S. Han, H. Kim, I. Yoon, and S. Hong, “A program-mable mutual capacitance sensing circuit for a large-sized touch panel,”in Proc. ISCAS, May 2012, pp. 1395–1398.

[17] Y. Piao, X. Xiao, X. Wang, Q. Zhou, K. Fan, and P. Liu, “Conditioningcircuit for capacitive position sensor with nano-scale precision based onAC excitation principle,” in Proc. IEEE EIT, May 2011, pp. 1–6.

[18] P. Ripka and A. Tipek, Eds., Modern Sensors Handbook. New York,NY, USA: Wiley, 2013.

[19] “Capacitive sensor operation and optimization,” Lion Precision, St. Paul,MN, USA, Tech. Rep. LT03-0020, 2006.

[20] P. Gründler, “Conductivity sensors and capacitive sensors,” in ChemicalSensors: An Introduction for Scientists and Engineers. Heidelberg,Germany: Springer, 2007, pp. 123–132.

[21] L. Zhao and E. M. Yeatman, “Micro capacitive tilt sensor for humanbody movement detection,” in Proc. BSN, 2007, pp. 195–200.

[22] E. Terzic, R. Nagarajah, and M. Alamgir, “A neural network approachto fluid quantity measurement in dynamic environments,” Mechatronics,vol. 21, no. 1, pp. 145–155, 2011.

[23] J. N. Palasagaram and R. Ramadoss, “MEMS-capacitive pressure sensorfabricated using printed-circuit-processing techniques,” IEEE Sensors J.,vol. 6, no. 6, pp. 1374–1375, Dec. 2006.

[24] P.-H. Lo, C. Hong, S.-H. Tseng, J.-H. Yeh, and W. Fang, “Implemen-tation of vertical-integrated dual mode inductive-capacitive proximitysensor,” in Proc. IEEE MEMS, Jan./Feb. 2012, pp. 640–643.

[25] P.-H. Lo, C. Hong, S.-C. Lo, and W. Fang, “Implementation ofinductive proximity sensor using nanoporous anodic aluminum oxidelayer,” in Proc. 16th Int. Solid-State Sens., Actuators Microsyst.Conf. (TRANSDUCERS), Jun. 2011, pp. 1871–1874.

[26] S. D. Min, J. K. Kim, H. S. Shin, Y. H. Yun, C. K. Lee, and M. Lee,“Noncontact respiration rate measurement system using an ultrasonicproximity sensor,” IEEE Sensors J., vol. 10, no. 11, pp. 1732–1739,Nov. 2010.

[27] Single-Zone 3D Tracking and Gesture Controller Data Sheet,Microchips, Chandler, AZ, USA, Nov. 2013.

[28] P. Premaratne and Q. Nguyen, “Consumer electronics control systembased on hand gesture moment invariants,” IET Comput. Vis., vol. 1,no. 1, pp. 35–41, Mar. 2007.

[29] H. Jiang, Z. Han, P. Scucces, S. Robidoux, and Y. Sun, “Voice-activatedenvironmental control system for persons with disabilities,” in Proc.Bioeng. Conf., Apr. 2000, pp. 167–168.

[30] S. B. Kang, “Hands-free interface to a virtual reality environment usinghead tracking,” U.S. Patent 6 009 210, Dec. 28, 1999.

[31] A. D. Wilson and A. F. Bobick, “Parametric hidden Markov models forgesture recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21,no. 9, pp. 884–900, Sep. 1999.

[32] P. Domingos and M. Pazzani, “On the optimality of the simpleBayesian classifier under zero-one loss,” Mach. Learn., vol. 29, nos. 2–3,pp. 103–130, 1997.

[33] G. Fang, W. Gao, and D. Zhao, “Large vocabulary sign languagerecognition based on hierarchical decision trees,” in Proc. 5th Int. Conf.Multimodal Interfaces, 2003, pp. 125–131.

[34] G. Singh, A. Nelson, R. Robucci, C. Patel, and N. Banerjee, “Inviz:Low-power personalized gesture recognition using wearable textilecapacitive sensor arrays,” in Proc. IEEE Int. Conf. Pervasive Comput.Commun. (PERCOM), Mar. 2015, pp. 198–206.

[35] A. Nelson, G. Singh, R. Robucci, C. Patel, and N. Banerjee, “Adaptiveand personalized gesture recognition using textile capacitive sensorarrays,” IEEE Trans. Multi-Scale Comput. Syst., vol. 1, no. 2, pp. 62–75,Apr./Jun. 2015.

[36] S. Rus, M. Sahbaz, A. Braun, and A. Kuijper, “Design factors for flexiblecapacitive sensors in ambient intelligence,” in Ambient Intelligence.Cham, Switzerland: Springer, 2015, pp. 77–92.

[37] V. Liu, A. Parks, V. Talla, S. Gollakota, D. Wetherall, and J. R. Smith,“Ambient backscatter: Wireless communication out of thin air,”SIGCOMM Comput. Commun. Rev., vol. 43, no. 4, pp. 39–50, Oct. 2013.

Gurashish Singh (M’13) received the B.S. degreein computer engineering from the University ofMaryland, Baltimore County (UMBC), Baltimore,MD, in 2014, where he is currently pursuingthe M.S. degree in computer engineering, withresearch on sensors, biomedical systems, embeddedsystems, and very large-scale integration. He hastaken an internship with Altera Corporation, SanJose, CA, exploring heterogeneous computing andwas a Teaching Assistant at UMBC. He receivedthe Runner-Up Best Demo Award at IEEE

PerCom 2015.

Alexander Nelson (S’13) received the B.S. degreeand the M.S. degree in computer engineeringfrom the University of Arkansas, Fayetteville, AR,in 2012 and 2013, respectively. He is currentlypursuing the Ph.D. degree with the University ofMaryland, Baltimore County, Baltimore, MD, withresearch on embedded real-time systems and perva-sive computing. He is a member of IEEE-Eta KappaNu and Tau Beta Pi, and he received the Runner-UpBest Demo Award at IEEE PerCom 2015 and anNSF GRFP Honorable Mention Award.

Sheung Lu (S’15) is currently pursuing the B.S.degree in computer engineering from the Uni-versity of Maryland, Baltimore County (UMBC),Baltimore, MD. He was a Teaching Assistant atUMBC and has an internship at the MIT LincolnLaboratory and Texas State University–San Marcos.He joined the ECLIPSE Research Cluster in 2014.His research interests include embedded systems,sensors, and very large-scale integration.

Page 13: Event-Driven Low-Power Gesture Recognition Using Differential … · 2019-04-09 · IEEE SENSORS JOURNAL, VOL. 16, NO. 12, JUNE 15, 2016 4955 Event-Driven Low-Power Gesture Recognition

SINGH et al.: EVENT-DRIVEN LOW-POWER GESTURE RECOGNITION USING DIFFERENTIAL CAPACITANCE 4967

Ryan Robucci (S’00–M’10) received the B.S.degree in computer engineering from the Uni-versity of Maryland, Baltimore County (UMBC),in 2002, and the M.S.E.E. and Ph.D. degreesfrom the Georgia Institute of Technology, Atlanta,in 2004 and 2009, respectively. He is currently anAssistant Professor with the Department of Com-puter Science and Electrical Engineering, UMBC.His current research interests include low-power,real-time signal processing with reconfigurableanalog and mixed-mode very large-scale integration

circuits; CMOS image sensors; sensor interfacing and networking; imageprocessing; bioinspired systems; and computer-aided design and analysis ofcomplex mixed-signal systems and hardware security. He received the ISCASBest Sensors Paper Award in 2005 and the Runner-Up Best Demo Award atthe IEEE PerCom 2015.

Chintan Patel (S’99–M’05) received the M.S.degree in computer science and the Ph.D. degreein computer engineering from the Universityof Maryland, Baltimore County (UMBC),in 2001 and 2004, respectively. He is currently anAssistant Professor with the Department of Com-puter Science and Electrical Engineering, UMBC.He specializes in very large-scale integration designand test. Specifically, he is involved in power supplymodeling, noise estimation, current measurementscircuits, and hardware security. He has received the

Runner-Up Best Demo Award at the IEEE PerCom.

Nilanjan Banerjee received the B.S. degree incomputer science and engineering from the IndianInstitute of Technology (IIT) Kharagpur, and theM.S. and Ph.D. degrees from the University ofMassachusetts, Amherst. He is currently an Asso-ciate Professor of Computer Science and Electri-cal Engineering with the University of Maryland,Baltimore County, where he currently directs theMobile, Pervasive, and Sensor Systems Laboratory.He received the Best Undergraduate Thesis Awardfrom IIT Kharagpur in 2004. He also received the

Yahoo! Outstanding Thesis Award from the University of Massachusetts,Amherst, in 2009. He is recipient of an NSF CAREER Award and a MicrosoftResearch Software Engineering Innovations Award.