low power embedded gesture recognition using novel short ... › summit › posters › magno,...
TRANSCRIPT
TEMPLATE DESIGN © 2008
www.PosterPresentations.com
Low Power Embedded Gesture Recognition Using Novel Short-Range Radar
Sensors
Michele Magno, Emanuel Eggimann, Jonas Erb, Philipp Mayer, Luca Benini
Integrated Systems Laboratory, ETH Zurich
Gesture Recognition Based on Short-Range Radar
Increasing research on radar for gesture
recognition1,2,3,4
Google developed micro-radar for gesture
recognition
Good results on difficult hand-gestures: 90%
accuracy on 11 gestures and 10 people
Contribution of This Work
Technical Background Embedded And Energy Efficient Algorithm CNN+TCN In Field Ev. and Comparison with Google Soli 1
ConclusionThis work presented a high-accuracy and low-power hand-gesture
recognition system based on short-range radar. Two large datasets
with 11 challenging hand-gestures performed by 26 different
people containing a total of 20210 gesture instances are recorded,
on which the final algorithm reaches an up to 92%. The model size
is only 92kB and the implementation in GAP8 shows that live-
prediction is feasible with a power consumption of the prediction
network of only 21mW.
1Soli: Ubiquitous Gesture Sensing with Millimeter Wave Radar, 2016
2Interacting with Soli: Exploring Fine-Grained Dynamic Gesture Recognition in the Radio-Frequency Spectrum. 2016
3Sparsity-based Dynamic Hand Gesture Recognition Using Micro-Doppler Signatures, 2017
4A Hand Gesture Recognition Sensor Using Reflected Impulses. 2017
Implement radar based hand-gesture recognition
in embedded system
Create dataset with fine-grained hand-gestures–at least 1000 samples per class and 20 users
Algorithm suitable for embedded systems–less than 1MB, at least 700x smaller than I. w. Soli
Achieving similar accuracy as I. w. Soli–85% (single-user), 87% (10 people) on 11 Gestures1
Algorithm implementation in GAP8 PULP processor
and experimental evaluation on efficiency (power, run-
time)
Short Range Radar–Classical airplane detection radar: 1-12 GHz
(wavelength = 2.5-30cm)
–Acconeer radar: 60GHz (wavelength = 5mm)
–Data from sensor:
−Sweep @ 160Hz → each over range
−time and range discrete signal 𝑆[𝑡, 𝑟]
Machine Learning Features:
–Range Frequency Doppler Map
–Signal Energy
–Signal Variation
–Center of Mass of the Envelope
time
rang
e
Sweeps
range resolution(0.483mm)
Range Frequency Doppler Map:
Radar output: range & time discrete signal:
𝑆[𝑡, 𝑟]
Doppler Map:
–𝑆 𝑓, 𝑟 = σ𝑡=0
𝐿𝑑𝑜𝑝𝑝𝑙𝑒𝑟 𝑆[𝑡, 𝑟]𝑒−
2𝜋𝑖𝑓𝑡
𝐿𝑑𝑜𝑝𝑝𝑙𝑒𝑟
= 𝐹𝐹𝑇(𝑆 𝑡, 𝑟 )Strong feature based on research1,2
1Interacting with Soli: Exploring Fine-Grained Dynamic Gesture Recognition in the Radio-Frequency Spectrum. 2016
2Short-Range FMCW Monopulse Radar for Hand-Gesture Sensing, 2015
Idea: Combine information of multiple time steps
→ Up 98.9% Accuracy
for 5 gestures and 2
sensors!
2D CNN
TCN
Excution Length
@100MHz
Cycles
Network + FFT 5.877
million
2D CNN 5.079 million
TCN 0.458 million
Fully Connected
Layers
0.086 million
FFT 0.177 million
Power consumption:
▪ 5Hz Prediction Rate: only 21mW!
▪ Much margin: up to 50Hz possible
Comparison Cortex M7 (STM32F746ZG):
▪ Same calculation consumes
147mW-588mW
▪ Running at limit (@216MHz
needing 850ms for 5 predictions)
@100MHz and 8 cores
running: total energy
per frame: 4.2mJ
→ 5 frames/s: 21mW
Per Frame Accuracy on 11 Gesture
Datasets
Soli This
Work
Single-User Leave-One-Out (Session)
Cross-Validation
85.75
%
92 %
Multiple-Users randomly shuffled 87.17
%
81.52 %
Multiple-Users Leave-One-Out (Person)
Cross-Validation
79.06
%
73.66 %
Other Properties Soli This
Work
Model Size 689MB 91kB
Dataset: Total Instances per Gesture 500 1610
Dataset: People 10 26
Embedded Implementation No Yes
Network Power Consumption - 21mW
Sensor Power Consumption 300m
W
<190mW
GAP8 VS ARM Cortex-M7Implementation: Run-Time,
Power Consumption, FFT
This work proposes a low-power high-accuracy embedded hand-gesture
recognition targeting battery-operated wearable devices using low power
short-range radar sensors. A 2D Convolutional Neural Network (CNN)
using range frequency Doppler features is combined with a Temporal
Convolutional Neural Network (TCN) for time sequence prediction. The
final algorithm has a model size of only 45723 parameters, yielding a
memory footprint of only 91kB. Two datasets containing 11 challenging
hand gestures performed by 26 different people have been recorded
containing a total of 20210 gesture instances. On the 11 hands, gestures
and an accuracy of 87% (26 users) and 92% (single user) have been
achieved. Furthermore, reducing the gesture to 5 we achieved up to 98.9%
in accuracy. Finally, the prediction algorithm has been implemented in the
GAP8 Parallel Ultra-Low-Power processor by GreenWaves Technologies,
showing that live-prediction is feasible with only 21mW of power
consumption for the full gesture prediction neural network, without the
sensor consumption.
TCN
FFT FFT FFT FFT FFT
CNN CNN CNN CNN CNN Model
Gesture
1) 2D Convolutional Neural Network (CNN) scaling down the range frequency
Doppler input map of size 32x492x2 to a representation vector of length 384.
2) Temporal Convolutional Neural Network (TCN) using as input a time
sequence of stacked representation vectors of length 384 leveraging temporal
information for more accurate predictions. The output representations of length 32
produced by the TCN are then fed into three fully connected layers, which create
the probability distribution to classify the observed gesture
* From Palm-Hold gesture.