5 track kinect@bicocca - gesture

71
KINECT Programming Ing. Matteo Valoriani [email protected]

Upload: matteo-valoriani

Post on 28-Jan-2015

119 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: 5   track kinect@Bicocca - gesture

KINECT Programming

Ing. Matteo Valoriani [email protected]

Page 2: 5   track kinect@Bicocca - gesture

KINECT Programming

Gesture

• What is a gesture?

• An action intended to communicate feelings or intentions

• What is “Gesture Detection” or “Gesture Recognition”?

• Computer’s ability to understand human gestures as input

• First used in 1963 with pen-based input device

• What is it used for?

• Mouse movements, Handwriting recognition, Sign language,

recognition, Touch screen input, Kinect

Page 3: 5   track kinect@Bicocca - gesture

KINECT Programming

Cursors (hands tracking):

Target an object

Avatars (body tracking):

Interaction with virtual space

• Depend by the tasks

• Important aspect in design of UI

Interaction metaphors

Page 4: 5   track kinect@Bicocca - gesture

KINECT Programming

The shadow/mirror effect

Shadow Effect: • I see the back of my avatar • Problems with Z movements

Mirror Effect: • I see the front of my avatar • Problem with mapping left/right

movements

Page 5: 5   track kinect@Bicocca - gesture

KINECT Programming

Page 6: 5   track kinect@Bicocca - gesture

KINECT Programming

Game mindset ≠ UI mindset

User Interaction

Challenging = fun Challenging = easy and effective

IR Emitter

Page 7: 5   track kinect@Bicocca - gesture

KINECT Programming

Gesture semantically fits user task

Abstract Meaningful

Page 8: 5   track kinect@Bicocca - gesture

KINECT Programming

User action fits UI reaction

1 2 3 4 5 6 7 8 9 10

System’s UI feedback relates to the user’s physical movement

Page 9: 5   track kinect@Bicocca - gesture

KINECT Programming

User action fits UI reaction

1 2 3 4 5 6 7 8 9 10 5

System’s UI feedback relates to the user’s physical movement

Page 10: 5   track kinect@Bicocca - gesture

KINECT Programming

Each gesture feels related and cohesive

with entire gesture set

Gestures family-up

1 2 3 4 5 6 7 8 9 10

Page 11: 5   track kinect@Bicocca - gesture

KINECT Programming

Different gesture depending on hand: only left hand

can do gesture A

Handed gestures

1 2 3 4 5 6 7 8 9 10

Page 12: 5   track kinect@Bicocca - gesture

KINECT Programming

Repeting Gesture?

Will users want/need to perform the proposed gesture repeatedly?

Page 13: 5   track kinect@Bicocca - gesture

KINECT Programming

Repeting Gesture?

Will users want/need to perform the proposed gesture repeatedly?

Page 14: 5   track kinect@Bicocca - gesture

KINECT Programming

One-handed gestures are preferred

Number of Hands

1 2 3 4 5 6 7 8 9 10 6 7 8 9 10

Page 15: 5   track kinect@Bicocca - gesture

KINECT Programming

Two hand gesture should be symmetrical

Symmetrical two-handed gesture

Page 16: 5   track kinect@Bicocca - gesture

KINECT Programming

Interactions requiring more work and effort should

have a higher payoff

Gesture payoff

1 2 3 4 5 6 7 8 9 10 6 7 8 9 10

Page 17: 5   track kinect@Bicocca - gesture

KINECT Programming

Fatigue is the start of downward that kills gesture

Fatigue kills gesture

Fatigue increase messiness poor performance frustration bad UX

Page 18: 5   track kinect@Bicocca - gesture

KINECT Programming

Gorilla arm problem: try to put the hand up for 10

minutes…

Gorilla Arm problem

Page 19: 5   track kinect@Bicocca - gesture

KINECT Programming

Confortable positions

Page 20: 5   track kinect@Bicocca - gesture

KINECT Programming

User posture may affect design of a gesture

User Posture

Page 21: 5   track kinect@Bicocca - gesture

KINECT Programming

The challenges

• Physical variable

• Environment

• Recognizing intent

• Input variability

Page 22: 5   track kinect@Bicocca - gesture

KINECT Programming

Page 23: 5   track kinect@Bicocca - gesture

KINECT Programming

Heuristics

• Experience-based techniques for problem solving, learning, and

discovery

• Cost effective

• Helps reconstruct missing

information

• Helps compute outcome of

a gesture

Heuristics Machine Learning

Cost

Gesture Complexity

Page 24: 5   track kinect@Bicocca - gesture

KINECT Programming

Define What Constitutes a Gesture

• Some players have more energy (or enthusiasm) than

others

• Some players will “optimize” their gestures

• Most players will not perform the gesture precisely as

intended

Page 25: 5   track kinect@Bicocca - gesture

KINECT Programming

Select the Right Triggers

• Use skeleton view to analyze whole skeleton behavior

• Use joint view to isolate and analyze specific joints and

axis behavior

• Use data sheet view: to get the real numbers

• Not all joints are needed

• Player location in the play area can cause some joints to

become occluded

Page 26: 5   track kinect@Bicocca - gesture

KINECT Programming

Define Key Stages of a Gesture

• Determine • When the gesture begins

• When the gesture ends

• Determine other key stages • Changes in motion direction

• Pauses

• …

• You could simply signal that the gesture has been completed, or

• You could keep a progress, or

• You could use distinct states

Page 27: 5   track kinect@Bicocca - gesture

KINECT Programming

Determine the Type of Outcome

• Definite gesture

• Contact or release

point

• Direction

• Initial velocity

• Continuous gesture

• Frequency

• Amplitude

Page 28: 5   track kinect@Bicocca - gesture

KINECT Programming

Run a Detection Filter Only When Necessary

• Define clear context for when a gesture is expected

• Provide clear feedback to the player

• Run the gesture filter when the context warrants it

• Cancel the gesture if context changes

Page 29: 5   track kinect@Bicocca - gesture

KINECT Programming

Causes of Missing Information

• Self Occlusion • Side poses

• Player’s position in play space

• Obstacles • Other players

• Furniture

• Outside the camera’s field of view • Left or right (easy to fix)

• Top or bottom (hard to avoid)

Page 30: 5   track kinect@Bicocca - gesture

KINECT Programming

Page 31: 5   track kinect@Bicocca - gesture

KINECT Programming

class GestureRecognizer {

public Dictionary<JointType, List<Joint>> skeletonSerie = new Dictionary<JointType, List<Joint>>() {

{ JointType.AnkleLeft, new List<Joint>()}, { JointType.AnkleRight, new List<Joint>()},

{ JointType.ElbowLeft, new List<Joint>()}, { JointType.ElbowRight, new List<Joint>()},

{ JointType.FootLeft, new List<Joint>()}, { JointType.FootRight, new List<Joint>()},

{ JointType.HandLeft, new List<Joint>()}, { JointType.HandRight, new List<Joint>()},

{ JointType.Head, new List<Joint>()}, { JointType.HipCenter, new List<Joint>()},

{ JointType.HipLeft, new List<Joint>()}, { JointType.HipRight, new List<Joint>()},

{ JointType.KneeLeft, new List<Joint>()}, { JointType.KneeRight, new List<Joint>()},

{ JointType.ShoulderCenter, new List<Joint>()}, { JointType.ShoulderLeft, new List<Joint>()},

{ JointType.ShoulderRight, new List<Joint>()},

{ JointType.Spine, new List<Joint>()},

{ JointType.WristLeft, new List<Joint>()},

{ JointType.WristRight, new List<Joint>()}

};

protected List<DateTime> timeList;

private static List<JointType> typesList = new List<JointType>() {JointType.AnkleLeft, JointType.AnkleRight, JointType.ElbowLeft, JointType.ElbowRight, JointType.FootLeft, JointType.FootRight, JointType.HandLeft, JointType.HandRight, JointType.Head, JointType.HipCenter, JointType.HipLeft, JointType.HipRight, JointType.KneeLeft, JointType.KneeRight, JointType.ShoulderCenter, JointType.ShoulderLeft, JointType.ShoulderRight, JointType.Spine, JointType.WristLeft, JointType.WristRight };

//... continue

}

Key Value

AnkleLeft <Vt1, Vt2, Vt3, Vt4,..>

AnkleRight <Vt1, Vt2, Vt3, Vt4,..>

ElbowLeft <Vt1, Vt2, Vt3, Vt4,..>

Page 32: 5   track kinect@Bicocca - gesture

KINECT Programming

const int bufferLenght=10;

public void Recognize(JointCollection jointCollection, DateTime date) {

timeList.Add(date);

foreach (JointType type in typesList) {

skeletonSerie[type].Add(jointCollection[type]);

if (skeletonSerie[type].Count > bufferLenght) {

skeletonSerie[type].RemoveAt(0);

}

}

startRecognition();

}

List<Gesture> gesturesList = new List<Gesture>();

private void startRecognition() {

gesturesList.Clear();

gesturesList.Add(HandOnHeadReconizerRT(JointType.HandLeft, JointType.ShoulderLeft));

// Do ...

}

Page 33: 5   track kinect@Bicocca - gesture

KINECT Programming

Boolean isHOHRecognitionStarted;

DateTime StartTimeHOH = DateTime.Now;

private Gesture HandOnHeadReconizerRT (JointType hand, JointType shoulder) {

// Correct Position

if (skeletonSerie[hand].Last().Position.Y > skeletonSerie[shoulder].Last().Position.Y + 0.2f) {

if (!isHOHRecognitionStarted) {

isHOHRecognitionStarted = true;

StartTimeHOH = timeList.Last();

}

else {

double totalMilliseconds = (timeList.Last() - StartTimeHOH).TotalMilliseconds;

// time ok?

if ((totalMilliseconds >= HandOnHeadMinimalDuration)) {

isHOHRecognitionStarted = false;

return Gesture.HandOnHead;

}

}

}

else {//Incorrect Position

if (isHOHRecognitionStarted) {

isHOHRecognitionStarted = false;

}

}

return Gesture.None; }

Alternative: count number of occurrences

Page 34: 5   track kinect@Bicocca - gesture

KINECT Programming

How to notify a gesture?

• Synchronous Solution: • Return gesturesList to GUI

• Asynchronous Solution: • Use Event

public delegate void HandOnHeadHadler(object sender, EventArgs e); public event HandOnHeadHadler HandOnHead; private Gesture HandOnHeadReconizerRTWithEvent(JointType hand, JointType shoulder) { Gesture g = HandOnHeadReconizerRT(hand, shoulder); if (g == Gesture.HandOnHead) { if (HandOnHead != null) HandOnHead(this, EventArgs.Empty); } return g; }

Page 35: 5   track kinect@Bicocca - gesture

KINECT Programming

Page 36: 5   track kinect@Bicocca - gesture

KINECT Programming

const float SwipeMinimalLength = 0.08f; const float SwipeMaximalHeight = 0.02f; const int SwipeMinimalDuration = 200; const int SwipeMaximalDuration = 1000; const int MinimalPeriodBetweenGestures = 0;

private Gesture HorizzontalSwipeRecognizer(List<Joint> positionList) { int start = 0; for (int index = 0; index < positionList.Count - 1; index++) { if ((Math.Abs(positionList[0].Position.Y - positionList[index].Position.Y) > SwipeMaximalHeight) || Math.Abs((positionList[index].Position.X - positionList[index + 1].Position.X)) < 0.01f) { start = index; } if ((Math.Abs(positionList[index].Position.X - positionList[start].Position.X) > SwipeMinimalLength)) { double totalMilliseconds = (timeList[index] - timeList[start]).TotalMilliseconds; if (totalMilliseconds >= SwipeMinimalDuration && totalMilliseconds <= SwipeMaximalDurati { if (DateTime.Now.Subtract(lastGestureDate).TotalMilliseconds > MinimalPeriodBetweenGestures) { lastGestureDate = DateTime.Now; if (positionList[index].Position.X - positionList[start].Position.X < 0) return Gesture.SwipeRightToLeft; else return Gesture.SwipeLeftToRight; } } } } return Gesture.None; }

∆x too small or ∆y too big shift start

∆x > minimal lenght

∆t in the accepted range

Page 37: 5   track kinect@Bicocca - gesture

KINECT Programming

public delegate void SwipeHadler(object sender, GestureEventArgs e); public event SwipeHadler Swipe;

private Gesture HorizzontalSwipeRecognizer(JointType jointType) { Gesture g = HorizzontalSwipeRecognizer(skeletonSerie[jointType]); switch (g) { case Gesture.None: break; case Gesture.SwipeLeftToRight: if (Swipe != null) Swipe(this, new GestureEventArgs("SwipeLeftToRight")); break; case Gesture.SwipeRightToLeft: if (Swipe != null) Swipe(this, new GestureEventArgs("SwipeRightToLeft")); break; default: break; } return g; }

...

public class GestureEventArgs : EventArgs { public string text; public GestureEventArgs(string text) { this.text = text; } }

Personalized EventArgs

Page 38: 5   track kinect@Bicocca - gesture

KINECT Programming

Performance • Skeleton processing is an expensive operation.

• Use VS2010 Performance Tool

Page 39: 5   track kinect@Bicocca - gesture

KINECT Programming

Page 40: 5   track kinect@Bicocca - gesture

KINECT Programming

PROs

• Easy to understand

• Easy to implement (for simple gestures)

• Easy to debug

CONs

• Challenging to choose best values for parameters

• Doesn’t scale well for variants of same gesture

• Gets challenging for complex gestures

• Challenging to compensate for latency

Pros & Cons

Recommendation Use for simple gestures

• Hand wave

• Head movement

Page 41: 5   track kinect@Bicocca - gesture

KINECT Programming

Page 42: 5   track kinect@Bicocca - gesture

KINECT Programming

Gesture Definition

Define gesture as weighted network

• Simple neural network

• Simple algorithmic gestures as input nodes

• Use fuzzy logic, i.e. probabilities, not Booleans

HeadAboveBaseLine

LeftKneeAboveBaseLine

RightKneeAboveBaseLine

Jump?

1

2

3

Page 43: 5   track kinect@Bicocca - gesture

KINECT Programming

Abstract Neuron

)(1

in

iixf

1x

f2x

1

2

nx

n

Page 44: 5   track kinect@Bicocca - gesture

KINECT Programming

Perceptron

• Simple network using weighted threshold elements

i

n

iiP

1

1P

nP

1

n

2P 2

Page 45: 5   track kinect@Bicocca - gesture

KINECT Programming

Example

HandAboveElbow AND HandInFrontOfShoulder

2

HandAboveElbow

HandInFrontOfShoulder

Hand.y

Elbow.y

Hand.z

Shoulder.z

(HandAboveElbow * 1) +

(HandInFrontOfShoulder * 1) >= 2

1

1

Page 46: 5   track kinect@Bicocca - gesture

KINECT Programming

Example

HandAboveElbow OR HandInFrontOfShoulder

1

HandAboveElbow

HandInFrontOfShoulder

Hand.y

Elbow.y

Hand.z

Shoulder.z

(HandAboveElbow * 1) +

(HandInFrontOfShoulder * 1) >= 1

1

1

Page 47: 5   track kinect@Bicocca - gesture

KINECT Programming

Network Definition for Detector

• Similar to perceptron

• Normalize using weights

• Use probabilities, not Booleans

n

ii

in

iiP

1

1

1P

nP

1

n

2P 2

Page 48: 5   track kinect@Bicocca - gesture

KINECT Programming

Surely This Will Suffice?

• But due to noise, still many false positives

• How can we reduce false positives?

0.8

HeadAboveBaseLine

LeftKneeAboveBaseLine

RightKneeAboveBaseLine

0.3

0.1

0.1 Jump?

LegsStraightPreviouslyBent 0.5

Page 49: 5   track kinect@Bicocca - gesture

KINECT Programming

And We’re Done!

0.8

HeadAboveBaseLine

LeftKneeAboveBaseLine

RightKneeAboveBaseLine

0.3

0.1

0.1

Jump?

LegsStraightPreviouslyBent 0.5

HeadBelowBaseLine

LeftKneeBelowBaseLine

RightKneeBelowBaseLine

LeftAnkleBelowBaseLine

RightAnkleBelowBaseLine

BodyFaceUpwards

1

OR

1

1

1

1

1

1

0

NOT

-1

2 AND

1

1

Page 50: 5   track kinect@Bicocca - gesture

KINECT Programming

0.8

HeadAboveBaseLine

LeftKneeAboveBaseLine

RightKneeAboveBaseLine

0.3

0.1

0.1

Jump? LegsStraightPreviouslyBent 0.5

HeadBelowBaseLine

LeftKneeBelowBaseLine

RightKneeBelowBaseLine

LeftAnkleBelowBaseLine

RightAnkleBelowBaseLine

BodyFaceUpwards

1

OR

1

1

1

1

1

1

0

NOT

-1

2 AND

1

1

1

1

OR

HeadFarAboveBaseLine

But Wait, If We Know For Sure…

Page 51: 5   track kinect@Bicocca - gesture

KINECT Programming

Implementation Overview

• Update height baseline values

• Update input nodes, i.e. algorithmic gestures

• Evaluate each node in network

• Calculate probability of gesture

Page 52: 5   track kinect@Bicocca - gesture

KINECT Programming

Pros

• Neural networks well understood • Introduced in 1940’s

• Learning algorithm can be used to find optimum • Parameters, weights, and thresholds

• Complex gestures can be detected

• Scale well for variants of same gesture

• Nodes can be reused in different gestures

• Easy to visualize as node graph

• Good CPU performance • 0.095 ms to execute Jump Detector

Page 53: 5   track kinect@Bicocca - gesture

KINECT Programming

Cons

• Lots of parameters, weights, and thresholds

• Small changes can have dramatic changes in results

• Very time consuming to choose manually

• Not easy to debug

• Is the code wrong or are parameters not optimal

• Challenging to compensate for latency

Page 54: 5   track kinect@Bicocca - gesture

KINECT Programming

Recommendation

• Use for more complex gestures

• Jump, duck, punch

• Break complex gestures into collection of simple

gestures

• Use learning algorithm

• Debug visualization is essential

Page 55: 5   track kinect@Bicocca - gesture

KINECT Programming

Page 56: 5   track kinect@Bicocca - gesture

KINECT Programming

Gesture Definition

• Define gesture as pre-recorded animations

• Motion capture animations

• Record different people doing same gesture

• Each person doing same gesture multiple times

Page 57: 5   track kinect@Bicocca - gesture

KINECT Programming

Exemplar

• Definition: ideal example to compare against

• Pre-recorded animations are exemplars

Page 58: 5   track kinect@Bicocca - gesture

KINECT Programming

Exemplar Matching

• Need to compare skeleton frames

• Define error metric for skeleton

• Angular difference for each joint in local space

• Peak Signal to Noise Ratio for whole skeleton

)/(log*10

Distance1

2

10

2

MSEMAXPSNR

NMSE i

0.3

Page 59: 5   track kinect@Bicocca - gesture

KINECT Programming

Exemplar Matching

• Search for best matching frames

• Best matching frame has strongest signal

• Different classifiers can be used

• K-Nearest

• Dynamic Time Warping (DTW)

• Hidden Markov Models (HMM)

Page 60: 5   track kinect@Bicocca - gesture

KINECT Programming

Exemplar Matching

0

5

10

15

20

25

1 2 3 4 5 6 7 8

PSNR

Page 61: 5   track kinect@Bicocca - gesture

KINECT Programming

Pros

• Works well for context-sensitive gesture detection

• Works well for animation blending

• Very complex gestures can be detected

• DTW allows for different speeds

• Can compensate for latency

• Can scale for variants of same gesture

• Just need more resources

• Easy to visualize exemplar matching

Page 62: 5   track kinect@Bicocca - gesture

KINECT Programming

Cons

• Requires lots of resources to be robust

• Multiple recordings of multiple people for one

gesture

• i.e. requires lots of CPU and memory

• K-Nearest

• 1.5 ms for 16 exemplar matches

• DTW

• 5 ms for 16 exemplar matches

Page 63: 5   track kinect@Bicocca - gesture

KINECT Programming

Example

• 10 Gestures, 10 People, 5 times = 500 Exemplars

• K-Nearest

• 46 ms

• DTW

• 156 ms

• Weighted network

• 1 ms

0

20

40

60

80

100

120

140

160

180

K-Nearest

DTW

WeightedNetwork

Page 64: 5   track kinect@Bicocca - gesture

KINECT Programming

Recommendation

• Use for context-sensitive gesture detection

• Use for complex gestures • Dancing, fitness exercises

• Use when reducing latency is critical

• Optimize by reducing exemplar matches • Preprocess exemplar data with key frames

• Use context of game

• Use another fast method first

• Implement debug visualization

Page 65: 5   track kinect@Bicocca - gesture

KINECT Programming

Page 66: 5   track kinect@Bicocca - gesture

KINECT Programming

Building Great Gesture Detection

Data Collection

Development

Testing

Page 67: 5   track kinect@Bicocca - gesture

KINECT Programming

Data Collection

Identify Gestures

Record Gestures

Tag Gesture Recordings

Verify Gesture Tagging

Backup & Share

Jump Punch

1. Exemplar 2. Sequence of same gesture 3. General (actual game play)

At least depth & skeleton

Meta data per recording, tag start/stop events for each

gesture

Someone other than tagger should verify correctness

Old, young, male, female, overweight, handedness

Use custom tool,or export to Excel

Page 68: 5   track kinect@Bicocca - gesture

KINECT Programming

Development

Tagged Gesture Recordings

Filter Joints Normalize Skeleton

Gesture Detector

Parameters Weights

Thresholds

Machine Learning Algorithm

Debug Visualization

Result Verification

Error

Phase 1 – Exemplar Data Phase 2 – Sequence Data Phase 3 – General Data

Page 69: 5   track kinect@Bicocca - gesture

KINECT Programming

Testing Tagged Gesture

Recordings

Filter Joints Normalize Skeleton

Gesture Detector

Parameters Weights

Thresholds

Result Verification

Error

Live Camera Stream

Human Verification

Feels Robust?

Data Collection

No

Page 70: 5   track kinect@Bicocca - gesture

KINECT Programming

Takeaways

• A system, not just a detector • Detector is small component

• Invest equally in other components

• Manage data • You’ll have lots of it!

• Most valuable component

• Tagging correctly is essential

• Collect real user data

Page 71: 5   track kinect@Bicocca - gesture

KINECT Programming

References • “A Brief History of Human Computer Interaction Technology” – Brad A. Myers

• “Neural Networks – A Systematic Introduction” – Raúl Rojas

• “A Gesture Processing Framework for Multimodal Interaction in Virtual Reality” – Marc E. Latoschik

• Gamefest 2010 – “Gesture Recognition” – Lewey Geselowitz & J. McBride

• Kinect Developer Summit 2011 – “Inside Kinect Skeletal Tracking Deep Dive” – Zsolt Mathe