gesture recognition in a class room environment michael wallick cs766
TRANSCRIPT
Gesture Recognition in a Class Room Environment
Michael Wallick
CS766
Virtual Videography
Place cameras in an environment
Automatically edit video off-line
Output should look like a professional editor
Our Implementation
Looking at the classroom domain
Recorded one semester of CS559 (Computer Graphics)
Computer Vision in Virtual Videography Understand what is happening on the
chalkboard Writing on the board
Understand what the professor is doing Location Actions
Chalkboard…
Partition the board into regions
Regions are semantically related groups of writing
Regions can be approximated using computer vision Let’s treat this as a black box … it just
happens
Gesture Recognition
Understand gestures or actions by a performer
Generally used as an input to a computer
Understand what the professor is doing Pointing Writing Reaching
Writing can be confused with Pointing and Reaching
Template Matching for G. R.
Generate templates of known gestures
Match an unknown frame with a template matching algorithm Sum of Squared Difference Cross Correlation Image Difference …
Implement of Gesture Recognition
The user selects several template images Pointing
Reaching
Format the templates
Separate the lecturer
Crop the image
Resize the images 256x256
Build the Recognition Mask
Load each template into the mask
For each “on” pixel, increment the mask at that location
Recognizing Gestures
Separate the lecturer from foreground Crop and resize For every “on” pixel, increment the “Score” by
that value in the mask Compute Confidence as
(float) (Score/Mask_Total) Compute Confidence for all gestures
A Gesture Matches if Confidence is:
Under 50% but much larger than other gestures
Over 50% and not too close to other gestures
Example: Ground State
Example: Pointing
Example: Reaching
Mistakes
Overall the results are good
Sometimes individual frames are not correct
Solution
For each frame, look at surrounding frames
Label frame with gesture of the majority
Where to go from here…
Use the regions to Validate the gestures Determine what is being pointed at
Incorporate the writing information with the gestures
Write paper and webpage!
Conclusions
We want to use gesture recognition for Virtual Videography
Gestures can be used to drive camera model
Find gestures by template matching
For each frame, take the “average” around a region of frames to correct errors
Thank You!
Questions/ Comments?