human action recognition using 3d joint information and hoofd features

Human Action Recognition Using 3D Joint Information and Pyramidal HOOFD Features

Human Action Recognition Using 3D Joint Information and Pyramidal HOOFD FeaturesMSc Thesis by Bar Can stndaThesis Advisor: Prof. Dr. Mustafa nelBuraya grseller eklenecek

Introduction to Human Action RecognitionMotivation, ApplicationsRelated WorkHuman Action Recognition Using 3D Joint Information and HOOFD FeaturesAcquiring Depth DataFeature Extraction3D JointsHOOFDFeature RepresentationClassificationExperimentsDatasetsMSR Action 3D DatasetMSR Action Pairs DatasetMSRC-12 Gesture DatasetConclusions & Future Work

Outline


Outline

Motion PerceptionGunnar Johansson [1971]Sequence of images for Human Motion AnalysisMoving Light Displays enable identification of people and gender

Motion Capture [2014]Dawn of the Planet of the Apes

Motivation

Vast amount of Data

Motivation

Video CategorizationMoviesTVYouTube

Motivation

Video CategorizationHow many human-pixels are there?MoviesTVYouTube

Motivation

Video CategorizationHow many human-pixels are there?

MoviesTVYouTube

35%34%40%

Motivation

Rehabilitation

15M people suffer fom stroke every yearAutomated systemsGamification

Motivation - Application

Release of Low-cost Depth CamerasKinect (2010)Google Tango (developers only, 2014)Leap Motion (2013)Effective and robust performance givenComplex backgroundChallenging viewpoints Occlusions

Motivation Why depth?

Google TangoLeap Motion

Related Work

Related Work

Extraction of Cuboids, Dollar et al. [CVPR, 2005] Motion History ImagesMotion Energy Images,Gorelick et al. [PAMI, 2007] Intensity Based

Related WorkHistogram of Oriented 4D Normals (HON4D)Oreifej et al. [CVPR, 2013] Depth Motion Maps,Yang et al. [JRTIP, 2012]

Depth Map Based

Related WorkSequence of Most Informative Joints (SMIJ),Ofli et al. [CVIU, 2013] View Invariant HumanAction RecognitionUsing Histogram of3D Joints,Xia et al. [CVPR, 2012] Skeletal Data Based


Outline

Human Action Recognition Using 3D Joint Information and HOOFD FeaturesDepth AcquisitionFormation of shadowsEliminating the noise3D JointsHOOFDSignal WarpingPyramidal HOOFD FeaturesNaive BayesSupport Vector Machines

16

KinectDepth data acquisition is accomplised by using Light Coding Method

In order to process the depth data in any applicationFormation of shadowsEliminating the noise

ShadowsGenerated by the foreground objects

Noise Rough object boundaries caused gaps and holes on depth data

Bilateral Filter

Space termRange term

Joint Features20 Joints are provided by Kinect SDK

10 Joint Angles and theirderivatives calculated:

Joint FeaturesMapped to sphericalCoordinates

Origin is aligned tothe hip centerRadius parameter is discarded

Histogram of Oriented Optical Flows from Depth (HOOFD)

Optical Flow from Depth DataMapping of depth data to intensity imageDepth values (z) represented as intensity (I)Optical flow field which is invariant to sudden change of brightness

Optical Flow 2D displacement of pixel patches on the image plane

Brightness Constancy Equation

Linearizing assuming small (u,v) using Taylor Series Expansion

Histogram of Oriented Optical Flows from Depth (HOOFD)

Brightness values of individual pixels on a local patch are preserved.

By linearizing the equation around I(x,y,t) using Taylor series expansion we obtained the second equation22

Optical Flow Lucas Kanade MethodApply it within a local patch

Minimize using Least-Squares method

Even though we assumed that the equation is equal to 0, practically it is not.

We then discretize the equation and applied it within a local patch and we acquired this cost function

Minimizing this function using least squares gives us the optical flow vectors as a result23

Optical Flow Horn Schunk MethodAssumption: global smoothness in the flow over the whole image

Smoothness error:

Error in brightness constancy equation

Minimize:

However in the literature there is also another method proposed by Horn and Schunk, which introduced a global smoothness constraid over the whole image.

This is a useful method to correct errors that is caused by the gaps and holes on depth data.

Smoothness is introduced by minimizing the velocities, optical flow vectors 24

Histogram of Oriented Optical Flow from DepthBinning according to:Primary Angle between the flow vector and the horizontal axisMagnitude of the flow vector

Orientation & Magnitude images

Histogram Binning example with bin size = 4

Signal WarpingIf it is a longer action instance -> Discard framesIf it is a shorter action instance -> Replicate and insert frames

Pyramidal HOOFD FeaturesHistogram of Oriented Optical Flow from DepthAfter obtaining optical flows patches1. Patches are extracted around each joint

Pyramidal HOOFD FeaturesHistogram of Oriented Optical Flow from DepthAfter obtaining optical flows patches1. Patches are extracted around each joint2. HOOFDs are calculated in a pyramidal fashion

Level 2Level 3Level 1



Supervised learning methodsTraining examples are attached to known classesSpam filtering on an e-mail clientExamples: Naive Bayes, Support Vector Machines

Naive Bayes ClassifierIndependence assumption between featuresFor example: a car Volkswagen with a red color and 17 inch wheels and these features contribute independently to classify that this car is a Volkswagen

Support Vector MachinesCalculates the choice of the most optimal hyperplane that defines the decision boundary between two classes

Introduction to Human Action RecognitionMotivation, ApplicationsRelated WorkAction Recognition Using 3D Joint Information and HOOFD FeaturesAcquiring Depth DataFeature Extraction3D JointsHOOFDFeature RepresentationClassificationExperimentsDatasetsMSR Action 3D DatasetMSR Action Pairs DatasetMSRC-12 Gesture DatasetConclusions & Future Work

Outline

DatasetsMSR Action 3D10 Subjects20 Actions

MSR Pairs 3D10 Subjects12 Actions

MSRC-12 Gesture30 Subjects12 Actions

Experiments

Experiment - 1

SettingsDataset: MSRC-12 GestureFeature: Joint FeaturesRatio: Leave-one-subject-out-cross-valuation50% Training 50% Test75% Training 25% Test

Experiment - 1

Experiment - 1

Experiment - 2

SettingsFeature: HOOFD FeaturesDataset: MSR Action 3D Ratio: 50% Training 50% Test

Experiment - 2SettingsFeature: HOOFD FeaturesDataset: MSR Action 3D Ratio: 50% Training 50% Test

HON4D: To make the descriptors more discriminative, they quantized the 4Dspace using the vertices of a polychoronDictionary Learning Group Sparsity Geometric Constraint with Temporal Pyramid Matching40

Experiment - 2SettingsFeature: HOOFD FeaturesDataset: MSR Action 3D Ratio: 50% Training 50% Test

Smash ActionForward Punch Action

Experiment - 3SettingsFeature: HOOFD FeaturesDataset: MSR Action PairsRatio: 50% Training 50% Test

Conclusion & Future WorkWe developed a novel human action recognition framework by fusing 3D Joint information and HOOFD features

We proposed a new feature called Histogram of Oriented Optical Flow from Depth (HOOFD)

Several experiments with publicly available datasets were conducted to assess the performance of the proposed technique.

Comparison with state-of-the-art algorithms show the success of our algorithm.

As future work,Potential of HOOFD will be fully explored

Different popular classification approaches will be employed (Bag of Words, Random Forest, Boosted Trees)

Thank You ... ???

human action recognition using 3d joint information and hoofd features

Software