research on sports training action recognition based on

Research ArticleResearch on Sports Training Action Recognition Based onDeep Learning

Peng Wang

College of Physical Education Zhengzhou University Zhengzhou 450000 China

Correspondence should be addressed to Peng Wang lytzznueducn

Received 28 April 2021 Revised 31 May 2021 Accepted 9 June 2021 Published 29 June 2021

Academic Editor Shah Nazir

Copyright copy 2021 PengWangis is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

With the rapid development of science and technology in todayrsquos society various industries are pursuing information digitizationand intelligence and pattern recognition and computer vision are also constantly carrying out technological innovationComputer vision is to let computers cameras and other machines receive information like human beings analyze and processtheir semantic information andmake coping strategies As an important research direction in the field of computer vision humanmotion recognition has new solutions with the gradual rise of deep learning Human motion recognition technology has a highmarket value and it has broad application prospects in the fields of intelligent monitoring motion analysis human-computerinteraction and medical monitoring is paper mainly studies the recognition of sports training action based on deep learningalgorithm Experimental work has been carried out in order to show the validity of the proposed research

1 Introduction

In recent years humanmotion recognition has become a hotissue in the field of the application system and academicresearch As early as 1973 a psychologist named Johanssoncarried out the motion perception experiment of movinglight spot which is the first modern research on humanmotion recognition Since then until the 1990s peoplebegan to pay more attention to this field So far many re-searchers around the world have done a lot of research onhuman motion recognition technology e traditional re-search on humanmotion recognition can be divided into thefollowing parts representation of motion information andrecognition and classification of motion information

Computer vision is the field of artificial intelligencewhich is mostly used for creating systems to preparecomputer for understanding and eliminating issue withartificial images and sense [1] Human action recognition ishaving subtask of collective activity recognition for whichthe available datasets are commonly inadequate e studyhas been presented to look into the issues presenting thecollective sports dataset containing multitask recognition forsports and collective activity categories A novel protocol of

evaluation called unseen sports is presented in which thetraining and test are carried out on disjoint sets of sportscategories [2] e study proposed human action recogni-tion through deep multimodal feature fusion algorithm [3]e research fuses visual feature probability maps skeletonand audio signal into hybrid feature utilized for representinghuman action Categories of human and nonhuman areclassified through the use of convolutional neural network[4] Research has been done for action recognition ofswimming sports based on wireless sensor and field pro-grammable gate array [5]

In recent ten years deep learning has become a researchhotspot in the field of artificial intelligence and variousresearch results based on deep learning methods have beenapplied to practicee sudden boom of deep learning is notaccidental but the reward of decades of intensive work ofresearchers in this field From the 1940s to 1960s [6ndash9] therudiment of deep learning began to appear in cybernetics In1958 Rosenblatt designed neuron perceptron and relized thetraining of a single neuron In the 1990s the emergence ofthe backpropagation algorithm made it possible to trainneural networks with one or two hidden neurons It was notuntil 2006 that the concept of deep learning was formally

HindawiScientific ProgrammingVolume 2021 Article ID 3396878 8 pageshttpsdoiorg10115520213396878

established by Hinton et al which set off the third wave ofdeep learning

Following are the main contributions of the study

(i) To study the existing approaches in the context ofsports training recognition

(ii) To study the recognition of sports training actionbased on deep learning algorithm

(iii) Experimental work has been carried out in order toshow the validity of the proposed research

2 Human Motion Video Image and MotionInformation Representation

21 Motion History Image e motion history image wasfirst proposed by Davis and Bobrick Before that they firstproposed a binary motion energy image which is thepredecessor of the motion history image So let us take alook at the motion energy image e motion energy imagemainly describes how the object moves and space changes torecognize the moving object It can describe the outline ofthe object movement and the spatial distribution of theenergy [10ndash12]

As shown in Figure 1 we take the action of sitting downas an example e upper line is the keyframe of the actionand the next line shows the binary motion image accu-mulated from the start frame to the corresponding frameWe can observe that the blank area in the image is the targetmotion area By observing the shape of the moving area ofthe target the occurrence of the movement and the ob-servation angle is judged [13ndash15]

We call the accumulated binary motion image themotion energy graph as shown in the following equation

Eτ (x y t) 1113944τminus1

i0D(x y t minus i) (1)

where Eτ (x y t)is the binary motion energy imageD (x y t)is the frame difference between the t frame andthe tminus 1 frame and the motion energy image E (x y t)isthe cumulative sum of the frame differences

Although a motion energy map can reflect the spatialinformation of motion it cannot reflect its temporal in-formationerefore the motion image emerges as the timesrequirement based on the motion energy image By calcu-lating the pixel changes in the same position at a certaintime it presents the target motion in the form of imagebrightness is method belongs to the template methodbased on vision e gray value of each pixel in the motionhistory image shows the motion of position pixels in thevideo sequence If the last moving time of the pixel is closerto the current frame the higher the gray value Comparedwith the motion energy image it can not only show thesequence of action but also contain more details ereforethe motion history image can represent the movement of thehuman body in a movement process which makes it widelyused in the field of motion recognition Let s be the intensityvalue of pixels in the motion history map and Sτ (x y t)

is the update function

Sτ(x y t) τ if Ψ(x y t) 1

max 0 Sτ(x y t minus 1) minus δ( 1113857 otherwise1113896

(2)

where (x y) represents the position of the pixel and t is thetime t is the duration which determines the time range ofmotion from the angle of frame number 8 is the attenuationparameter e update function S(x y t) can be defined byoptical flow interframe difference or image difference andthe interframe difference method is the most commonlyused Its application is shown in formulas (3) and (4)

Ψ(x y t) 1 if D(x y t)ge ξ

0 otherwise1113896 (3)

where

D (x y t) |I(x y t) minus I(x y t plusmn Δ)| (4)

where I(x y t) is the intensity value of the coordinate (x y)pixel in the T frame of the video image sequence Δ is theinterframe distance and ξis a difference threshold given by ahuman which can be adjusted with the change of videoscene

Figure 2 shows the effect pictures of motion historyimages corresponding to different T values It can be seenfrom Figures 2(a) and 2(b) that when the value of T is toosmall the whole motion trajectory of the action cannot beobtained completely As shown in Figure 2(d) when thevalue of R is too large the change of the intensity value of themotion track in the captured motion history map is notobvious which leads to the loss of the information of theaction time dimension We cannot distinguish that the valueof t must be considered in the motion history map obtainedbecause the value is too small As for the differencethreshold if the value is too small the acquired motionhistory map will exhibit a lot of messy noise As shown inFigure 2(e) the obtained image cannot distinguish theforeground from the background well if the value is toolarge the area with a smaller pixel intensity value willdisappear and empty holes will appear resulting in loss ofaction information With the increase of the value the voidarea will be larger and larger until the final motion historyimage only contains the contour edge rough the ex-periment the optimal value is t= 50 ξ 40 which canobtain the most sufficient and effective motion trajectoryinformation

22 Rainbow Coding e pseudocolor processing of theimage can transform the image information into a form thatis easier to recognize by humans or machines and enhancethe useful information in the image Pseudocolor processingrefers to the technical process of converting a black andwhite gray image or multiband image into a color toneimagee commonly used pseudocolor codingmethods aredensity segmentation filtering and gray level colortransformation

e density segmentation method is mainly used to dealwith the image with discontinuous hue which is the simplest

2 Scientific Programming

method of pseudocolor enhancement It divides the graylevel of a gray image from 0 to 255 intom intervals g I= 1 2 M and then assigns a specific color C to each interval acolor image is obtained from a gray image However thedisadvantage of this method is that the change of hue is notcontinuous and the image has obvious blocks and thenumber of colors is not rich

e filtering method is a method based on the frequencydomain It does not rely on the gray level of the image togenerate pseudocolor but is determined by the differentspatial frequencies of the gray image As shown in Figure 3

the gray image is first transformed into the frequency do-main by Fourier transform and then it is separated intothree independent variables by using three filters with dif-ferent characteristics in the frequency domain en threesingle-channel images with different frequency componentsare obtained by inverse Fourier transform of these threevariables en they are processed such as histogramequalization Finally we synthesize our pseudocolor imagesas RGB tricolor components

ere are many color transformation methods based ongray levels such as gray mapping rainbow coding and so

Frame 1 Frame 13 Frame 20 Frame 30 Frame 40

Figure 1 Keyframe and motion energy diagram of action ldquosit downrdquo

(a) (b) (c)

(d) (e)

Figure 2 Motion history of different parameters (a) τ = 10 ξ = 40 (b) τ = 20 ξ = 40 (c) τ = 50 ξ = 40 (d) τ = 100 ξ = 40 (e) τ = 50 ξ = 10

Scientific Programming 3

on But the central idea is based on the principle of coloraccording to different coding formulas the gray value of theimage is generated into three-channel values of red greenand blue and then the color is synthesized In RGB colorspace any color can be composed of red green and blue indifferent proportions erefore what we need to set is thetransformation function of the three color channels ecolor matching equation is shown in formulas (5)ndash(7)

R(x y) TR f(x y)1113864 1113865 (5)

G(x y) TG f(x y)1113864 1113865 (6)

B(x y) TB f(x y)1113864 1113865 (7)

where R(x y) G(x y) and B(x y) are the values of redgreen and blue respectively f (x y) is the gray value of (x y)points on the gray image and TR TG and TB are thecorresponding mapping functions e pseudocolor imagewe need can be obtained by driving the color display with thethree-channel values It can be seen that the red green andblue mapping functions are very important which deter-mine the quality of pseudocolor after transformation Dif-ferent mapping functions will result in different pseudocolorimages

23 Improved Motion History Image It is not effective toextract motion history images from RGB video and send it tothe network for training In this paper we propose a humanmotion recognition method based on the improved motionhistory image mainly from the following aspects

231 Removing Redundant Motion Sequences In the ex-periment it is found that the performer usually has a re-action time of about 1 second when the action executioncommand is issued Similarly after the execution of theaction there is a period of static time which means thatthere are useless still frames at the beginning and end of thedataset video ese frames contain useless redundant in-formation and even cover the important information ofkeyframes which directly affects the quality of the extractedmotion history and then the image erefore beforeextracting the motion history image from the video the first

step is to remove 10 frames of each video and then themotion history image is obtained

232 Applying Rainbow Coding According to Abidi et al inthe report better perceptual quality and more informationcan be obtained by encoding gray texture with humanperceptible color Inspired by this in this paper we use therainbow coding to enhance the motion model of the motionimage e larger the gray value the closer it is to redotherwise the smaller the gray value the closer it is to bluee motion history image encoded by the rainbow has a richcolor e distribution of color reflects the level of motionand the information of the time dimension which can moreeffectively represent the motion information of action

3 Overview of Deep Learning

31 Deep Learning Method Deep learning is a method oflearning data representation in machine learning roughlearning multilevel combination we can get the recognizablefeature representation and finally map the feature repre-sentation to the task target As a kind of machine learningdeep learning is superior to machine learning in that it canautomatically learn the feature representation of data Asshown in Figure 4 deep learning avoids the trouble ofmanual feature design in machine learning in the traditionalmachine learning process data are extracted from input tomanual feature extraction and then the extracted featuresare mapped to learning objectives Deep learning simplifiesthis process Using the end-to-end idea the deep learningmodel can directly convert the input to the output and theprocess of feature extraction and feature mapping to thetarget output is automatically completed by the modelwhich eliminates many complicated intermediate processesin traditional machine learning

Like other machine learning methods the essence ofdeep learning is to use algorithms to learn knowledge from alarge number of data but it is called ldquodepthrdquo On the onehand the depth of the deep learning model is the stack ofmultiple layers of modules and the number of layers is largee data from input to target output need multilayertransformation and the model is deeper on the other handthe feature extraction of deep learning is a process of ab-straction and fusion from generalization features to semanticfeatures e shallow features are some basic patterns the

Grayscaleimage

Fouriertransform

Filter 1

Filter 2

Filter 3

Fourier transform

Fourier transform

Fourier transform

Histogramequalization



Color display

Figure 3 Flowchart of the filtering method


middle-level features begin to have some fuzzy semanticsand the deep-seated features are recognizable semanticfeatures which can be mapped to the target output thedeep-seated features have the following features the obviousprogressive process from shallow to deep and what themodel finally learns is this deep feature representationmethod

Since the development of deep learning it includesmathematical analysis linear algebra probability theorymathematical statistics optimization theory and numericalcalculation It also includes regularization methods such asrandom deactivation and batch standardization to ensurethe generalization performance of themodel model learningmethod combining backpropagation and random gradientdescent and distributed representation strategy in the ab-sorption representation learning field ese have injectedstrong vitality into the deep learning method

311 Forward Propagation and Backward PropagationIn deep learning feedforward neural networks have a for-ward propagation process When the input informationpasses through a layer of hidden units after layer by layerconversion the final output is generated Such a flow ofinformation is called forward propagation forward propa-gation can be regarded as a process of network input pro-cessing When the network is training the input data flowthrough the network and produce output and calculate theloss function with the target output Combined with thebackpropagation algorithm the network is updated whenthe network weight parameters are fixed the networktraining has been completed and the forward propagation isa prediction process and the final prediction results areobtained by the forward propagation of the input

In contrast to forwarding propagation backpropagationis a process in which the information of cost function flowsforward through the network and calculates the gradiente nonlinearity of the deep learning model makes thelearning of the model nonconvex optimization Generallythe gradient-based method is used to iterate the trainingmodel so that the cost function of the model converges tothe minimum value e backpropagation algorithm pointsout a way to calculate the gradient e gradient points outthe optimization direction and then combines the stochasticgradient descent algorithm to update the weight parametersof the model e core idea of backpropagation is to re-cursively calculate the gradient of the cost function

concerning hidden layer output and weight using chain ruleFirstly the gradient of the cost function about the output ofthe last hidden layer and the gradient of the output of the lastlayer concerning the weight parameters are calculated andthen the gradient of the cost function concerning the weightparameters of this layer can be obtained by using the chainrule and then the gradient of the output of this layerconcerning the input is calculated and multiplied

Based on the gradient of the previous cost function onthe output of this layer the derivative of the cost functionconcerning the output of the penultimate hidden layer isobtained and so on until the hidden layer of the lowestlayer to obtain the gradient of the cost function concerningthe weight parameters of all layers

For the deep learning model with L-layer hidden layerthe weight parameter of each layer is w and the input of eachlayer is X To calculate simply it is assumed that the outputof the former layer is the input of the latter layer X is theoutput of the last hidden layer and the cost function is J

Calculate (zJzXL+1) and keep it as the next operationFor each layer l l L L minus 1 1 the calculation processis as follows

calculatezXi+1

zWi

andzJ

zXi+1 then

zJ

zWi

zJ

zXi+1

zXi+1

zWi

calculatezXi+1

zXi

thenzJ

zXi

zJ

zXi+1

zXi+1

zXi

reservezJ

zXi

for the next operation

(8)

In the above operation process the propagation processis simplified In practice in addition to the gradient of theweight parameters if there are bias terms and regular termsthe gradient of the cost function concerning bias and reg-ularity also needs to be calculated Moreover before thehidden layer output there is generally an activation processand the gradient of the cost function on the layer outputneeds to be converted to the gradient before the activation

e backpropagation algorithm is not only used tocalculate the gradient of cost function about parameters butalso used to calculate the gradient of other outputs on pa-rameters to analyze the model e backpropagation algo-rithm can be used to calculate the gradient of any functionwhich is a very practical method to calculate the gradient

Machinelearning

Deeplearning

Data input

Data input

Manual featuredesign

Deep feature design

Target mapping

Target mapping

Target output

Target output

Figure 4 Machine learning method and deep learning method flow


e backpropagation algorithm combined with randomgradient descent has always been the most commonly usedlearning method for deep learning model law

312 Distributed Representation When it comes to deeplearning we have to mention distributed representation Asa kind of representation learning deep learning is unique inthat it can automatically learn the distributed feature rep-resentation of data according to different learning tasks asan important tool of representation learning distributedrepresentation is a representation of concepts expressed by acombination of multiple separated features

In the deep learning model neural network neuron tothe semantic concept is many-to-many mapping e se-mantic concept may be represented by activation patternsdistributed in different neurons and a neuron can partici-pate in the representation of different semantic concepts Forexample the semantic concept ldquocatrdquo can be represented bythe combination of features such as ldquoearrdquo ldquofour legsrdquo andldquofurrdquo In convolutional neural networks these features arethe activation modes of several convolutional neurons andthe feature of ldquofurrdquo can also be a local feature of the semanticconcept ldquodogrdquo ldquoleopardrdquo or ldquotigerrdquo e neurons thatgenerate this activation mode can also participate in therepresentation of these semantic concepts the advantage ofthis feature is that fewer learning samples can be used toachieve the same as nondistributed tables

It shows the same learning effect For example for inputsamples such as ldquowhite catrdquo ldquoblack catrdquo ldquowhite dogrdquo andldquoblack dogrdquo when distributed representation is not appliedfour separate neurons are needed to learn the concepts ofcolor and category at the same time After using distributedrepresentation only two kinds of neurons are needed one isused to describe categories one is used to describe colorsand the other is to describe colors Color neurons can learncolor concepts from input samples of ldquocatrdquo and ldquodogrdquo in-stead of using specific neurons to learn from specifiedsamples

4 Analysis and Recognition of Sports Video inthe Process of Sports Training

41 Adaptive 7reshold Moving Object Separation Based onParticle Filter Prediction e separation of moving objectsin sports training videos can collect and process the movingobjects from the dynamic background which is the basis ofsports video analysis Sports training video sequence sepa-ration in the sports target is the athlete in the video theadaptive threshold moving object separation algorithmbased on particle filter prediction is used to enhance theaccuracy of moving object acquisitione specific process isas follows firstly the foreground image in the video isseparated by the three-frame difference method and thebackground is projected into adjacent video framesaccording to the camera steady motion model and thebackground separation map of each frame is obtained emethod of background subtraction is used to further sep-arate moving objects Because of the similarity between the

foreground image and the background image to avoid theseparation of the moving objects in the video foregroundimage mistakenly fused into the background image it isnecessary to separate the coordinate range of the foregroundimage and obtain the frame threshold of the backgroundimage which is not in the coordinate range of the foregroundimage according to the particle filter method to completethe adaptive threshold separation of the moving object

e foreground target obtained by the three-framedifference method is interfered with by noise which will leadto false separation After filtering it can be used as theoperation standard of adaptive threshold separation enthrough the prediction scheme of the particle filter theforeground coordinate interval of other frames is predictedaccording to the result of frame separation at this time eoffset state between the image pixel and the foregroundcoordinate interval is added which is set as the foregroundseparation probability of the pixel in other frames Based onthis probability the adaptive separation threshold can becalculated

(1) e probability of separating background points bythe three-frame difference method is as follows

PLj

01 the pixel belongs to the foreground

05 the background of the pixel is not clear

09 the pixel belongs to the background

⎧⎪⎪⎨

⎪⎪⎩

(9)

e probability of pixel separation into the back-ground can be obtained by formula (9) For pixelsadjacent to the image boundary it is not clear thatthey are the background so a median value of 05needs to be set e mean filtering method can re-strain the bad invasion of noise After filtering thecollected image pixels with a 3times 3 filter the newbackground separation probability is obtained asfollows

Pf(i j) 19

1113944

1

mminus11113944

1

nminus1P(i + m j + m) (10)

(2) e probability of background points is obtained byparticle filter From the schematic diagram of particlefilter prediction described in Figure 5 it can be seenthat the particle set composed of weighted particlescan be regarded as the foreground range of themoving target and the prediction of foregroundrange is completed e particle types in the particleset are vectors (including the x-coordinate and y-coordinate of the upper left corner of the movingobject the x-coordinate and y-coordinate of thelower right-hand corner of the moving object andthe horizontal and vertical movement speed of theupper left-hand corner and the lower right-handcorner of the moving target) After separating theforeground coordinate interval the actual weights ofvarious types of particles are calculated according tothe foreground coordinate interval and the particle


samples are extracted the particles with higherweight are more likely to be output as samples enew particle set is formed by sampling the particlesand then the foreground range of subsequent framesis predicted to obtain the probability of differentpixels as background in the subsequent frames If themoving object pixel in the moving video falls in thebackground probability area the point belongs to thebackground otherwise it belongs to the movingforeground and finally the moving object separa-tion is completed

42MovingObjectTracking inMovingVideo e purpose ofsports target separation in sports training video is to ac-curately track the sports target and the purpose of trackingis to collect the motion parameters of athletesrsquo body joints

from the sports training video e adaptive particle filteralgorithm is used to track the sports video in the process ofsports training and the human skeleton model is created asshown in Figure 6

e human motion model can predict the immediatemotion state according to the motion state of the previousframes In the process of sports training the sports trend andthe number of sports have regularity

5 Conclusion

e key to the success of human motion recognition is tocapture the spatiotemporal motion patterns of all parts ofthe human body at the same time In this paper a deeplearning method based on the motion characteristics oflocal joints is proposed to recognize the motion samplesen the effectiveness of the method is evaluated on twodatasets Aiming at the shortcomings of the model theaverage error of the model is reduced and the spatialconfiguration information is introduced to improve therecognition accuracy of the model e work of this paperis summarized as follows after the in-depth investigationon the related tasks of motion recognition the researchprospect and application significance of the deep learningmethod are pointed out and the recent research statusbased on the deep learning method is introducedaccording to the data mode used which provides the basisfor the later work of action recognition based on localmotion features

Data Availability

e data used to support the findings of this study are in-cluded within the article

Conflicts of Interest

e author declares that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

is study was supported by the Outstanding YoungTeachers Project in Colleges and Universities of HenanProvince (no 2017GGJS164)

Previous framesegmentation result Particle initialization

Get the predictiontarget value of each

particle

Calculate the weightof each particle Resampling

Predict the probability that the pixels in thenext frame belong to the background

Figure 5 Schematic diagram of particle filter prediction

L11

J16

J6

J5

J4

J3

J2

J1

J16

J9

L8

L7

L15

J7

J8

J16

J16

L5L9 L12

L4

L3 L6

J16 J13

J11

J12 J15

J14

L14

L13

Figure 6 Two-dimensional nodes of human skeleton model


References

[1] X Xie ldquoImage recognition of sports training based on openIoT and embedded wearable devicesrdquo Microprocessors andMicrosystems vol 82 Article ID 103914 2021

[2] C Zalluhoglu and N Ikizler-Cinbis ldquoCollective sports Amulti-task dataset for collective activity recognitionrdquo Imageand Vision Computing vol 94 Article ID 103870 2020

[3] E Zhou and H Zhang ldquoHuman action recognition towardmassive-scale sport sceneries based on deep multi-modelfeature fusionrdquo Signal Processing Image Communicationvol 84 Article ID 115802 2020

[4] W Yang Y Gao and F Zhai ldquoSimulation of sports actionpicture recognition based on FPGA and convolutional neuralnetworkrdquo Microprocessors and Microsystems vol 80 ArticleID 103593 2021

[5] T Liang ldquoSwimming sports action recognition based onwireless sensor and FPGArdquo Microprocessors and Micro-systems Article ID 103433 2020 In press

[6] C Gong H Wang R Li et al ldquoVideo moving object tra-jectory tracking algorithm based on state-dependent detec-tionrdquo Modern Electronic Technology vol 39 no 7 pp 51ndash562016

[7] H Song F Wang X Liu et al ldquoGroup motion analysis andabnormal behavior detection in a geographical environmentrdquoGeography and Geographic Information Science vol 31 no 4pp 1ndash5 2015

[8] M Zhu ldquoResearch and application of a three-axis accelerationsensor system in the field of sports technology analysisrdquo Iceand Snow Sports vol 37 no 2 pp 89ndash96 2015

[9] F Wei ldquoLoose design of key parts of childrenrsquos trousers undersports conditionrdquo Xirsquoan University of Technology Acta Sinicavol 29 no 5 pp 550ndash554 2015

[10] H Guo ldquoTable tennis decision-making system based on dual-channel target motion detectionrdquo Journal of Guangzhou In-stitute of Physical Education vol 34 no 6 pp 67ndash70 2014

[11] D Sui and D Hou ldquoResearch and simulation of dynamicimage tracking optimization algorithmrdquo Modern ElectronicTechnology vol 39 no 6 pp 98ndash100 2016

[12] B Li F Liu D Li et al ldquoAbnormal behavior detection andrecognition of moving objects in campus intelligent videosurveillancerdquo Software Guide vol 15 no 2 pp 134ndash136 2016

[13] M Wu and L Lin ldquoResearch on video recognition andcomparison system of competitive sportsrdquo Journal ofGuangzhou Institute of Physical Education vol 34 no 4pp 59ndash61 2014

[14] C Li and X Lian ldquoResearch on video analysis and intelligentdiagnosis system of national traditional sports antagonismrdquoJournal of Guangzhou Institute of Physical Education vol 36no 1 pp 52ndash56 2016

[15] W Wei K Wu L Guo et al ldquoVideo tracking of gradientexpansion template motion of memory watershed disc smallrdquoJournal of System Simulation vol 28 no 2 pp 462ndash466 2016


established by Hinton et al which set off the third wave ofdeep learning

Following are the main contributions of the study

(i) To study the existing approaches in the context ofsports training recognition

(ii) To study the recognition of sports training actionbased on deep learning algorithm

(iii) Experimental work has been carried out in order toshow the validity of the proposed research

2 Human Motion Video Image and MotionInformation Representation

21 Motion History Image e motion history image wasfirst proposed by Davis and Bobrick Before that they firstproposed a binary motion energy image which is thepredecessor of the motion history image So let us take alook at the motion energy image e motion energy imagemainly describes how the object moves and space changes torecognize the moving object It can describe the outline ofthe object movement and the spatial distribution of theenergy [10ndash12]

As shown in Figure 1 we take the action of sitting downas an example e upper line is the keyframe of the actionand the next line shows the binary motion image accu-mulated from the start frame to the corresponding frameWe can observe that the blank area in the image is the targetmotion area By observing the shape of the moving area ofthe target the occurrence of the movement and the ob-servation angle is judged [13ndash15]

We call the accumulated binary motion image themotion energy graph as shown in the following equation

Eτ (x y t) 1113944τminus1

i0D(x y t minus i) (1)

where Eτ (x y t)is the binary motion energy imageD (x y t)is the frame difference between the t frame andthe tminus 1 frame and the motion energy image E (x y t)isthe cumulative sum of the frame differences

Although a motion energy map can reflect the spatialinformation of motion it cannot reflect its temporal in-formationerefore the motion image emerges as the timesrequirement based on the motion energy image By calcu-lating the pixel changes in the same position at a certaintime it presents the target motion in the form of imagebrightness is method belongs to the template methodbased on vision e gray value of each pixel in the motionhistory image shows the motion of position pixels in thevideo sequence If the last moving time of the pixel is closerto the current frame the higher the gray value Comparedwith the motion energy image it can not only show thesequence of action but also contain more details ereforethe motion history image can represent the movement of thehuman body in a movement process which makes it widelyused in the field of motion recognition Let s be the intensityvalue of pixels in the motion history map and Sτ (x y t)

is the update function

Sτ(x y t) τ if Ψ(x y t) 1

max 0 Sτ(x y t minus 1) minus δ( 1113857 otherwise1113896

(2)

where (x y) represents the position of the pixel and t is thetime t is the duration which determines the time range ofmotion from the angle of frame number 8 is the attenuationparameter e update function S(x y t) can be defined byoptical flow interframe difference or image difference andthe interframe difference method is the most commonlyused Its application is shown in formulas (3) and (4)

Ψ(x y t) 1 if D(x y t)ge ξ

0 otherwise1113896 (3)

where

D (x y t) |I(x y t) minus I(x y t plusmn Δ)| (4)

where I(x y t) is the intensity value of the coordinate (x y)pixel in the T frame of the video image sequence Δ is theinterframe distance and ξis a difference threshold given by ahuman which can be adjusted with the change of videoscene

Figure 2 shows the effect pictures of motion historyimages corresponding to different T values It can be seenfrom Figures 2(a) and 2(b) that when the value of T is toosmall the whole motion trajectory of the action cannot beobtained completely As shown in Figure 2(d) when thevalue of R is too large the change of the intensity value of themotion track in the captured motion history map is notobvious which leads to the loss of the information of theaction time dimension We cannot distinguish that the valueof t must be considered in the motion history map obtainedbecause the value is too small As for the differencethreshold if the value is too small the acquired motionhistory map will exhibit a lot of messy noise As shown inFigure 2(e) the obtained image cannot distinguish theforeground from the background well if the value is toolarge the area with a smaller pixel intensity value willdisappear and empty holes will appear resulting in loss ofaction information With the increase of the value the voidarea will be larger and larger until the final motion historyimage only contains the contour edge rough the ex-periment the optimal value is t= 50 ξ 40 which canobtain the most sufficient and effective motion trajectoryinformation

22 Rainbow Coding e pseudocolor processing of theimage can transform the image information into a form thatis easier to recognize by humans or machines and enhancethe useful information in the image Pseudocolor processingrefers to the technical process of converting a black andwhite gray image or multiband image into a color toneimagee commonly used pseudocolor codingmethods aredensity segmentation filtering and gray level colortransformation

e density segmentation method is mainly used to dealwith the image with discontinuous hue which is the simplest








(a) (b) (c)

(d) (e)




R(x y) TR f(x y)1113864 1113865 (5)

G(x y) TG f(x y)1113864 1113865 (6)

B(x y) TB f(x y)1113864 1113865 (7)









Grayscaleimage

Fouriertransform

Filter 1

Filter 2

Filter 3

Fourier transform

Fourier transform

Fourier transform




Color display











calculatezXi+1

zWi

andzJ

zXi+1 then

zJ

zWi

zJ

zXi+1

zXi+1

zWi

calculatezXi+1

zXi

thenzJ

zXi

zJ

zXi+1

zXi+1

zXi

reservezJ

zXi


(8)



Machinelearning

Deeplearning

Data input

Data input


Deep feature design

Target mapping

Target mapping

Target output

Target output












PLj




⎧⎪⎪⎨

⎪⎪⎩

(9)


Pf(i j) 19

1113944

1

mminus11113944

1








5 Conclusion


Data Availability




Acknowledgments




particle




L11

J16

J6

J5

J4

J3

J2

J1

J16

J9

L8

L7

L15

J7

J8

J16

J16

L5L9 L12

L4

L3 L6

J16 J13

J11

J12 J15

J14

L14

L13



References























(a) (b) (c)

(d) (e)




R(x y) TR f(x y)1113864 1113865 (5)

G(x y) TG f(x y)1113864 1113865 (6)

B(x y) TB f(x y)1113864 1113865 (7)









Grayscaleimage

Fouriertransform

Filter 1

Filter 2

Filter 3

Fourier transform

Fourier transform

Fourier transform




Color display











calculatezXi+1

zWi

andzJ

zXi+1 then

zJ

zWi

zJ

zXi+1

zXi+1

zWi

calculatezXi+1

zXi

thenzJ

zXi

zJ

zXi+1

zXi+1

zXi

reservezJ

zXi


(8)



Machinelearning

Deeplearning

Data input

Data input


Deep feature design

Target mapping

Target mapping

Target output

Target output












PLj




⎧⎪⎪⎨

⎪⎪⎩

(9)


Pf(i j) 19

1113944

1

mminus11113944

1








5 Conclusion


Data Availability




Acknowledgments




particle




L11

J16

J6

J5

J4

J3

J2

J1

J16

J9

L8

L7

L15

J7

J8

J16

J16

L5L9 L12

L4

L3 L6

J16 J13

J11

J12 J15

J14

L14

L13



References


















R(x y) TR f(x y)1113864 1113865 (5)

G(x y) TG f(x y)1113864 1113865 (6)

B(x y) TB f(x y)1113864 1113865 (7)









Grayscaleimage

Fouriertransform

Filter 1

Filter 2

Filter 3

Fourier transform

Fourier transform

Fourier transform




Color display











calculatezXi+1

zWi

andzJ

zXi+1 then

zJ

zWi

zJ

zXi+1

zXi+1

zWi

calculatezXi+1

zXi

thenzJ

zXi

zJ

zXi+1

zXi+1

zXi

reservezJ

zXi


(8)



Machinelearning

Deeplearning

Data input

Data input


Deep feature design

Target mapping

Target mapping

Target output

Target output












PLj




⎧⎪⎪⎨

⎪⎪⎩

(9)


Pf(i j) 19

1113944

1

mminus11113944

1








5 Conclusion


Data Availability




Acknowledgments




particle




L11

J16

J6

J5

J4

J3

J2

J1

J16

J9

L8

L7

L15

J7

J8

J16

J16

L5L9 L12

L4

L3 L6

J16 J13

J11

J12 J15

J14

L14

L13



References

























calculatezXi+1

zWi

andzJ

zXi+1 then

zJ

zWi

zJ

zXi+1

zXi+1

zWi

calculatezXi+1

zXi

thenzJ

zXi

zJ

zXi+1

zXi+1

zXi

reservezJ

zXi


(8)



Machinelearning

Deeplearning

Data input

Data input


Deep feature design

Target mapping

Target mapping

Target output

Target output












PLj




⎧⎪⎪⎨

⎪⎪⎩

(9)


Pf(i j) 19

1113944

1

mminus11113944

1








5 Conclusion


Data Availability




Acknowledgments




particle




L11

J16

J6

J5

J4

J3

J2

J1

J16

J9

L8

L7

L15

J7

J8

J16

J16

L5L9 L12

L4

L3 L6

J16 J13

J11

J12 J15

J14

L14

L13



References


























PLj




⎧⎪⎪⎨

⎪⎪⎩

(9)


Pf(i j) 19

1113944

1

mminus11113944

1








5 Conclusion


Data Availability




Acknowledgments




particle




L11

J16

J6

J5

J4

J3

J2

J1

J16

J9

L8

L7

L15

J7

J8

J16

J16

L5L9 L12

L4

L3 L6

J16 J13

J11

J12 J15

J14

L14

L13



References





















5 Conclusion


Data Availability




Acknowledgments




particle




L11

J16

J6

J5

J4

J3

J2

J1

J16

J9

L8

L7

L15

J7

J8

J16

J16

L5L9 L12

L4

L3 L6

J16 J13

J11

J12 J15

J14

L14

L13



References

















References

















research on sports training action recognition based on

Documents