research article dense trajectories and dhog for classification...

8
Research Article Dense Trajectories and DHOG for Classification of Viewpoints from Echocardiogram Videos Liqin Huang, Xiangyu Zhang, and Wei Li School of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, China Correspondence should be addressed to Liqin Huang; [email protected] Received 15 October 2015; Revised 19 January 2016; Accepted 31 January 2016 Academic Editor: Syoji Kobashi Copyright © 2016 Liqin Huang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. In echo-cardiac clinical computer-aided diagnosis, an important step is to automatically classify echocardiography videos from different angles and different regions. We propose a kind of echocardiography video classification algorithm based on the dense trajectory and difference histograms of oriented gradients (DHOG). First, we use the dense grid method to describe feature characteristics in each frame of echocardiography sequence and then track these feature points by applying the dense optical flow. In order to overcome the influence of the rapid and irregular movement of echocardiography videos and get more robust tracking results, we also design a trajectory description algorithm which uses the derivative of the optical flow to obtain the motion trajectory information and associates the different characteristics (e.g., the trajectory shape, DHOG, HOF, and MBH) with embedded structural information of the spatiotemporal pyramid. To avoid “dimension disaster,” we apply Fisher’s vector to reduce the dimension of feature description followed by the SVM linear classifier to improve the final classification result. e average accuracy of echocardiography video classification is 77.12% for all eight viewpoints and 100% for three primary viewpoints. 1. Introduction e echocardiography video still plays an important role in modern medical diagnosis. It can be used to analyze the heart by providing the cardiac structural and motional infor- mation. e echocardiography video gets the 3D detailed anatomical structure and functional information of the heart from eight standard views, which are usually taken from an ultrasound transducer at the three primary positions (Apical Angles (AA), Parasternal Long Axis (PLA), and Parasternal Short Axis (PSA)). In order to choose the most helpful echocardiography video for disease diagnosis or research, we need to categorize the eight kinds of echocardiography videos. Classifying is the first step in the research. In clinical practice, because of the ultrasound characteristic limits, such as the high noise caused by the low contrast ratio in echocardiography video, sonographer has to manually classify these echocardiography videos. It causes the great decrease of working efficiency and easily impacts the recog- nition results owing to sonographer experience and image resolution. erefore, how to use the computer to classify the echocardiography video is a crucial step in the current echo- cardiac research. In recent years, there are a lot of researchers who focus on this area. For example, Kumar et al. [1] are devoted to expressing the main anatomical structure of space motion. Zhou et al. [2] and Beymer et al. [3] mention echocardiogram classification based on multiple object space relations. en Shalbaf and Sera et al. [4, 5] add the movement information on this basis, which extract and detect the feature by tracking the movements and patterns of the contour shape of the heart. Wang et al. [6] extract feature by analyzing the factor in video. Guo et al. [7] propose an echocardiography video classification method based on sparse representation and greatly enhance the recognition accuracy. An echocardiogra- phy video classification method based on the 3DSIFT feature description is proposed in [8]. e authors detect echo- cardiac features applying cuboid detector, represent these features into 3DSIFT descriptors, and finally use Bag of Word to encode before classifying in SVM. Aſter that, Rap and Espinola-Zavaleta et al. [9, 10] propose the echocardiography classification method based on the features of low dimension Hindawi Publishing Corporation Computational and Mathematical Methods in Medicine Volume 2016, Article ID 9610192, 7 pages http://dx.doi.org/10.1155/2016/9610192

Upload: others

Post on 07-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article Dense Trajectories and DHOG for Classification …downloads.hindawi.com/journals/cmmm/2016/9610192.pdf · 2019-07-30 · Research Article Dense Trajectories and DHOG

Research ArticleDense Trajectories and DHOG for Classification ofViewpoints from Echocardiogram Videos

Liqin Huang Xiangyu Zhang and Wei Li

School of Physics and Information Engineering Fuzhou University Fuzhou 350116 China

Correspondence should be addressed to Liqin Huang lqhuangfzu163com

Received 15 October 2015 Revised 19 January 2016 Accepted 31 January 2016

Academic Editor Syoji Kobashi

Copyright copy 2016 Liqin Huang et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

In echo-cardiac clinical computer-aided diagnosis an important step is to automatically classify echocardiography videos fromdifferent angles and different regions We propose a kind of echocardiography video classification algorithm based on the densetrajectory and difference histograms of oriented gradients (DHOG) First we use the dense grid method to describe featurecharacteristics in each frame of echocardiography sequence and then track these feature points by applying the dense opticalflow In order to overcome the influence of the rapid and irregular movement of echocardiography videos and get more robusttracking results we also design a trajectory description algorithm which uses the derivative of the optical flow to obtain themotion trajectory information and associates the different characteristics (eg the trajectory shape DHOG HOF and MBH) withembedded structural information of the spatiotemporal pyramid To avoid ldquodimension disasterrdquo we apply Fisherrsquos vector to reducethe dimension of feature description followed by the SVM linear classifier to improve the final classification result The averageaccuracy of echocardiography video classification is 7712 for all eight viewpoints and 100 for three primary viewpoints

1 Introduction

The echocardiography video still plays an important role inmodern medical diagnosis It can be used to analyze theheart by providing the cardiac structural and motional infor-mation The echocardiography video gets the 3D detailedanatomical structure and functional information of the heartfrom eight standard views which are usually taken from anultrasound transducer at the three primary positions (ApicalAngles (AA) Parasternal Long Axis (PLA) and ParasternalShort Axis (PSA)) In order to choose the most helpfulechocardiography video for disease diagnosis or researchwe need to categorize the eight kinds of echocardiographyvideos Classifying is the first step in the research In clinicalpractice because of the ultrasound characteristic limitssuch as the high noise caused by the low contrast ratioin echocardiography video sonographer has to manuallyclassify these echocardiography videos It causes the greatdecrease of working efficiency and easily impacts the recog-nition results owing to sonographer experience and imageresolutionTherefore how to use the computer to classify the

echocardiography video is a crucial step in the current echo-cardiac research

In recent years there are a lot of researchers who focuson this area For example Kumar et al [1] are devoted toexpressing the main anatomical structure of space motionZhou et al [2] and Beymer et al [3] mention echocardiogramclassification based on multiple object space relations ThenShalbaf and Sera et al [4 5] add the movement informationon this basis which extract and detect the feature by trackingthe movements and patterns of the contour shape of theheart Wang et al [6] extract feature by analyzing the factorin video Guo et al [7] propose an echocardiography videoclassification method based on sparse representation andgreatly enhance the recognition accuracy An echocardiogra-phy video classification method based on the 3DSIFT featuredescription is proposed in [8] The authors detect echo-cardiac features applying cuboid detector represent thesefeatures into 3DSIFT descriptors and finally use Bag ofWordto encode before classifying in SVM After that Rap andEspinola-Zavaleta et al [9 10] propose the echocardiographyclassification method based on the features of low dimension

Hindawi Publishing CorporationComputational and Mathematical Methods in MedicineVolume 2016 Article ID 9610192 7 pageshttpdxdoiorg10115520169610192

2 Computational and Mathematical Methods in Medicine

Test video

Dense sample

Dense trajectory extraction

DHOG trajectory description

Feature coding

Train video

Dense sample

Dense trajectory extraction

DHOG trajectory description

Feature coding

Linear SVM classifier

The classification

results of echocardiography

video

Figure 1 Echocardiography video classification flow based on the dense trajectory

and improve the classification result by combining the kernelfunction with movement decomposition algorithm

In this paper it is a challenge to precisely classify theechocardiography video because it has strong noise andlow resolution We propose a method based on the densetrajectory [11] and difference histograms of oriented gradients(DHOG) Dense trajectory tracking which samples the videopoints densely in our experiment can ensure the featurepoints covered integrally In addition it can differentiate theforeground and the background from the scene and improvethe efficiency of recognition Moreover dense optical flowcan improve perfectly the trajectory characteristics of thefeature points In order to describe the dense trajectory thispaper also proposes a method based on motion boundariesand structure descriptors (DHOG HOF and MBH) whichhas the better result The movement boundary histogramhas higher robustness than other descriptions which arebased on the optical flow algorithm It has a great impactfor the stability of the classification system Finally theFisher vector and linear SVM classifier are applied to classifyechocardiography videos

2 Methodology

Dense trajectories are obtained through dense-sampling thepoints in each frame and track these points in dense opticalflow field Dense sampling ensures the integrity coverageof the feature points of video and dense optical flow canimprove the properties of trajectory distinguish the fore-ground and background of echocardiography video andpromote the efficiency of recognitionThe flow chart of densetrajectory behavior recognition method is shown in Figure 1We will produce details about the flow chart in the followingcontent

21 Dense Sampling Feature points on a grid spaced aredensely sampled by 119882 pixels Sampling is carried out oneach spatial scale separately It means that feature pointsequally cover all spatial positions and scales According tothe previous experiment results [7] the sampling step size of119882 = 5 pixels is dense enough to give good results over alldata Smaller sampling step does not havemore improvementand it will increase the amount of calculation There are atmost 8 spatial scales in total depending on the resolution ofthe video The spatial scale increases by a factor of 1radic2

Because sampling in the homogeneous area makes nosense for detecting we remove points in these areas using

t t + 1 t + 2 t + L t + L + 1 t + L + 2

Figure 2 The diagram of dense trajectory extraction

Zhangrsquos criterion [19] When the eigenvalue of the localautocorrelation matrix is less than a certain threshold thepointwill be removed according to the criterion In this paperthe threshold 119879 is as follows

119879 = 0001 timesmax119894isin119868119905

min (1205821119894 1205822

119894) (1)

where (1205821119894 1205822

119894) are the eigenvalues of point 119894 in the image

119868119905 And experimental results showed that the value of 0001

represents a good compromise between saliency and densityof the sampled points

22 Dense Trajectory Extraction Dense trajectories areextracted by tracking feature points in each scale space aloneas shown in Figure 2

For each frame 119868119905 its dense optical flow field 120596

119905= (119906119905 V119905)

is computed by 119908 119903 119905 The next frame is where 119906119905and V119905are

the horizontal and vertical components of the optical flowGiven a point 119875

119905= (119909119905 119910119905) in frame 119868

119905 its tracked position in

frame 119868119905+1

is smoothed by applying a median filter

119875119905+1

= (119909119905+1 119910119905+1) = (119909

119905 119910119905) + (119872 lowast 120596

119905)1003816100381610038161003816(119909119905119910119905) (2)

where 119872 is the median filtering kernel and the size is 3 times3 pixels It avoids trajectories of points located on motionboundaries being smoothed out [20]

Once the dense optical flow field is computed pointscan be tracked very densely without additional cost Anotheradvantage using the dense optical flow field is the smoothnessconstraints which allow relatively robust tracking of fastand irregular motion patterns In this paper we use Lirsquosalgorithm [21] to embed a translation motionmodel betweenneighborhoods of two consecutive frames And polynomialexpansion is employed to approximate pixel intensities inthe neighborhood Not only can this method effectively

Computational and Mathematical Methods in Medicine 3

(a) Original image

1 3 5 7 9 11 13 15 17

1 3 5 7 9 11 13 15 17

0

02

04

06

08

1

0

05

1

Valu

eVa

lue

Bin

Bin

Bin

Bin

(b) Projection in [0 2120587]

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

0

05

1

15

0

05

1

15

Valu

eVa

lue

Bin

Bin

Bin

Bin

(c) Projection in [0 120587]

Figure 3 The robustness of the compression direction on background

cooperate with optical flow and improve the accuracy ofthe optical flow but also its principle is simple and it haslow computational complexity In our research we use theimplementation from the OpenCV library to complete thecalculation of dense optical flow field

The shape descriptor of a trajectorymeans encoding localmotion positions Given a trajectory of length 119871 we describeits shape by a sequence (Δ119875

119905 Δ119875

119905+119871minus1) of displacement

vector Δ119875119905= (119875119905+1

minus 119875119905) = (119909

119905+1minus 119909119905 119910119905+1

minus 119910119905) The

resulting vector is normalized by the sum of displacementvector magnitudes

119879 =(Δ119875119905 Δ119875

119905+119871minus1)

sum119905+119871minus1

119895=119905

10038171003817100381710038171003817Δ119875119895

10038171003817100381710038171003817

(3)

In the following we refer to this vector as trajectory Aswe use trajectories with a fixed length of 119871 = 15 frames weobtain a 30-dimensional descriptor

23 Motion and Structure Descriptors Besides the trajectoryshape information we also design descriptors to embedappearance and motion information and include differencehistograms of oriented gradients (DHOG) histograms ofoptical flow (HOF) and the motion boundary histograms(MBH) These features are combined as a motion model ofthe local feature descriptor In 119871 frames of continuous videowe extract the feature in the119873times119873pixels of local area In orderto embed structural information we divide the 119873 times 119873 times 119871

spatiotemporal cube into a set of grids with 119899120590times 119899120590times 119899120591

and compute the local descriptor corresponding to each gridFinally we combine each descriptor into the final vector

231 Different Histograms of Oriented Gradients HOG fea-ture [22] is the information description based on the localstatistical characteristics its main thought is that the localtarget appearance and shape in an image can be described bythe gradient or the direction of the edge density distributionFirst the image is divided into several nonoverlapping cellswith size 119873 times 119873 and the gradient of each cell is calculatedand then represented using histogram statistics By mergingmultiple cells into a vector the HOG features are obtained

Although the traditional HOG features have strongrobustness for certain amount of illumination change theexpression ability of HOG feature still exists Since thegradient direction can be changed by the light and shade ofthe background it is possible to generate different expressionsfor the HOG feature as shown in Figure 3 The projection inthe interval [0 120587] has no change for the gradient directionhistogram as shown in Figure 3(c)

It can lead to ignoring the difference information ofsome target showed The HOG feature descriptor is uniformbecause of the different shape objects with different oppositedirection of gradient and the same size and number whichcan produce the same histogram So the HOG features haveno distinguished ability for these objects

In order to improve the expression ability of HOGfeature in echocardiography video characteristic trajectoryWe propose an improved HOG descriptor algorithm Wedivide 119871 into [0 2120587] area and get the gradient directionhistogram119867

119892 where 119871 is even and each pixel in the cell needs

to be vote

119867119900119892 (119894) = 119867

119892 (119894) + 119867119892 (119894 +119871

2) 1 le 119894 le

119871

2 (4)

4 Computational and Mathematical Methods in Medicine

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

0

02

04

06

08

1

Valu

eVa

lue

Valu

eBin

Bin

Bin

Bin

0

01

02

03

04

05

0

01

02

03

04

05

Bin

Bin

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

0

02

04

06

08

1

Valu

eVa

lue

Valu

e

Bin

Bin

Bin

Bin

0

01

02

03

04

05

0

01

02

03

04

05

Bin

Bin

Figure 4 Comparison of HOG and DHOG

where119867119900119892(119894) and119867

119892(119894) are respectively 119894th element values of

HOG and119867119892 In this paper we set a new series of histograms

119867119899119892

whose length is equal to the HOG after the HOG asshown in Figure 4 and each element value is

119867119899119892 (119894) =

1003816100381610038161003816100381610038161003816119867119892 (119894) minus 119867119892 (119894 +

119871

2)

1003816100381610038161003816100381610038161003816 1 le 119894 le

119871

2 (5)

In this paper we call the new gradient direction his-togram as difference HOG (DHOG) in order to simplify thecalculation we extract theHOG feature descriptors in 32times 32pixels of dense trajectory Each block is divided into 2times 2 cellsIn order to obtain the optimal description while consideringthe differences in the opposite direction in this paper the area(0sim180∘) is divided into nine bins and then extended to [02120587] by DHOG We divide the gradient direction area of 0sim360∘ into 18 bins and calculate gradient direction histogramin each cell Then we divide the 119871 = 15 framesrsquo densetrajectory into three parts and sum the DHOG features withthe corresponding bin The eigenvectors also are normalizedwith the 1198712-norm So in this paper there is a dense trajectoryof HOG feature with 119871 = 15 frames whose dimensionality is2 times 2 times 18 times 3 = 216

232 Gradient and Optical Flow Histograms Different fromtheDHOGdescriptor extracting the static shape informationthe HOF uses optical flow to describe and extract themotional informationThe calculation method of HOF is thesame as DHOG HOF makes the sum of modulus value oflight flowrsquos horizontal component and vertical componentas the amplitude of the light flow and makes the arctangentvalue of the ratio of vertical and horizontal component as thedirection angle of the light flowThen the light flow directionwill be projected into the adjacent bin which is based onthe amplitude of light flow Considering that the light flowamplitude is small (less than a certain threshold) we willadd a zero bin in the HOF feature which is used to hold thenumber of pixels HOF descriptor also is normalized with the1198712-norm The other parameters are the same as the DHOGSo the dimensionality of HOF is 2 times 2 times 9 times 3 = 108

233 Motion Boundary Histograms Motion boundary his-tograms (MBH) are proposed by Yao at el for detection ofthe human in 2014 [23] It computes derivatives separatelyfor the horizontal and vertical components MBH is thegradient of light flow it can get rid of uniform motion

Computational and Mathematical Methods in Medicine 5

Table 1 The number of each type of video

Video A2C A3C A4C A5C PLA PSAB PSAM PSAP SumNumber 45 31 36 7 42 11 39 17 228Patients 9 6 7 1 8 2 8 4 45Normal 36 25 29 6 34 9 31 13 183

and keep the change of optical flow field In a certainextent it is a complement to the DHOG and HOF that isremoving the camera movement to the human The DHOGand HOF are generated from the optical flow which makesthe computational complexity of MBH small

24 The Selection of Feature Encoding and Classifier In thispaper we use the Fisher vector [24 25] with the improved256 visual words of GMM (signed square-rooting followedby 1198712-norm) for features encoding Fisherrsquos vector seen as theexpansion of the BOV word bag model is the intermediateexpression of image Because Fisherrsquos vector fuses the advan-tages of production and is discriminant of Fisherrsquos kernelarchitecture it can not only reflect the frequency of eachvisual word but also encode the differences of local featurein visual words information In addition it can representthe richer image feature compared to the word bag modelThe high dimensional feature that combined with simpleand effective linear classifier can achieve good classificationeffect So in this paper we use the linear SVM classifierwhich has the excellent performance of machine learning inclassification and generalization So linear SVM classifier hasa lot of advantages compared to other learning algorithms inpractice and is simple and convenient to use Its performanceis better in classification

3 Experimental Results and Analysis

31 Data In this experiment 228 different echocardiographyvideos are collected from 72 different patients (including58 normal individuals and 14 patients with cardiac wallmotion abnormalities) these data were provided by TsinghuaUniversity First Hospital All the length of echocardiographyvideo is 1 to 2 seconds and the videos are stored as 434times 636 resolution and in 26-frame DICOM format (DigitalImaging and Communications in Medicine) Table 1 showsthe detailed number of eight kinds of video

32 Experiment Setup and Result Analysis In this experimentthe dense sampling grid of feature points is selected with 5pixels in the process of dense sampling and it can guaranteethe best result Then the tracking frame length is set to15 frames in the process of tracking of feature points Inthe process of motion boundary description the defaultparameters are119873 = 32 119899

120590= 2 and 119899

120591= 3 in our experiment

and these parameters have the best effect through the exper-iments testing In terms of selecting the feature encodingand classifier this experiment adopts vlfeat-0919 softwarefor processing (download from httpwwwvlfeatorg) Forthe selection of training video and testing video we use the

method of ldquohalf and halfrdquo (the total echocardiography videodata are randomly selected half as the training video and theother half as the testing video) And we analyze its classifiedresults

We use the confusion matrix to show the classificationaccuracy rate of the eight kinds of echocardiography videoin Table 2 and the calculation method of accuracy rate isshowed in function 31 Overall the average classificationaccuracy of all the classes is 7712 We can find that theclassification accuracy for each class is different In particularthe classification accuracy for these classes such as A5CPSAB and PSAP is lower than the others The main reasonis shortness of experiment data corresponding to these threekinds of echocardiography videos It leads to lack of trainingsamples

accuracy rate = number of correct videosnumber of videos

(6)

In view of the above experimental result that the clas-sification accuracy can be affected by the number of thesample videos in the following experiment we cut off the lessexperimental data of three kinds of video (A5C PSAB andPSAP) and make further experiment with the rest of the fivekinds of experimental data and compare the results (Table 3)

From Table 3 comparing with the eight kinds of echocar-diography video and five kinds of the echocardiographyvideo we can find that the classification accuracy showsa significant improvement The classification accuracy ispromoted from original 7712 to 9836

Our experiments also make a classification test forechocardiography video of the primal three kinds of locations(AA PLA and PSA) its average classification accuracy is100 and it can satisfy the basic needs of medical echocar-diography video classification

Finally compared with the other echocardiography videoclassification algorithm which is previously proposed ourexperimental result is increased a lot (illustrated in Table 4)And for the echocardiography video of the three kindsof primal locations our experiment has the best accuracyTherefore the method of echocardiography videorsquos classifi-cation based on dense trajectory tracking and DHOG hasan obvious improvement in accuracy compared with theprevious algorithm

4 Summary and Discussion

This paper is based on the classification of the echocardiog-raphy video with the dense trajectory tracking and DHOGalgorithm Firstly we use the dense trajectory tracking andDHOG algorithm to describe the feature in each kind ofechocardiography video and get the features and then applythe VLfeat software for Fisher vector to encode the featuresFinally we use the linear SVM to get the classification resultsFrom the experimental results while the method adopted inthis paper is compared with the previous method which isbased on the 3D-SIFT echocardiography video classificationthe classification accuracy is improved significantly But thefinal average classification accuracy is still not high because

6 Computational and Mathematical Methods in Medicine

Table 2 Confusion matrix for eight kinds of echocardiography video classification

Classified result AccuracyA2C A3C A4C A5C PLA PSAB PSAM PSAP

Class

A2C 22 1 0 0 0 0 0 0 096A3C 1 15 0 0 0 0 0 0 095A4C 0 0 17 1 0 0 0 0 095A5C 2 0 1 0 0 0 0 0 000PLA 0 0 0 0 21 0 0 0 100PSAB 0 0 0 0 1 3 0 2 050PSAM 0 0 0 0 0 0 20 0 100PSAP 0 0 0 0 2 0 0 7 081

Table 3 Confusionmatrix for five kinds of echocardiography videoclassification

Classified result AccuracyA2C A3C A4C PLA PSAM

Class

A2C 22 1 0 0 0 096A3C 1 15 0 0 0 095A4C 0 0 18 0 0 100PLA 0 0 0 21 0 100PSAM 0 0 0 0 20 100

Table 4 The algorithm accuracy compared with other algorithmsin this paper

The comparison of resultClassification accuracy

of eight classesClassification accuracy

of three classes[8] 72 90[12] 704 86[13] 74 91[14] 76 93[15] 765 93[16] 75 92[17] 732 95[18] 713 94Our method 7712 100

of the lack of the echocardiography video data So in thispaper we get rid of the small amount of three kinds ofexperimental data and make the classification experimentagain After solving the problem of the sample data theaverage classification accuracy of the rest of five classes ofdata compared to the average classification accuracy of eightclasses has increased nearly 209 And in three classesof primal position of echocardiography video the averageclassification accuracy is 100 Therefore the method ofthis paper can preliminarily achieve the requirement of themedical echocardiography video classification accuracy Fordealing with the classification of echocardiography videothe challenge is not just about the video noise caused bythe low resolution of the video but also how to balance thecomputational complexity and time cost In the future work

we will gradually increase the database of eight kinds ofechocardiography video and adopt or propose newer andmore effective method to improve the classification accuracyand reduce the required time of the classification as well

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural Science Fundin China under Grants nos 61471124 and 61473090 Alsothis research forms part of WIDTH project that is financiallyfunded by EC under FP7 programmewith Grant no PIRSES-GA-2010-269124

References

[1] R Kumar F Wang D Beymer and T Syeda-Mahmood ldquoCar-diac disease detection from echocardiogram using edge filteredscale-invariant motion featuresrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition Workshops (CVPRW rsquo10) pp 162ndash169 IEEE SanFrancisco Calif USA June 2010

[2] S K Zhou J H Park B Georgescu C Simopoulos J OtsukiandD Comaniciu ldquoImage-basedmulticlass boosting and echo-cardiographic view classificationrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition (CVPR rsquo06) vol 2 pp 1559ndash1565 New York NYUSA June 2006

[3] D Beymer T Syeda-Mahmood and F Wang ldquoExploitingspatio-temporal information for view recognition in cardiacecho videosrdquo in Proceedings of the IEEE Computer Society Con-ference on Computer Vision and Pattern Recognition Workshops(CVPRW rsquo08) pp 1ndash8 IEEE Anchorage Alaska USA June2008

[4] A Shalbaf H Behnam Z Alizade-Sani and M ShojaifardldquoAutomatic classification of left ventricular regional wallmotionabnormalities in echocardiography images using nonrigidimage registrationrdquo Journal of Digital Imaging vol 26 no 5 pp909ndash919 2013

[5] F Sera T S Kato M Farr et al ldquoLeft ventricular longitudinalstrain by speckle-tracking echocardiography is associated withtreatment-requiring cardiac allograft rejectionrdquo Journal of Car-diac Failure vol 20 no 5 pp 359ndash364 2014

Computational and Mathematical Methods in Medicine 7

[6] J Y Wang Y Li Y Zhang et al ldquoBag-of-features basedmedical image retrieval via multiple assignment and visualwords weightingrdquo IEEE Transactions on Medical Imaging vol30 no 11 pp 1996ndash2011 2011

[7] Y Guo YWang D Kong and X Shu ldquoAutomatic classificationof intracardiac tumor and thrombi in echocardiography basedon sparse representationrdquo IEEE Journal of Biomedical andHealth Informatics vol 19 no 2 pp 601ndash611 2015

[8] Y Qian L Wang C Wang and X Gao ldquoThe synergy of 3DSIFT and sparse codes for classification of viewpoints fromechocardiogram videordquo in Medical Content-Based Retrieval forClinical Decision Support Third MICCAI International Work-shop MCBR-CDS 2012 Nice France October 1 2012 RevisedSelected Papers vol 7723 of Lecture Notes In Computer Sciencepp 68ndash79 Springer Berlin Germany 2013

[9] MRap andAChacko ldquoOptimising the use of transoesophagealechocardiography in diagnosing suspected infective endocardi-tisrdquo Acta Cardiologica vol 70 no 4 pp 487ndash491 2015

[10] N Espinola-Zavaleta M E Soto A Romero-Gonzalez et alldquoPrevalence of congenital heart disease and pulmonary hyper-tension in Downrsquos syndrome an echocardiographic studyrdquoJournal of Cardiovascular Ultrasound vol 23 no 2 pp 72ndash772015

[11] H Wang A Klaser C Schmid and C-L Liu ldquoDense trajecto-ries and motion boundary descriptors for action recognitionrdquoInternational Journal of Computer Vision vol 103 no 1 pp 60ndash79 2013

[12] I LaptevMMarszałek C Schmid and B Rozenfeld ldquoLearningrealistic human actions frommoviesrdquo in Proceedings of the 26thIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo08) pp 1ndash8 Anchorage Alaska USA June 2008

[13] J Liu J Luo and M Shah ldquoRecognizing realistic actions fromvideos in the wildrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo09) pp 1996ndash2003 IEEE Miami Fla USA June 2009

[14] A Gilbert J Illingworth and R Bowden ldquoFast realisticmulti-action recognition using mined dense spatio-temporalfeaturesrdquo in Proceedings of the 12th International Conferenceon Computer Vision pp 925ndash931 IEEE Kyoto Japan October2009

[15] W Brendel and S Todorovic ldquoActivities as time series of humanposturesrdquo inComputer VisionmdashECCV 2010 11th European Con-ference on Computer Vision Heraklion Crete Greece September5ndash11 2010 Proceedings Part II vol 6312 pp 721ndash734 SpringerBerlin Germany 2010

[16] A Kovashka and K Grauman ldquoLearning a hierarchy ofdiscriminative space-time neighborhood features for humanaction recognitionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo10) pp 2046ndash2053 San Francisco Calif USA June 2010

[17] Q V Le W Y Zou S Y Yeung and A Y Ng ldquoLearning hierar-chical invariant spatio-temporal features for action recognitionwith independent subspace analysisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo11) pp 3361ndash3368 June 2011

[18] B Li M Ayazoglu T Mao O I Camps and M SznaierldquoActivity recognition using dynamic subspace anglesrdquo in Pro-ceedings of the IEEE Conference on Computer Vision and PatternRecognition (CVPR rsquo11) pp 3193ndash3200 IEEE Providence RIUSA June 2011

[19] G C Zhang and P A Vela ldquoGood features to track for visualSLAMrdquo in Proceedings of the IEEE Conference on Computer

Vision and Pattern Recognition (CVPR rsquo15) pp 1373ndash1382 IEEEBoston Mass USA June 2015

[20] H Wang and C Schmid ldquoAction recognition with improvedtrajectoriesrdquo in Proceedings of the 14th IEEE International Con-ference on Computer Vision (ICCV rsquo13) pp 3551ndash3558 SydneyAustralia December 2013

[21] X Li G Cui W Yi and L Kong ldquoA fast maneuvering targetmotion parameters estimation algorithmbased onACCFrdquo IEEESignal Processing Letters vol 22 no 3 pp 270ndash274 2015

[22] Y Ji M Yang Z Lu and C Wang ldquoIntegrating visual selectiveattention model with HOG features for traffic light detectionand recognitionrdquo in Proceedings of the IEEE Intelligent VehiclesSymposium (IV rsquo15) pp 280ndash285 Seoul Republic of KoreaJune-July 2015

[23] B Z Yao B X Nie Z Liu and S-C Zhu ldquoAnimated posetemplates for modeling and detecting human actionsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol36 no 3 pp 436ndash452 2014

[24] D Veitch and P Tune ldquoOptimal skampling for the flow sizedistributionrdquo IEEE Transactions on Information Theory vol 61no 6 pp 3075ndash3099 2015

[25] B Klein G Lev G Sadeh and L Wolf ldquoAssociating neuralword embeddings with deep image representations using FisherVectorsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo15) pp 4437ndash4446IEEE Boston Mass USA June 2015

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 2: Research Article Dense Trajectories and DHOG for Classification …downloads.hindawi.com/journals/cmmm/2016/9610192.pdf · 2019-07-30 · Research Article Dense Trajectories and DHOG

2 Computational and Mathematical Methods in Medicine

Test video

Dense sample

Dense trajectory extraction

DHOG trajectory description

Feature coding

Train video

Dense sample

Dense trajectory extraction

DHOG trajectory description

Feature coding

Linear SVM classifier

The classification

results of echocardiography

video

Figure 1 Echocardiography video classification flow based on the dense trajectory

and improve the classification result by combining the kernelfunction with movement decomposition algorithm

In this paper it is a challenge to precisely classify theechocardiography video because it has strong noise andlow resolution We propose a method based on the densetrajectory [11] and difference histograms of oriented gradients(DHOG) Dense trajectory tracking which samples the videopoints densely in our experiment can ensure the featurepoints covered integrally In addition it can differentiate theforeground and the background from the scene and improvethe efficiency of recognition Moreover dense optical flowcan improve perfectly the trajectory characteristics of thefeature points In order to describe the dense trajectory thispaper also proposes a method based on motion boundariesand structure descriptors (DHOG HOF and MBH) whichhas the better result The movement boundary histogramhas higher robustness than other descriptions which arebased on the optical flow algorithm It has a great impactfor the stability of the classification system Finally theFisher vector and linear SVM classifier are applied to classifyechocardiography videos

2 Methodology

Dense trajectories are obtained through dense-sampling thepoints in each frame and track these points in dense opticalflow field Dense sampling ensures the integrity coverageof the feature points of video and dense optical flow canimprove the properties of trajectory distinguish the fore-ground and background of echocardiography video andpromote the efficiency of recognitionThe flow chart of densetrajectory behavior recognition method is shown in Figure 1We will produce details about the flow chart in the followingcontent

21 Dense Sampling Feature points on a grid spaced aredensely sampled by 119882 pixels Sampling is carried out oneach spatial scale separately It means that feature pointsequally cover all spatial positions and scales According tothe previous experiment results [7] the sampling step size of119882 = 5 pixels is dense enough to give good results over alldata Smaller sampling step does not havemore improvementand it will increase the amount of calculation There are atmost 8 spatial scales in total depending on the resolution ofthe video The spatial scale increases by a factor of 1radic2

Because sampling in the homogeneous area makes nosense for detecting we remove points in these areas using

t t + 1 t + 2 t + L t + L + 1 t + L + 2

Figure 2 The diagram of dense trajectory extraction

Zhangrsquos criterion [19] When the eigenvalue of the localautocorrelation matrix is less than a certain threshold thepointwill be removed according to the criterion In this paperthe threshold 119879 is as follows

119879 = 0001 timesmax119894isin119868119905

min (1205821119894 1205822

119894) (1)

where (1205821119894 1205822

119894) are the eigenvalues of point 119894 in the image

119868119905 And experimental results showed that the value of 0001

represents a good compromise between saliency and densityof the sampled points

22 Dense Trajectory Extraction Dense trajectories areextracted by tracking feature points in each scale space aloneas shown in Figure 2

For each frame 119868119905 its dense optical flow field 120596

119905= (119906119905 V119905)

is computed by 119908 119903 119905 The next frame is where 119906119905and V119905are

the horizontal and vertical components of the optical flowGiven a point 119875

119905= (119909119905 119910119905) in frame 119868

119905 its tracked position in

frame 119868119905+1

is smoothed by applying a median filter

119875119905+1

= (119909119905+1 119910119905+1) = (119909

119905 119910119905) + (119872 lowast 120596

119905)1003816100381610038161003816(119909119905119910119905) (2)

where 119872 is the median filtering kernel and the size is 3 times3 pixels It avoids trajectories of points located on motionboundaries being smoothed out [20]

Once the dense optical flow field is computed pointscan be tracked very densely without additional cost Anotheradvantage using the dense optical flow field is the smoothnessconstraints which allow relatively robust tracking of fastand irregular motion patterns In this paper we use Lirsquosalgorithm [21] to embed a translation motionmodel betweenneighborhoods of two consecutive frames And polynomialexpansion is employed to approximate pixel intensities inthe neighborhood Not only can this method effectively

Computational and Mathematical Methods in Medicine 3

(a) Original image

1 3 5 7 9 11 13 15 17

1 3 5 7 9 11 13 15 17

0

02

04

06

08

1

0

05

1

Valu

eVa

lue

Bin

Bin

Bin

Bin

(b) Projection in [0 2120587]

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

0

05

1

15

0

05

1

15

Valu

eVa

lue

Bin

Bin

Bin

Bin

(c) Projection in [0 120587]

Figure 3 The robustness of the compression direction on background

cooperate with optical flow and improve the accuracy ofthe optical flow but also its principle is simple and it haslow computational complexity In our research we use theimplementation from the OpenCV library to complete thecalculation of dense optical flow field

The shape descriptor of a trajectorymeans encoding localmotion positions Given a trajectory of length 119871 we describeits shape by a sequence (Δ119875

119905 Δ119875

119905+119871minus1) of displacement

vector Δ119875119905= (119875119905+1

minus 119875119905) = (119909

119905+1minus 119909119905 119910119905+1

minus 119910119905) The

resulting vector is normalized by the sum of displacementvector magnitudes

119879 =(Δ119875119905 Δ119875

119905+119871minus1)

sum119905+119871minus1

119895=119905

10038171003817100381710038171003817Δ119875119895

10038171003817100381710038171003817

(3)

In the following we refer to this vector as trajectory Aswe use trajectories with a fixed length of 119871 = 15 frames weobtain a 30-dimensional descriptor

23 Motion and Structure Descriptors Besides the trajectoryshape information we also design descriptors to embedappearance and motion information and include differencehistograms of oriented gradients (DHOG) histograms ofoptical flow (HOF) and the motion boundary histograms(MBH) These features are combined as a motion model ofthe local feature descriptor In 119871 frames of continuous videowe extract the feature in the119873times119873pixels of local area In orderto embed structural information we divide the 119873 times 119873 times 119871

spatiotemporal cube into a set of grids with 119899120590times 119899120590times 119899120591

and compute the local descriptor corresponding to each gridFinally we combine each descriptor into the final vector

231 Different Histograms of Oriented Gradients HOG fea-ture [22] is the information description based on the localstatistical characteristics its main thought is that the localtarget appearance and shape in an image can be described bythe gradient or the direction of the edge density distributionFirst the image is divided into several nonoverlapping cellswith size 119873 times 119873 and the gradient of each cell is calculatedand then represented using histogram statistics By mergingmultiple cells into a vector the HOG features are obtained

Although the traditional HOG features have strongrobustness for certain amount of illumination change theexpression ability of HOG feature still exists Since thegradient direction can be changed by the light and shade ofthe background it is possible to generate different expressionsfor the HOG feature as shown in Figure 3 The projection inthe interval [0 120587] has no change for the gradient directionhistogram as shown in Figure 3(c)

It can lead to ignoring the difference information ofsome target showed The HOG feature descriptor is uniformbecause of the different shape objects with different oppositedirection of gradient and the same size and number whichcan produce the same histogram So the HOG features haveno distinguished ability for these objects

In order to improve the expression ability of HOGfeature in echocardiography video characteristic trajectoryWe propose an improved HOG descriptor algorithm Wedivide 119871 into [0 2120587] area and get the gradient directionhistogram119867

119892 where 119871 is even and each pixel in the cell needs

to be vote

119867119900119892 (119894) = 119867

119892 (119894) + 119867119892 (119894 +119871

2) 1 le 119894 le

119871

2 (4)

4 Computational and Mathematical Methods in Medicine

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

0

02

04

06

08

1

Valu

eVa

lue

Valu

eBin

Bin

Bin

Bin

0

01

02

03

04

05

0

01

02

03

04

05

Bin

Bin

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

0

02

04

06

08

1

Valu

eVa

lue

Valu

e

Bin

Bin

Bin

Bin

0

01

02

03

04

05

0

01

02

03

04

05

Bin

Bin

Figure 4 Comparison of HOG and DHOG

where119867119900119892(119894) and119867

119892(119894) are respectively 119894th element values of

HOG and119867119892 In this paper we set a new series of histograms

119867119899119892

whose length is equal to the HOG after the HOG asshown in Figure 4 and each element value is

119867119899119892 (119894) =

1003816100381610038161003816100381610038161003816119867119892 (119894) minus 119867119892 (119894 +

119871

2)

1003816100381610038161003816100381610038161003816 1 le 119894 le

119871

2 (5)

In this paper we call the new gradient direction his-togram as difference HOG (DHOG) in order to simplify thecalculation we extract theHOG feature descriptors in 32times 32pixels of dense trajectory Each block is divided into 2times 2 cellsIn order to obtain the optimal description while consideringthe differences in the opposite direction in this paper the area(0sim180∘) is divided into nine bins and then extended to [02120587] by DHOG We divide the gradient direction area of 0sim360∘ into 18 bins and calculate gradient direction histogramin each cell Then we divide the 119871 = 15 framesrsquo densetrajectory into three parts and sum the DHOG features withthe corresponding bin The eigenvectors also are normalizedwith the 1198712-norm So in this paper there is a dense trajectoryof HOG feature with 119871 = 15 frames whose dimensionality is2 times 2 times 18 times 3 = 216

232 Gradient and Optical Flow Histograms Different fromtheDHOGdescriptor extracting the static shape informationthe HOF uses optical flow to describe and extract themotional informationThe calculation method of HOF is thesame as DHOG HOF makes the sum of modulus value oflight flowrsquos horizontal component and vertical componentas the amplitude of the light flow and makes the arctangentvalue of the ratio of vertical and horizontal component as thedirection angle of the light flowThen the light flow directionwill be projected into the adjacent bin which is based onthe amplitude of light flow Considering that the light flowamplitude is small (less than a certain threshold) we willadd a zero bin in the HOF feature which is used to hold thenumber of pixels HOF descriptor also is normalized with the1198712-norm The other parameters are the same as the DHOGSo the dimensionality of HOF is 2 times 2 times 9 times 3 = 108

233 Motion Boundary Histograms Motion boundary his-tograms (MBH) are proposed by Yao at el for detection ofthe human in 2014 [23] It computes derivatives separatelyfor the horizontal and vertical components MBH is thegradient of light flow it can get rid of uniform motion

Computational and Mathematical Methods in Medicine 5

Table 1 The number of each type of video

Video A2C A3C A4C A5C PLA PSAB PSAM PSAP SumNumber 45 31 36 7 42 11 39 17 228Patients 9 6 7 1 8 2 8 4 45Normal 36 25 29 6 34 9 31 13 183

and keep the change of optical flow field In a certainextent it is a complement to the DHOG and HOF that isremoving the camera movement to the human The DHOGand HOF are generated from the optical flow which makesthe computational complexity of MBH small

24 The Selection of Feature Encoding and Classifier In thispaper we use the Fisher vector [24 25] with the improved256 visual words of GMM (signed square-rooting followedby 1198712-norm) for features encoding Fisherrsquos vector seen as theexpansion of the BOV word bag model is the intermediateexpression of image Because Fisherrsquos vector fuses the advan-tages of production and is discriminant of Fisherrsquos kernelarchitecture it can not only reflect the frequency of eachvisual word but also encode the differences of local featurein visual words information In addition it can representthe richer image feature compared to the word bag modelThe high dimensional feature that combined with simpleand effective linear classifier can achieve good classificationeffect So in this paper we use the linear SVM classifierwhich has the excellent performance of machine learning inclassification and generalization So linear SVM classifier hasa lot of advantages compared to other learning algorithms inpractice and is simple and convenient to use Its performanceis better in classification

3 Experimental Results and Analysis

31 Data In this experiment 228 different echocardiographyvideos are collected from 72 different patients (including58 normal individuals and 14 patients with cardiac wallmotion abnormalities) these data were provided by TsinghuaUniversity First Hospital All the length of echocardiographyvideo is 1 to 2 seconds and the videos are stored as 434times 636 resolution and in 26-frame DICOM format (DigitalImaging and Communications in Medicine) Table 1 showsthe detailed number of eight kinds of video

32 Experiment Setup and Result Analysis In this experimentthe dense sampling grid of feature points is selected with 5pixels in the process of dense sampling and it can guaranteethe best result Then the tracking frame length is set to15 frames in the process of tracking of feature points Inthe process of motion boundary description the defaultparameters are119873 = 32 119899

120590= 2 and 119899

120591= 3 in our experiment

and these parameters have the best effect through the exper-iments testing In terms of selecting the feature encodingand classifier this experiment adopts vlfeat-0919 softwarefor processing (download from httpwwwvlfeatorg) Forthe selection of training video and testing video we use the

method of ldquohalf and halfrdquo (the total echocardiography videodata are randomly selected half as the training video and theother half as the testing video) And we analyze its classifiedresults

We use the confusion matrix to show the classificationaccuracy rate of the eight kinds of echocardiography videoin Table 2 and the calculation method of accuracy rate isshowed in function 31 Overall the average classificationaccuracy of all the classes is 7712 We can find that theclassification accuracy for each class is different In particularthe classification accuracy for these classes such as A5CPSAB and PSAP is lower than the others The main reasonis shortness of experiment data corresponding to these threekinds of echocardiography videos It leads to lack of trainingsamples

accuracy rate = number of correct videosnumber of videos

(6)

In view of the above experimental result that the clas-sification accuracy can be affected by the number of thesample videos in the following experiment we cut off the lessexperimental data of three kinds of video (A5C PSAB andPSAP) and make further experiment with the rest of the fivekinds of experimental data and compare the results (Table 3)

From Table 3 comparing with the eight kinds of echocar-diography video and five kinds of the echocardiographyvideo we can find that the classification accuracy showsa significant improvement The classification accuracy ispromoted from original 7712 to 9836

Our experiments also make a classification test forechocardiography video of the primal three kinds of locations(AA PLA and PSA) its average classification accuracy is100 and it can satisfy the basic needs of medical echocar-diography video classification

Finally compared with the other echocardiography videoclassification algorithm which is previously proposed ourexperimental result is increased a lot (illustrated in Table 4)And for the echocardiography video of the three kindsof primal locations our experiment has the best accuracyTherefore the method of echocardiography videorsquos classifi-cation based on dense trajectory tracking and DHOG hasan obvious improvement in accuracy compared with theprevious algorithm

4 Summary and Discussion

This paper is based on the classification of the echocardiog-raphy video with the dense trajectory tracking and DHOGalgorithm Firstly we use the dense trajectory tracking andDHOG algorithm to describe the feature in each kind ofechocardiography video and get the features and then applythe VLfeat software for Fisher vector to encode the featuresFinally we use the linear SVM to get the classification resultsFrom the experimental results while the method adopted inthis paper is compared with the previous method which isbased on the 3D-SIFT echocardiography video classificationthe classification accuracy is improved significantly But thefinal average classification accuracy is still not high because

6 Computational and Mathematical Methods in Medicine

Table 2 Confusion matrix for eight kinds of echocardiography video classification

Classified result AccuracyA2C A3C A4C A5C PLA PSAB PSAM PSAP

Class

A2C 22 1 0 0 0 0 0 0 096A3C 1 15 0 0 0 0 0 0 095A4C 0 0 17 1 0 0 0 0 095A5C 2 0 1 0 0 0 0 0 000PLA 0 0 0 0 21 0 0 0 100PSAB 0 0 0 0 1 3 0 2 050PSAM 0 0 0 0 0 0 20 0 100PSAP 0 0 0 0 2 0 0 7 081

Table 3 Confusionmatrix for five kinds of echocardiography videoclassification

Classified result AccuracyA2C A3C A4C PLA PSAM

Class

A2C 22 1 0 0 0 096A3C 1 15 0 0 0 095A4C 0 0 18 0 0 100PLA 0 0 0 21 0 100PSAM 0 0 0 0 20 100

Table 4 The algorithm accuracy compared with other algorithmsin this paper

The comparison of resultClassification accuracy

of eight classesClassification accuracy

of three classes[8] 72 90[12] 704 86[13] 74 91[14] 76 93[15] 765 93[16] 75 92[17] 732 95[18] 713 94Our method 7712 100

of the lack of the echocardiography video data So in thispaper we get rid of the small amount of three kinds ofexperimental data and make the classification experimentagain After solving the problem of the sample data theaverage classification accuracy of the rest of five classes ofdata compared to the average classification accuracy of eightclasses has increased nearly 209 And in three classesof primal position of echocardiography video the averageclassification accuracy is 100 Therefore the method ofthis paper can preliminarily achieve the requirement of themedical echocardiography video classification accuracy Fordealing with the classification of echocardiography videothe challenge is not just about the video noise caused bythe low resolution of the video but also how to balance thecomputational complexity and time cost In the future work

we will gradually increase the database of eight kinds ofechocardiography video and adopt or propose newer andmore effective method to improve the classification accuracyand reduce the required time of the classification as well

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural Science Fundin China under Grants nos 61471124 and 61473090 Alsothis research forms part of WIDTH project that is financiallyfunded by EC under FP7 programmewith Grant no PIRSES-GA-2010-269124

References

[1] R Kumar F Wang D Beymer and T Syeda-Mahmood ldquoCar-diac disease detection from echocardiogram using edge filteredscale-invariant motion featuresrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition Workshops (CVPRW rsquo10) pp 162ndash169 IEEE SanFrancisco Calif USA June 2010

[2] S K Zhou J H Park B Georgescu C Simopoulos J OtsukiandD Comaniciu ldquoImage-basedmulticlass boosting and echo-cardiographic view classificationrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition (CVPR rsquo06) vol 2 pp 1559ndash1565 New York NYUSA June 2006

[3] D Beymer T Syeda-Mahmood and F Wang ldquoExploitingspatio-temporal information for view recognition in cardiacecho videosrdquo in Proceedings of the IEEE Computer Society Con-ference on Computer Vision and Pattern Recognition Workshops(CVPRW rsquo08) pp 1ndash8 IEEE Anchorage Alaska USA June2008

[4] A Shalbaf H Behnam Z Alizade-Sani and M ShojaifardldquoAutomatic classification of left ventricular regional wallmotionabnormalities in echocardiography images using nonrigidimage registrationrdquo Journal of Digital Imaging vol 26 no 5 pp909ndash919 2013

[5] F Sera T S Kato M Farr et al ldquoLeft ventricular longitudinalstrain by speckle-tracking echocardiography is associated withtreatment-requiring cardiac allograft rejectionrdquo Journal of Car-diac Failure vol 20 no 5 pp 359ndash364 2014

Computational and Mathematical Methods in Medicine 7

[6] J Y Wang Y Li Y Zhang et al ldquoBag-of-features basedmedical image retrieval via multiple assignment and visualwords weightingrdquo IEEE Transactions on Medical Imaging vol30 no 11 pp 1996ndash2011 2011

[7] Y Guo YWang D Kong and X Shu ldquoAutomatic classificationof intracardiac tumor and thrombi in echocardiography basedon sparse representationrdquo IEEE Journal of Biomedical andHealth Informatics vol 19 no 2 pp 601ndash611 2015

[8] Y Qian L Wang C Wang and X Gao ldquoThe synergy of 3DSIFT and sparse codes for classification of viewpoints fromechocardiogram videordquo in Medical Content-Based Retrieval forClinical Decision Support Third MICCAI International Work-shop MCBR-CDS 2012 Nice France October 1 2012 RevisedSelected Papers vol 7723 of Lecture Notes In Computer Sciencepp 68ndash79 Springer Berlin Germany 2013

[9] MRap andAChacko ldquoOptimising the use of transoesophagealechocardiography in diagnosing suspected infective endocardi-tisrdquo Acta Cardiologica vol 70 no 4 pp 487ndash491 2015

[10] N Espinola-Zavaleta M E Soto A Romero-Gonzalez et alldquoPrevalence of congenital heart disease and pulmonary hyper-tension in Downrsquos syndrome an echocardiographic studyrdquoJournal of Cardiovascular Ultrasound vol 23 no 2 pp 72ndash772015

[11] H Wang A Klaser C Schmid and C-L Liu ldquoDense trajecto-ries and motion boundary descriptors for action recognitionrdquoInternational Journal of Computer Vision vol 103 no 1 pp 60ndash79 2013

[12] I LaptevMMarszałek C Schmid and B Rozenfeld ldquoLearningrealistic human actions frommoviesrdquo in Proceedings of the 26thIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo08) pp 1ndash8 Anchorage Alaska USA June 2008

[13] J Liu J Luo and M Shah ldquoRecognizing realistic actions fromvideos in the wildrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo09) pp 1996ndash2003 IEEE Miami Fla USA June 2009

[14] A Gilbert J Illingworth and R Bowden ldquoFast realisticmulti-action recognition using mined dense spatio-temporalfeaturesrdquo in Proceedings of the 12th International Conferenceon Computer Vision pp 925ndash931 IEEE Kyoto Japan October2009

[15] W Brendel and S Todorovic ldquoActivities as time series of humanposturesrdquo inComputer VisionmdashECCV 2010 11th European Con-ference on Computer Vision Heraklion Crete Greece September5ndash11 2010 Proceedings Part II vol 6312 pp 721ndash734 SpringerBerlin Germany 2010

[16] A Kovashka and K Grauman ldquoLearning a hierarchy ofdiscriminative space-time neighborhood features for humanaction recognitionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo10) pp 2046ndash2053 San Francisco Calif USA June 2010

[17] Q V Le W Y Zou S Y Yeung and A Y Ng ldquoLearning hierar-chical invariant spatio-temporal features for action recognitionwith independent subspace analysisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo11) pp 3361ndash3368 June 2011

[18] B Li M Ayazoglu T Mao O I Camps and M SznaierldquoActivity recognition using dynamic subspace anglesrdquo in Pro-ceedings of the IEEE Conference on Computer Vision and PatternRecognition (CVPR rsquo11) pp 3193ndash3200 IEEE Providence RIUSA June 2011

[19] G C Zhang and P A Vela ldquoGood features to track for visualSLAMrdquo in Proceedings of the IEEE Conference on Computer

Vision and Pattern Recognition (CVPR rsquo15) pp 1373ndash1382 IEEEBoston Mass USA June 2015

[20] H Wang and C Schmid ldquoAction recognition with improvedtrajectoriesrdquo in Proceedings of the 14th IEEE International Con-ference on Computer Vision (ICCV rsquo13) pp 3551ndash3558 SydneyAustralia December 2013

[21] X Li G Cui W Yi and L Kong ldquoA fast maneuvering targetmotion parameters estimation algorithmbased onACCFrdquo IEEESignal Processing Letters vol 22 no 3 pp 270ndash274 2015

[22] Y Ji M Yang Z Lu and C Wang ldquoIntegrating visual selectiveattention model with HOG features for traffic light detectionand recognitionrdquo in Proceedings of the IEEE Intelligent VehiclesSymposium (IV rsquo15) pp 280ndash285 Seoul Republic of KoreaJune-July 2015

[23] B Z Yao B X Nie Z Liu and S-C Zhu ldquoAnimated posetemplates for modeling and detecting human actionsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol36 no 3 pp 436ndash452 2014

[24] D Veitch and P Tune ldquoOptimal skampling for the flow sizedistributionrdquo IEEE Transactions on Information Theory vol 61no 6 pp 3075ndash3099 2015

[25] B Klein G Lev G Sadeh and L Wolf ldquoAssociating neuralword embeddings with deep image representations using FisherVectorsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo15) pp 4437ndash4446IEEE Boston Mass USA June 2015

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 3: Research Article Dense Trajectories and DHOG for Classification …downloads.hindawi.com/journals/cmmm/2016/9610192.pdf · 2019-07-30 · Research Article Dense Trajectories and DHOG

Computational and Mathematical Methods in Medicine 3

(a) Original image

1 3 5 7 9 11 13 15 17

1 3 5 7 9 11 13 15 17

0

02

04

06

08

1

0

05

1

Valu

eVa

lue

Bin

Bin

Bin

Bin

(b) Projection in [0 2120587]

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

0

05

1

15

0

05

1

15

Valu

eVa

lue

Bin

Bin

Bin

Bin

(c) Projection in [0 120587]

Figure 3 The robustness of the compression direction on background

cooperate with optical flow and improve the accuracy ofthe optical flow but also its principle is simple and it haslow computational complexity In our research we use theimplementation from the OpenCV library to complete thecalculation of dense optical flow field

The shape descriptor of a trajectorymeans encoding localmotion positions Given a trajectory of length 119871 we describeits shape by a sequence (Δ119875

119905 Δ119875

119905+119871minus1) of displacement

vector Δ119875119905= (119875119905+1

minus 119875119905) = (119909

119905+1minus 119909119905 119910119905+1

minus 119910119905) The

resulting vector is normalized by the sum of displacementvector magnitudes

119879 =(Δ119875119905 Δ119875

119905+119871minus1)

sum119905+119871minus1

119895=119905

10038171003817100381710038171003817Δ119875119895

10038171003817100381710038171003817

(3)

In the following we refer to this vector as trajectory Aswe use trajectories with a fixed length of 119871 = 15 frames weobtain a 30-dimensional descriptor

23 Motion and Structure Descriptors Besides the trajectoryshape information we also design descriptors to embedappearance and motion information and include differencehistograms of oriented gradients (DHOG) histograms ofoptical flow (HOF) and the motion boundary histograms(MBH) These features are combined as a motion model ofthe local feature descriptor In 119871 frames of continuous videowe extract the feature in the119873times119873pixels of local area In orderto embed structural information we divide the 119873 times 119873 times 119871

spatiotemporal cube into a set of grids with 119899120590times 119899120590times 119899120591

and compute the local descriptor corresponding to each gridFinally we combine each descriptor into the final vector

231 Different Histograms of Oriented Gradients HOG fea-ture [22] is the information description based on the localstatistical characteristics its main thought is that the localtarget appearance and shape in an image can be described bythe gradient or the direction of the edge density distributionFirst the image is divided into several nonoverlapping cellswith size 119873 times 119873 and the gradient of each cell is calculatedand then represented using histogram statistics By mergingmultiple cells into a vector the HOG features are obtained

Although the traditional HOG features have strongrobustness for certain amount of illumination change theexpression ability of HOG feature still exists Since thegradient direction can be changed by the light and shade ofthe background it is possible to generate different expressionsfor the HOG feature as shown in Figure 3 The projection inthe interval [0 120587] has no change for the gradient directionhistogram as shown in Figure 3(c)

It can lead to ignoring the difference information ofsome target showed The HOG feature descriptor is uniformbecause of the different shape objects with different oppositedirection of gradient and the same size and number whichcan produce the same histogram So the HOG features haveno distinguished ability for these objects

In order to improve the expression ability of HOGfeature in echocardiography video characteristic trajectoryWe propose an improved HOG descriptor algorithm Wedivide 119871 into [0 2120587] area and get the gradient directionhistogram119867

119892 where 119871 is even and each pixel in the cell needs

to be vote

119867119900119892 (119894) = 119867

119892 (119894) + 119867119892 (119894 +119871

2) 1 le 119894 le

119871

2 (4)

4 Computational and Mathematical Methods in Medicine

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

0

02

04

06

08

1

Valu

eVa

lue

Valu

eBin

Bin

Bin

Bin

0

01

02

03

04

05

0

01

02

03

04

05

Bin

Bin

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

0

02

04

06

08

1

Valu

eVa

lue

Valu

e

Bin

Bin

Bin

Bin

0

01

02

03

04

05

0

01

02

03

04

05

Bin

Bin

Figure 4 Comparison of HOG and DHOG

where119867119900119892(119894) and119867

119892(119894) are respectively 119894th element values of

HOG and119867119892 In this paper we set a new series of histograms

119867119899119892

whose length is equal to the HOG after the HOG asshown in Figure 4 and each element value is

119867119899119892 (119894) =

1003816100381610038161003816100381610038161003816119867119892 (119894) minus 119867119892 (119894 +

119871

2)

1003816100381610038161003816100381610038161003816 1 le 119894 le

119871

2 (5)

In this paper we call the new gradient direction his-togram as difference HOG (DHOG) in order to simplify thecalculation we extract theHOG feature descriptors in 32times 32pixels of dense trajectory Each block is divided into 2times 2 cellsIn order to obtain the optimal description while consideringthe differences in the opposite direction in this paper the area(0sim180∘) is divided into nine bins and then extended to [02120587] by DHOG We divide the gradient direction area of 0sim360∘ into 18 bins and calculate gradient direction histogramin each cell Then we divide the 119871 = 15 framesrsquo densetrajectory into three parts and sum the DHOG features withthe corresponding bin The eigenvectors also are normalizedwith the 1198712-norm So in this paper there is a dense trajectoryof HOG feature with 119871 = 15 frames whose dimensionality is2 times 2 times 18 times 3 = 216

232 Gradient and Optical Flow Histograms Different fromtheDHOGdescriptor extracting the static shape informationthe HOF uses optical flow to describe and extract themotional informationThe calculation method of HOF is thesame as DHOG HOF makes the sum of modulus value oflight flowrsquos horizontal component and vertical componentas the amplitude of the light flow and makes the arctangentvalue of the ratio of vertical and horizontal component as thedirection angle of the light flowThen the light flow directionwill be projected into the adjacent bin which is based onthe amplitude of light flow Considering that the light flowamplitude is small (less than a certain threshold) we willadd a zero bin in the HOF feature which is used to hold thenumber of pixels HOF descriptor also is normalized with the1198712-norm The other parameters are the same as the DHOGSo the dimensionality of HOF is 2 times 2 times 9 times 3 = 108

233 Motion Boundary Histograms Motion boundary his-tograms (MBH) are proposed by Yao at el for detection ofthe human in 2014 [23] It computes derivatives separatelyfor the horizontal and vertical components MBH is thegradient of light flow it can get rid of uniform motion

Computational and Mathematical Methods in Medicine 5

Table 1 The number of each type of video

Video A2C A3C A4C A5C PLA PSAB PSAM PSAP SumNumber 45 31 36 7 42 11 39 17 228Patients 9 6 7 1 8 2 8 4 45Normal 36 25 29 6 34 9 31 13 183

and keep the change of optical flow field In a certainextent it is a complement to the DHOG and HOF that isremoving the camera movement to the human The DHOGand HOF are generated from the optical flow which makesthe computational complexity of MBH small

24 The Selection of Feature Encoding and Classifier In thispaper we use the Fisher vector [24 25] with the improved256 visual words of GMM (signed square-rooting followedby 1198712-norm) for features encoding Fisherrsquos vector seen as theexpansion of the BOV word bag model is the intermediateexpression of image Because Fisherrsquos vector fuses the advan-tages of production and is discriminant of Fisherrsquos kernelarchitecture it can not only reflect the frequency of eachvisual word but also encode the differences of local featurein visual words information In addition it can representthe richer image feature compared to the word bag modelThe high dimensional feature that combined with simpleand effective linear classifier can achieve good classificationeffect So in this paper we use the linear SVM classifierwhich has the excellent performance of machine learning inclassification and generalization So linear SVM classifier hasa lot of advantages compared to other learning algorithms inpractice and is simple and convenient to use Its performanceis better in classification

3 Experimental Results and Analysis

31 Data In this experiment 228 different echocardiographyvideos are collected from 72 different patients (including58 normal individuals and 14 patients with cardiac wallmotion abnormalities) these data were provided by TsinghuaUniversity First Hospital All the length of echocardiographyvideo is 1 to 2 seconds and the videos are stored as 434times 636 resolution and in 26-frame DICOM format (DigitalImaging and Communications in Medicine) Table 1 showsthe detailed number of eight kinds of video

32 Experiment Setup and Result Analysis In this experimentthe dense sampling grid of feature points is selected with 5pixels in the process of dense sampling and it can guaranteethe best result Then the tracking frame length is set to15 frames in the process of tracking of feature points Inthe process of motion boundary description the defaultparameters are119873 = 32 119899

120590= 2 and 119899

120591= 3 in our experiment

and these parameters have the best effect through the exper-iments testing In terms of selecting the feature encodingand classifier this experiment adopts vlfeat-0919 softwarefor processing (download from httpwwwvlfeatorg) Forthe selection of training video and testing video we use the

method of ldquohalf and halfrdquo (the total echocardiography videodata are randomly selected half as the training video and theother half as the testing video) And we analyze its classifiedresults

We use the confusion matrix to show the classificationaccuracy rate of the eight kinds of echocardiography videoin Table 2 and the calculation method of accuracy rate isshowed in function 31 Overall the average classificationaccuracy of all the classes is 7712 We can find that theclassification accuracy for each class is different In particularthe classification accuracy for these classes such as A5CPSAB and PSAP is lower than the others The main reasonis shortness of experiment data corresponding to these threekinds of echocardiography videos It leads to lack of trainingsamples

accuracy rate = number of correct videosnumber of videos

(6)

In view of the above experimental result that the clas-sification accuracy can be affected by the number of thesample videos in the following experiment we cut off the lessexperimental data of three kinds of video (A5C PSAB andPSAP) and make further experiment with the rest of the fivekinds of experimental data and compare the results (Table 3)

From Table 3 comparing with the eight kinds of echocar-diography video and five kinds of the echocardiographyvideo we can find that the classification accuracy showsa significant improvement The classification accuracy ispromoted from original 7712 to 9836

Our experiments also make a classification test forechocardiography video of the primal three kinds of locations(AA PLA and PSA) its average classification accuracy is100 and it can satisfy the basic needs of medical echocar-diography video classification

Finally compared with the other echocardiography videoclassification algorithm which is previously proposed ourexperimental result is increased a lot (illustrated in Table 4)And for the echocardiography video of the three kindsof primal locations our experiment has the best accuracyTherefore the method of echocardiography videorsquos classifi-cation based on dense trajectory tracking and DHOG hasan obvious improvement in accuracy compared with theprevious algorithm

4 Summary and Discussion

This paper is based on the classification of the echocardiog-raphy video with the dense trajectory tracking and DHOGalgorithm Firstly we use the dense trajectory tracking andDHOG algorithm to describe the feature in each kind ofechocardiography video and get the features and then applythe VLfeat software for Fisher vector to encode the featuresFinally we use the linear SVM to get the classification resultsFrom the experimental results while the method adopted inthis paper is compared with the previous method which isbased on the 3D-SIFT echocardiography video classificationthe classification accuracy is improved significantly But thefinal average classification accuracy is still not high because

6 Computational and Mathematical Methods in Medicine

Table 2 Confusion matrix for eight kinds of echocardiography video classification

Classified result AccuracyA2C A3C A4C A5C PLA PSAB PSAM PSAP

Class

A2C 22 1 0 0 0 0 0 0 096A3C 1 15 0 0 0 0 0 0 095A4C 0 0 17 1 0 0 0 0 095A5C 2 0 1 0 0 0 0 0 000PLA 0 0 0 0 21 0 0 0 100PSAB 0 0 0 0 1 3 0 2 050PSAM 0 0 0 0 0 0 20 0 100PSAP 0 0 0 0 2 0 0 7 081

Table 3 Confusionmatrix for five kinds of echocardiography videoclassification

Classified result AccuracyA2C A3C A4C PLA PSAM

Class

A2C 22 1 0 0 0 096A3C 1 15 0 0 0 095A4C 0 0 18 0 0 100PLA 0 0 0 21 0 100PSAM 0 0 0 0 20 100

Table 4 The algorithm accuracy compared with other algorithmsin this paper

The comparison of resultClassification accuracy

of eight classesClassification accuracy

of three classes[8] 72 90[12] 704 86[13] 74 91[14] 76 93[15] 765 93[16] 75 92[17] 732 95[18] 713 94Our method 7712 100

of the lack of the echocardiography video data So in thispaper we get rid of the small amount of three kinds ofexperimental data and make the classification experimentagain After solving the problem of the sample data theaverage classification accuracy of the rest of five classes ofdata compared to the average classification accuracy of eightclasses has increased nearly 209 And in three classesof primal position of echocardiography video the averageclassification accuracy is 100 Therefore the method ofthis paper can preliminarily achieve the requirement of themedical echocardiography video classification accuracy Fordealing with the classification of echocardiography videothe challenge is not just about the video noise caused bythe low resolution of the video but also how to balance thecomputational complexity and time cost In the future work

we will gradually increase the database of eight kinds ofechocardiography video and adopt or propose newer andmore effective method to improve the classification accuracyand reduce the required time of the classification as well

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural Science Fundin China under Grants nos 61471124 and 61473090 Alsothis research forms part of WIDTH project that is financiallyfunded by EC under FP7 programmewith Grant no PIRSES-GA-2010-269124

References

[1] R Kumar F Wang D Beymer and T Syeda-Mahmood ldquoCar-diac disease detection from echocardiogram using edge filteredscale-invariant motion featuresrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition Workshops (CVPRW rsquo10) pp 162ndash169 IEEE SanFrancisco Calif USA June 2010

[2] S K Zhou J H Park B Georgescu C Simopoulos J OtsukiandD Comaniciu ldquoImage-basedmulticlass boosting and echo-cardiographic view classificationrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition (CVPR rsquo06) vol 2 pp 1559ndash1565 New York NYUSA June 2006

[3] D Beymer T Syeda-Mahmood and F Wang ldquoExploitingspatio-temporal information for view recognition in cardiacecho videosrdquo in Proceedings of the IEEE Computer Society Con-ference on Computer Vision and Pattern Recognition Workshops(CVPRW rsquo08) pp 1ndash8 IEEE Anchorage Alaska USA June2008

[4] A Shalbaf H Behnam Z Alizade-Sani and M ShojaifardldquoAutomatic classification of left ventricular regional wallmotionabnormalities in echocardiography images using nonrigidimage registrationrdquo Journal of Digital Imaging vol 26 no 5 pp909ndash919 2013

[5] F Sera T S Kato M Farr et al ldquoLeft ventricular longitudinalstrain by speckle-tracking echocardiography is associated withtreatment-requiring cardiac allograft rejectionrdquo Journal of Car-diac Failure vol 20 no 5 pp 359ndash364 2014

Computational and Mathematical Methods in Medicine 7

[6] J Y Wang Y Li Y Zhang et al ldquoBag-of-features basedmedical image retrieval via multiple assignment and visualwords weightingrdquo IEEE Transactions on Medical Imaging vol30 no 11 pp 1996ndash2011 2011

[7] Y Guo YWang D Kong and X Shu ldquoAutomatic classificationof intracardiac tumor and thrombi in echocardiography basedon sparse representationrdquo IEEE Journal of Biomedical andHealth Informatics vol 19 no 2 pp 601ndash611 2015

[8] Y Qian L Wang C Wang and X Gao ldquoThe synergy of 3DSIFT and sparse codes for classification of viewpoints fromechocardiogram videordquo in Medical Content-Based Retrieval forClinical Decision Support Third MICCAI International Work-shop MCBR-CDS 2012 Nice France October 1 2012 RevisedSelected Papers vol 7723 of Lecture Notes In Computer Sciencepp 68ndash79 Springer Berlin Germany 2013

[9] MRap andAChacko ldquoOptimising the use of transoesophagealechocardiography in diagnosing suspected infective endocardi-tisrdquo Acta Cardiologica vol 70 no 4 pp 487ndash491 2015

[10] N Espinola-Zavaleta M E Soto A Romero-Gonzalez et alldquoPrevalence of congenital heart disease and pulmonary hyper-tension in Downrsquos syndrome an echocardiographic studyrdquoJournal of Cardiovascular Ultrasound vol 23 no 2 pp 72ndash772015

[11] H Wang A Klaser C Schmid and C-L Liu ldquoDense trajecto-ries and motion boundary descriptors for action recognitionrdquoInternational Journal of Computer Vision vol 103 no 1 pp 60ndash79 2013

[12] I LaptevMMarszałek C Schmid and B Rozenfeld ldquoLearningrealistic human actions frommoviesrdquo in Proceedings of the 26thIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo08) pp 1ndash8 Anchorage Alaska USA June 2008

[13] J Liu J Luo and M Shah ldquoRecognizing realistic actions fromvideos in the wildrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo09) pp 1996ndash2003 IEEE Miami Fla USA June 2009

[14] A Gilbert J Illingworth and R Bowden ldquoFast realisticmulti-action recognition using mined dense spatio-temporalfeaturesrdquo in Proceedings of the 12th International Conferenceon Computer Vision pp 925ndash931 IEEE Kyoto Japan October2009

[15] W Brendel and S Todorovic ldquoActivities as time series of humanposturesrdquo inComputer VisionmdashECCV 2010 11th European Con-ference on Computer Vision Heraklion Crete Greece September5ndash11 2010 Proceedings Part II vol 6312 pp 721ndash734 SpringerBerlin Germany 2010

[16] A Kovashka and K Grauman ldquoLearning a hierarchy ofdiscriminative space-time neighborhood features for humanaction recognitionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo10) pp 2046ndash2053 San Francisco Calif USA June 2010

[17] Q V Le W Y Zou S Y Yeung and A Y Ng ldquoLearning hierar-chical invariant spatio-temporal features for action recognitionwith independent subspace analysisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo11) pp 3361ndash3368 June 2011

[18] B Li M Ayazoglu T Mao O I Camps and M SznaierldquoActivity recognition using dynamic subspace anglesrdquo in Pro-ceedings of the IEEE Conference on Computer Vision and PatternRecognition (CVPR rsquo11) pp 3193ndash3200 IEEE Providence RIUSA June 2011

[19] G C Zhang and P A Vela ldquoGood features to track for visualSLAMrdquo in Proceedings of the IEEE Conference on Computer

Vision and Pattern Recognition (CVPR rsquo15) pp 1373ndash1382 IEEEBoston Mass USA June 2015

[20] H Wang and C Schmid ldquoAction recognition with improvedtrajectoriesrdquo in Proceedings of the 14th IEEE International Con-ference on Computer Vision (ICCV rsquo13) pp 3551ndash3558 SydneyAustralia December 2013

[21] X Li G Cui W Yi and L Kong ldquoA fast maneuvering targetmotion parameters estimation algorithmbased onACCFrdquo IEEESignal Processing Letters vol 22 no 3 pp 270ndash274 2015

[22] Y Ji M Yang Z Lu and C Wang ldquoIntegrating visual selectiveattention model with HOG features for traffic light detectionand recognitionrdquo in Proceedings of the IEEE Intelligent VehiclesSymposium (IV rsquo15) pp 280ndash285 Seoul Republic of KoreaJune-July 2015

[23] B Z Yao B X Nie Z Liu and S-C Zhu ldquoAnimated posetemplates for modeling and detecting human actionsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol36 no 3 pp 436ndash452 2014

[24] D Veitch and P Tune ldquoOptimal skampling for the flow sizedistributionrdquo IEEE Transactions on Information Theory vol 61no 6 pp 3075ndash3099 2015

[25] B Klein G Lev G Sadeh and L Wolf ldquoAssociating neuralword embeddings with deep image representations using FisherVectorsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo15) pp 4437ndash4446IEEE Boston Mass USA June 2015

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 4: Research Article Dense Trajectories and DHOG for Classification …downloads.hindawi.com/journals/cmmm/2016/9610192.pdf · 2019-07-30 · Research Article Dense Trajectories and DHOG

4 Computational and Mathematical Methods in Medicine

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

0

02

04

06

08

1

Valu

eVa

lue

Valu

eBin

Bin

Bin

Bin

0

01

02

03

04

05

0

01

02

03

04

05

Bin

Bin

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9

0

02

04

06

08

1

Valu

eVa

lue

Valu

e

Bin

Bin

Bin

Bin

0

01

02

03

04

05

0

01

02

03

04

05

Bin

Bin

Figure 4 Comparison of HOG and DHOG

where119867119900119892(119894) and119867

119892(119894) are respectively 119894th element values of

HOG and119867119892 In this paper we set a new series of histograms

119867119899119892

whose length is equal to the HOG after the HOG asshown in Figure 4 and each element value is

119867119899119892 (119894) =

1003816100381610038161003816100381610038161003816119867119892 (119894) minus 119867119892 (119894 +

119871

2)

1003816100381610038161003816100381610038161003816 1 le 119894 le

119871

2 (5)

In this paper we call the new gradient direction his-togram as difference HOG (DHOG) in order to simplify thecalculation we extract theHOG feature descriptors in 32times 32pixels of dense trajectory Each block is divided into 2times 2 cellsIn order to obtain the optimal description while consideringthe differences in the opposite direction in this paper the area(0sim180∘) is divided into nine bins and then extended to [02120587] by DHOG We divide the gradient direction area of 0sim360∘ into 18 bins and calculate gradient direction histogramin each cell Then we divide the 119871 = 15 framesrsquo densetrajectory into three parts and sum the DHOG features withthe corresponding bin The eigenvectors also are normalizedwith the 1198712-norm So in this paper there is a dense trajectoryof HOG feature with 119871 = 15 frames whose dimensionality is2 times 2 times 18 times 3 = 216

232 Gradient and Optical Flow Histograms Different fromtheDHOGdescriptor extracting the static shape informationthe HOF uses optical flow to describe and extract themotional informationThe calculation method of HOF is thesame as DHOG HOF makes the sum of modulus value oflight flowrsquos horizontal component and vertical componentas the amplitude of the light flow and makes the arctangentvalue of the ratio of vertical and horizontal component as thedirection angle of the light flowThen the light flow directionwill be projected into the adjacent bin which is based onthe amplitude of light flow Considering that the light flowamplitude is small (less than a certain threshold) we willadd a zero bin in the HOF feature which is used to hold thenumber of pixels HOF descriptor also is normalized with the1198712-norm The other parameters are the same as the DHOGSo the dimensionality of HOF is 2 times 2 times 9 times 3 = 108

233 Motion Boundary Histograms Motion boundary his-tograms (MBH) are proposed by Yao at el for detection ofthe human in 2014 [23] It computes derivatives separatelyfor the horizontal and vertical components MBH is thegradient of light flow it can get rid of uniform motion

Computational and Mathematical Methods in Medicine 5

Table 1 The number of each type of video

Video A2C A3C A4C A5C PLA PSAB PSAM PSAP SumNumber 45 31 36 7 42 11 39 17 228Patients 9 6 7 1 8 2 8 4 45Normal 36 25 29 6 34 9 31 13 183

and keep the change of optical flow field In a certainextent it is a complement to the DHOG and HOF that isremoving the camera movement to the human The DHOGand HOF are generated from the optical flow which makesthe computational complexity of MBH small

24 The Selection of Feature Encoding and Classifier In thispaper we use the Fisher vector [24 25] with the improved256 visual words of GMM (signed square-rooting followedby 1198712-norm) for features encoding Fisherrsquos vector seen as theexpansion of the BOV word bag model is the intermediateexpression of image Because Fisherrsquos vector fuses the advan-tages of production and is discriminant of Fisherrsquos kernelarchitecture it can not only reflect the frequency of eachvisual word but also encode the differences of local featurein visual words information In addition it can representthe richer image feature compared to the word bag modelThe high dimensional feature that combined with simpleand effective linear classifier can achieve good classificationeffect So in this paper we use the linear SVM classifierwhich has the excellent performance of machine learning inclassification and generalization So linear SVM classifier hasa lot of advantages compared to other learning algorithms inpractice and is simple and convenient to use Its performanceis better in classification

3 Experimental Results and Analysis

31 Data In this experiment 228 different echocardiographyvideos are collected from 72 different patients (including58 normal individuals and 14 patients with cardiac wallmotion abnormalities) these data were provided by TsinghuaUniversity First Hospital All the length of echocardiographyvideo is 1 to 2 seconds and the videos are stored as 434times 636 resolution and in 26-frame DICOM format (DigitalImaging and Communications in Medicine) Table 1 showsthe detailed number of eight kinds of video

32 Experiment Setup and Result Analysis In this experimentthe dense sampling grid of feature points is selected with 5pixels in the process of dense sampling and it can guaranteethe best result Then the tracking frame length is set to15 frames in the process of tracking of feature points Inthe process of motion boundary description the defaultparameters are119873 = 32 119899

120590= 2 and 119899

120591= 3 in our experiment

and these parameters have the best effect through the exper-iments testing In terms of selecting the feature encodingand classifier this experiment adopts vlfeat-0919 softwarefor processing (download from httpwwwvlfeatorg) Forthe selection of training video and testing video we use the

method of ldquohalf and halfrdquo (the total echocardiography videodata are randomly selected half as the training video and theother half as the testing video) And we analyze its classifiedresults

We use the confusion matrix to show the classificationaccuracy rate of the eight kinds of echocardiography videoin Table 2 and the calculation method of accuracy rate isshowed in function 31 Overall the average classificationaccuracy of all the classes is 7712 We can find that theclassification accuracy for each class is different In particularthe classification accuracy for these classes such as A5CPSAB and PSAP is lower than the others The main reasonis shortness of experiment data corresponding to these threekinds of echocardiography videos It leads to lack of trainingsamples

accuracy rate = number of correct videosnumber of videos

(6)

In view of the above experimental result that the clas-sification accuracy can be affected by the number of thesample videos in the following experiment we cut off the lessexperimental data of three kinds of video (A5C PSAB andPSAP) and make further experiment with the rest of the fivekinds of experimental data and compare the results (Table 3)

From Table 3 comparing with the eight kinds of echocar-diography video and five kinds of the echocardiographyvideo we can find that the classification accuracy showsa significant improvement The classification accuracy ispromoted from original 7712 to 9836

Our experiments also make a classification test forechocardiography video of the primal three kinds of locations(AA PLA and PSA) its average classification accuracy is100 and it can satisfy the basic needs of medical echocar-diography video classification

Finally compared with the other echocardiography videoclassification algorithm which is previously proposed ourexperimental result is increased a lot (illustrated in Table 4)And for the echocardiography video of the three kindsof primal locations our experiment has the best accuracyTherefore the method of echocardiography videorsquos classifi-cation based on dense trajectory tracking and DHOG hasan obvious improvement in accuracy compared with theprevious algorithm

4 Summary and Discussion

This paper is based on the classification of the echocardiog-raphy video with the dense trajectory tracking and DHOGalgorithm Firstly we use the dense trajectory tracking andDHOG algorithm to describe the feature in each kind ofechocardiography video and get the features and then applythe VLfeat software for Fisher vector to encode the featuresFinally we use the linear SVM to get the classification resultsFrom the experimental results while the method adopted inthis paper is compared with the previous method which isbased on the 3D-SIFT echocardiography video classificationthe classification accuracy is improved significantly But thefinal average classification accuracy is still not high because

6 Computational and Mathematical Methods in Medicine

Table 2 Confusion matrix for eight kinds of echocardiography video classification

Classified result AccuracyA2C A3C A4C A5C PLA PSAB PSAM PSAP

Class

A2C 22 1 0 0 0 0 0 0 096A3C 1 15 0 0 0 0 0 0 095A4C 0 0 17 1 0 0 0 0 095A5C 2 0 1 0 0 0 0 0 000PLA 0 0 0 0 21 0 0 0 100PSAB 0 0 0 0 1 3 0 2 050PSAM 0 0 0 0 0 0 20 0 100PSAP 0 0 0 0 2 0 0 7 081

Table 3 Confusionmatrix for five kinds of echocardiography videoclassification

Classified result AccuracyA2C A3C A4C PLA PSAM

Class

A2C 22 1 0 0 0 096A3C 1 15 0 0 0 095A4C 0 0 18 0 0 100PLA 0 0 0 21 0 100PSAM 0 0 0 0 20 100

Table 4 The algorithm accuracy compared with other algorithmsin this paper

The comparison of resultClassification accuracy

of eight classesClassification accuracy

of three classes[8] 72 90[12] 704 86[13] 74 91[14] 76 93[15] 765 93[16] 75 92[17] 732 95[18] 713 94Our method 7712 100

of the lack of the echocardiography video data So in thispaper we get rid of the small amount of three kinds ofexperimental data and make the classification experimentagain After solving the problem of the sample data theaverage classification accuracy of the rest of five classes ofdata compared to the average classification accuracy of eightclasses has increased nearly 209 And in three classesof primal position of echocardiography video the averageclassification accuracy is 100 Therefore the method ofthis paper can preliminarily achieve the requirement of themedical echocardiography video classification accuracy Fordealing with the classification of echocardiography videothe challenge is not just about the video noise caused bythe low resolution of the video but also how to balance thecomputational complexity and time cost In the future work

we will gradually increase the database of eight kinds ofechocardiography video and adopt or propose newer andmore effective method to improve the classification accuracyand reduce the required time of the classification as well

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural Science Fundin China under Grants nos 61471124 and 61473090 Alsothis research forms part of WIDTH project that is financiallyfunded by EC under FP7 programmewith Grant no PIRSES-GA-2010-269124

References

[1] R Kumar F Wang D Beymer and T Syeda-Mahmood ldquoCar-diac disease detection from echocardiogram using edge filteredscale-invariant motion featuresrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition Workshops (CVPRW rsquo10) pp 162ndash169 IEEE SanFrancisco Calif USA June 2010

[2] S K Zhou J H Park B Georgescu C Simopoulos J OtsukiandD Comaniciu ldquoImage-basedmulticlass boosting and echo-cardiographic view classificationrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition (CVPR rsquo06) vol 2 pp 1559ndash1565 New York NYUSA June 2006

[3] D Beymer T Syeda-Mahmood and F Wang ldquoExploitingspatio-temporal information for view recognition in cardiacecho videosrdquo in Proceedings of the IEEE Computer Society Con-ference on Computer Vision and Pattern Recognition Workshops(CVPRW rsquo08) pp 1ndash8 IEEE Anchorage Alaska USA June2008

[4] A Shalbaf H Behnam Z Alizade-Sani and M ShojaifardldquoAutomatic classification of left ventricular regional wallmotionabnormalities in echocardiography images using nonrigidimage registrationrdquo Journal of Digital Imaging vol 26 no 5 pp909ndash919 2013

[5] F Sera T S Kato M Farr et al ldquoLeft ventricular longitudinalstrain by speckle-tracking echocardiography is associated withtreatment-requiring cardiac allograft rejectionrdquo Journal of Car-diac Failure vol 20 no 5 pp 359ndash364 2014

Computational and Mathematical Methods in Medicine 7

[6] J Y Wang Y Li Y Zhang et al ldquoBag-of-features basedmedical image retrieval via multiple assignment and visualwords weightingrdquo IEEE Transactions on Medical Imaging vol30 no 11 pp 1996ndash2011 2011

[7] Y Guo YWang D Kong and X Shu ldquoAutomatic classificationof intracardiac tumor and thrombi in echocardiography basedon sparse representationrdquo IEEE Journal of Biomedical andHealth Informatics vol 19 no 2 pp 601ndash611 2015

[8] Y Qian L Wang C Wang and X Gao ldquoThe synergy of 3DSIFT and sparse codes for classification of viewpoints fromechocardiogram videordquo in Medical Content-Based Retrieval forClinical Decision Support Third MICCAI International Work-shop MCBR-CDS 2012 Nice France October 1 2012 RevisedSelected Papers vol 7723 of Lecture Notes In Computer Sciencepp 68ndash79 Springer Berlin Germany 2013

[9] MRap andAChacko ldquoOptimising the use of transoesophagealechocardiography in diagnosing suspected infective endocardi-tisrdquo Acta Cardiologica vol 70 no 4 pp 487ndash491 2015

[10] N Espinola-Zavaleta M E Soto A Romero-Gonzalez et alldquoPrevalence of congenital heart disease and pulmonary hyper-tension in Downrsquos syndrome an echocardiographic studyrdquoJournal of Cardiovascular Ultrasound vol 23 no 2 pp 72ndash772015

[11] H Wang A Klaser C Schmid and C-L Liu ldquoDense trajecto-ries and motion boundary descriptors for action recognitionrdquoInternational Journal of Computer Vision vol 103 no 1 pp 60ndash79 2013

[12] I LaptevMMarszałek C Schmid and B Rozenfeld ldquoLearningrealistic human actions frommoviesrdquo in Proceedings of the 26thIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo08) pp 1ndash8 Anchorage Alaska USA June 2008

[13] J Liu J Luo and M Shah ldquoRecognizing realistic actions fromvideos in the wildrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo09) pp 1996ndash2003 IEEE Miami Fla USA June 2009

[14] A Gilbert J Illingworth and R Bowden ldquoFast realisticmulti-action recognition using mined dense spatio-temporalfeaturesrdquo in Proceedings of the 12th International Conferenceon Computer Vision pp 925ndash931 IEEE Kyoto Japan October2009

[15] W Brendel and S Todorovic ldquoActivities as time series of humanposturesrdquo inComputer VisionmdashECCV 2010 11th European Con-ference on Computer Vision Heraklion Crete Greece September5ndash11 2010 Proceedings Part II vol 6312 pp 721ndash734 SpringerBerlin Germany 2010

[16] A Kovashka and K Grauman ldquoLearning a hierarchy ofdiscriminative space-time neighborhood features for humanaction recognitionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo10) pp 2046ndash2053 San Francisco Calif USA June 2010

[17] Q V Le W Y Zou S Y Yeung and A Y Ng ldquoLearning hierar-chical invariant spatio-temporal features for action recognitionwith independent subspace analysisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo11) pp 3361ndash3368 June 2011

[18] B Li M Ayazoglu T Mao O I Camps and M SznaierldquoActivity recognition using dynamic subspace anglesrdquo in Pro-ceedings of the IEEE Conference on Computer Vision and PatternRecognition (CVPR rsquo11) pp 3193ndash3200 IEEE Providence RIUSA June 2011

[19] G C Zhang and P A Vela ldquoGood features to track for visualSLAMrdquo in Proceedings of the IEEE Conference on Computer

Vision and Pattern Recognition (CVPR rsquo15) pp 1373ndash1382 IEEEBoston Mass USA June 2015

[20] H Wang and C Schmid ldquoAction recognition with improvedtrajectoriesrdquo in Proceedings of the 14th IEEE International Con-ference on Computer Vision (ICCV rsquo13) pp 3551ndash3558 SydneyAustralia December 2013

[21] X Li G Cui W Yi and L Kong ldquoA fast maneuvering targetmotion parameters estimation algorithmbased onACCFrdquo IEEESignal Processing Letters vol 22 no 3 pp 270ndash274 2015

[22] Y Ji M Yang Z Lu and C Wang ldquoIntegrating visual selectiveattention model with HOG features for traffic light detectionand recognitionrdquo in Proceedings of the IEEE Intelligent VehiclesSymposium (IV rsquo15) pp 280ndash285 Seoul Republic of KoreaJune-July 2015

[23] B Z Yao B X Nie Z Liu and S-C Zhu ldquoAnimated posetemplates for modeling and detecting human actionsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol36 no 3 pp 436ndash452 2014

[24] D Veitch and P Tune ldquoOptimal skampling for the flow sizedistributionrdquo IEEE Transactions on Information Theory vol 61no 6 pp 3075ndash3099 2015

[25] B Klein G Lev G Sadeh and L Wolf ldquoAssociating neuralword embeddings with deep image representations using FisherVectorsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo15) pp 4437ndash4446IEEE Boston Mass USA June 2015

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 5: Research Article Dense Trajectories and DHOG for Classification …downloads.hindawi.com/journals/cmmm/2016/9610192.pdf · 2019-07-30 · Research Article Dense Trajectories and DHOG

Computational and Mathematical Methods in Medicine 5

Table 1 The number of each type of video

Video A2C A3C A4C A5C PLA PSAB PSAM PSAP SumNumber 45 31 36 7 42 11 39 17 228Patients 9 6 7 1 8 2 8 4 45Normal 36 25 29 6 34 9 31 13 183

and keep the change of optical flow field In a certainextent it is a complement to the DHOG and HOF that isremoving the camera movement to the human The DHOGand HOF are generated from the optical flow which makesthe computational complexity of MBH small

24 The Selection of Feature Encoding and Classifier In thispaper we use the Fisher vector [24 25] with the improved256 visual words of GMM (signed square-rooting followedby 1198712-norm) for features encoding Fisherrsquos vector seen as theexpansion of the BOV word bag model is the intermediateexpression of image Because Fisherrsquos vector fuses the advan-tages of production and is discriminant of Fisherrsquos kernelarchitecture it can not only reflect the frequency of eachvisual word but also encode the differences of local featurein visual words information In addition it can representthe richer image feature compared to the word bag modelThe high dimensional feature that combined with simpleand effective linear classifier can achieve good classificationeffect So in this paper we use the linear SVM classifierwhich has the excellent performance of machine learning inclassification and generalization So linear SVM classifier hasa lot of advantages compared to other learning algorithms inpractice and is simple and convenient to use Its performanceis better in classification

3 Experimental Results and Analysis

31 Data In this experiment 228 different echocardiographyvideos are collected from 72 different patients (including58 normal individuals and 14 patients with cardiac wallmotion abnormalities) these data were provided by TsinghuaUniversity First Hospital All the length of echocardiographyvideo is 1 to 2 seconds and the videos are stored as 434times 636 resolution and in 26-frame DICOM format (DigitalImaging and Communications in Medicine) Table 1 showsthe detailed number of eight kinds of video

32 Experiment Setup and Result Analysis In this experimentthe dense sampling grid of feature points is selected with 5pixels in the process of dense sampling and it can guaranteethe best result Then the tracking frame length is set to15 frames in the process of tracking of feature points Inthe process of motion boundary description the defaultparameters are119873 = 32 119899

120590= 2 and 119899

120591= 3 in our experiment

and these parameters have the best effect through the exper-iments testing In terms of selecting the feature encodingand classifier this experiment adopts vlfeat-0919 softwarefor processing (download from httpwwwvlfeatorg) Forthe selection of training video and testing video we use the

method of ldquohalf and halfrdquo (the total echocardiography videodata are randomly selected half as the training video and theother half as the testing video) And we analyze its classifiedresults

We use the confusion matrix to show the classificationaccuracy rate of the eight kinds of echocardiography videoin Table 2 and the calculation method of accuracy rate isshowed in function 31 Overall the average classificationaccuracy of all the classes is 7712 We can find that theclassification accuracy for each class is different In particularthe classification accuracy for these classes such as A5CPSAB and PSAP is lower than the others The main reasonis shortness of experiment data corresponding to these threekinds of echocardiography videos It leads to lack of trainingsamples

accuracy rate = number of correct videosnumber of videos

(6)

In view of the above experimental result that the clas-sification accuracy can be affected by the number of thesample videos in the following experiment we cut off the lessexperimental data of three kinds of video (A5C PSAB andPSAP) and make further experiment with the rest of the fivekinds of experimental data and compare the results (Table 3)

From Table 3 comparing with the eight kinds of echocar-diography video and five kinds of the echocardiographyvideo we can find that the classification accuracy showsa significant improvement The classification accuracy ispromoted from original 7712 to 9836

Our experiments also make a classification test forechocardiography video of the primal three kinds of locations(AA PLA and PSA) its average classification accuracy is100 and it can satisfy the basic needs of medical echocar-diography video classification

Finally compared with the other echocardiography videoclassification algorithm which is previously proposed ourexperimental result is increased a lot (illustrated in Table 4)And for the echocardiography video of the three kindsof primal locations our experiment has the best accuracyTherefore the method of echocardiography videorsquos classifi-cation based on dense trajectory tracking and DHOG hasan obvious improvement in accuracy compared with theprevious algorithm

4 Summary and Discussion

This paper is based on the classification of the echocardiog-raphy video with the dense trajectory tracking and DHOGalgorithm Firstly we use the dense trajectory tracking andDHOG algorithm to describe the feature in each kind ofechocardiography video and get the features and then applythe VLfeat software for Fisher vector to encode the featuresFinally we use the linear SVM to get the classification resultsFrom the experimental results while the method adopted inthis paper is compared with the previous method which isbased on the 3D-SIFT echocardiography video classificationthe classification accuracy is improved significantly But thefinal average classification accuracy is still not high because

6 Computational and Mathematical Methods in Medicine

Table 2 Confusion matrix for eight kinds of echocardiography video classification

Classified result AccuracyA2C A3C A4C A5C PLA PSAB PSAM PSAP

Class

A2C 22 1 0 0 0 0 0 0 096A3C 1 15 0 0 0 0 0 0 095A4C 0 0 17 1 0 0 0 0 095A5C 2 0 1 0 0 0 0 0 000PLA 0 0 0 0 21 0 0 0 100PSAB 0 0 0 0 1 3 0 2 050PSAM 0 0 0 0 0 0 20 0 100PSAP 0 0 0 0 2 0 0 7 081

Table 3 Confusionmatrix for five kinds of echocardiography videoclassification

Classified result AccuracyA2C A3C A4C PLA PSAM

Class

A2C 22 1 0 0 0 096A3C 1 15 0 0 0 095A4C 0 0 18 0 0 100PLA 0 0 0 21 0 100PSAM 0 0 0 0 20 100

Table 4 The algorithm accuracy compared with other algorithmsin this paper

The comparison of resultClassification accuracy

of eight classesClassification accuracy

of three classes[8] 72 90[12] 704 86[13] 74 91[14] 76 93[15] 765 93[16] 75 92[17] 732 95[18] 713 94Our method 7712 100

of the lack of the echocardiography video data So in thispaper we get rid of the small amount of three kinds ofexperimental data and make the classification experimentagain After solving the problem of the sample data theaverage classification accuracy of the rest of five classes ofdata compared to the average classification accuracy of eightclasses has increased nearly 209 And in three classesof primal position of echocardiography video the averageclassification accuracy is 100 Therefore the method ofthis paper can preliminarily achieve the requirement of themedical echocardiography video classification accuracy Fordealing with the classification of echocardiography videothe challenge is not just about the video noise caused bythe low resolution of the video but also how to balance thecomputational complexity and time cost In the future work

we will gradually increase the database of eight kinds ofechocardiography video and adopt or propose newer andmore effective method to improve the classification accuracyand reduce the required time of the classification as well

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural Science Fundin China under Grants nos 61471124 and 61473090 Alsothis research forms part of WIDTH project that is financiallyfunded by EC under FP7 programmewith Grant no PIRSES-GA-2010-269124

References

[1] R Kumar F Wang D Beymer and T Syeda-Mahmood ldquoCar-diac disease detection from echocardiogram using edge filteredscale-invariant motion featuresrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition Workshops (CVPRW rsquo10) pp 162ndash169 IEEE SanFrancisco Calif USA June 2010

[2] S K Zhou J H Park B Georgescu C Simopoulos J OtsukiandD Comaniciu ldquoImage-basedmulticlass boosting and echo-cardiographic view classificationrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition (CVPR rsquo06) vol 2 pp 1559ndash1565 New York NYUSA June 2006

[3] D Beymer T Syeda-Mahmood and F Wang ldquoExploitingspatio-temporal information for view recognition in cardiacecho videosrdquo in Proceedings of the IEEE Computer Society Con-ference on Computer Vision and Pattern Recognition Workshops(CVPRW rsquo08) pp 1ndash8 IEEE Anchorage Alaska USA June2008

[4] A Shalbaf H Behnam Z Alizade-Sani and M ShojaifardldquoAutomatic classification of left ventricular regional wallmotionabnormalities in echocardiography images using nonrigidimage registrationrdquo Journal of Digital Imaging vol 26 no 5 pp909ndash919 2013

[5] F Sera T S Kato M Farr et al ldquoLeft ventricular longitudinalstrain by speckle-tracking echocardiography is associated withtreatment-requiring cardiac allograft rejectionrdquo Journal of Car-diac Failure vol 20 no 5 pp 359ndash364 2014

Computational and Mathematical Methods in Medicine 7

[6] J Y Wang Y Li Y Zhang et al ldquoBag-of-features basedmedical image retrieval via multiple assignment and visualwords weightingrdquo IEEE Transactions on Medical Imaging vol30 no 11 pp 1996ndash2011 2011

[7] Y Guo YWang D Kong and X Shu ldquoAutomatic classificationof intracardiac tumor and thrombi in echocardiography basedon sparse representationrdquo IEEE Journal of Biomedical andHealth Informatics vol 19 no 2 pp 601ndash611 2015

[8] Y Qian L Wang C Wang and X Gao ldquoThe synergy of 3DSIFT and sparse codes for classification of viewpoints fromechocardiogram videordquo in Medical Content-Based Retrieval forClinical Decision Support Third MICCAI International Work-shop MCBR-CDS 2012 Nice France October 1 2012 RevisedSelected Papers vol 7723 of Lecture Notes In Computer Sciencepp 68ndash79 Springer Berlin Germany 2013

[9] MRap andAChacko ldquoOptimising the use of transoesophagealechocardiography in diagnosing suspected infective endocardi-tisrdquo Acta Cardiologica vol 70 no 4 pp 487ndash491 2015

[10] N Espinola-Zavaleta M E Soto A Romero-Gonzalez et alldquoPrevalence of congenital heart disease and pulmonary hyper-tension in Downrsquos syndrome an echocardiographic studyrdquoJournal of Cardiovascular Ultrasound vol 23 no 2 pp 72ndash772015

[11] H Wang A Klaser C Schmid and C-L Liu ldquoDense trajecto-ries and motion boundary descriptors for action recognitionrdquoInternational Journal of Computer Vision vol 103 no 1 pp 60ndash79 2013

[12] I LaptevMMarszałek C Schmid and B Rozenfeld ldquoLearningrealistic human actions frommoviesrdquo in Proceedings of the 26thIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo08) pp 1ndash8 Anchorage Alaska USA June 2008

[13] J Liu J Luo and M Shah ldquoRecognizing realistic actions fromvideos in the wildrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo09) pp 1996ndash2003 IEEE Miami Fla USA June 2009

[14] A Gilbert J Illingworth and R Bowden ldquoFast realisticmulti-action recognition using mined dense spatio-temporalfeaturesrdquo in Proceedings of the 12th International Conferenceon Computer Vision pp 925ndash931 IEEE Kyoto Japan October2009

[15] W Brendel and S Todorovic ldquoActivities as time series of humanposturesrdquo inComputer VisionmdashECCV 2010 11th European Con-ference on Computer Vision Heraklion Crete Greece September5ndash11 2010 Proceedings Part II vol 6312 pp 721ndash734 SpringerBerlin Germany 2010

[16] A Kovashka and K Grauman ldquoLearning a hierarchy ofdiscriminative space-time neighborhood features for humanaction recognitionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo10) pp 2046ndash2053 San Francisco Calif USA June 2010

[17] Q V Le W Y Zou S Y Yeung and A Y Ng ldquoLearning hierar-chical invariant spatio-temporal features for action recognitionwith independent subspace analysisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo11) pp 3361ndash3368 June 2011

[18] B Li M Ayazoglu T Mao O I Camps and M SznaierldquoActivity recognition using dynamic subspace anglesrdquo in Pro-ceedings of the IEEE Conference on Computer Vision and PatternRecognition (CVPR rsquo11) pp 3193ndash3200 IEEE Providence RIUSA June 2011

[19] G C Zhang and P A Vela ldquoGood features to track for visualSLAMrdquo in Proceedings of the IEEE Conference on Computer

Vision and Pattern Recognition (CVPR rsquo15) pp 1373ndash1382 IEEEBoston Mass USA June 2015

[20] H Wang and C Schmid ldquoAction recognition with improvedtrajectoriesrdquo in Proceedings of the 14th IEEE International Con-ference on Computer Vision (ICCV rsquo13) pp 3551ndash3558 SydneyAustralia December 2013

[21] X Li G Cui W Yi and L Kong ldquoA fast maneuvering targetmotion parameters estimation algorithmbased onACCFrdquo IEEESignal Processing Letters vol 22 no 3 pp 270ndash274 2015

[22] Y Ji M Yang Z Lu and C Wang ldquoIntegrating visual selectiveattention model with HOG features for traffic light detectionand recognitionrdquo in Proceedings of the IEEE Intelligent VehiclesSymposium (IV rsquo15) pp 280ndash285 Seoul Republic of KoreaJune-July 2015

[23] B Z Yao B X Nie Z Liu and S-C Zhu ldquoAnimated posetemplates for modeling and detecting human actionsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol36 no 3 pp 436ndash452 2014

[24] D Veitch and P Tune ldquoOptimal skampling for the flow sizedistributionrdquo IEEE Transactions on Information Theory vol 61no 6 pp 3075ndash3099 2015

[25] B Klein G Lev G Sadeh and L Wolf ldquoAssociating neuralword embeddings with deep image representations using FisherVectorsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo15) pp 4437ndash4446IEEE Boston Mass USA June 2015

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 6: Research Article Dense Trajectories and DHOG for Classification …downloads.hindawi.com/journals/cmmm/2016/9610192.pdf · 2019-07-30 · Research Article Dense Trajectories and DHOG

6 Computational and Mathematical Methods in Medicine

Table 2 Confusion matrix for eight kinds of echocardiography video classification

Classified result AccuracyA2C A3C A4C A5C PLA PSAB PSAM PSAP

Class

A2C 22 1 0 0 0 0 0 0 096A3C 1 15 0 0 0 0 0 0 095A4C 0 0 17 1 0 0 0 0 095A5C 2 0 1 0 0 0 0 0 000PLA 0 0 0 0 21 0 0 0 100PSAB 0 0 0 0 1 3 0 2 050PSAM 0 0 0 0 0 0 20 0 100PSAP 0 0 0 0 2 0 0 7 081

Table 3 Confusionmatrix for five kinds of echocardiography videoclassification

Classified result AccuracyA2C A3C A4C PLA PSAM

Class

A2C 22 1 0 0 0 096A3C 1 15 0 0 0 095A4C 0 0 18 0 0 100PLA 0 0 0 21 0 100PSAM 0 0 0 0 20 100

Table 4 The algorithm accuracy compared with other algorithmsin this paper

The comparison of resultClassification accuracy

of eight classesClassification accuracy

of three classes[8] 72 90[12] 704 86[13] 74 91[14] 76 93[15] 765 93[16] 75 92[17] 732 95[18] 713 94Our method 7712 100

of the lack of the echocardiography video data So in thispaper we get rid of the small amount of three kinds ofexperimental data and make the classification experimentagain After solving the problem of the sample data theaverage classification accuracy of the rest of five classes ofdata compared to the average classification accuracy of eightclasses has increased nearly 209 And in three classesof primal position of echocardiography video the averageclassification accuracy is 100 Therefore the method ofthis paper can preliminarily achieve the requirement of themedical echocardiography video classification accuracy Fordealing with the classification of echocardiography videothe challenge is not just about the video noise caused bythe low resolution of the video but also how to balance thecomputational complexity and time cost In the future work

we will gradually increase the database of eight kinds ofechocardiography video and adopt or propose newer andmore effective method to improve the classification accuracyand reduce the required time of the classification as well

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural Science Fundin China under Grants nos 61471124 and 61473090 Alsothis research forms part of WIDTH project that is financiallyfunded by EC under FP7 programmewith Grant no PIRSES-GA-2010-269124

References

[1] R Kumar F Wang D Beymer and T Syeda-Mahmood ldquoCar-diac disease detection from echocardiogram using edge filteredscale-invariant motion featuresrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition Workshops (CVPRW rsquo10) pp 162ndash169 IEEE SanFrancisco Calif USA June 2010

[2] S K Zhou J H Park B Georgescu C Simopoulos J OtsukiandD Comaniciu ldquoImage-basedmulticlass boosting and echo-cardiographic view classificationrdquo in Proceedings of the IEEEComputer Society Conference on Computer Vision and PatternRecognition (CVPR rsquo06) vol 2 pp 1559ndash1565 New York NYUSA June 2006

[3] D Beymer T Syeda-Mahmood and F Wang ldquoExploitingspatio-temporal information for view recognition in cardiacecho videosrdquo in Proceedings of the IEEE Computer Society Con-ference on Computer Vision and Pattern Recognition Workshops(CVPRW rsquo08) pp 1ndash8 IEEE Anchorage Alaska USA June2008

[4] A Shalbaf H Behnam Z Alizade-Sani and M ShojaifardldquoAutomatic classification of left ventricular regional wallmotionabnormalities in echocardiography images using nonrigidimage registrationrdquo Journal of Digital Imaging vol 26 no 5 pp909ndash919 2013

[5] F Sera T S Kato M Farr et al ldquoLeft ventricular longitudinalstrain by speckle-tracking echocardiography is associated withtreatment-requiring cardiac allograft rejectionrdquo Journal of Car-diac Failure vol 20 no 5 pp 359ndash364 2014

Computational and Mathematical Methods in Medicine 7

[6] J Y Wang Y Li Y Zhang et al ldquoBag-of-features basedmedical image retrieval via multiple assignment and visualwords weightingrdquo IEEE Transactions on Medical Imaging vol30 no 11 pp 1996ndash2011 2011

[7] Y Guo YWang D Kong and X Shu ldquoAutomatic classificationof intracardiac tumor and thrombi in echocardiography basedon sparse representationrdquo IEEE Journal of Biomedical andHealth Informatics vol 19 no 2 pp 601ndash611 2015

[8] Y Qian L Wang C Wang and X Gao ldquoThe synergy of 3DSIFT and sparse codes for classification of viewpoints fromechocardiogram videordquo in Medical Content-Based Retrieval forClinical Decision Support Third MICCAI International Work-shop MCBR-CDS 2012 Nice France October 1 2012 RevisedSelected Papers vol 7723 of Lecture Notes In Computer Sciencepp 68ndash79 Springer Berlin Germany 2013

[9] MRap andAChacko ldquoOptimising the use of transoesophagealechocardiography in diagnosing suspected infective endocardi-tisrdquo Acta Cardiologica vol 70 no 4 pp 487ndash491 2015

[10] N Espinola-Zavaleta M E Soto A Romero-Gonzalez et alldquoPrevalence of congenital heart disease and pulmonary hyper-tension in Downrsquos syndrome an echocardiographic studyrdquoJournal of Cardiovascular Ultrasound vol 23 no 2 pp 72ndash772015

[11] H Wang A Klaser C Schmid and C-L Liu ldquoDense trajecto-ries and motion boundary descriptors for action recognitionrdquoInternational Journal of Computer Vision vol 103 no 1 pp 60ndash79 2013

[12] I LaptevMMarszałek C Schmid and B Rozenfeld ldquoLearningrealistic human actions frommoviesrdquo in Proceedings of the 26thIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo08) pp 1ndash8 Anchorage Alaska USA June 2008

[13] J Liu J Luo and M Shah ldquoRecognizing realistic actions fromvideos in the wildrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo09) pp 1996ndash2003 IEEE Miami Fla USA June 2009

[14] A Gilbert J Illingworth and R Bowden ldquoFast realisticmulti-action recognition using mined dense spatio-temporalfeaturesrdquo in Proceedings of the 12th International Conferenceon Computer Vision pp 925ndash931 IEEE Kyoto Japan October2009

[15] W Brendel and S Todorovic ldquoActivities as time series of humanposturesrdquo inComputer VisionmdashECCV 2010 11th European Con-ference on Computer Vision Heraklion Crete Greece September5ndash11 2010 Proceedings Part II vol 6312 pp 721ndash734 SpringerBerlin Germany 2010

[16] A Kovashka and K Grauman ldquoLearning a hierarchy ofdiscriminative space-time neighborhood features for humanaction recognitionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo10) pp 2046ndash2053 San Francisco Calif USA June 2010

[17] Q V Le W Y Zou S Y Yeung and A Y Ng ldquoLearning hierar-chical invariant spatio-temporal features for action recognitionwith independent subspace analysisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo11) pp 3361ndash3368 June 2011

[18] B Li M Ayazoglu T Mao O I Camps and M SznaierldquoActivity recognition using dynamic subspace anglesrdquo in Pro-ceedings of the IEEE Conference on Computer Vision and PatternRecognition (CVPR rsquo11) pp 3193ndash3200 IEEE Providence RIUSA June 2011

[19] G C Zhang and P A Vela ldquoGood features to track for visualSLAMrdquo in Proceedings of the IEEE Conference on Computer

Vision and Pattern Recognition (CVPR rsquo15) pp 1373ndash1382 IEEEBoston Mass USA June 2015

[20] H Wang and C Schmid ldquoAction recognition with improvedtrajectoriesrdquo in Proceedings of the 14th IEEE International Con-ference on Computer Vision (ICCV rsquo13) pp 3551ndash3558 SydneyAustralia December 2013

[21] X Li G Cui W Yi and L Kong ldquoA fast maneuvering targetmotion parameters estimation algorithmbased onACCFrdquo IEEESignal Processing Letters vol 22 no 3 pp 270ndash274 2015

[22] Y Ji M Yang Z Lu and C Wang ldquoIntegrating visual selectiveattention model with HOG features for traffic light detectionand recognitionrdquo in Proceedings of the IEEE Intelligent VehiclesSymposium (IV rsquo15) pp 280ndash285 Seoul Republic of KoreaJune-July 2015

[23] B Z Yao B X Nie Z Liu and S-C Zhu ldquoAnimated posetemplates for modeling and detecting human actionsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol36 no 3 pp 436ndash452 2014

[24] D Veitch and P Tune ldquoOptimal skampling for the flow sizedistributionrdquo IEEE Transactions on Information Theory vol 61no 6 pp 3075ndash3099 2015

[25] B Klein G Lev G Sadeh and L Wolf ldquoAssociating neuralword embeddings with deep image representations using FisherVectorsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo15) pp 4437ndash4446IEEE Boston Mass USA June 2015

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 7: Research Article Dense Trajectories and DHOG for Classification …downloads.hindawi.com/journals/cmmm/2016/9610192.pdf · 2019-07-30 · Research Article Dense Trajectories and DHOG

Computational and Mathematical Methods in Medicine 7

[6] J Y Wang Y Li Y Zhang et al ldquoBag-of-features basedmedical image retrieval via multiple assignment and visualwords weightingrdquo IEEE Transactions on Medical Imaging vol30 no 11 pp 1996ndash2011 2011

[7] Y Guo YWang D Kong and X Shu ldquoAutomatic classificationof intracardiac tumor and thrombi in echocardiography basedon sparse representationrdquo IEEE Journal of Biomedical andHealth Informatics vol 19 no 2 pp 601ndash611 2015

[8] Y Qian L Wang C Wang and X Gao ldquoThe synergy of 3DSIFT and sparse codes for classification of viewpoints fromechocardiogram videordquo in Medical Content-Based Retrieval forClinical Decision Support Third MICCAI International Work-shop MCBR-CDS 2012 Nice France October 1 2012 RevisedSelected Papers vol 7723 of Lecture Notes In Computer Sciencepp 68ndash79 Springer Berlin Germany 2013

[9] MRap andAChacko ldquoOptimising the use of transoesophagealechocardiography in diagnosing suspected infective endocardi-tisrdquo Acta Cardiologica vol 70 no 4 pp 487ndash491 2015

[10] N Espinola-Zavaleta M E Soto A Romero-Gonzalez et alldquoPrevalence of congenital heart disease and pulmonary hyper-tension in Downrsquos syndrome an echocardiographic studyrdquoJournal of Cardiovascular Ultrasound vol 23 no 2 pp 72ndash772015

[11] H Wang A Klaser C Schmid and C-L Liu ldquoDense trajecto-ries and motion boundary descriptors for action recognitionrdquoInternational Journal of Computer Vision vol 103 no 1 pp 60ndash79 2013

[12] I LaptevMMarszałek C Schmid and B Rozenfeld ldquoLearningrealistic human actions frommoviesrdquo in Proceedings of the 26thIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo08) pp 1ndash8 Anchorage Alaska USA June 2008

[13] J Liu J Luo and M Shah ldquoRecognizing realistic actions fromvideos in the wildrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo09) pp 1996ndash2003 IEEE Miami Fla USA June 2009

[14] A Gilbert J Illingworth and R Bowden ldquoFast realisticmulti-action recognition using mined dense spatio-temporalfeaturesrdquo in Proceedings of the 12th International Conferenceon Computer Vision pp 925ndash931 IEEE Kyoto Japan October2009

[15] W Brendel and S Todorovic ldquoActivities as time series of humanposturesrdquo inComputer VisionmdashECCV 2010 11th European Con-ference on Computer Vision Heraklion Crete Greece September5ndash11 2010 Proceedings Part II vol 6312 pp 721ndash734 SpringerBerlin Germany 2010

[16] A Kovashka and K Grauman ldquoLearning a hierarchy ofdiscriminative space-time neighborhood features for humanaction recognitionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo10) pp 2046ndash2053 San Francisco Calif USA June 2010

[17] Q V Le W Y Zou S Y Yeung and A Y Ng ldquoLearning hierar-chical invariant spatio-temporal features for action recognitionwith independent subspace analysisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo11) pp 3361ndash3368 June 2011

[18] B Li M Ayazoglu T Mao O I Camps and M SznaierldquoActivity recognition using dynamic subspace anglesrdquo in Pro-ceedings of the IEEE Conference on Computer Vision and PatternRecognition (CVPR rsquo11) pp 3193ndash3200 IEEE Providence RIUSA June 2011

[19] G C Zhang and P A Vela ldquoGood features to track for visualSLAMrdquo in Proceedings of the IEEE Conference on Computer

Vision and Pattern Recognition (CVPR rsquo15) pp 1373ndash1382 IEEEBoston Mass USA June 2015

[20] H Wang and C Schmid ldquoAction recognition with improvedtrajectoriesrdquo in Proceedings of the 14th IEEE International Con-ference on Computer Vision (ICCV rsquo13) pp 3551ndash3558 SydneyAustralia December 2013

[21] X Li G Cui W Yi and L Kong ldquoA fast maneuvering targetmotion parameters estimation algorithmbased onACCFrdquo IEEESignal Processing Letters vol 22 no 3 pp 270ndash274 2015

[22] Y Ji M Yang Z Lu and C Wang ldquoIntegrating visual selectiveattention model with HOG features for traffic light detectionand recognitionrdquo in Proceedings of the IEEE Intelligent VehiclesSymposium (IV rsquo15) pp 280ndash285 Seoul Republic of KoreaJune-July 2015

[23] B Z Yao B X Nie Z Liu and S-C Zhu ldquoAnimated posetemplates for modeling and detecting human actionsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol36 no 3 pp 436ndash452 2014

[24] D Veitch and P Tune ldquoOptimal skampling for the flow sizedistributionrdquo IEEE Transactions on Information Theory vol 61no 6 pp 3075ndash3099 2015

[25] B Klein G Lev G Sadeh and L Wolf ldquoAssociating neuralword embeddings with deep image representations using FisherVectorsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo15) pp 4437ndash4446IEEE Boston Mass USA June 2015

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 8: Research Article Dense Trajectories and DHOG for Classification …downloads.hindawi.com/journals/cmmm/2016/9610192.pdf · 2019-07-30 · Research Article Dense Trajectories and DHOG

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom