context-dependent detection of unusual events in videos by geometric analysis of video trajectories...
TRANSCRIPT
Context-dependent Detection of Context-dependent Detection of
Unusual Events in VideosUnusual Events in Videos bybyGeometric Analysis of Video Geometric Analysis of Video
TrajectoriesTrajectoriesLongin Jan LateckiLongin Jan Latecki
((lateckilatecki@[email protected])) Computer and Information ScienceComputer and Information Sciencess
Temple University, PhiladelphiaTemple University, Philadelphia
Nilesh Ghubade and Nilesh Ghubade and Xiangdong Wen Xiangdong Wen ((nileshgnileshg@[email protected]))
AgendaAgenda
IntroductionIntroduction Mapping of video to a trajectoryMapping of video to a trajectory Relation: motion trajectory Relation: motion trajectory video video
trajectorytrajectory Discrete curve evolutionDiscrete curve evolution Polygon simplificationPolygon simplification Key framesKey frames Unusual events in surveillance videosUnusual events in surveillance videos ResultsResults
Main ToolsMain Tools Mapping the video sequence to a polyline Mapping the video sequence to a polyline
in in a a multi-dimensional space. multi-dimensional space. The automatic extraction of relevant frames The automatic extraction of relevant frames
from videos is based on from videos is based on polygon polygon simplification simplification by by discrete curve evolutiondiscrete curve evolution..
Mapping of video to a Mapping of video to a trajectorytrajectory
MapMapping ofping of the image stream to a trajectory the image stream to a trajectory (polyline) in a feature space.(polyline) in a feature space.
Representing each frame Representing each frame as:as:
Bin0 ……… Bin nFrame 0
Frame N
X-coord of the Bin’s centroid
Bin’s Frequency
Count
Y-coord of the Bin’s centroid
Bin n
Used in our Used in our experimentsexperiments
Red-Green-Blue (rgb) BinsRed-Green-Blue (rgb) Bins Each frame as a 24-bit color image (8 bit per Each frame as a 24-bit color image (8 bit per
color intensity)color intensity)::• Bin 0 = color intensities from 0-31Bin 0 = color intensities from 0-31• Bin 1 = color intensities from 32-63Bin 1 = color intensities from 32-63• Bin 8 = color intensities from 224-255Bin 8 = color intensities from 224-255
Three attributes per bin: -Three attributes per bin: -• Row of the bin’s centroidRow of the bin’s centroid• Column of the bin’s centroidColumn of the bin’s centroid• Frequency count of the bin.Frequency count of the bin.
(8 bins per color level * 3 attributes/bin)*3 color (8 bins per color level * 3 attributes/bin)*3 color levels = 72 featurelevels = 72 feature
Theoretical Results:
Motion trajectory Video trajectory
Consider a video in which an object (a set of pixels) is moving on a uniform background. The object is visible in all frames and it is moving with a constant speed on a linear trajectory. Then the video trajectory in the feature space is a straight line.
If n objects are moving with constant speeds on a linear trajectory, then the trajectory is a straight line in the feature space.
Consider a video in which an object (a set of pixels) is moving on a uniform background.
Then the trajectory vectors are contained in the plane.
If n objects are moving, then the dimension of the trajectory is at most 2n.
If a new object suddenly appears in the movie, the dimension of the trajectory increases at least by 1 and at most by 3.
MovingDotMovieWithAdditionalDot.avi
Robust Rank ComputationRobust Rank Computation
Using singular value decomposition, based on: C. Rao, A. Yilmaz, and M.Shah.View-Invariant Representation and Recognition of actions.Int. J. of Computer Vision 50, 2002.M. Seitz and C. R. Dyer.View-invariant analysis of cyclic motion. Int. J. of Computer Vision 16, 1997.
n
iiMerr
3
22 )(
We compute err in a window of 11 consecutive frames in our experiments.
0 20 40 60 80 100 120 140 1600
1
2
3
4
5
6
7
8x 10
-21
Frame Number
Nor
m D
ist
for
the
win
dow
of
"11"
fra
mes
MovingDotMovieWithAdditionalDotBins:Graph of Norm Dist for window of "11" frames VERSUS frame number
MovingDotMovieWithAdditionalDot.avi
Interpolation of video trajectoryInterpolation of video trajectory
MovingDotMovie_Clockwise.avi
MovingDotMovieWithAdditionalDot.avi
Polygon Polygon simplificationsimplification
Relevance Ranking Frame Number
0 1
1 100
99 5
98 12
Frames with decreasing relevance
Discrete Curve EvolutionDiscrete Curve Evolution P=P P=P00, ..., P, ..., Pmm
PPi+1i+1 is obtained from P is obtained from Pii by deleting the by deleting the vertices of Pvertices of Pii that have minimal relevance that have minimal relevance
measure measure K(v, PK(v, Pii) = K(u,v,w) = |d(u,v)+d(v,w)-d(u,w)|) = K(u,v,w) = |d(u,v)+d(v,w)-d(u,w)|
u
v
w u
v
w
Discrete Curve Evolution: Discrete Curve Evolution: Preservation of position, no blurringPreservation of position, no blurring
Discrete Curve Evolution: Discrete Curve Evolution: robustness with respect to noiserobustness with respect to noise
Discrete Curve Evolution: Discrete Curve Evolution: extraction of linear segmentsextraction of linear segments
Key Frame Extraction Key Frame Extraction
Key frames and Key frames and rankrank
Security1 Security1 Bins MatrixBins MatrixDistance MatrixDistance Matrix
0 50 100 150 200 250 300 350 4000
0.2
0.4
0.6
0.8
1x 10
-3
Frame Number
Nor
m D
ist
for
the
win
dow
of
"11"
fra
mes
security1Bins:Graph of Norm Dist for window of "11" frames VERSUS frame number
err for seciurity1 video
M. S. Drew and J. Au: M. S. Drew and J. Au: http://www.cs.sfu.ca/~mark/ftp/AcmMM00/http://www.cs.sfu.ca/~mark/ftp/AcmMM00/
Predictability of video parts:Predictability of video parts:Local Curveness computationLocal Curveness computation
We divide the video polygonal curve P into parts T_i. For videos with 25 fps: T_i contains 25 frames.
We apply discrete curve evolution to each T_iuntil three points remain: a, b, c.Curveness measure of T_i:
C(T_i,P) = |d(a, b) + d(b, c) - d(a, c)|
b is the most relevant frame in T_i and the first vertex of T_i+1
security7
0 50 100 150 200 250 300 350 400
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5x 10
-4
Frame Number
Nor
m D
ist
for
the
win
dow
of
"11"
fra
mes
security7Bins:Graph of Norm Dist for window of "11" frames VERSUS frame number
err forseciurity7
2D projection by PCA of video trajectory for security7
Mov3
0 50 100 150 200 250 300 350 400
0
0.5
1
1.5
2
2.5
3
3.5
4x 10
-4
Frame Number
Nor
m D
ist
for
the
win
dow
of
"11"
fra
mes
Mov3Bins:Graph of Norm Dist for window of "11" frames VERSUS frame number
Mov3:Mov3:
Rustam waving his hand.Rustam waving his hand.
Bins MatrixBins Matrix
KeyKey frames = 1 378 52 142 frames = 1 378 52 142 253 235 148 31 155 167253 235 148 31 155 167
Distance MatrixDistance Matrix
KeyKey frames = 1 378 253 220 frames = 1 378 253 220 161 109 50 155 149 270 161 109 50 155 149 270
Hall_monitor
0 50 100 150 200 250 3001.5
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5x 10
-5
Frame Number
Nor
m D
ist
for
the
win
dow
of
"11"
fra
mes
HallMonitorBins:Graph of Norm Dist for window of "11" frames VERSUS frame number
err forhall_monitor
Hall Monitor:Hall Monitor:
2 persons entering-exiting in 2 persons entering-exiting in a hall.a hall.
Bins MatrixBins Matrix
KeyKey frames = 1 300 35 240 frames = 1 300 35 240 221 215 265 241 278 280221 215 265 241 278 280
Distance Matrix Distance Matrix
KeyKey frames = 1 300 37 265 frames = 1 300 37 265 241 240 235 278 280 282241 240 235 278 280 282
CameraAtLightSignal.avi
Multimodal HistogramMultimodal Histogram
Histogram of lena
Segmented ImageSegmented Image
Image after segmentation – we get a outline of her face, hat etc
Gray Scale Image - MultimodalGray Scale Image - Multimodal
Original Image of Lena
Thank youThank you