Online Detection of Unusual Events in Videos via Dynamic Sparse Coding
Outline
• Unusual Event Detection• Video Representation• Dynamic Sparse Coding• Empirical Study• Conclusions
Outline
• Unusual Event Detection• Video Representation• Dynamic Sparse Coding• Empirical Study• Conclusions
Unusual events: Incidences that occur very rarely in the entire video
Unusual Event Detection
• Easy-to-verify– Given a frame, fairly easy to decide if unusual
events occur• Hard-to-describe– Cannot enumerate all possible unusual events – Cannot model unusual events directly
• Solution: Model usual events instead, and claim anything different as unusual
Easy to model usual events?
Challenges
• Unsupervised learning– Only input is video itself
• Online detecting– In most cases, cannot afford multiple runs through
the video• Concept drift– Usual events change
• Truly unusual event vs. Noisy usual event
Previous Works
• Clustering Based Method (CVPR 2004)
– Finding spatially isolated clusters• Reconstruction (IJCV 2007)
• Space-time Markov Random Field (CVPR 2009)
Outline
• Unusual Event Detection• Video Representation• Dynamic Sparse Coding• Empirical Study• Conclusions
Video Features
• Static features based on edges and object shapes– Image-level information
• Dynamic features based on optical flow measurements– Motion information
• Spatio-Temporal Interest Points– Obtained from local video patches– Shown to be useful in human action categorization
Spatio-Temporal Interest Point
• Detection– Basic idea: generalize interest point detector from
spatial domain to spatio-temporal domain– Spatial (image): Laplacian, Hessian, Harris corner
detector, etc.– Spatio-temporal (video): spatio-temporal corners,
Laplacian on spatial and temporal axis– Output: small video patches extracted from each
interest point
Spatio-Temporal Interest Point (Cont.)
• Description– Similar to detection, generalization of spatial
method to spatio-temporal domain– Spatial: histogram of directional gradients – SIFT,
HOG– Spatio-Temporal: gradients on x, y, and time
directions
Outline
• Unusual Event Detection• Video Representation• Dynamic Sparse Coding• Empirical Study• Conclusions
Motivation of the Approach
• Sparse Reconstruction– Reconstruct an event with other events in the
video• Usual events: multiple appearances could find a few
events to reconstruct it SPARSE• Unusual events: rare appearance need large amount
of events for reconstruction DENSE
• Concisely represent the knowledge of usual events
The Proposed Approach• Define events in the video
– Sliding window runs through the video– Spatio-temporal interest points within the same window define
an event
• Knowledge of what are usual events– Store in the learned dictionary D
• Abnormality of an event
• Update dictionary D
(Ab)Normality
• Reconstruction error
• Sparsity regularization
• Smoothness regularization
(Ab)Normality (Cont.)
• Empirical demonstration
Work-flow
Optimization
• Learning with Fixed 𝜶 D
• Learning D with Fixed 𝜶
• Online Dictionary Update
Outline
• Unusual Event Detection• Video Representation• Dynamic Sparse Coding• Empirical Study• Conclusions
Video Data• Subway Surveillance Videos
– Subway exit: 43 minutes, 65K frames• Usual events: people exiting subway• Unusual events: entering subway, loitering, etc.
– Subway entrance: 1 hour 36 minutes, 144K frames• Usual events: people entering subway• Unusual events: exiting subway, no payment, loitering
• Youtube Videos: 8 short videos– Different camera motion (rotation, zoom in/out, fast tracking, slow
motion, etc.)– Different categories of targets (human, vehicles, animals, etc.)– Wide variety of activities and environmental conditions (indoor, outdoor).
Learned Dictionary
• Subway exit surveillance video
• Subway entrance surveillance video
Quantitative Comparison
• Subway exit surveillance video
• Subway entrance surveillance video
Analysis Experiment
• Online Update of the Learned Dictionary– Our approach: update learned dictionary after
observing new event– Comparing method: fixed dictionary
Detected Unusual Events
• Subway Exit
Detected Unusual Events (Cont.)
• Subway entrance
Detected Unusual Events (Cont.)
• Youtube Videos– For each video, approximately the first 1/5 of
video data is used to learn initial dictionary– Unusual event detection is carried out in the
remaining video– Red boxes represent sliding windows that result in
unusual event detection
Outline
• Unusual Event Detection• Video Representation• Dynamic Sparse Coding• Empirical Study• Conclusions
Conclusions
• Fully unsupervised dynamic sparse coding approach for detecting unusual events in videos
• Bases dictionary is updated in an online fashion as the algorithm observes more data, avoiding any issues with concept drift.