people counting in low density video sequences2
TRANSCRIPT
People Counting in Low Density Video Sequences31
Application
Motivation
Diagrams show the sequence the algorithm
Detection and Segmentation of Foreground Objects
Tracking Objects
Head Detection and Feature Extraction
Classifier
Conclusion
Application 32• SecurityAutomatically detect if a person enters a restricted area.In a surveillance system , helps to filter the information coming
from the cameras.
• Advertising companies. Know if person tend to stop near billboard. Helps choosing the best place to advertise. Lets advertisers know if a certain advertisement is having the
expected impact on the public• Shop OwnersKnowing the average of client in the store(hourly , daily , Monthly….).Evaluate if the display window catches the eye of people passing by
Motivation 33• Difficulties
Extensive human monitoring of the incoming video channels is
impractical, expensive and ineffective
Storing a large capacity of data is problematic
• Solution
Automatic systems that trigger recording or video transmission and
attract the attention of a human observer to a particular video
channel.
Static scenes - Motion detection
Dynamic scenes - Abnormal motion detection
Detection and Segmentation of Foreground Objects 35
• the main limitation of such techniques refers to the presence of noise mainly due to the variations in the scene I
illumination
Shadows
spurious generated by video compression algorithms.
Detection and Segmentation of Foreground Objects CONT( ) 36 Basic idea for segmentation is
Background subtraction technique is one of such segmentation approaches that is very sensitive to illumination
The resulting image is a set of pixels from the object in motion. Morphological operations such as dilation and erosion are employed to assure that these pixels make up a single object. In such a way, now a single object called blob which has all its pixels connected.
Tracking Objects 37• The tracking of the objects in the scene and the prediction of its position in the scene is done by cost function below:
Cost = (wP*dP) + (wS*dS) + (wD*dD) + dT
• Where wP, wS, and wD are weights that sum to one.
• dP is the Euclidean distance in pixels between the object centers.
• dS is the size difference between the bounding boxes.
• dD is the difference between the center of the region of movement and the center of the object,
• dT is the difference of the time to live (TTL) of the object.
Tracking Objects 38• an object may be occluded and the Lucas-Kanade algorithm will fail to predict the movement. In this case, the movement of such an object is predicted as:
Os = S*Os + (1-S) * (Rc -Oc )
• where S is a fixed value of the speed.
• The region of movement is Rc .
• should be the closest region to the object, respecting the proximity threshold.
Then, the new position of the object and its bounding box are computed as:
• Oc = Oc + Os
• Or = Or + Os
Head Detection and Feature Extraction 310• Two strategies were proposed to tackle the problem of estimating the
number of persons enclosed by the blobs.
• The head region is computed by splitting the blob height into four zones.
The first strategy employs two thresholds that are learned from the width and area of the blobs.
The second strategy, a zoning scheme divides the head region into ten vertical zones of equal size. From each vertical zone is computed the number of foreground pixels divided by the sub region total area.
Classifier 314• Our problems are if many object in a counting zone.
• The counting algorithm starts to analyze the segmented objects (blobs) when they enter in a counting zone.
The first strategy, a person is added to the counting if the analyzed blob does not have a width greater than the width threshold estimated.
Otherwise, additional information based on the blob head region area is used to estimate the number of persons enclosed by the blob.
Classifier COUNT… 315 The second strategy has two steps: training and classification.
The training, a database with reference feature vectors is build from the analysis of blobs in motion in the frames of videos.
Each reference feature vector receives a label to indicate if it encloses (one, two, or three) persons.
The labels are assigned by human operators at the calibration/training step.
The classification consists of, given a blob in motion.
a set V of feature vectors is extracted. The number of vectors in the V set varies according to the period of time the blob takes to pass through the counting zone,
• the Euclidean distance among each feature vector in V is defined as follows:
Conclusion 316• In this presentation we have presented an approach for counting
people that pass through a virtual counting zone and which are gathered by a general purpose CCTV camera.
• One of the proposed strategies is able to count with a relative accuracy the number of persons even when they are very close or when they are in groups.
• we have achieved correct counting rates up to 85% on video clips captured in a non- controlled environment.