people re-identification using depth cameras

22
Who is who at different cameras: people re-identification using depth cameras Belén Castillón Fernández de Pinedo

Upload: belen-castillon

Post on 21-Nov-2015

217 views

Category:

Documents


2 download

DESCRIPTION

People reidentificacion using depth cameras.

TRANSCRIPT

Who is who at different cameras: people re-identification using depth cameras

Who is who at different cameras: people re-identification using depth cameras Beln Castilln Fernndez de Pinedo 1. Introduction

2. System description2.1 Kinect sensor2.2 Camera calibration2.3 Heigh maps2.4 Segmentation2.5 Tracking

3. Bodyprints3.1 Extraction3.2 Matching people

4. Results and discussion

5. Conclusions

OutlineGoal: Obtain a feature vector per person, which is called bodyprint, this vectors can be matched to solve the re-identification problem. Bodyprints are obtained using calibrated depth-colour cameras such as Microsoft kinect.

Main problem: In multi-camera systems is difficult to re-identify people who leave one camera and enter in another or in the same one after a period of time.

1. Introduction2. System description

2.1 Kinect sensorAligned RGB and depth images obtained with kinect.Has a colour RGB and an infrared (IR) camera. It also has an IR pattern generator that jointly with the IR camera is able to determine the depth.

Segments people and it is able to estimate their position.

The sensor has a minimum distance to measure depth around 1 m, and a maximum distance of about 10 m. (only for indoor)

2.2 Camera calibration The system has the origin in the optical center of the camera and its aligned with the camera axis.To change from camera coordinates to world coordinates.

Spatial camera coordinates equations for every pixel:

These equations provide 3D coordinates

zword coordinate represent the height.

Transformation between both systems:2.2 Camera calibration

To determine the ground plane just select a portion of the RGB-d image that correspond to the ground plane. The Zworld axis will be normal to this plane.2.2 Camera calibration (ground)

Virtual aerial view of the scene.

They are images where pixel values represent the height with respect to the ground.

Makes segmentation easier.2.3 Height maps

Segmentation provides us information of which pixels in the original image belong to each particular person.2.4 Segmentation

Process of linking the segmentation results from several frames.

Track denote a thread of linked objects corresponding to the same person.2.5 TrackingKey idea: To match people, we extract a feature vector per track which we call bodyprint. Each bodyprint summarises the colour appearance at a different height for a track.

Algorithm: Height is discretised at steps of 2 cm.

At time t, the mean RGB value for each given height is computed to obtain the temporal signatures.

3. Bodyprints

3. Bodyprints

We obtain bodyprints by averaging the temporal signatures along time:

Where the bodyprint vector is RGBk (h), it describes the appareance of the person. The count vector is Ck (h) and measures the reliability of the values of the bodyprint.3.1 Extraction

To compare bodyprints we propose a normalized weighted correlation coefficient.

If we want to compare bodyprints j and k we use W(h) that allows to compare bodyprints with missing values (like occlusions):3.2 Matching people

We compute a weighted mean for each track, which is used to compensate changes in brightness and finally the correlation:

3.2 Matching people

Experiment 1: people recorded by camera 1 are searched across some videos recorded by camera 2.

One camera captures people entering into a shop and another one captures people at the exit. (front1-front2 and rear1-rear2 views)

The re-identification performance that obtained is 93%.

4. Results and discussion4. Results and discussionFront-front exampleRear-rear example

Example of a wrong match: the correct match had the second highest correlation coefficient and it was very similar to the highest (0.87345 and 0.87212).

4. Results and discussion

Experiment 2: people are re-identified using the same camera.

The key difference compared with the previous experiment is that frontal and rear views are now compared.

The average correct re-identification obtained in this experiment drops to 55%.

4. Results and discussionProblems in re-identification: presence of logos on T-shirts, backpacks4. Results and discussion

The method has proved to be robust against differences in illumination, point of view and momentary partial occlusions.

Errors:Similar appearance of two different people.Different appearance of the same person from the point of view of each camera.

Solutions:More complex models can be used.Models that take into account the relative angular position with respect to the person axis.5. Conclusions