particle dynamics and multi- channel feature dictionaries for robust visual tracking srikrishna...
TRANSCRIPT
Particle Dynamics and Multi-Channel Feature Dictionaries for Robust Visual Tracking
Srikrishna Karanam, Yang Li, Rich Radke Dept. of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute, Troy NY
2
Compressive sensing tracking
Feature dictiona
ry๐ด=[๐ก1๐ก 2โฏ ๐ก๐]
X. Mei and H. Ling, Robust visual tracking using minimization, ICCV 2009.
3
Compressive sensing tracking
Current
state
๐ ๐ก+1=๐ ๐ก+๐ฉ (0,1 )๐ข0๐ข0=๐๐๐๐(๐0)๐ (๐ ๐ก+1โจ๐ ๐ก)
Hypotheses
X. Mei and H. Ling, Robust visual tracking using minimization, ICCV 2009.
4
Compressive sensing tracking
Hypothesis Testing
โฏ
๐ฆ=๐ด๐ฅ+๐ Sparse x, e
X. Mei and H. Ling, Robust visual tracking using minimization, ICCV 2009.
Contributions
APPEARANCE MODELโข Multi-channel feature
dictionariesโข Image intensityโข Image gradient
magnitudeโข Histograms of
Oriented Gradients
HYPOTHESIS GENERATIONโข Particle filterโข Adaptive variance
Gaussian State Transition Model
HYPOTHESIS TESTINGโข minimizationโข Probabilistic reasoningโข Adaptive filteringโข Dictionary update
5
6
Appearance model Intensity
Normalized gradient magnitude
Histograms of Oriented Gradients
โฏ
โฏ
โฏ
Norm.Gradient
HOG
โ
(โถ)Intensity
โ
(โถ)
โ
(โถ)
J. Wright and Y. Ma, Dense error correction via minimization, IEEE Trans. on Info. Theory, 2009
7
Hypothesis generation โ Transition model
โข Contribution โ Dynamic state transition model
โข - state with highest observation probability
โข - estimated using past states
โฎ
Past statevectors
๏ฟฝฬ๏ฟฝ (๐)
๐ (๐)
๐ (๐โ๐)
๐ (๐โ๐+๐)
๐ (๐โ๐)๐๐+๐
X. Mei and H. Ling, Robust visual tracking using minimization, ICCV 2009.C. Bao et al., Real-time robust tracker using accelerated proximal gradient approach, CVPR 2012.Z. Hong et al., Tracking via robust multi-task multi-view joint sparse representation, ICCV 2013.
8
Hypothesis generation โ Transition modelโข Contribution โ Dynamic
state transition model
โข - state with highest observation probability
โข - estimated using dynamics of past states
โข to be computedโข Hankel matrixโข Least squares minimization
๐ ๐ก+1=๐ ๐ก+๐ฉ (0,1 )๐๐+๐
๐ ๐ก+1=max (min (๐0โ๐๐ก ,๐๐๐๐ฅ ) ,๐๐๐๐)
๐๐ก=โ๐=1
3
๐๐ก๐๐๐ก
๐=โฅ ๐ฆ~๐ ๐ก๐ โ ๐ฆ ๐ ๐ก
๐ โฅ2
๐ ๐ก=๐1๐ ๐กโ1+๐2๐ ๐กโ 2+โฏ+๐๐๐ ๐กโ๐
๐ป ๐ ๐ก โ๐โ1 ,๐๐๐= [๐ ๐+1๐ ๐+2โฏ๐ ๐+ (๐กโ๐โ1 ) ]๐
M. Ayazoglu et al., Dynamic subspace-based coordinated multicamera tracking, ICCV 2011.
(1)
(2)
(3)
(4)
(5)
(6)
9
Hypothesis generation โ Particle filtering
๐=๐ ๐โ 1,1โ๐ฟ2
2๐
โข Related approaches โ (400-600, fixed)โข Dynamic model + adaptive candidate
filtering
D. Fox, KLD-Sampling: Adaptive Particle Filters, NIPS 2001.
10
Hypothesis testing min๐ฅ ,๐
โฅ ๐ฅโฅ1+โฅ๐ โฅ1๐ . ๐ก . ๐ฆ=๐ด๐ฅ+๐
min๐ฅ ,๐
๐ฟ (๐ฅ ,๐ ,๐ )
๐ฅ๐+1=argmin๐ฅ๐ฟ(๐ฅ ,๐๐ ,๐๐)
๐๐+ 1=argmin๐๐ฟ(๐ฅ ๐+1 ,๐ ,๐๐)
๐๐+1=๐๐+๐(๐ฆโ๐ด๐ฅ ๐+ 1โ๐๐+1)
FISTA
Analytic
Hypothesis โข Intensity
โข Norm. gradient โข HOG
min in each channel
Highest observation probability
๐ (๐ฆ๐ก|๐ ๐ก )=exp (โโ๐=1
3
๐ผ ๐โฅ ๐ด๐๐ฅ ๐โ ๐ฆ๐ก
๐โฅ22)
๐ฆ ๐
๐ฅ ๐ ,๐ ๐
โฏ
11
Data
โข Publicly available standard test sequences
Focal Length
Y. Wu et al., Online object tracking: a benchmark, CVPR 2013.
12
Evaluation metrics
โข Success plot
โข Robustness testsโข Temporal robustness test
โข Spatial Robustness test
Principal Point Focal Length
โข Overlap precision vs. Overlap threshold
13
Experimental Results โ Overall Success Plot
โข Ideally, close to 1
Principal Point Focal Length
14
Experimental Results โ Robustness tests
โข Temporal robustness evaluation
โข Spatial robustness evaluation
Principal Point Focal Length
15
Experimental Results
โข Validating key components.
โข Choice of features.
โข Choice of transition model.
โข Adaptive candidate filtering
Principal Point Focal Length
Distortion Coefficient
16
SpeedMethod Speed
(fps)Template size
Average
distance
precision
Average AUC
Ours* 2.5 64 x 64 0.92 0.69
L1* 8.2 12 x 15 0.47 0.36
MTT* 0.4 32 x 32 0.60 0.42
ONDL* 0.5 32 x 32 0.79 0.59
SCM* 0.05 32 x 32 0.72 0.59
ASLA* 0.7 32 x 32 0.73 0.59
LSH 7 - 0.70 0.57
LOT 0.2 - 0.53 0.31
SPT 0.1 - 0.49 0.29
MIL 8.5 - 0.56 0.45
IVT 6.5 - 0.61 0.46
* - based on sparse visual representation.
This material is based upon work supported by the U.S. Department of Homeland Security, Science and Technology Directorate, Office of University Programs, under Award 2013-ST-061-ED0001. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the U.S. Department of Homeland Security.
Conclusionsโข Multi-Channel featuresโข Particle dynamical informationโข Adaptive filtering
Thank you! Questions?