particle dynamics and multi- channel feature dictionaries for robust visual tracking srikrishna...

Particle Dynamics and Multi-Channel Feature Dictionaries for Robust Visual Tracking

Srikrishna Karanam, Yang Li, Rich Radke Dept. of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute, Troy NY

2

Compressive sensing tracking

Feature dictiona

ry𝐴=[𝑡1𝑡 2⋯ 𝑡𝑛]

X. Mei and H. Ling, Robust visual tracking using minimization, ICCV 2009.

3


Current

state

𝑠𝑡+1=𝑠𝑡+𝒩 (0,1 )𝑢0𝑢0=𝑑𝑖𝑎𝑔(𝜎0)𝑝 (𝑠𝑡+1∨𝑠𝑡)

Hypotheses


4


Hypothesis Testing

⋯

𝑦=𝐴𝑥+𝑒 Sparse x, e


Contributions

APPEARANCE MODEL• Multi-channel feature

dictionaries• Image intensity• Image gradient

magnitude• Histograms of

Oriented Gradients

HYPOTHESIS GENERATION• Particle filter• Adaptive variance

Gaussian State Transition Model

HYPOTHESIS TESTING• minimization• Probabilistic reasoning• Adaptive filtering• Dictionary update

5

6

Appearance model Intensity

Normalized gradient magnitude

Histograms of Oriented Gradients

⋯

⋯

⋯

Norm.Gradient

HOG

∑

(∶)Intensity

∑

(∶)

∑

(∶)

J. Wright and Y. Ma, Dense error correction via minimization, IEEE Trans. on Info. Theory, 2009

7

Hypothesis generation – Transition model

• Contribution – Dynamic state transition model

• - state with highest observation probability

• - estimated using past states

⋮

Past statevectors

�̂� (𝒕)

𝒆 (𝒕)

𝒔 (𝒕−𝒏)

𝒔 (𝒕−𝒏+𝟏)

𝒔 (𝒕−𝟏)𝝈𝒕+𝟏

X. Mei and H. Ling, Robust visual tracking using minimization, ICCV 2009.C. Bao et al., Real-time robust tracker using accelerated proximal gradient approach, CVPR 2012.Z. Hong et al., Tracking via robust multi-task multi-view joint sparse representation, ICCV 2013.

8

Hypothesis generation – Transition model• Contribution – Dynamic

state transition model

• - state with highest observation probability

• - estimated using dynamics of past states

• to be computed• Hankel matrix• Least squares minimization

𝑠𝑡+1=𝑠𝑡+𝒩 (0,1 )𝜎𝒕+𝟏

𝜎 𝑡+1=max (min (𝜎0√𝑒𝑡 ,𝜎𝑚𝑎𝑥 ) ,𝜎𝑚𝑖𝑛)

𝑒𝑡=∑𝑗=1

3

𝑒𝑡𝑗𝑒𝑡

𝑗=∥ 𝑦~𝑠𝑡𝑗 − 𝑦 𝑠𝑡

𝑗 ∥2

𝑠𝑡=𝑐1𝑠𝑡−1+𝑐2𝑠𝑡− 2+⋯+𝑐𝑛𝑠𝑡−𝑛

𝐻 𝑠𝑡 −𝑛−1 ,𝑛𝑐𝑇= [𝑠𝑛+1𝑠𝑛+2⋯𝑠𝑛+ (𝑡−𝑛−1 ) ]𝑇

M. Ayazoglu et al., Dynamic subspace-based coordinated multicamera tracking, ICCV 2011.

(1)

(2)

(3)

(4)

(5)

(6)

9

Hypothesis generation – Particle filtering

𝑛=𝜒 𝑘− 1,1−𝛿2

2𝜖

• Related approaches – (400-600, fixed)• Dynamic model + adaptive candidate

filtering

D. Fox, KLD-Sampling: Adaptive Particle Filters, NIPS 2001.

10

Hypothesis testing min𝑥 ,𝑒

∥ 𝑥∥1+∥𝑒 ∥1𝑠 . 𝑡 . 𝑦=𝐴𝑥+𝑒

min𝑥 ,𝑒

𝐿 (𝑥 ,𝑒 ,𝑝 )

𝑥𝑖+1=argmin𝑥𝐿(𝑥 ,𝑒𝑖 ,𝑝𝑖)

𝑒𝑖+ 1=argmin𝑒𝐿(𝑥 𝑖+1 ,𝑒 ,𝑝𝑖)

𝑝𝑖+1=𝑝𝑖+𝑘(𝑦−𝐴𝑥 𝑖+ 1−𝑒𝑖+1)

FISTA

Analytic

Hypothesis • Intensity

• Norm. gradient • HOG

min in each channel

Highest observation probability

𝑝 (𝑦𝑡|𝑠𝑡 )=exp (−∑𝑗=1

3

𝛼 𝑗∥ 𝐴𝑗𝑥 𝑗− 𝑦𝑡

𝑗∥22)

𝑦 𝑗

𝑥 𝑗 ,𝑒 𝑗

⋯

11

Data

• Publicly available standard test sequences

Focal Length

Y. Wu et al., Online object tracking: a benchmark, CVPR 2013.

12

Evaluation metrics

• Success plot

• Robustness tests• Temporal robustness test

• Spatial Robustness test

Principal Point Focal Length

• Overlap precision vs. Overlap threshold

13

Experimental Results – Overall Success Plot

• Ideally, close to 1


14

Experimental Results – Robustness tests

• Temporal robustness evaluation

• Spatial robustness evaluation


15

Experimental Results

• Validating key components.

• Choice of features.

• Choice of transition model.

• Adaptive candidate filtering


Distortion Coefficient

16

SpeedMethod Speed

(fps)Template size

Average

distance

precision

Average AUC

Ours* 2.5 64 x 64 0.92 0.69

L1* 8.2 12 x 15 0.47 0.36

MTT* 0.4 32 x 32 0.60 0.42

ONDL* 0.5 32 x 32 0.79 0.59

SCM* 0.05 32 x 32 0.72 0.59

ASLA* 0.7 32 x 32 0.73 0.59

LSH 7 - 0.70 0.57

LOT 0.2 - 0.53 0.31

SPT 0.1 - 0.49 0.29

MIL 8.5 - 0.56 0.45

IVT 6.5 - 0.61 0.46

* - based on sparse visual representation.

This material is based upon work supported by the U.S. Department of Homeland Security, Science and Technology Directorate, Office of University Programs, under Award 2013-ST-061-ED0001. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the U.S. Department of Homeland Security.

Conclusions• Multi-Channel features• Particle dynamical information• Adaptive filtering

Thank you! Questions?

particle dynamics and multi- channel feature dictionaries for robust visual tracking srikrishna...

Documents

sparse x

sparse visual representation

adaptive particle filters

choice of transition

online object tracking

choice of features

troy ny

technology directorate