• No results found

This paper presents a novel approach for robust scale estimation in a tracking-by-detection framework

N/A
N/A
Protected

Academic year: 2021

Share "This paper presents a novel approach for robust scale estimation in a tracking-by-detection framework"

Copied!
1
0
0

Loading.... (view fulltext now)

Full text

(1)

Accurate Scale Estimation for Robust Visual Tracking

Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, Michael Felsberg martin.danelljan@liu.se,hager.gustav@gmail.com, fahad.khan@liu.se,michael.felsberg@liu.se

Computer Vision Laboratory

Department of Electrical Engineering Linköping University

Linköping, Sweden

Robust scale estimation is a challenging problem in visual object track- ing. Most existing methods fail to handle large scale variations in complex image sequences. This paper presents a novel approach for robust scale estimation in a tracking-by-detection framework. The proposed approach works by learning discriminative correlation filters based on a scale pyra- mid representation. We learn separate filters for translation and scale es- timation, and show that this improves the performance compared to an exhaustive scale search while operating at real-time. Our scale estimation approach is generic as it can be incorporated into any tracking method with no inherent scale estimation.

#005 #100 #160 #170

#005 #150 #160 #460

#010 #199 #340 #400

Ours ASLA SCM Struck LSHT

Discriminative Correlation Filters. Our tracking approach is based on the discriminative correlation filters employed in the MOSSE tracker [1].

Similarly to [2], these filters are extended to multi-dimensional features for visual tracking. We use HOG features for the translation filter and concatenate it with image intensity features. In general, we consider a d-dimensional feature map representation of an image. Let f be a rect- angular patch of the target, extracted from this feature map. We denote feature dimension number l ∈ {1, . . . , d} of f by fl. The objective is to find an optimal correlation filter h, consisting of one filter hl per feature dimension. This is achieved by minimizing the cost function:

ε =

d

l=1

hl? fl− g

2

+ λ

d

l=1

hl

2. (1)

Here, g is the desired correlation output associated with the training ex- ample f and λ ≥ 0 is a regularization parameter. The solution to (1) is:

Hl= GFl

dk=1FkFk+ λ. (2)

Capital letters denote the discrete Fourier transforms (DFTs) of the corre- sponding functions. We update the numerator Altand denominator Btof the correlation filter Htlin (2) separately using a learning rate η:

Atl= (1 − η)At−1l + ηGtFtl and Bt= (1 − η)Bt−1+ η

d

k=1

FtkFtk. (3)

The correlation scores y at a patch z in the next frame are computed using (4). The new target state is found by maximizing the score y.

y=F−1 (

dl=1AltZl Bt+ λ

)

. (4)

Our Scale Estimation Approach. Ideally, an accurate scale estimation approach should be robust while computationally efficient. To achieve this, we propose a fast scale estimation approach by learning separate filters for translation and scale. This helps by restricting the search area

0 10 20 30 40 50

0 0.2 0.4 0.6 0.8

Location error threshold

Distance Precision

Precision plot

Ours [0.745]

Struck [0.659]

ASLA [0.612]

SCM [0.610]

TLD [0.509]

LSHT [0.508]

EDFT [0.505]

CSK [0.502]

L1APG [0.472]

LOT [0.467]

DFT [0.441]

CT [0.344]

0 0.2 0.4 0.6 0.8 1

0 0.2 0.4 0.6 0.8

Overlap threshold

Overlap Precision

Success plot

Ours [0.549]

ASLA [0.492]

SCM [0.477]

Struck [0.430]

TLD [0.356]

LSHT [0.354]

EDFT [0.350]

CSK [0.350]

L1APG [0.350]

LOT [0.339]

DFT [0.329]

CT [0.239]

Figure 1: Precision and success plots illustrating the average distance and overlap precision respectively over all the 28 sequences. The average distance precision at 20 pixels for each method is reported in the legend of the precision plot. The legend of the success plot contains the area- under-the-curve(AUC) score for each tracker.

Method median OP median DP median CLE median FPS

Baseline (no scale) 37.8 74.5 15.9 44.1

Exhaustive Scale Search (this paper) 52.2 87.6 11.8 0.96

Fast Scale Search (this paper) 75.5 93.3 10.9 24.0

Table 1: Comparison of our fast scale estimation method with the baseline tracker and our exhaustive scale-space tracker.

to smaller parts of the scale space. In addition, we gain the freedom of selecting the feature representation for each filter independently.

We augment the baseline method by learning a separate 1-dimensional correlation filter to estimate the target scale in an image. The training ex- ample f for updating the scale filter is computed by extracting features using variable patch sizes centred around the target. Let P × R denote the target size in the current frame and S be the size of the scale filter. For each n ∈nj

S−12 k , . . . ,j

S−1 2

ko

, we extract an image patch Jnof size anP× anRcentred around the target. Here, a denotes the scale factor be- tween feature layers. The value f (n) of the training example f at scale level n is set to a HOG-based d-dimensional feature descriptor of Jn. Eq. 3 is then used to update the scale filter hscalewith the new sample f .

In visual tracking scenarios, the scale difference between two frames is typically smaller compared to the translation. Therefore, we first apply the translation filter htransgiven a new frame. Afterwards, the scale filter hscaleis applied at the new target location. An example z is extracted from this location using the same procedure as for f . By maximizing the correlation output (4) between hscaleand z, we obtain the scale difference.

Evaluation. We employ all the 28 sequences annotated with the scale variation attribute in the recent evaluation of tracking methods [3]. The sequences also pose challenging problems such as illumination variation, motion blur, background clutter and occlusion. The baseline HOG based tracker with no scale estimation capability is compared with our exhaus- tive scale space tracker and the fast scale estimation method in table 1.

We additionally compare our approach with 11 state-of-the-art track- ers. Figure 1 contains the precision and success plots illustrating the mean distance and overlap precision over all the 28 sequences. In both precision and success plots, our approach significantly outperforms the compared methods. In summary, the precision plot demonstrates that our approach is superior in robustness compared to existing trackers. Similarly, the suc- cess plot shows that our method estimates the target scale more accurately on the benchmark sequences.

[1] D. S. Bolme, J. R. Beveridge, B. A. Draper, and Yui M. Lui. Visual object tracking using adaptive correlation filters. In CVPR, 2010.

[2] João F. Henriques, Rui Caseiro, Pedro Martins, and Jorge Batista.

High-speed tracking with kernelized correlation filters. CoRR, abs/1404.7584, 2014.

[3] Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang. Online object tracking:

A benchmark. In CVPR, 2013.

References

Related documents

The most obvious example of what PBCM can be used to show is an estimation of the Direct Operational Expenditures per kg of produced nanocellulose filaments in different

The object tracking module is based on the feature tracking method with automatic scale selection presented in Chapter 2 and incorpo- rates a qualitative feature hierarchy as

This thesis is for anyone in the computer science area who are interested in how known security flaws can be discovered on hosts using a network connection, and how the infor-

Studien visar att de pedagogiska utredningarnas syfte i majoriteten av dokumenten beskriver att skolan som organisation ska ta reda på mer för att möta elevers skolsvårigheter medan

Att det är så stor andel som inte prioriterar detta högre är förvånande då Twenge och Campbell (2008) har påvisat att Generation Y värdesätter relationen till chefen högre

The decay time measurements showed an influence from the Cherenkov radiation to the total amount of emitted light of; 8 % for crystal 1003, 47 % for crystal 1002 and 19 % for

Compared to the nonlinear filtering framework, estimators for JMSs have to additionally estimate the discrete state (or mode) of a Markov chain that allow a switching between

Key Words: Discrete Dynamic Systems, Control, Finite Field Polynomial, Boolean Al- gebra, Propositional Logic, Binary Decision Diagrams, Temporal Logic, Modeling, Model