-
DAVE -- A Detect-and-Verify Paradigm for Low-Shot Counting
Authors:
Jer Pelhan,
Alan Lukežič,
Vitjan Zavrtanik,
Matej Kristan
Abstract:
Low-shot counters estimate the number of objects corresponding to a selected category, based on only few or no exemplars annotated in the image. The current state-of-the-art estimates the total counts as the sum over the object location density map, but does not provide individual object locations and sizes, which are crucial for many applications. This is addressed by detection-based counters, wh…
▽ More
Low-shot counters estimate the number of objects corresponding to a selected category, based on only few or no exemplars annotated in the image. The current state-of-the-art estimates the total counts as the sum over the object location density map, but does not provide individual object locations and sizes, which are crucial for many applications. This is addressed by detection-based counters, which, however fall behind in the total count accuracy. Furthermore, both approaches tend to overestimate the counts in the presence of other object classes due to many false positives. We propose DAVE, a low-shot counter based on a detect-and-verify paradigm, that avoids the aforementioned issues by first generating a high-recall detection set and then verifying the detections to identify and remove the outliers. This jointly increases the recall and precision, leading to accurate counts. DAVE outperforms the top density-based counters by ~20% in the total count MAE, it outperforms the most recent detection-based counter by ~20% in detection quality and sets a new state-of-the-art in zero-shot as well as text-prompt-based counting.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
A New Dataset and a Distractor-Aware Architecture for Transparent Object Tracking
Authors:
Alan Lukezic,
Ziga Trojer,
Jiri Matas,
Matej Kristan
Abstract:
Performance of modern trackers degrades substantially on transparent objects compared to opaque objects. This is largely due to two distinct reasons. Transparent objects are unique in that their appearance is directly affected by the background. Furthermore, transparent object scenes often contain many visually similar objects (distractors), which often lead to tracking failure. However, developme…
▽ More
Performance of modern trackers degrades substantially on transparent objects compared to opaque objects. This is largely due to two distinct reasons. Transparent objects are unique in that their appearance is directly affected by the background. Furthermore, transparent object scenes often contain many visually similar objects (distractors), which often lead to tracking failure. However, development of modern tracking architectures requires large training sets, which do not exist in transparent object tracking. We present two contributions addressing the aforementioned issues. We propose the first transparent object tracking training dataset Trans2k that consists of over 2k sequences with 104,343 images overall, annotated by bounding boxes and segmentation masks. Standard trackers trained on this dataset consistently improve by up to 16%. Our second contribution is a new distractor-aware transparent object tracker (DiTra) that treats localization accuracy and target identification as separate tasks and implements them by a novel architecture. DiTra sets a new state-of-the-art in transparent object tracking and generalizes well to opaque objects.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
Exact Algorithms and Lowerbounds for Multiagent Pathfinding: Power of Treelike Topology
Authors:
Foivos Fioravantes,
Dušan Knop,
Jan Matyáš Křišťan,
Nikolaos Melissinos,
Michal Opler
Abstract:
In the Multiagent Path Finding problem (MAPF for short), we focus on efficiently finding non-colliding paths for a set of $k$ agents on a given graph $G$, where each agent seeks a path from its source vertex to a target. An important measure of the quality of the solution is the length of the proposed schedule $\ell$, that is, the length of a longest path (including the waiting time). In this work…
▽ More
In the Multiagent Path Finding problem (MAPF for short), we focus on efficiently finding non-colliding paths for a set of $k$ agents on a given graph $G$, where each agent seeks a path from its source vertex to a target. An important measure of the quality of the solution is the length of the proposed schedule $\ell$, that is, the length of a longest path (including the waiting time). In this work, we propose a systematic study under the parameterized complexity framework. The hardness results we provide align with many heuristics used for this problem, whose running time could potentially be improved based on our fixed-parameter tractability results.
We show that MAPF is W[1]-hard with respect to $k$ (even if $k$ is combined with the maximum degree of the input graph). The problem remains NP-hard in planar graphs even if the maximum degree and the makespan$\ell$ are fixed constants. On the positive side, we show an FPT algorithm for $k+\ell$.
As we delve further, the structure of~$G$ comes into play. We give an FPT algorithm for parameter $k$ plus the diameter of the graph~$G$. The MAPF problem is W[1]-hard for cliquewidth of $G$ plus $\ell$ while it is FPT for treewidth of $G$ plus $\ell$.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024
Authors:
Benjamin Kiefer,
Lojze Žust,
Matej Kristan,
Janez Perš,
Matija Teršek,
Arnold Wiliem,
Martin Messmer,
Cheng-Yen Yang,
Hsiang-Wei Huang,
Zhongyu Jiang,
Heng-Cheng Kuo,
Jie Mei,
Jenq-Neng Hwang,
Daniel Stadler,
Lars Sommer,
Kaer Huang,
Aiguo Zheng,
Weitu Chong,
Kanokphan Lertniphonphan,
Jun Xie,
Feng Chen,
Jian Li,
Zhepeng Wang,
Luca Zedda,
Andrea Loddo
, et al. (24 additional authors not shown)
Abstract:
The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024 addresses maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicles (USV). Three challenges categories are considered: (i) UAV-based Maritime Object Tracking with Re-identification, (ii) USV-based Maritime Obstacle Segmentation and Detection, (iii) USV-based Maritime Boat Tracking. The USV-based Maritime Obst…
▽ More
The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024 addresses maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicles (USV). Three challenges categories are considered: (i) UAV-based Maritime Object Tracking with Re-identification, (ii) USV-based Maritime Obstacle Segmentation and Detection, (iii) USV-based Maritime Boat Tracking. The USV-based Maritime Obstacle Segmentation and Detection features three sub-challenges, including a new embedded challenge addressing efficicent inference on real-world embedded devices. This report offers a comprehensive overview of the findings from the challenges. We provide both statistical and qualitative analyses, evaluating trends from over 195 submissions. All datasets, evaluation code, and the leaderboard are available to the public at https://macvi.org/workshop/macvi24.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
Cheating Depth: Enhancing 3D Surface Anomaly Detection via Depth Simulation
Authors:
Vitjan Zavrtanik,
Matej Kristan,
Danijel Skočaj
Abstract:
RGB-based surface anomaly detection methods have advanced significantly. However, certain surface anomalies remain practically invisible in RGB alone, necessitating the incorporation of 3D information. Existing approaches that employ point-cloud backbones suffer from suboptimal representations and reduced applicability due to slow processing. Re-training RGB backbones, designed for faster dense in…
▽ More
RGB-based surface anomaly detection methods have advanced significantly. However, certain surface anomalies remain practically invisible in RGB alone, necessitating the incorporation of 3D information. Existing approaches that employ point-cloud backbones suffer from suboptimal representations and reduced applicability due to slow processing. Re-training RGB backbones, designed for faster dense input processing, on industrial depth datasets is hindered by the limited availability of sufficiently large datasets. We make several contributions to address these challenges. (i) We propose a novel Depth-Aware Discrete Autoencoder (DADA) architecture, that enables learning a general discrete latent space that jointly models RGB and 3D data for 3D surface anomaly detection. (ii) We tackle the lack of diverse industrial depth datasets by introducing a simulation process for learning informative depth features in the depth encoder. (iii) We propose a new surface anomaly detection method 3DSR, which outperforms all existing state-of-the-art on the challenging MVTec3D anomaly detection benchmark, both in terms of accuracy and processing speed. The experimental results validate the effectiveness and efficiency of our approach, highlighting the potential of utilizing depth information for improved surface anomaly detection.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
LaRS: A Diverse Panoptic Maritime Obstacle Detection Dataset and Benchmark
Authors:
Lojze Žust,
Janez Perš,
Matej Kristan
Abstract:
The progress in maritime obstacle detection is hindered by the lack of a diverse dataset that adequately captures the complexity of general maritime environments. We present the first maritime panoptic obstacle detection benchmark LaRS, featuring scenes from Lakes, Rivers and Seas. Our major contribution is the new dataset, which boasts the largest diversity in recording locations, scene types, ob…
▽ More
The progress in maritime obstacle detection is hindered by the lack of a diverse dataset that adequately captures the complexity of general maritime environments. We present the first maritime panoptic obstacle detection benchmark LaRS, featuring scenes from Lakes, Rivers and Seas. Our major contribution is the new dataset, which boasts the largest diversity in recording locations, scene types, obstacle classes, and acquisition conditions among the related datasets. LaRS is composed of over 4000 per-pixel labeled key frames with nine preceding frames to allow utilization of the temporal texture, amounting to over 40k frames. Each key frame is annotated with 8 thing, 3 stuff classes and 19 global scene attributes. We report the results of 27 semantic and panoptic segmentation methods, along with several performance insights and future research directions. To enable objective evaluation, we have implemented an online evaluation server. The LaRS dataset, evaluation toolkit and benchmark are publicly available at: https://lojzezust.github.io/lars-dataset
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Shortest Dominating Set Reconfiguration under Token Sliding
Authors:
Jan Matyáš Křišťan,
Jakub Svoboda
Abstract:
In this paper, we present novel algorithms that efficiently compute a shortest reconfiguration sequence between two given dominating sets in trees and interval graphs under the Token Sliding model. In this problem, a graph is provided along with its two dominating sets, which can be imagined as tokens placed on vertices. The objective is to find a shortest sequence of dominating sets that transfor…
▽ More
In this paper, we present novel algorithms that efficiently compute a shortest reconfiguration sequence between two given dominating sets in trees and interval graphs under the Token Sliding model. In this problem, a graph is provided along with its two dominating sets, which can be imagined as tokens placed on vertices. The objective is to find a shortest sequence of dominating sets that transforms one set into the other, with each set in the sequence resulting from sliding a single token in the previous set. While identifying any sequence has been well studied, our work presents the first polynomial algorithms for this optimization variant in the context of dominating sets.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
eWaSR -- an embedded-compute-ready maritime obstacle detection network
Authors:
Matija Teršek,
Lojze Žust,
Matej Kristan
Abstract:
Maritime obstacle detection is critical for safe navigation of autonomous surface vehicles (ASVs). While the accuracy of image-based detection methods has advanced substantially, their computational and memory requirements prohibit deployment on embedded devices. In this paper we analyze the currently best-performing maritime obstacle detection network WaSR. Based on the analysis we then propose r…
▽ More
Maritime obstacle detection is critical for safe navigation of autonomous surface vehicles (ASVs). While the accuracy of image-based detection methods has advanced substantially, their computational and memory requirements prohibit deployment on embedded devices. In this paper we analyze the currently best-performing maritime obstacle detection network WaSR. Based on the analysis we then propose replacements for the most computationally intensive stages and propose its embedded-compute-ready variant eWaSR. In particular, the new design follows the most recent advancements of transformer-based lightweight networks. eWaSR achieves comparable detection results to state-of-the-art WaSR with only 0.52% F1 score performance drop and outperforms other state-of-the-art embedded-ready architectures by over 9.74% in F1 score. On a standard GPU, eWaSR runs 10x faster than the original WaSR (115 FPS vs 11 FPS). Tests on a real embedded device OAK-D show that, while WaSR cannot run due to memory restrictions, eWaSR runs comfortably at 5.5 FPS. This makes eWaSR the first practical embedded-compute-ready maritime obstacle detection network. The source code and trained eWaSR models are publicly available here: https://github.com/tersekmatija/eWaSR.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.
-
Computing m-Eternal Domination Number of Cactus Graphs in Linear Time
Authors:
Václav Blažej,
Jan Matyáš Křišťan,
Tomáš Valla
Abstract:
In m-eternal domination attacker and defender play on a graph. Initially, the defender places guards on vertices. In each round, the attacker chooses a vertex to attack. Then, the defender can move each guard to a neighboring vertex and must move a guard to the attacked vertex. The m-eternal domination number is the minimum number of guards such that the graph can be defended indefinitely. In this…
▽ More
In m-eternal domination attacker and defender play on a graph. Initially, the defender places guards on vertices. In each round, the attacker chooses a vertex to attack. Then, the defender can move each guard to a neighboring vertex and must move a guard to the attacked vertex. The m-eternal domination number is the minimum number of guards such that the graph can be defended indefinitely. In this paper, we study the m-eternal domination number of cactus graphs. We consider two variants of the m-eternal domination number: one allows multiple guards to occupy a single vertex, the second variant requires the guards to occupy distinct vertices. We develop several tools for obtaining lower and upper bounds on these problems and we use them to obtain an algorithm which computes the minimum number of required guards of cactus graphs for both variants of the problem.
△ Less
Submitted 12 January, 2023;
originally announced January 2023.
-
1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results
Authors:
Benjamin Kiefer,
Matej Kristan,
Janez Perš,
Lojze Žust,
Fabio Poiesi,
Fabio Augusto de Alcantara Andrade,
Alexandre Bernardino,
Matthew Dawkins,
Jenni Raitoharju,
Yitong Quan,
Adem Atmaca,
Timon Höfer,
Qiming Zhang,
Yufei Xu,
Jing Zhang,
Dacheng Tao,
Lars Sommer,
Raphael Spraul,
Hangyue Zhao,
Hongpu Zhang,
Yanyun Zhao,
Jan Lukas Augustin,
Eui-ik Jeon,
Impyeong Lee,
Luca Zedda
, et al. (48 additional authors not shown)
Abstract:
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detec…
▽ More
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.
△ Less
Submitted 28 November, 2022; v1 submitted 24 November, 2022;
originally announced November 2022.
-
A Low-Shot Object Counting Network With Iterative Prototype Adaptation
Authors:
Nikola Djukic,
Alan Lukezic,
Vitjan Zavrtanik,
Matej Kristan
Abstract:
We consider low-shot counting of arbitrary semantic categories in the image using only few annotated exemplars (few-shot) or no exemplars (no-shot). The standard few-shot pipeline follows extraction of appearance queries from exemplars and matching them with image features to infer the object counts. Existing methods extract queries by feature pooling which neglects the shape information (e.g., si…
▽ More
We consider low-shot counting of arbitrary semantic categories in the image using only few annotated exemplars (few-shot) or no exemplars (no-shot). The standard few-shot pipeline follows extraction of appearance queries from exemplars and matching them with image features to infer the object counts. Existing methods extract queries by feature pooling which neglects the shape information (e.g., size and aspect) and leads to a reduced object localization accuracy and count estimates. We propose a Low-shot Object Counting network with iterative prototype Adaptation (LOCA). Our main contribution is the new object prototype extraction module, which iteratively fuses the exemplar shape and appearance information with image features. The module is easily adapted to zero-shot scenarios, enabling LOCA to cover the entire spectrum of low-shot counting problems. LOCA outperforms all recent state-of-the-art methods on FSC147 benchmark by 20-30% in RMSE on one-shot and few-shot and achieves state-of-the-art on zero-shot scenarios, while demonstrating better generalization capabilities.
△ Less
Submitted 28 September, 2023; v1 submitted 15 November, 2022;
originally announced November 2022.
-
Trans2k: Unlocking the Power of Deep Models for Transparent Object Tracking
Authors:
Alan Lukezic,
Ziga Trojer,
Jiri Matas,
Matej Kristan
Abstract:
Visual object tracking has focused predominantly on opaque objects, while transparent object tracking received very little attention. Motivated by the uniqueness of transparent objects in that their appearance is directly affected by the background, the first dedicated evaluation dataset has emerged recently. We contribute to this effort by proposing the first transparent object tracking training…
▽ More
Visual object tracking has focused predominantly on opaque objects, while transparent object tracking received very little attention. Motivated by the uniqueness of transparent objects in that their appearance is directly affected by the background, the first dedicated evaluation dataset has emerged recently. We contribute to this effort by proposing the first transparent object tracking training dataset Trans2k that consists of over 2k sequences with 104,343 images overall, annotated by bounding boxes and segmentation masks. Noting that transparent objects can be realistically rendered by modern renderers, we quantify domain-specific attributes and render the dataset containing visual attributes and tracking situations not covered in the existing object training datasets. We observe a consistent performance boost (up to 16%) across a diverse set of modern tracking architectures when trained using Trans2k, and show insights not previously possible due to the lack of appropriate training sets. The dataset and the rendering engine will be publicly released to unlock the power of modern learning-based trackers and foster new designs in transparent object tracking.
△ Less
Submitted 7 October, 2022;
originally announced October 2022.
-
DSR -- A dual subspace re-projection network for surface anomaly detection
Authors:
Vitjan Zavrtanik,
Matej Kristan,
Danijel Skočaj
Abstract:
The state-of-the-art in discriminative unsupervised surface anomaly detection relies on external datasets for synthesizing anomaly-augmented training images. Such approaches are prone to failure on near-in-distribution anomalies since these are difficult to be synthesized realistically due to their similarity to anomaly-free regions. We propose an architecture based on quantized feature space repr…
▽ More
The state-of-the-art in discriminative unsupervised surface anomaly detection relies on external datasets for synthesizing anomaly-augmented training images. Such approaches are prone to failure on near-in-distribution anomalies since these are difficult to be synthesized realistically due to their similarity to anomaly-free regions. We propose an architecture based on quantized feature space representation with dual decoders, DSR, that avoids the image-level anomaly synthesis requirement. Without making any assumptions about the visual properties of anomalies, DSR generates the anomalies at the feature level by sampling the learned quantized feature space, which allows a controlled generation of near-in-distribution anomalies. DSR achieves state-of-the-art results on the KSDD2 and MVTec anomaly detection datasets. The experiments on the challenging real-world KSDD2 dataset show that DSR significantly outperforms other unsupervised surface anomaly detection methods, improving the previous top-performing methods by 10% AP in anomaly detection and 35% AP in anomaly localization.
△ Less
Submitted 23 November, 2022; v1 submitted 2 August, 2022;
originally announced August 2022.
-
Learning with Weak Annotations for Robust Maritime Obstacle Detection
Authors:
Lojze Žust,
Matej Kristan
Abstract:
Robust maritime obstacle detection is critical for safe navigation of autonomous boats and timely collision avoidance. The current state-of-the-art is based on deep segmentation networks trained on large datasets. However, per-pixel ground truth labeling of such datasets is labor-intensive and expensive. We propose a new scaffolding learning regime (SLR) that leverages weak annotations consisting…
▽ More
Robust maritime obstacle detection is critical for safe navigation of autonomous boats and timely collision avoidance. The current state-of-the-art is based on deep segmentation networks trained on large datasets. However, per-pixel ground truth labeling of such datasets is labor-intensive and expensive. We propose a new scaffolding learning regime (SLR) that leverages weak annotations consisting of water edges, the horizon location, and obstacle bounding boxes to train segmentation-based obstacle detection networks, thereby reducing the required ground truth labeling effort by a factor of twenty. SLR trains an initial model from weak annotations and then alternates between re-estimating the segmentation pseudo-labels and improving the network parameters. Experiments show that maritime obstacle segmentation networks trained using SLR on weak annotations not only match but outperform the same networks trained with dense ground truth labels, which is a remarkable result. In addition to the increased accuracy, SLR also increases domain generalization and can be used for domain adaptation with a low manual annotation load. The SLR code and pre-trained models are available at https://github.com/lojzezust/SLR .
△ Less
Submitted 25 November, 2022; v1 submitted 27 June, 2022;
originally announced June 2022.
-
Efficient attack sequences in m-eternal domination
Authors:
Václav Blažej,
Jan Matyáš Křišťan,
Tomáš Valla
Abstract:
We study the m-eternal domination problem from the perspective of the attacker. For many graph classes, the minimum required number of guards to defend eternally is known. By definition, if the defender has less than the required number of guards, then there exists a sequence of attacks that ensures the attacker's victory. Little is known about such sequences of attacks, in particular, no bound on…
▽ More
We study the m-eternal domination problem from the perspective of the attacker. For many graph classes, the minimum required number of guards to defend eternally is known. By definition, if the defender has less than the required number of guards, then there exists a sequence of attacks that ensures the attacker's victory. Little is known about such sequences of attacks, in particular, no bound on its length is known.
We show that if the game is played on a tree $T$ on $n$ vertices and the defender has less than the necessary number of guards, then the attacker can win in at most $n$ turns. Furthermore, we present an efficient procedure that produces such an attacking strategy.
△ Less
Submitted 6 April, 2022;
originally announced April 2022.
-
Temporal Context for Robust Maritime Obstacle Detection
Authors:
Lojze Žust,
Matej Kristan
Abstract:
Robust maritime obstacle detection is essential for fully autonomous unmanned surface vehicles (USVs). The currently widely adopted segmentation-based obstacle detection methods are prone to misclassification of object reflections and sun glitter as obstacles, producing many false positive detections, effectively rendering the methods impractical for USV navigation. However, water-turbulence-induc…
▽ More
Robust maritime obstacle detection is essential for fully autonomous unmanned surface vehicles (USVs). The currently widely adopted segmentation-based obstacle detection methods are prone to misclassification of object reflections and sun glitter as obstacles, producing many false positive detections, effectively rendering the methods impractical for USV navigation. However, water-turbulence-induced temporal appearance changes on object reflections are very distinctive from the appearance dynamics of true objects. We harness this property to design WaSR-T, a novel maritime obstacle detection network, that extracts the temporal context from a sequence of recent frames to reduce ambiguity. By learning the local temporal characteristics of object reflection on the water surface, WaSR-T substantially improves obstacle detection accuracy in the presence of reflections and glitter. Compared with existing single-frame methods, WaSR-T reduces the number of false positive detections by 41% overall and by over 53% within the danger zone of the boat, while preserving a high recall, and achieving new state-of-the-art performance on the challenging MODS maritime obstacle detection benchmark. The code, pretrained models and extended datasets are available at https://github.com/lojzezust/WaSR-T
△ Less
Submitted 3 August, 2022; v1 submitted 10 March, 2022;
originally announced March 2022.
-
Polynomial Kernels for Tracking Shortest Paths
Authors:
Václav Blažej,
Pratibha Choudhary,
Dušan Knop,
Jan Matyáš Křišťan,
Ondřej Suchý,
Tomáš Valla
Abstract:
Given an undirected graph $G=(V,E)$, vertices $s,t\in V$, and an integer $k$, Tracking Shortest Paths requires deciding whether there exists a set of $k$ vertices $T\subseteq V$ such that for any two distinct shortest paths between $s$ and $t$, say $P_1$ and $P_2$, we have $T\cap V(P_1)\neq T\cap V(P_2)$. In this paper, we give the first polynomial size kernel for the problem. Specifically we show…
▽ More
Given an undirected graph $G=(V,E)$, vertices $s,t\in V$, and an integer $k$, Tracking Shortest Paths requires deciding whether there exists a set of $k$ vertices $T\subseteq V$ such that for any two distinct shortest paths between $s$ and $t$, say $P_1$ and $P_2$, we have $T\cap V(P_1)\neq T\cap V(P_2)$. In this paper, we give the first polynomial size kernel for the problem. Specifically we show the existence of a kernel with $\mathcal{O}(k^2)$ vertices and edges in general graphs and a kernel with $\mathcal{O}(k)$ vertices and edges in planar graphs for the Tracking Paths in DAG problem. This problem admits a polynomial parameter transformation to Tracking Shortest Paths, and this implies a kernel with $\mathcal{O}(k^4)$ vertices and edges for Tracking Shortest Paths in general graphs and a kernel with $\mathcal{O}(k^2)$ vertices and edges in planar graphs. Based on the above we also give a single exponential algorithm for Tracking Shortest Paths in planar graphs.
△ Less
Submitted 24 February, 2022;
originally announced February 2022.
-
A Discriminative Single-Shot Segmentation Network for Visual Object Tracking
Authors:
Alan Lukežič,
Jiří Matas,
Matej Kristan
Abstract:
Template-based discriminative trackers are currently the dominant tracking paradigm due to their robustness, but are restricted to bounding box tracking and a limited range of transformation models, which reduces their localization accuracy. We propose a discriminative single-shot segmentation tracker -- D3S2, which narrows the gap between visual object tracking and video object segmentation. A si…
▽ More
Template-based discriminative trackers are currently the dominant tracking paradigm due to their robustness, but are restricted to bounding box tracking and a limited range of transformation models, which reduces their localization accuracy. We propose a discriminative single-shot segmentation tracker -- D3S2, which narrows the gap between visual object tracking and video object segmentation. A single-shot network applies two target models with complementary geometric properties, one invariant to a broad range of transformations, including non-rigid deformations, the other assuming a rigid object to simultaneously achieve robust online target segmentation. The overall tracking reliability is further increased by decoupling the object and feature scale estimation. Without per-dataset finetuning, and trained only for segmentation as the primary output, D3S2 outperforms all published trackers on the recent short-term tracking benchmark VOT2020 and performs very close to the state-of-the-art trackers on the GOT-10k, TrackingNet, OTB100 and LaSoT. D3S2 outperforms the leading segmentation tracker SiamMask on video object segmentation benchmarks and performs on par with top video object segmentation algorithms.
△ Less
Submitted 27 December, 2021; v1 submitted 22 December, 2021;
originally announced December 2021.
-
DRAEM -- A discriminatively trained reconstruction embedding for surface anomaly detection
Authors:
Vitjan Zavrtanik,
Matej Kristan,
Danijel Skočaj
Abstract:
Visual surface anomaly detection aims to detect local image regions that significantly deviate from normal appearance. Recent surface anomaly detection methods rely on generative models to accurately reconstruct the normal areas and to fail on anomalies. These methods are trained only on anomaly-free images, and often require hand-crafted post-processing steps to localize the anomalies, which proh…
▽ More
Visual surface anomaly detection aims to detect local image regions that significantly deviate from normal appearance. Recent surface anomaly detection methods rely on generative models to accurately reconstruct the normal areas and to fail on anomalies. These methods are trained only on anomaly-free images, and often require hand-crafted post-processing steps to localize the anomalies, which prohibits optimizing the feature extraction for maximal detection capability. In addition to reconstructive approach, we cast surface anomaly detection primarily as a discriminative problem and propose a discriminatively trained reconstruction anomaly embedding model (DRAEM). The proposed method learns a joint representation of an anomalous image and its anomaly-free reconstruction, while simultaneously learning a decision boundary between normal and anomalous examples. The method enables direct anomaly localization without the need for additional complicated post-processing of the network output and can be trained using simple and general anomaly simulations. On the challenging MVTec anomaly detection dataset, DRAEM outperforms the current state-of-the-art unsupervised methods by a large margin and even delivers detection performance close to the fully-supervised methods on the widely used DAGM surface-defect detection dataset, while substantially outperforming them in localization accuracy.
△ Less
Submitted 27 September, 2021; v1 submitted 17 August, 2021;
originally announced August 2021.
-
Constant Factor Approximation for Tracking Paths and Fault Tolerant Feedback Vertex Set
Authors:
Václav Blažej,
Pratibha Choudhary,
Dušan Knop,
Jan Matyáš Křišťan,
Ondřej Suchý,
Tomáš Valla
Abstract:
Consider a vertex-weighted graph $G$ with a source $s$ and a target $t$. Tracking Paths requires finding a minimum weight set of vertices (trackers) such that the sequence of trackers in each path from $s$ to $t$ is unique. In this work, we derive a factor $6$-approximation algorithm for Tracking Paths in weighted graphs and a factor $4$-approximation algorithm if the input is unweighted. This is…
▽ More
Consider a vertex-weighted graph $G$ with a source $s$ and a target $t$. Tracking Paths requires finding a minimum weight set of vertices (trackers) such that the sequence of trackers in each path from $s$ to $t$ is unique. In this work, we derive a factor $6$-approximation algorithm for Tracking Paths in weighted graphs and a factor $4$-approximation algorithm if the input is unweighted. This is the first constant factor approximation for this problem. While doing so, we also study approximation of the closely related $r$-Fault Tolerant Feedback Vertex Set problem. There, for a fixed integer $r$ and a given vertex-weighted graph $G$, the task is to find a minimum weight set of vertices intersecting every cycle of $G$ in at least $r+1$ vertices. We give a factor $\mathcal{O}(r)$ approximation algorithm for $r$-Fault Tolerant Feedback Vertex Set if $r$ is a constant.
△ Less
Submitted 24 February, 2022; v1 submitted 3 August, 2021;
originally announced August 2021.
-
Learning Maritime Obstacle Detection from Weak Annotations by Scaffolding
Authors:
Lojze Žust,
Matej Kristan
Abstract:
Coastal water autonomous boats rely on robust perception methods for obstacle detection and timely collision avoidance. The current state-of-the-art is based on deep segmentation networks trained on large datasets. Per-pixel ground truth labeling of such datasets, however, is labor-intensive and expensive. We observe that far less information is required for practical obstacle avoidance - the loca…
▽ More
Coastal water autonomous boats rely on robust perception methods for obstacle detection and timely collision avoidance. The current state-of-the-art is based on deep segmentation networks trained on large datasets. Per-pixel ground truth labeling of such datasets, however, is labor-intensive and expensive. We observe that far less information is required for practical obstacle avoidance - the location of water edge on static obstacles like shore and approximate location and bounds of dynamic obstacles in the water is sufficient to plan a reaction. We propose a new scaffolding learning regime (SLR) that allows training obstacle detection segmentation networks only from such weak annotations, thus significantly reducing the cost of ground-truth labeling. Experiments show that maritime obstacle segmentation networks trained using SLR substantially outperform the same networks trained with dense ground truth labels. Thus accuracy is not sacrificed for labelling simplicity but is in fact improved, which is a remarkable result.
△ Less
Submitted 1 August, 2021;
originally announced August 2021.
-
MODS -- A USV-oriented object detection and obstacle segmentation benchmark
Authors:
Borja Bovcon,
Jon Muhovič,
Duško Vranac,
Dean Mozetič,
Janez Perš,
Matej Kristan
Abstract:
Small-sized unmanned surface vehicles (USV) are coastal water devices with a broad range of applications such as environmental control and surveillance. A crucial capability for autonomous operation is obstacle detection for timely reaction and collision avoidance, which has been recently explored in the context of camera-based visual scene interpretation. Owing to curated datasets, substantial ad…
▽ More
Small-sized unmanned surface vehicles (USV) are coastal water devices with a broad range of applications such as environmental control and surveillance. A crucial capability for autonomous operation is obstacle detection for timely reaction and collision avoidance, which has been recently explored in the context of camera-based visual scene interpretation. Owing to curated datasets, substantial advances in scene interpretation have been made in a related field of unmanned ground vehicles. However, the current maritime datasets do not adequately capture the complexity of real-world USV scenes and the evaluation protocols are not standardised, which makes cross-paper comparison of different methods difficult and hinders the progress. To address these issues, we introduce a new obstacle detection benchmark MODS, which considers two major perception tasks: maritime object detection and the more general maritime obstacle segmentation. We present a new diverse maritime evaluation dataset containing approximately 81k stereo images synchronized with an on-board IMU, with over 60k objects annotated. We propose a new obstacle segmentation performance evaluation protocol that reflects the detection accuracy in a way meaningful for practical USV navigation. Nineteen recent state-of-the-art object detection and obstacle segmentation methods are evaluated using the proposed protocol, creating a benchmark to facilitate development of the field. The proposed dataset, as well as evaluation routines, are made publicly available at vicos.si/resources.
△ Less
Submitted 9 February, 2022; v1 submitted 5 May, 2021;
originally announced May 2021.
-
A water-obstacle separation and refinement network for unmanned surface vehicles
Authors:
Borja Bovcon,
Matej Kristan
Abstract:
Obstacle detection by semantic segmentation shows a great promise for autonomous navigation in unmanned surface vehicles (USV). However, existing methods suffer from poor estimation of the water edge in the presence of visual ambiguities, poor detection of small obstacles and high false-positive rate on water reflections and wakes. We propose a new deep encoder-decoder architecture, a water-obstac…
▽ More
Obstacle detection by semantic segmentation shows a great promise for autonomous navigation in unmanned surface vehicles (USV). However, existing methods suffer from poor estimation of the water edge in the presence of visual ambiguities, poor detection of small obstacles and high false-positive rate on water reflections and wakes. We propose a new deep encoder-decoder architecture, a water-obstacle separation and refinement network (WaSR), to address these issues. Detection and water edge accuracy are improved by a novel decoder that gradually fuses inertial information from IMU with the visual features from the encoder. In addition, a novel loss function is designed to increase the separation between water and obstacle features early on in the network. Subsequently, the capacity of the remaining layers in the decoder is better utilised, leading to a significant reduction in false positives and increased true positives. Experimental results show that WaSR outperforms the current state-of-the-art by a large margin, yielding a 14% increase in F-measure over the second-best method.
△ Less
Submitted 7 January, 2020;
originally announced January 2020.
-
DAL -- A Deep Depth-aware Long-term Tracker
Authors:
Yanlin Qian,
Alan Lukežič,
Matej Kristan,
Joni-Kristian Kämäräinen,
Jiri Matas
Abstract:
The best RGBD trackers provide high accuracy but are slow to run. On the other hand, the best RGB trackers are fast but clearly inferior on the RGBD datasets. In this work, we propose a deep depth-aware long-term tracker that achieves state-of-the-art RGBD tracking performance and is fast to run. We reformulate deep discriminative correlation filter (DCF) to embed the depth information into deep f…
▽ More
The best RGBD trackers provide high accuracy but are slow to run. On the other hand, the best RGB trackers are fast but clearly inferior on the RGBD datasets. In this work, we propose a deep depth-aware long-term tracker that achieves state-of-the-art RGBD tracking performance and is fast to run. We reformulate deep discriminative correlation filter (DCF) to embed the depth information into deep features. Moreover, the same depth-aware correlation filter is used for target re-detection. Comprehensive evaluations show that the proposed tracker achieves state-of-the-art performance on the Princeton RGBD, STC, and the newly-released CDTB benchmarks and runs 20 fps.
△ Less
Submitted 2 December, 2019;
originally announced December 2019.
-
D3S -- A Discriminative Single Shot Segmentation Tracker
Authors:
Alan Lukežič,
Jiří Matas,
Matej Kristan
Abstract:
Template-based discriminative trackers are currently the dominant tracking paradigm due to their robustness, but are restricted to bounding box tracking and a limited range of transformation models, which reduces their localization accuracy. We propose a discriminative single-shot segmentation tracker - D3S, which narrows the gap between visual object tracking and video object segmentation. A sing…
▽ More
Template-based discriminative trackers are currently the dominant tracking paradigm due to their robustness, but are restricted to bounding box tracking and a limited range of transformation models, which reduces their localization accuracy. We propose a discriminative single-shot segmentation tracker - D3S, which narrows the gap between visual object tracking and video object segmentation. A single-shot network applies two target models with complementary geometric properties, one invariant to a broad range of transformations, including non-rigid deformations, the other assuming a rigid object to simultaneously achieve high robustness and online target segmentation. Without per-dataset finetuning and trained only for segmentation as the primary output, D3S outperforms all trackers on VOT2016, VOT2018 and GOT-10k benchmarks and performs close to the state-of-the-art trackers on the TrackingNet. D3S outperforms the leading segmentation tracker SiamMask on video object segmentation benchmark and performs on par with top video object segmentation algorithms, while running an order of magnitude faster, close to real-time.
△ Less
Submitted 14 April, 2020; v1 submitted 20 November, 2019;
originally announced November 2019.
-
On the m-eternal Domination Number of Cactus Graphs
Authors:
Václav Blažej,
Jan Matyáš Křišťan,
Tomáš Valla
Abstract:
Given a graph $G$, guards are placed on vertices of $G$. Then vertices are subject to an infinite sequence of attacks so that each attack must be defended by a guard moving from a neighboring vertex. The m-eternal domination number is the minimum number of guards such that the graph can be defended indefinitely. In this paper we study the m-eternal domination number of cactus graphs, that is, conn…
▽ More
Given a graph $G$, guards are placed on vertices of $G$. Then vertices are subject to an infinite sequence of attacks so that each attack must be defended by a guard moving from a neighboring vertex. The m-eternal domination number is the minimum number of guards such that the graph can be defended indefinitely. In this paper we study the m-eternal domination number of cactus graphs, that is, connected graphs where each edge lies in at most two cycles, and we consider three variants of the m-eternal domination number: first variant allows multiple guards to occupy a single vertex, second variant does not allow it, and in the third variant additional "eviction" attacks must be defended. We provide a new upper bound for the m-eternal domination number of cactus graphs, and for a subclass of cactus graphs called Christmas cactus graphs, where each vertex lies in at most two cycles, we prove that these three numbers are equal. Moreover, we present a linear-time algorithm for computing them.
△ Less
Submitted 18 July, 2019;
originally announced July 2019.
-
CDTB: A Color and Depth Visual Object Tracking Dataset and Benchmark
Authors:
Alan Lukežič,
Ugur Kart,
Jani Käpylä,
Ahmed Durmush,
Joni-Kristian Kämäräinen,
Jiří Matas,
Matej Kristan
Abstract:
A long-term visual object tracking performance evaluation methodology and a benchmark are proposed. Performance measures are designed by following a long-term tracking definition to maximize the analysis probing strength. The new measures outperform existing ones in interpretation potential and in better distinguishing between different tracking behaviors. We show that these measures generalize th…
▽ More
A long-term visual object tracking performance evaluation methodology and a benchmark are proposed. Performance measures are designed by following a long-term tracking definition to maximize the analysis probing strength. The new measures outperform existing ones in interpretation potential and in better distinguishing between different tracking behaviors. We show that these measures generalize the short-term performance measures, thus linking the two tracking problems. Furthermore, the new measures are highly robust to temporal annotation sparsity and allow annotation of sequences hundreds of times longer than in the current datasets without increasing manual annotation labor. A new challenging dataset of carefully selected sequences with many target disappearances is proposed. A new tracking taxonomy is proposed to position trackers on the short-term/long-term spectrum. The benchmark contains an extensive evaluation of the largest number of long-term tackers and comparison to state-of-the-art short-term trackers. We analyze the influence of tracking architecture implementations to long-term performance and explore various re-detection strategies as well as influence of visual model update strategies to long-term tracking drift. The methodology is integrated in the VOT toolkit to automate experimental analysis and benchmarking and to facilitate future development of long-term trackers.
△ Less
Submitted 1 July, 2019;
originally announced July 2019.
-
Performance Evaluation Methodology for Long-Term Visual Object Tracking
Authors:
Alan Lukežič,
Luka Čehovin Zajc,
Tomáš Vojíř,
Jiří Matas,
Matej Kristan
Abstract:
A long-term visual object tracking performance evaluation methodology and a benchmark are proposed. Performance measures are designed by following a long-term tracking definition to maximize the analysis probing strength. The new measures outperform existing ones in interpretation potential and in better distinguishing between different tracking behaviors. We show that these measures generalize th…
▽ More
A long-term visual object tracking performance evaluation methodology and a benchmark are proposed. Performance measures are designed by following a long-term tracking definition to maximize the analysis probing strength. The new measures outperform existing ones in interpretation potential and in better distinguishing between different tracking behaviors. We show that these measures generalize the short-term performance measures, thus linking the two tracking problems. Furthermore, the new measures are highly robust to temporal annotation sparsity and allow annotation of sequences hundreds of times longer than in the current datasets without increasing manual annotation labor. A new challenging dataset of carefully selected sequences with many target disappearances is proposed. A new tracking taxonomy is proposed to position trackers on the short-term/long-term spectrum. The benchmark contains an extensive evaluation of the largest number of long-term tackers and comparison to state-of-the-art short-term trackers. We analyze the influence of tracking architecture implementations to long-term performance and explore various re-detection strategies as well as influence of visual model update strategies to long-term tracking drift. The methodology is integrated in the VOT toolkit to automate experimental analysis and benchmarking and to facilitate future development of long-term trackers.
△ Less
Submitted 19 June, 2019;
originally announced June 2019.
-
Spatially-Adaptive Filter Units for Compact and Efficient Deep Neural Networks
Authors:
Domen Tabernik,
Matej Kristan,
Aleš Leonardis
Abstract:
Convolutional neural networks excel in a number of computer vision tasks. One of their most crucial architectural elements is the effective receptive field size, that has to be manually set to accommodate a specific task. Standard solutions involve large kernels, down/up-sampling and dilated convolutions. These require testing a variety of dilation and down/up-sampling factors and result in non-co…
▽ More
Convolutional neural networks excel in a number of computer vision tasks. One of their most crucial architectural elements is the effective receptive field size, that has to be manually set to accommodate a specific task. Standard solutions involve large kernels, down/up-sampling and dilated convolutions. These require testing a variety of dilation and down/up-sampling factors and result in non-compact representations and excessive number of parameters. We address this issue by proposing a new convolution filter composed of displaced aggregation units (DAU). DAUs learn spatial displacements and adapt the receptive field sizes of individual convolution filters to a given problem, thus eliminating the need for hand-crafted modifications. DAUs provide a seamless substitution of convolutional filters in existing state-of-the-art architectures, which we demonstrate on AlexNet, ResNet50, ResNet101, DeepLab and SRN-DeblurNet. The benefits of this design are demonstrated on a variety of computer vision tasks and datasets, such as image classification (ILSVRC 2012), semantic segmentation (PASCAL VOC 2011, Cityscape) and blind image de-blurring (GOPRO). Results show that DAUs efficiently allocate parameters resulting in up to four times more compact networks at similar or better performance.
△ Less
Submitted 6 February, 2020; v1 submitted 20 February, 2019;
originally announced February 2019.
-
Object Tracking by Reconstruction with View-Specific Discriminative Correlation Filters
Authors:
Ugur Kart,
Alan Lukezic,
Matej Kristan,
Joni-Kristian Kamarainen,
Jiri Matas
Abstract:
Standard RGB-D trackers treat the target as an inherently 2D structure, which makes modelling appearance changes related even to simple out-of-plane rotation highly challenging. We address this limitation by proposing a novel long-term RGB-D tracker - Object Tracking by Reconstruction (OTR). The tracker performs online 3D target reconstruction to facilitate robust learning of a set of view-specifi…
▽ More
Standard RGB-D trackers treat the target as an inherently 2D structure, which makes modelling appearance changes related even to simple out-of-plane rotation highly challenging. We address this limitation by proposing a novel long-term RGB-D tracker - Object Tracking by Reconstruction (OTR). The tracker performs online 3D target reconstruction to facilitate robust learning of a set of view-specific discriminative correlation filters (DCFs). The 3D reconstruction supports two performance-enhancing features: (i) generation of accurate spatial support for constrained DCF learning from its 2D projection and (ii) point cloud based estimation of 3D pose change for selection and storage of view-specific DCFs which are used to robustly localize the target after out-of-view rotation or heavy occlusion. Extensive evaluation of OTR on the challenging Princeton RGB-D tracking and STC Benchmarks shows it outperforms the state-of-the-art by a large margin.
△ Less
Submitted 27 November, 2018;
originally announced November 2018.
-
Now you see me: evaluating performance in long-term visual tracking
Authors:
Alan Lukežič,
Luka Čehovin Zajc,
Tomáš Vojíř,
Jiří Matas,
Matej Kristan
Abstract:
We propose a new long-term tracking performance evaluation methodology and present a new challenging dataset of carefully selected sequences with many target disappearances. We perform an extensive evaluation of six long-term and nine short-term state-of-the-art trackers, using new performance measures, suitable for evaluating long-term tracking - tracking precision, recall and F-score. The evalua…
▽ More
We propose a new long-term tracking performance evaluation methodology and present a new challenging dataset of carefully selected sequences with many target disappearances. We perform an extensive evaluation of six long-term and nine short-term state-of-the-art trackers, using new performance measures, suitable for evaluating long-term tracking - tracking precision, recall and F-score. The evaluation shows that a good model update strategy and the capability of image-wide re-detection are critical for long-term tracking performance. We integrated the methodology in the VOT toolkit to automate experimental analysis and benchmarking and to facilitate the development of long-term trackers.
△ Less
Submitted 19 April, 2018;
originally announced April 2018.
-
Stereo obstacle detection for unmanned surface vehicles by IMU-assisted semantic segmentation
Authors:
Borja Bovcon,
Rok Mandeljc,
Janez Perš,
Matej Kristan
Abstract:
A new obstacle detection algorithm for unmanned surface vehicles (USVs) is presented. A state-of-the-art graphical model for semantic segmentation is extended to incorporate boat pitch and roll measurements from the on-board inertial measurement unit (IMU), and a stereo verification algorithm that consolidates tentative detections obtained from the segmentation is proposed. The IMU readings are us…
▽ More
A new obstacle detection algorithm for unmanned surface vehicles (USVs) is presented. A state-of-the-art graphical model for semantic segmentation is extended to incorporate boat pitch and roll measurements from the on-board inertial measurement unit (IMU), and a stereo verification algorithm that consolidates tentative detections obtained from the segmentation is proposed. The IMU readings are used to estimate the location of horizon line in the image, which automatically adjusts the priors in the probabilistic semantic segmentation model. We derive the equations for projecting the horizon into images, propose an efficient optimization algorithm for the extended graphical model, and offer a practical IMU-camera-USV calibration procedure. Using an USV equipped with multiple synchronized sensors, we captured a new challenging multi-modal dataset, and annotated its images with water edge and obstacles. Experimental results show that the proposed algorithm significantly outperforms the state of the art, with nearly 30% improvement in water-edge detection accuracy, an over 21% reduction of false positive rate, an almost 60% reduction of false negative rate, and an over 65% increase of true positive rate, while its Matlab implementation runs in real-time.
△ Less
Submitted 22 February, 2018;
originally announced February 2018.
-
Spatially-Adaptive Filter Units for Deep Neural Networks
Authors:
Domen Tabernik,
Matej Kristan,
Aleš Leonardis
Abstract:
Classical deep convolutional networks increase receptive field size by either gradual resolution reduction or application of hand-crafted dilated convolutions to prevent increase in the number of parameters. In this paper we propose a novel displaced aggregation unit (DAU) that does not require hand-crafting. In contrast to classical filters with units (pixels) placed on a fixed regular grid, the…
▽ More
Classical deep convolutional networks increase receptive field size by either gradual resolution reduction or application of hand-crafted dilated convolutions to prevent increase in the number of parameters. In this paper we propose a novel displaced aggregation unit (DAU) that does not require hand-crafting. In contrast to classical filters with units (pixels) placed on a fixed regular grid, the displacement of the DAUs are learned, which enables filters to spatially-adapt their receptive field to a given problem. We extensively demonstrate the strength of DAUs on a classification and semantic segmentation tasks. Compared to ConvNets with regular filter, ConvNets with DAUs achieve comparable performance at faster convergence and up to 3-times reduction in parameters. Furthermore, DAUs allow us to study deep networks from novel perspectives. We study spatial distributions of DAU filters and analyze the number of parameters allocated for spatial coverage in a filter.
△ Less
Submitted 15 March, 2018; v1 submitted 30 November, 2017;
originally announced November 2017.
-
FuCoLoT -- A Fully-Correlational Long-Term Tracker
Authors:
Alan Lukežič,
Luka Čehovin Zajc,
Tomáš Vojíř,
Jiří Matas,
Matej Kristan
Abstract:
We propose FuCoLoT -- a Fully Correlational Long-term Tracker. It exploits the novel DCF constrained filter learning method to design a detector that is able to re-detect the target in the whole image efficiently. FuCoLoT maintains several correlation filters trained on different time scales that act as the detector components. A novel mechanism based on the correlation response is used for tracki…
▽ More
We propose FuCoLoT -- a Fully Correlational Long-term Tracker. It exploits the novel DCF constrained filter learning method to design a detector that is able to re-detect the target in the whole image efficiently. FuCoLoT maintains several correlation filters trained on different time scales that act as the detector components. A novel mechanism based on the correlation response is used for tracking failure estimation. FuCoLoT achieves state-of-the-art results on standard short-term benchmarks and it outperforms the current best-performing tracker on the long-term UAV20L benchmark by over 19%. It has an order of magnitude smaller memory footprint than its best-performing competitors and runs at 15fps in a single CPU thread.
△ Less
Submitted 14 January, 2019; v1 submitted 27 November, 2017;
originally announced November 2017.
-
Beyond standard benchmarks: Parameterizing performance evaluation in visual object tracking
Authors:
Luka Čehovin Zajc,
Alan Lukežič,
Aleš Leonardis,
Matej Kristan
Abstract:
Object-to-camera motion produces a variety of apparent motion patterns that significantly affect performance of short-term visual trackers. Despite being crucial for designing robust trackers, their influence is poorly explored in standard benchmarks due to weakly defined, biased and overlapping attribute annotations. In this paper we propose to go beyond pre-recorded benchmarks with post-hoc anno…
▽ More
Object-to-camera motion produces a variety of apparent motion patterns that significantly affect performance of short-term visual trackers. Despite being crucial for designing robust trackers, their influence is poorly explored in standard benchmarks due to weakly defined, biased and overlapping attribute annotations. In this paper we propose to go beyond pre-recorded benchmarks with post-hoc annotations by presenting an approach that utilizes omnidirectional videos to generate realistic, consistently annotated, short-term tracking scenarios with exactly parameterized motion patterns. We have created an evaluation system, constructed a fully annotated dataset of omnidirectional videos and the generators for typical motion patterns. We provide an in-depth analysis of major tracking paradigms which is complementary to the standard benchmarks and confirms the expressiveness of our evaluation approach.
△ Less
Submitted 25 March, 2017; v1 submitted 30 November, 2016;
originally announced December 2016.
-
Discriminative Correlation Filter with Channel and Spatial Reliability
Authors:
Alan Lukežič,
Tomáš Vojíř,
Luka Čehovin,
Jiří Matas,
Matej Kristan
Abstract:
Short-term tracking is an open and challenging problem for which discriminative correlation filters (DCF) have shown excellent performance. We introduce the channel and spatial reliability concepts to DCF tracking and provide a novel learning algorithm for its efficient and seamless integration in the filter update and the tracking process. The spatial reliability map adjusts the filter support to…
▽ More
Short-term tracking is an open and challenging problem for which discriminative correlation filters (DCF) have shown excellent performance. We introduce the channel and spatial reliability concepts to DCF tracking and provide a novel learning algorithm for its efficient and seamless integration in the filter update and the tracking process. The spatial reliability map adjusts the filter support to the part of the object suitable for tracking. This both allows to enlarge the search region and improves tracking of non-rectangular objects. Reliability scores reflect channel-wise quality of the learned filters and are used as feature weighting coefficients in localization. Experimentally, with only two simple standard features, HoGs and Colornames, the novel CSR-DCF method -- DCF with Channel and Spatial Reliability -- achieves state-of-the-art results on VOT 2016, VOT 2015 and OTB100. The CSR-DCF runs in real-time on a CPU.
△ Less
Submitted 14 January, 2019; v1 submitted 25 November, 2016;
originally announced November 2016.
-
Towards Deep Compositional Networks
Authors:
Domen Tabernik,
Matej Kristan,
Jeremy L. Wyatt,
Aleš Leonardis
Abstract:
Hierarchical feature learning based on convolutional neural networks (CNN) has recently shown significant potential in various computer vision tasks. While allowing high-quality discriminative feature learning, the downside of CNNs is the lack of explicit structure in features, which often leads to overfitting, absence of reconstruction from partial observations and limited generative abilities. E…
▽ More
Hierarchical feature learning based on convolutional neural networks (CNN) has recently shown significant potential in various computer vision tasks. While allowing high-quality discriminative feature learning, the downside of CNNs is the lack of explicit structure in features, which often leads to overfitting, absence of reconstruction from partial observations and limited generative abilities. Explicit structure is inherent in hierarchical compositional models, however, these lack the ability to optimize a well-defined cost function. We propose a novel analytic model of a basic unit in a layered hierarchical model with both explicit compositional structure and a well-defined discriminative cost function. Our experiments on two datasets show that the proposed compositional model performs on a par with standard CNNs on discriminative tasks, while, due to explicit modeling of the structure in the feature units, affording a straight-forward visualization of parts and faster inference due to separability of the units. Actions
△ Less
Submitted 13 September, 2016;
originally announced September 2016.
-
Deformable Parts Correlation Filters for Robust Visual Tracking
Authors:
Alan Lukežič,
Luka Čehovin,
Matej Kristan
Abstract:
Deformable parts models show a great potential in tracking by principally addressing non-rigid object deformations and self occlusions, but according to recent benchmarks, they often lag behind the holistic approaches. The reason is that potentially large number of degrees of freedom have to be estimated for object localization and simplifications of the constellation topology are often assumed to…
▽ More
Deformable parts models show a great potential in tracking by principally addressing non-rigid object deformations and self occlusions, but according to recent benchmarks, they often lag behind the holistic approaches. The reason is that potentially large number of degrees of freedom have to be estimated for object localization and simplifications of the constellation topology are often assumed to make the inference tractable. We present a new formulation of the constellation model with correlation filters that treats the geometric and visual constraints within a single convex cost function and derive a highly efficient optimization for MAP inference of a fully-connected constellation. We propose a tracker that models the object at two levels of detail. The coarse level corresponds a root correlation filter and a novel color model for approximate object localization, while the mid-level representation is composed of the new deformable constellation of correlation filters that refine the object location. The resulting tracker is rigorously analyzed on a highly challenging OTB, VOT2014 and VOT2015 benchmarks, exhibits a state-of-the-art performance and runs in real-time.
△ Less
Submitted 12 May, 2016;
originally announced May 2016.
-
A regularization-based approach for unsupervised image segmentation
Authors:
Aleksandar Dimitriev,
Matej Kristan
Abstract:
We propose a novel unsupervised image segmentation algorithm, which aims to segment an image into several coherent parts. It requires no user input, no supervised learning phase and assumes an unknown number of segments. It achieves this by first over-segmenting the image into several hundred superpixels. These are iteratively joined on the basis of a discriminative classifier trained on color and…
▽ More
We propose a novel unsupervised image segmentation algorithm, which aims to segment an image into several coherent parts. It requires no user input, no supervised learning phase and assumes an unknown number of segments. It achieves this by first over-segmenting the image into several hundred superpixels. These are iteratively joined on the basis of a discriminative classifier trained on color and texture information obtained from each superpixel. The output of the classifier is regularized by a Markov random field that lends more influence to neighbouring superpixels that are more similar. In each iteration, similar superpixels fall under the same label, until only a few coherent regions remain in the image. The algorithm was tested on a standard evaluation data set, where it performs on par with state-of-the-art algorithms in term of precision and greatly outperforms the state of the art by reducing the oversegmentation of the object of interest.
△ Less
Submitted 8 March, 2016;
originally announced March 2016.
-
Fast image-based obstacle detection from unmanned surface vehicles
Authors:
Matej Kristan,
Vildana Sulic,
Stanislav Kovacic,
Janez Pers
Abstract:
Obstacle detection plays an important role in unmanned surface vehicles (USV). The USVs operate in highly diverse environments in which an obstacle may be a floating piece of wood, a scuba diver, a pier, or a part of a shoreline, which presents a significant challenge to continuous detection from images taken onboard. This paper addresses the problem of online detection by constrained unsupervised…
▽ More
Obstacle detection plays an important role in unmanned surface vehicles (USV). The USVs operate in highly diverse environments in which an obstacle may be a floating piece of wood, a scuba diver, a pier, or a part of a shoreline, which presents a significant challenge to continuous detection from images taken onboard. This paper addresses the problem of online detection by constrained unsupervised segmentation. To this end, a new graphical model is proposed that affords a fast and continuous obstacle image-map estimation from a single video stream captured onboard a USV. The model accounts for the semantic structure of marine environment as observed from USV by imposing weak structural constraints. A Markov random field framework is adopted and a highly efficient algorithm for simultaneous optimization of model parameters and segmentation mask estimation is derived. Our approach does not require computationally intensive extraction of texture features and comfortably runs in real-time. The algorithm is tested on a new, challenging, dataset for segmentation and obstacle detection in marine environments, which is the largest annotated dataset of its kind. Results on this dataset show that our model outperforms the related approaches, while requiring a fraction of computational effort.
△ Less
Submitted 6 March, 2015;
originally announced March 2015.
-
A Novel Performance Evaluation Methodology for Single-Target Trackers
Authors:
Matej Kristan,
Jiri Matas,
Ales Leonardis,
Tomas Vojir,
Roman Pflugfelder,
Gustavo Fernandez,
Georg Nebehay,
Fatih Porikli,
Luka Cehovin
Abstract:
This paper addresses the problem of single-target tracker performance evaluation. We consider the performance measures, the dataset and the evaluation system to be the most important components of tracker evaluation and propose requirements for each of them. The requirements are the basis of a new evaluation methodology that aims at a simple and easily interpretable tracker comparison. The ranking…
▽ More
This paper addresses the problem of single-target tracker performance evaluation. We consider the performance measures, the dataset and the evaluation system to be the most important components of tracker evaluation and propose requirements for each of them. The requirements are the basis of a new evaluation methodology that aims at a simple and easily interpretable tracker comparison. The ranking-based methodology addresses tracker equivalence in terms of statistical significance and practical differences. A fully-annotated dataset with per-frame annotations with several visual attributes is introduced. The diversity of its visual properties is maximized in a novel way by clustering a large number of videos according to their visual attributes. This makes it the most sophistically constructed and annotated dataset to date. A multi-platform evaluation system allowing easy integration of third-party trackers is presented as well. The proposed evaluation methodology was tested on the VOT2014 challenge on the new dataset and 38 trackers, making it the largest benchmark to date. Most of the tested trackers are indeed state-of-the-art since they outperform the standard baselines, resulting in a highly-challenging benchmark. An exhaustive analysis of the dataset from the perspective of tracking difficulty is carried out. To facilitate tracker comparison a new performance visualization technique is proposed.
△ Less
Submitted 8 January, 2016; v1 submitted 4 March, 2015;
originally announced March 2015.
-
Visual object tracking performance measures revisited
Authors:
Luka Čehovin,
Aleš Leonardis,
Matej Kristan
Abstract:
The problem of visual tracking evaluation is sporting a large variety of performance measures, and largely suffers from lack of consensus about which measures should be used in experiments. This makes the cross-paper tracker comparison difficult. Furthermore, as some measures may be less effective than others, the tracking results may be skewed or biased towards particular tracking aspects. In thi…
▽ More
The problem of visual tracking evaluation is sporting a large variety of performance measures, and largely suffers from lack of consensus about which measures should be used in experiments. This makes the cross-paper tracker comparison difficult. Furthermore, as some measures may be less effective than others, the tracking results may be skewed or biased towards particular tracking aspects. In this paper we revisit the popular performance measures and tracker performance visualizations and analyze them theoretically and experimentally. We show that several measures are equivalent from the point of information they provide for tracker comparison and, crucially, that some are more brittle than the others. Based on our analysis we narrow down the set of potential measures to only two complementary ones, describing accuracy and robustness, thus pushing towards homogenization of the tracker evaluation methodology. These two measures can be intuitively interpreted and visualized and have been employed by the recent Visual Object Tracking (VOT) challenges as the foundation for the evaluation methodology.
△ Less
Submitted 7 March, 2016; v1 submitted 20 February, 2015;
originally announced February 2015.