-
Stable Yaw Estimation of Boats from the Viewpoint of UAVs and USVs
Authors:
Benjamin Kiefer,
Timon Höfer,
Andreas Zell
Abstract:
Yaw estimation of boats from the viewpoint of unmanned aerial vehicles (UAVs) and unmanned surface vehicles (USVs) or boats is a crucial task in various applications such as 3D scene rendering, trajectory prediction, and navigation. However, the lack of literature on yaw estimation of objects from the viewpoint of UAVs has motivated us to address this domain. In this paper, we propose a method bas…
▽ More
Yaw estimation of boats from the viewpoint of unmanned aerial vehicles (UAVs) and unmanned surface vehicles (USVs) or boats is a crucial task in various applications such as 3D scene rendering, trajectory prediction, and navigation. However, the lack of literature on yaw estimation of objects from the viewpoint of UAVs has motivated us to address this domain. In this paper, we propose a method based on HyperPosePDF for predicting the orientation of boats in the 6D space. For that, we use existing datasets, such as PASCAL3D+ and our own datasets, SeaDronesSee-3D and BOArienT, which we annotated manually. We extend HyperPosePDF to work in video-based scenarios, such that it yields robust orientation predictions across time. Naively applying HyperPosePDF on video data yields single-point predictions, resulting in far-off predictions and often incorrect symmetric orientations due to unseen or visually different data. To alleviate this issue, we propose aggregating the probability distributions of pose predictions, resulting in significantly improved performance, as shown in our experimental evaluation. Our proposed method could significantly benefit downstream tasks in marine robotics.
△ Less
Submitted 24 June, 2023;
originally announced June 2023.
-
1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results
Authors:
Benjamin Kiefer,
Matej Kristan,
Janez Perš,
Lojze Žust,
Fabio Poiesi,
Fabio Augusto de Alcantara Andrade,
Alexandre Bernardino,
Matthew Dawkins,
Jenni Raitoharju,
Yitong Quan,
Adem Atmaca,
Timon Höfer,
Qiming Zhang,
Yufei Xu,
Jing Zhang,
Dacheng Tao,
Lars Sommer,
Raphael Spraul,
Hangyue Zhao,
Hongpu Zhang,
Yanyun Zhao,
Jan Lukas Augustin,
Eui-ik Jeon,
Impyeong Lee,
Luca Zedda
, et al. (48 additional authors not shown)
Abstract:
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detec…
▽ More
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.
△ Less
Submitted 28 November, 2022; v1 submitted 24 November, 2022;
originally announced November 2022.
-
FourierMask: Instance Segmentation using Fourier Mapping in Implicit Neural Networks
Authors:
Hamd ul Moqeet Riaz,
Nuri Benbarka,
Timon Hoefer,
Andreas Zell
Abstract:
We present FourierMask, which employs Fourier series combined with implicit neural representations to generate instance segmentation masks. We apply a Fourier mapping (FM) to the coordinate locations and utilize the mapped features as inputs to an implicit representation (coordinate-based multi-layer perceptron (MLP)). FourierMask learns to predict the coefficients of the FM for a particular insta…
▽ More
We present FourierMask, which employs Fourier series combined with implicit neural representations to generate instance segmentation masks. We apply a Fourier mapping (FM) to the coordinate locations and utilize the mapped features as inputs to an implicit representation (coordinate-based multi-layer perceptron (MLP)). FourierMask learns to predict the coefficients of the FM for a particular instance, and therefore adapts the FM to a specific object. This allows FourierMask to be generalized to predict instance segmentation masks from natural images. Since implicit functions are continuous in the domain of input coordinates, we illustrate that by sub-sampling the input pixel coordinates, we can generate higher resolution masks during inference. Furthermore, we train a renderer MLP (FourierRend) on the uncertain predictions of FourierMask and illustrate that it significantly improves the quality of the masks. FourierMask shows competitive results on the MS COCO dataset compared to the baseline Mask R-CNN at the same output resolution and surpasses it on higher resolution.
△ Less
Submitted 17 March, 2022; v1 submitted 23 December, 2021;
originally announced December 2021.
-
Seeing Implicit Neural Representations as Fourier Series
Authors:
Nuri Benbarka,
Timon Höfer,
Hamd ul-moqeet Riaz,
Andreas Zell
Abstract:
Implicit Neural Representations (INR) use multilayer perceptrons to represent high-frequency functions in low-dimensional problem domains. Recently these representations achieved state-of-the-art results on tasks related to complex 3D objects and scenes. A core problem is the representation of highly detailed signals, which is tackled using networks with periodic activation functions (SIRENs) or a…
▽ More
Implicit Neural Representations (INR) use multilayer perceptrons to represent high-frequency functions in low-dimensional problem domains. Recently these representations achieved state-of-the-art results on tasks related to complex 3D objects and scenes. A core problem is the representation of highly detailed signals, which is tackled using networks with periodic activation functions (SIRENs) or applying Fourier mappings to the input. This work analyzes the connection between the two methods and shows that a Fourier mapped perceptron is structurally like one hidden layer SIREN. Furthermore, we identify the relationship between the previously proposed Fourier mapping and the general d-dimensional Fourier series, leading to an integer lattice mapping. Moreover, we modify a progressive training strategy to work on arbitrary Fourier mappings and show that it improves the generalization of the interpolation task. Lastly, we compare the different mappings on the image regression and novel view synthesis tasks. We confirm the previous finding that the main contributor to the mapping performance is the size of the embedding and standard deviation of its elements.
△ Less
Submitted 1 September, 2021;
originally announced September 2021.
-
Object detection and Autoencoder-based 6D pose estimation for highly cluttered Bin Picking
Authors:
Timon Höfer,
Faranak Shamsafar,
Nuri Benbarka,
Andreas Zell
Abstract:
Bin picking is a core problem in industrial environments and robotics, with its main module as 6D pose estimation. However, industrial depth sensors have a lack of accuracy when it comes to small objects. Therefore, we propose a framework for pose estimation in highly cluttered scenes with small objects, which mainly relies on RGB data and makes use of depth information only for pose refinement. I…
▽ More
Bin picking is a core problem in industrial environments and robotics, with its main module as 6D pose estimation. However, industrial depth sensors have a lack of accuracy when it comes to small objects. Therefore, we propose a framework for pose estimation in highly cluttered scenes with small objects, which mainly relies on RGB data and makes use of depth information only for pose refinement. In this work, we compare synthetic data generation approaches for object detection and pose estimation and introduce a pose filtering algorithm that determines the most accurate estimated poses. We will make our
△ Less
Submitted 15 June, 2021;
originally announced June 2021.