-
SemanticSpray++: A Multimodal Dataset for Autonomous Driving in Wet Surface Conditions
Authors:
Aldi Piroli,
Vinzenz Dallabetta,
Johannes Kopp,
Marc Walessa,
Daniel Meissner,
Klaus Dietmayer
Abstract:
Autonomous vehicles rely on camera, LiDAR, and radar sensors to navigate the environment. Adverse weather conditions like snow, rain, and fog are known to be problematic for both camera and LiDAR-based perception systems. Currently, it is difficult to evaluate the performance of these methods due to the lack of publicly available datasets containing multimodal labeled data. To address this limitat…
▽ More
Autonomous vehicles rely on camera, LiDAR, and radar sensors to navigate the environment. Adverse weather conditions like snow, rain, and fog are known to be problematic for both camera and LiDAR-based perception systems. Currently, it is difficult to evaluate the performance of these methods due to the lack of publicly available datasets containing multimodal labeled data. To address this limitation, we propose the SemanticSpray++ dataset, which provides labels for camera, LiDAR, and radar data of highway-like scenarios in wet surface conditions. In particular, we provide 2D bounding boxes for the camera image, 3D bounding boxes for the LiDAR point cloud, and semantic labels for the radar targets. By labeling all three sensor modalities, the SemanticSpray++ dataset offers a comprehensive test bed for analyzing the performance of different perception methods when vehicles travel on wet surface conditions. Together with comprehensive label statistics, we also evaluate multiple baseline methods across different tasks and analyze their performances. The dataset will be available at https://semantic-spray-dataset.github.io .
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Label-Efficient Semantic Segmentation of LiDAR Point Clouds in Adverse Weather Conditions
Authors:
Aldi Piroli,
Vinzenz Dallabetta,
Johannes Kopp,
Marc Walessa,
Daniel Meissner,
Klaus Dietmayer
Abstract:
Adverse weather conditions can severely affect the performance of LiDAR sensors by introducing unwanted noise in the measurements. Therefore, differentiating between noise and valid points is crucial for the reliable use of these sensors. Current approaches for detecting adverse weather points require large amounts of labeled data, which can be difficult and expensive to obtain. This paper propose…
▽ More
Adverse weather conditions can severely affect the performance of LiDAR sensors by introducing unwanted noise in the measurements. Therefore, differentiating between noise and valid points is crucial for the reliable use of these sensors. Current approaches for detecting adverse weather points require large amounts of labeled data, which can be difficult and expensive to obtain. This paper proposes a label-efficient approach to segment LiDAR point clouds in adverse weather. We develop a framework that uses few-shot semantic segmentation to learn to segment adverse weather points from only a few labeled examples. Then, we use a semi-supervised learning approach to generate pseudo-labels for unlabelled point clouds, significantly increasing the amount of training data without requiring any additional labeling. We also integrate good weather data in our training pipeline, allowing for high performance in both good and adverse weather conditions. Results on real and synthetic datasets show that our method performs well in detecting snow, fog, and spray. Furthermore, we achieve competitive performance against fully supervised methods while using only a fraction of labeled data.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Revisiting Out-of-Distribution Detection in LiDAR-based 3D Object Detection
Authors:
Michael Kösel,
Marcel Schreiber,
Michael Ulrich,
Claudius Gläser,
Klaus Dietmayer
Abstract:
LiDAR-based 3D object detection has become an essential part of automated driving due to its ability to localize and classify objects precisely in 3D. However, object detectors face a critical challenge when dealing with unknown foreground objects, particularly those that were not present in their original training data. These out-of-distribution (OOD) objects can lead to misclassifications, posin…
▽ More
LiDAR-based 3D object detection has become an essential part of automated driving due to its ability to localize and classify objects precisely in 3D. However, object detectors face a critical challenge when dealing with unknown foreground objects, particularly those that were not present in their original training data. These out-of-distribution (OOD) objects can lead to misclassifications, posing a significant risk to the safety and reliability of automated vehicles. Currently, LiDAR-based OOD object detection has not been well studied. We address this problem by generating synthetic training data for OOD objects by perturbing known object categories. Our idea is that these synthetic OOD objects produce different responses in the feature map of an object detector compared to in-distribution (ID) objects. We then extract features using a pre-trained and fixed object detector and train a simple multilayer perceptron (MLP) to classify each detection as either ID or OOD. In addition, we propose a new evaluation protocol that allows the use of existing datasets without modifying the point cloud, ensuring a more authentic evaluation of real-world scenarios. The effectiveness of our method is validated through experiments on the newly proposed nuScenes OOD benchmark. The source code is available at https://github.com/uulm-mrm/mmood3d.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
The Radar Ghost Dataset -- An Evaluation of Ghost Objects in Automotive Radar Data
Authors:
Florian Kraus,
Nicolas Scheiner,
Werner Ritter,
Klaus Dietmayer
Abstract:
Radar sensors have a long tradition in advanced driver assistance systems (ADAS) and also play a major role in current concepts for autonomous vehicles. Their importance is reasoned by their high robustness against meteorological effects, such as rain, snow, or fog, and the radar's ability to measure relative radial velocity differences via the Doppler effect. The cause for these advantages, namel…
▽ More
Radar sensors have a long tradition in advanced driver assistance systems (ADAS) and also play a major role in current concepts for autonomous vehicles. Their importance is reasoned by their high robustness against meteorological effects, such as rain, snow, or fog, and the radar's ability to measure relative radial velocity differences via the Doppler effect. The cause for these advantages, namely the large wavelength, is also one of the drawbacks of radar sensors. Compared to camera or lidar sensor, a lot more surfaces in a typical traffic scenario appear flat relative to the radar's emitted signal. This results in multi-path reflections or so called ghost detections in the radar signal. Ghost objects pose a major source for potential false positive detections in a vehicle's perception pipeline. Therefore, it is important to be able to segregate multi-path reflections from direct ones. In this article, we present a dataset with detailed manual annotations for different kinds of ghost detections. Moreover, two different approaches for identifying these kinds of objects are evaluated. We hope that our dataset encourages more researchers to engage in the fields of multi-path object suppression or exploitation.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Simultaneous Clutter Detection and Semantic Segmentation of Moving Objects for Automotive Radar Data
Authors:
Johannes Kopp,
Dominik Kellner,
Aldi Piroli,
Vinzenz Dallabetta,
Klaus Dietmayer
Abstract:
The unique properties of radar sensors, such as their robustness to adverse weather conditions, make them an important part of the environment perception system of autonomous vehicles. One of the first steps during the processing of radar point clouds is often the detection of clutter, i.e. erroneous points that do not correspond to real objects. Another common objective is the semantic segmentati…
▽ More
The unique properties of radar sensors, such as their robustness to adverse weather conditions, make them an important part of the environment perception system of autonomous vehicles. One of the first steps during the processing of radar point clouds is often the detection of clutter, i.e. erroneous points that do not correspond to real objects. Another common objective is the semantic segmentation of moving road users. These two problems are handled strictly separate from each other in literature. The employed neural networks are always focused entirely on only one of the tasks. In contrast to this, we examine ways to solve both tasks at the same time with a single jointly used model. In addition to a new augmented multi-head architecture, we also devise a method to represent a network's predictions for the two tasks with only one output value. This novel approach allows us to solve the tasks simultaneously with the same inference time as a conventional task-specific model. In an extensive evaluation, we show that our setup is highly effective and outperforms every existing network for semantic segmentation on the RadarScenes dataset.
△ Less
Submitted 14 November, 2023; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Multimodal Object Query Initialization for 3D Object Detection
Authors:
Mathijs R. van Geerenstein,
Felicia Ruppel,
Klaus Dietmayer,
Dariu M. Gavrila
Abstract:
3D object detection models that exploit both LiDAR and camera sensor features are top performers in large-scale autonomous driving benchmarks. A transformer is a popular network architecture used for this task, in which so-called object queries act as candidate objects. Initializing these object queries based on current sensor inputs is a common practice. For this, existing methods strongly rely o…
▽ More
3D object detection models that exploit both LiDAR and camera sensor features are top performers in large-scale autonomous driving benchmarks. A transformer is a popular network architecture used for this task, in which so-called object queries act as candidate objects. Initializing these object queries based on current sensor inputs is a common practice. For this, existing methods strongly rely on LiDAR data however, and do not fully exploit image features. Besides, they introduce significant latency. To overcome these limitations we propose EfficientQ3M, an efficient, modular, and multimodal solution for object query initialization for transformer-based 3D object detection models. The proposed initialization method is combined with a "modality-balanced" transformer decoder where the queries can access all sensor modalities throughout the decoder. In experiments, we outperform the state of the art in transformer-based LiDAR object detection on the competitive nuScenes benchmark and showcase the benefits of input-dependent multimodal query initialization, while being more efficient than the available alternatives for LiDAR-camera initialization. The proposed method can be applied with any combination of sensor modalities as input, demonstrating its modularity.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Efficient Path Planning in Large Unknown Environments with Switchable System Models for Automated Vehicles
Authors:
Oliver Schumann,
Michael Buchholz,
Klaus Dietmayer
Abstract:
Large environments are challenging for path planning algorithms as the size of the configuration space increases. Furthermore, if the environment is mainly unexplored, large amounts of the path are planned through unknown areas. Hence, a complete replanning of the entire path occurs whenever the path collides with newly discovered obstacles. We propose a novel method that stops the path planning a…
▽ More
Large environments are challenging for path planning algorithms as the size of the configuration space increases. Furthermore, if the environment is mainly unexplored, large amounts of the path are planned through unknown areas. Hence, a complete replanning of the entire path occurs whenever the path collides with newly discovered obstacles. We propose a novel method that stops the path planning algorithm after a certain distance. It is used to navigate the algorithm in large environments and is not prone to problems of existing navigation approaches. Furthermore, we developed a method to detect significant environment changes to allow a more efficient replanning. At last, we extend the path planner to be used in the U-Shift concept vehicle. It can switch to another system model and rotate around the center of its rear axis. The results show that the proposed methods generate nearly identical paths compared to the standard Hybrid A* while drastically reducing the execution time. Furthermore, we show that the extended path planning algorithm enables the efficient use of the maneuvering capabilities of the concept vehicle to plan concise paths in narrow environments.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
LS-VOS: Identifying Outliers in 3D Object Detections Using Latent Space Virtual Outlier Synthesis
Authors:
Aldi Piroli,
Vinzenz Dallabetta,
Johannes Kopp,
Marc Walessa,
Daniel Meissner,
Klaus Dietmayer
Abstract:
LiDAR-based 3D object detectors have achieved unprecedented speed and accuracy in autonomous driving applications. However, similar to other neural networks, they are often biased toward high-confidence predictions or return detections where no real object is present. These types of detections can lead to a less reliable environment perception, severely affecting the functionality and safety of au…
▽ More
LiDAR-based 3D object detectors have achieved unprecedented speed and accuracy in autonomous driving applications. However, similar to other neural networks, they are often biased toward high-confidence predictions or return detections where no real object is present. These types of detections can lead to a less reliable environment perception, severely affecting the functionality and safety of autonomous vehicles. We address this problem by proposing LS-VOS, a framework for identifying outliers in 3D object detections. Our approach builds on the idea of Virtual Outlier Synthesis (VOS), which incorporates outlier knowledge during training, enabling the model to learn more compact decision boundaries. In particular, we propose a new synthesis approach that relies on the latent space of an auto-encoder network to generate outlier features with a parametrizable degree of similarity to in-distribution features. In extensive experiments, we show that our approach improves the outlier detection capabilities of a state-of-the-art object detector while maintaining high 3D object detection performance.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Towards Robust 3D Object Detection In Rainy Conditions
Authors:
Aldi Piroli,
Vinzenz Dallabetta,
Johannes Kopp,
Marc Walessa,
Daniel Meissner,
Klaus Dietmayer
Abstract:
LiDAR sensors are used in autonomous driving applications to accurately perceive the environment. However, they are affected by adverse weather conditions such as snow, fog, and rain. These everyday phenomena introduce unwanted noise into the measurements, severely degrading the performance of LiDAR-based perception systems. In this work, we propose a framework for improving the robustness of LiDA…
▽ More
LiDAR sensors are used in autonomous driving applications to accurately perceive the environment. However, they are affected by adverse weather conditions such as snow, fog, and rain. These everyday phenomena introduce unwanted noise into the measurements, severely degrading the performance of LiDAR-based perception systems. In this work, we propose a framework for improving the robustness of LiDAR-based 3D object detectors against road spray. Our approach uses a state-of-the-art adverse weather detection network to filter out spray from the LiDAR point cloud, which is then used as input for the object detector. In this way, the detected objects are less affected by the adverse weather in the scene, resulting in a more accurate perception of the environment. In addition to adverse weather filtering, we explore the use of radar targets to further filter false positive detections. Tests on real-world data show that our approach improves the robustness to road spray of several popular 3D object detectors.
△ Less
Submitted 5 October, 2023; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Group Regression for Query Based Object Detection and Tracking
Authors:
Felicia Ruppel,
Florian Faion,
Claudius Gläser,
Klaus Dietmayer
Abstract:
Group regression is commonly used in 3D object detection to predict box parameters of similar classes in a joint head, aiming to benefit from similarities while separating highly dissimilar classes. For query-based perception methods, this has, so far, not been feasible. We close this gap and present a method to incorporate multi-class group regression, especially designed for the 3D domain in the…
▽ More
Group regression is commonly used in 3D object detection to predict box parameters of similar classes in a joint head, aiming to benefit from similarities while separating highly dissimilar classes. For query-based perception methods, this has, so far, not been feasible. We close this gap and present a method to incorporate multi-class group regression, especially designed for the 3D domain in the context of autonomous driving, into existing attention and query-based perception approaches. We enhance a transformer based joint object detection and tracking model with this approach, and thoroughly evaluate its behavior and performance. For group regression, the classes of the nuScenes dataset are divided into six groups of similar shape and prevalence, each being regressed by a dedicated head. We show that the proposed method is applicable to many existing transformer based perception approaches and can bring potential benefits. The behavior of query group regression is thoroughly analyzed in comparison to a unified regression head, e.g. in terms of class-switching behavior and distribution of the output parameters. The proposed method offers many possibilities for further research, such as in the direction of deep multi-hypotheses tracking.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
User Feedback and Sample Weighting for Ill-Conditioned Hand-Eye Calibration
Authors:
Markus Horn,
Thomas Wodtko,
Michael Buchholz,
Klaus Dietmayer
Abstract:
Hand-eye calibration is an important and extensively researched method for calibrating rigidly coupled sensors, solely based on estimates of their motion. Due to the geometric structure of this problem, at least two motion estimates with non-parallel rotation axes are required for a unique solution. If the majority of rotation axes are almost parallel, the resulting optimization problem is ill-con…
▽ More
Hand-eye calibration is an important and extensively researched method for calibrating rigidly coupled sensors, solely based on estimates of their motion. Due to the geometric structure of this problem, at least two motion estimates with non-parallel rotation axes are required for a unique solution. If the majority of rotation axes are almost parallel, the resulting optimization problem is ill-conditioned. In this paper, we propose an approach to automatically weight the motion samples of such an ill-conditioned optimization problem for improving the conditioning. The sample weights are chosen in relation to the local density of all available rotation axes. Furthermore, we present an approach for estimating the sensitivity and conditioning of the cost function, separated into the translation and the rotation part. This information can be employed as user feedback when recording the calibration data to prevent ill-conditioning in advance. We evaluate and compare our approach on artificially augmented data from the KITTI odometry dataset.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Joint Out-of-Distribution Detection and Uncertainty Estimation for Trajectory Prediction
Authors:
Julian Wiederer,
Julian Schmidt,
Ulrich Kressel,
Klaus Dietmayer,
Vasileios Belagiannis
Abstract:
Despite the significant research efforts on trajectory prediction for automated driving, limited work exists on assessing the prediction reliability. To address this limitation we propose an approach that covers two sources of error, namely novel situations with out-of-distribution (OOD) detection and the complexity in in-distribution (ID) situations with uncertainty estimation. We introduce two m…
▽ More
Despite the significant research efforts on trajectory prediction for automated driving, limited work exists on assessing the prediction reliability. To address this limitation we propose an approach that covers two sources of error, namely novel situations with out-of-distribution (OOD) detection and the complexity in in-distribution (ID) situations with uncertainty estimation. We introduce two modules next to an encoder-decoder network for trajectory prediction. Firstly, a Gaussian mixture model learns the probability density function of the ID encoder features during training, and then it is used to detect the OOD samples in regions of the feature space with low likelihood. Secondly, an error regression network is applied to the encoder, which learns to estimate the trajectory prediction error in supervised training. During inference, the estimated prediction error is used as the uncertainty. In our experiments, the combination of both modules outperforms the prior work in OOD detection and uncertainty estimation, on the Shifts robust trajectory prediction dataset by $2.8 \%$ and $10.1 \%$, respectively. The code is publicly available.
△ Less
Submitted 4 August, 2023; v1 submitted 3 August, 2023;
originally announced August 2023.
-
Advancing Frame-Dropping in Multi-Object Tracking-by-Detection Systems Through Event-Based Detection Triggering
Authors:
Matti Henning,
Michael Buchholz,
Klaus Dietmayer
Abstract:
With rising computational requirements modern automated vehicles (AVs) often consider trade-offs between energy consumption and perception performance, potentially jeopardizing their safe operation. Frame-dropping in tracking-by-detection perception systems presents a promising approach, although late traffic participant detection might be induced.
In this paper, we extend our previous work on f…
▽ More
With rising computational requirements modern automated vehicles (AVs) often consider trade-offs between energy consumption and perception performance, potentially jeopardizing their safe operation. Frame-dropping in tracking-by-detection perception systems presents a promising approach, although late traffic participant detection might be induced.
In this paper, we extend our previous work on frame-dropping in tracking-by-detection perception systems. We introduce an additional event-based triggering mechanism using camera object detections to increase both the system's efficiency, as well as its safety. Evaluating both single and multi-modal tracking methods we show that late object detections are mitigated while the potential for reduced energy consumption is significantly increased, reaching nearly 60 Watt per reduced point in HOTA score.
△ Less
Submitted 1 August, 2023;
originally announced August 2023.
-
Data-Free Backbone Fine-Tuning for Pruned Neural Networks
Authors:
Adrian Holzbock,
Achyut Hegde,
Klaus Dietmayer,
Vasileios Belagiannis
Abstract:
Model compression techniques reduce the computational load and memory consumption of deep neural networks. After the compression operation, e.g. parameter pruning, the model is normally fine-tuned on the original training dataset to recover from the performance drop caused by compression. However, the training data is not always available due to privacy issues or other factors. In this work, we pr…
▽ More
Model compression techniques reduce the computational load and memory consumption of deep neural networks. After the compression operation, e.g. parameter pruning, the model is normally fine-tuned on the original training dataset to recover from the performance drop caused by compression. However, the training data is not always available due to privacy issues or other factors. In this work, we present a data-free fine-tuning approach for pruning the backbone of deep neural networks. In particular, the pruned network backbone is trained with synthetically generated images, and our proposed intermediate supervision to mimic the unpruned backbone's output feature map. Afterwards, the pruned backbone can be combined with the original network head to make predictions. We generate synthetic images by back-propagating gradients to noise images while relying on L1-pruning for the backbone pruning. In our experiments, we show that our approach is task-independent due to pruning only the backbone. By evaluating our approach on 2D human pose estimation, object detection, and image classification, we demonstrate promising performance compared to the unpruned model. Our code is available at https://github.com/holzbock/dfbf.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
Energy-based Detection of Adverse Weather Effects in LiDAR Data
Authors:
Aldi Piroli,
Vinzenz Dallabetta,
Johannes Kopp,
Marc Walessa,
Daniel Meissner,
Klaus Dietmayer
Abstract:
Autonomous vehicles rely on LiDAR sensors to perceive the environment. Adverse weather conditions like rain, snow, and fog negatively affect these sensors, reducing their reliability by introducing unwanted noise in the measurements. In this work, we tackle this problem by proposing a novel approach for detecting adverse weather effects in LiDAR data. We reformulate this problem as an outlier dete…
▽ More
Autonomous vehicles rely on LiDAR sensors to perceive the environment. Adverse weather conditions like rain, snow, and fog negatively affect these sensors, reducing their reliability by introducing unwanted noise in the measurements. In this work, we tackle this problem by proposing a novel approach for detecting adverse weather effects in LiDAR data. We reformulate this problem as an outlier detection task and use an energy-based framework to detect outliers in point clouds. More specifically, our method learns to associate low energy scores with inlier points and high energy scores with outliers allowing for robust detection of adverse weather effects. In extensive experiments, we show that our method performs better in adverse weather detection and has higher robustness to unseen weather effects than previous state-of-the-art methods. Furthermore, we show how our method can be used to perform simultaneous outlier detection and semantic segmentation. Finally, to help expand the research field of LiDAR perception in adverse weather, we release the SemanticSpray dataset, which contains labeled vehicle spray data in highway-like scenarios. The dataset is available at https://semantic-spray-dataset.github.io .
△ Less
Submitted 29 June, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Real-Time Spatial Trajectory Planning for Urban Environments Using Dynamic Optimization
Authors:
Jona Ruof,
Max Bastian Mertens,
Michael Buchholz,
Klaus Dietmayer
Abstract:
Planning trajectories for automated vehicles in urban environments requires methods with high generality, long planning horizons, and fast update rates. Using a path-velocity decomposition, we contribute a novel planning framework, which generates foresighted trajectories and can handle a wide variety of state and control constraints effectively. In contrast to related work, the proposed optimal c…
▽ More
Planning trajectories for automated vehicles in urban environments requires methods with high generality, long planning horizons, and fast update rates. Using a path-velocity decomposition, we contribute a novel planning framework, which generates foresighted trajectories and can handle a wide variety of state and control constraints effectively. In contrast to related work, the proposed optimal control problems are formulated over space rather than time. This spatial formulation decouples environmental constraints from the optimization variables, which allows the application of simple, yet efficient shooting methods. To this end, we present a tailored solution strategy based on ILQR, in the Augmented Lagrangian framework, to rapidly minimize the trajectory objective costs, even under infeasible initial solutions. Evaluations in simulation and on a full-sized automated vehicle in real-world urban traffic show the real-time capability and versatility of the proposed approach.
△ Less
Submitted 9 August, 2023; v1 submitted 4 May, 2023;
originally announced May 2023.
-
Extrinsic Infrastructure Calibration Using the Hand-Eye Robot-World Formulation
Authors:
Markus Horn,
Thomas Wodtko,
Michael Buchholz,
Klaus Dietmayer
Abstract:
We propose a certifiably globally optimal approach for solving the hand-eye robot-world problem supporting multiple sensors and targets at once. Further, we leverage this formulation for estimating a geo-referenced calibration of infrastructure sensors. Since vehicle motion recorded by infrastructure sensors is mostly planar, obtaining a unique solution for the respective hand-eye robot-world prob…
▽ More
We propose a certifiably globally optimal approach for solving the hand-eye robot-world problem supporting multiple sensors and targets at once. Further, we leverage this formulation for estimating a geo-referenced calibration of infrastructure sensors. Since vehicle motion recorded by infrastructure sensors is mostly planar, obtaining a unique solution for the respective hand-eye robot-world problem is unfeasible without incorporating additional knowledge. Hence, we extend our proposed method to include a-priori knowledge, i.e., the translation norm of calibration targets, to yield a unique solution. Our approach achieves state-of-the-art results on simulated and real-world data. Especially on real-world intersection data, our approach utilizing the translation norm is the only method providing accurate results.
△ Less
Submitted 2 August, 2023; v1 submitted 2 May, 2023;
originally announced May 2023.
-
RT-K-Net: Revisiting K-Net for Real-Time Panoptic Segmentation
Authors:
Markus Schön,
Michael Buchholz,
Klaus Dietmayer
Abstract:
Panoptic segmentation is one of the most challenging scene parsing tasks, combining the tasks of semantic segmentation and instance segmentation. While much progress has been made, few works focus on the real-time application of panoptic segmentation methods. In this paper, we revisit the recently introduced K-Net architecture. We propose vital changes to the architecture, training, and inference…
▽ More
Panoptic segmentation is one of the most challenging scene parsing tasks, combining the tasks of semantic segmentation and instance segmentation. While much progress has been made, few works focus on the real-time application of panoptic segmentation methods. In this paper, we revisit the recently introduced K-Net architecture. We propose vital changes to the architecture, training, and inference procedure, which massively decrease latency and improve performance. Our resulting RT-K-Net sets a new state-of-the-art performance for real-time panoptic segmentation methods on the Cityscapes dataset and shows promising results on the challenging Mapillary Vistas dataset. On Cityscapes, RT-K-Net reaches 60.2 % PQ with an average inference time of 32 ms for full resolution 1024x2048 pixel images on a single Titan RTX GPU. On Mapillary Vistas, RT-K-Net reaches 33.2 % PQ with an average inference time of 69 ms. Source code is available at https://github.com/markusschoen/RT-K-Net.
△ Less
Submitted 4 August, 2023; v1 submitted 2 May, 2023;
originally announced May 2023.
-
The Impact of Frame-Dropping on Performance and Energy Consumption for Multi-Object Tracking
Authors:
Matti Henning,
Michael Buchholz,
Klaus Dietmayer
Abstract:
The safety of automated vehicles (AVs) relies on the representation of their environment. Consequently, state-of-the-art AVs employ potent sensor systems to achieve the best possible environment representation at all times. Although these high-performing systems achieve impressive results, they induce significant requirements for the processing capabilities of an AV's computational hardware compon…
▽ More
The safety of automated vehicles (AVs) relies on the representation of their environment. Consequently, state-of-the-art AVs employ potent sensor systems to achieve the best possible environment representation at all times. Although these high-performing systems achieve impressive results, they induce significant requirements for the processing capabilities of an AV's computational hardware components and their energy consumption.
To enable a dynamic adaptation of such perception systems based on the situational perception requirements, we introduce a model-agnostic method for the scalable employment of single-frame object detection models using frame-dropping in tracking-by-detection systems. We evaluate our approach on the KITTI 3D Tracking Benchmark, showing that significant energy savings can be achieved at acceptable performance degradation, reaching up to 28% reduction of energy consumption at a performance decline of 6.6% in HOTA score.
△ Less
Submitted 1 August, 2023; v1 submitted 17 April, 2023;
originally announced April 2023.
-
LMR: Lane Distance-Based Metric for Trajectory Prediction
Authors:
Julian Schmidt,
Thomas Monninger,
Julian Jordan,
Klaus Dietmayer
Abstract:
The development of approaches for trajectory prediction requires metrics to validate and compare their performance. Currently established metrics are based on Euclidean distance, which means that errors are weighted equally in all directions. Euclidean metrics are insufficient for structured environments like roads, since they do not properly capture the agent's intent relative to the underlying l…
▽ More
The development of approaches for trajectory prediction requires metrics to validate and compare their performance. Currently established metrics are based on Euclidean distance, which means that errors are weighted equally in all directions. Euclidean metrics are insufficient for structured environments like roads, since they do not properly capture the agent's intent relative to the underlying lane. In order to provide a reasonable assessment of trajectory prediction approaches with regard to the downstream planning task, we propose a new metric that is lane distance-based: Lane Miss Rate (LMR). For the calculation of LMR, the ground-truth and predicted endpoints are assigned to lane segments, more precisely their centerlines. Measured by the distance along the lane segments, predictions that are within a certain threshold distance to the ground-truth count as hits, otherwise they count as misses. LMR is then defined as the ratio of sequences that yield a miss. Our results on three state-of-the-art trajectory prediction models show that LMR preserves the order of Euclidean distance-based metrics. In contrast to the Euclidean Miss Rate, qualitative results show that LMR yields misses for sequences where predictions are located on wrong lanes. Hits on the other hand result for sequences where predictions are located on the correct lane. This means that LMR implicitly weights Euclidean error relative to the lane and goes into the direction of capturing intents of traffic agents. The source code of LMR for Argoverse 2 is publicly available.
△ Less
Submitted 13 April, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
RESET: Revisiting Trajectory Sets for Conditional Behavior Prediction
Authors:
Julian Schmidt,
Pascal Huissel,
Julian Wiederer,
Julian Jordan,
Vasileios Belagiannis,
Klaus Dietmayer
Abstract:
It is desirable to predict the behavior of traffic participants conditioned on different planned trajectories of the autonomous vehicle. This allows the downstream planner to estimate the impact of its decisions. Recent approaches for conditional behavior prediction rely on a regression decoder, meaning that coordinates or polynomial coefficients are regressed. In this work we revisit set-based tr…
▽ More
It is desirable to predict the behavior of traffic participants conditioned on different planned trajectories of the autonomous vehicle. This allows the downstream planner to estimate the impact of its decisions. Recent approaches for conditional behavior prediction rely on a regression decoder, meaning that coordinates or polynomial coefficients are regressed. In this work we revisit set-based trajectory prediction, where the probability of each trajectory in a predefined trajectory set is determined by a classification model, and first-time employ it to the task of conditional behavior prediction. We propose RESET, which combines a new metric-driven algorithm for trajectory set generation with a graph-based encoder. For unconditional prediction, RESET achieves comparable performance to a regression-based approach. Due to the nature of set-based approaches, it has the advantageous property of being able to predict a flexible number of trajectories without influencing runtime or complexity. For conditional prediction, RESET achieves reasonable results with late fusion of the planned trajectory, which was not observed for regression-based approaches before. This means that RESET is computationally lightweight to combine with a planner that proposes multiple future plans of the autonomous vehicle, as large parts of the forward pass can be reused.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
Tackling Clutter in Radar Data -- Label Generation and Detection Using PointNet++
Authors:
Johannes Kopp,
Dominik Kellner,
Aldi Piroli,
Klaus Dietmayer
Abstract:
Radar sensors employed for environment perception, e.g. in autonomous vehicles, output a lot of unwanted clutter. These points, for which no corresponding real objects exist, are a major source of errors in following processing steps like object detection or tracking. We therefore present two novel neural network setups for identifying clutter. The input data, network architectures and training co…
▽ More
Radar sensors employed for environment perception, e.g. in autonomous vehicles, output a lot of unwanted clutter. These points, for which no corresponding real objects exist, are a major source of errors in following processing steps like object detection or tracking. We therefore present two novel neural network setups for identifying clutter. The input data, network architectures and training configuration are adjusted specifically for this task. Special attention is paid to the downsampling of point clouds composed of multiple sensor scans. In an extensive evaluation, the new setups display substantially better performance than existing approaches. Because there is no suitable public data set in which clutter is annotated, we design a method to automatically generate the respective labels. By applying it to existing data with object annotations and releasing its code, we effectively create the first freely available radar clutter data set representing real-world driving scenarios. Code and instructions are accessible at www.github.com/kopp-j/clutter-ds.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Gesture Recognition with Keypoint and Radar Stream Fusion for Automated Vehicles
Authors:
Adrian Holzbock,
Nicolai Kern,
Christian Waldschmidt,
Klaus Dietmayer,
Vasileios Belagiannis
Abstract:
We present a joint camera and radar approach to enable autonomous vehicles to understand and react to human gestures in everyday traffic. Initially, we process the radar data with a PointNet followed by a spatio-temporal multilayer perceptron (stMLP). Independently, the human body pose is extracted from the camera frame and processed with a separate stMLP network. We propose a fusion neural networ…
▽ More
We present a joint camera and radar approach to enable autonomous vehicles to understand and react to human gestures in everyday traffic. Initially, we process the radar data with a PointNet followed by a spatio-temporal multilayer perceptron (stMLP). Independently, the human body pose is extracted from the camera frame and processed with a separate stMLP network. We propose a fusion neural network for both modalities, including an auxiliary loss for each modality. In our experiments with a collected dataset, we show the advantages of gesture recognition with two modalities. Motivated by adverse weather conditions, we also demonstrate promising performance when one of the sensors lacks functionality.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
Exploring Navigation Maps for Learning-Based Motion Prediction
Authors:
Julian Schmidt,
Julian Jordan,
Franz Gritschneder,
Thomas Monninger,
Klaus Dietmayer
Abstract:
The prediction of surrounding agents' motion is a key for safe autonomous driving. In this paper, we explore navigation maps as an alternative to the predominant High Definition (HD) maps for learning-based motion prediction. Navigation maps provide topological and geometrical information on road-level, HD maps additionally have centimeter-accurate lane-level information. As a result, HD maps are…
▽ More
The prediction of surrounding agents' motion is a key for safe autonomous driving. In this paper, we explore navigation maps as an alternative to the predominant High Definition (HD) maps for learning-based motion prediction. Navigation maps provide topological and geometrical information on road-level, HD maps additionally have centimeter-accurate lane-level information. As a result, HD maps are costly and time-consuming to obtain, while navigation maps with near-global coverage are freely available. We describe an approach to integrate navigation maps into learning-based motion prediction models. To exploit locally available HD maps during training, we additionally propose a model-agnostic method for knowledge distillation. In experiments on the publicly available Argoverse dataset with navigation maps obtained from OpenStreetMap, our approach shows a significant improvement over not using a map at all. Combined with our method for knowledge distillation, we achieve results that are close to the original HD map-reliant models. Our publicly available navigation map API for Argoverse enables researchers to develop and evaluate their own approaches using navigation maps.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
SCENE: Reasoning about Traffic Scenes using Heterogeneous Graph Neural Networks
Authors:
Thomas Monninger,
Julian Schmidt,
Jan Rupprecht,
David Raba,
Julian Jordan,
Daniel Frank,
Steffen Staab,
Klaus Dietmayer
Abstract:
Understanding traffic scenes requires considering heterogeneous information about dynamic agents and the static infrastructure. In this work we propose SCENE, a methodology to encode diverse traffic scenes in heterogeneous graphs and to reason about these graphs using a heterogeneous Graph Neural Network encoder and task-specific decoders. The heterogeneous graphs, whose structures are defined by…
▽ More
Understanding traffic scenes requires considering heterogeneous information about dynamic agents and the static infrastructure. In this work we propose SCENE, a methodology to encode diverse traffic scenes in heterogeneous graphs and to reason about these graphs using a heterogeneous Graph Neural Network encoder and task-specific decoders. The heterogeneous graphs, whose structures are defined by an ontology, consist of different nodes with type-specific node features and different relations with type-specific edge features. In order to exploit all the information given by these graphs, we propose to use cascaded layers of graph convolution. The result is an encoding of the scene. Task-specific decoders can be applied to predict desired attributes of the scene. Extensive evaluation on two diverse binary node classification tasks show the main strength of this methodology: despite being generic, it even manages to outperform task-specific baselines. The further application of our methodology to the task of node classification in various knowledge graphs shows its transferability to other domains.
△ Less
Submitted 9 January, 2023;
originally announced January 2023.
-
Can Transformer Attention Spread Give Insights Into Uncertainty of Detected and Tracked Objects?
Authors:
Felicia Ruppel,
Florian Faion,
Claudius Gläser,
Klaus Dietmayer
Abstract:
Transformers have recently been utilized to perform object detection and tracking in the context of autonomous driving. One unique characteristic of these models is that attention weights are computed in each forward pass, giving insights into the model's interior, in particular, which part of the input data it deemed interesting for the given task. Such an attention matrix with the input grid is…
▽ More
Transformers have recently been utilized to perform object detection and tracking in the context of autonomous driving. One unique characteristic of these models is that attention weights are computed in each forward pass, giving insights into the model's interior, in particular, which part of the input data it deemed interesting for the given task. Such an attention matrix with the input grid is available for each detected (or tracked) object in every transformer decoder layer. In this work, we investigate the distribution of these attention weights: How do they change through the decoder layers and through the lifetime of a track? Can they be used to infer additional information about an object, such as a detection uncertainty? Especially in unstructured environments, or environments that were not common during training, a reliable measure of detection uncertainty is crucial to decide whether the system can still be trusted or not.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Transformers for Object Detection in Large Point Clouds
Authors:
Felicia Ruppel,
Florian Faion,
Claudius Gläser,
Klaus Dietmayer
Abstract:
We present TransLPC, a novel detection model for large point clouds that is based on a transformer architecture. While object detection with transformers has been an active field of research, it has proved difficult to apply such models to point clouds that span a large area, e.g. those that are common in autonomous driving, with lidar or radar data. TransLPC is able to remedy these issues: The st…
▽ More
We present TransLPC, a novel detection model for large point clouds that is based on a transformer architecture. While object detection with transformers has been an active field of research, it has proved difficult to apply such models to point clouds that span a large area, e.g. those that are common in autonomous driving, with lidar or radar data. TransLPC is able to remedy these issues: The structure of the transformer model is modified to allow for larger input sequence lengths, which are sufficient for large point clouds. Besides this, we propose a novel query refinement technique to improve detection accuracy, while retaining a memory-friendly number of transformer decoder queries. The queries are repositioned between layers, moving them closer to the bounding box they are estimating, in an efficient manner. This simple technique has a significant effect on detection accuracy, which is evaluated on the challenging nuScenes dataset on real-world lidar data. Besides this, the proposed method is compatible with existing transformer-based solutions that require object detection, e.g. for joint multi-object tracking and detection, and enables them to be used in conjunction with large point clouds.
△ Less
Submitted 30 September, 2022;
originally announced September 2022.
-
A Benchmark for Unsupervised Anomaly Detection in Multi-Agent Trajectories
Authors:
Julian Wiederer,
Julian Schmidt,
Ulrich Kressel,
Klaus Dietmayer,
Vasileios Belagiannis
Abstract:
Human intuition allows to detect abnormal driving scenarios in situations they never experienced before. Like humans detect those abnormal situations and take countermeasures to prevent collisions, self-driving cars need anomaly detection mechanisms. However, the literature lacks a standard benchmark for the comparison of anomaly detection algorithms. We fill the gap and propose the R-U-MAAD bench…
▽ More
Human intuition allows to detect abnormal driving scenarios in situations they never experienced before. Like humans detect those abnormal situations and take countermeasures to prevent collisions, self-driving cars need anomaly detection mechanisms. However, the literature lacks a standard benchmark for the comparison of anomaly detection algorithms. We fill the gap and propose the R-U-MAAD benchmark for unsupervised anomaly detection in multi-agent trajectories. The goal is to learn a representation of the normal driving from the training sequences without labels, and afterwards detect anomalies. We use the Argoverse Motion Forecasting dataset for the training and propose a test dataset of 160 sequences with human-annotated anomalies in urban environments. To this end we combine a replay of real-world trajectories and scene-dependent abnormal driving in the simulation. In our experiments we compare 11 baselines including linear models, deep auto-encoders and one-class classification models using standard anomaly detection metrics. The deep reconstruction and end-to-end one-class methods show promising results. The benchmark and the baseline models will be publicly available.
△ Less
Submitted 5 September, 2022;
originally announced September 2022.
-
Detection of Condensed Vehicle Gas Exhaust in LiDAR Point Clouds
Authors:
Aldi Piroli,
Vinzenz Dallabetta,
Marc Walessa,
Daniel Meissner,
Johannes Kopp,
Klaus Dietmayer
Abstract:
LiDAR sensors used in autonomous driving applications are negatively affected by adverse weather conditions. One common, but understudied effect, is the condensation of vehicle gas exhaust in cold weather. This everyday phenomenon can severely impact the quality of LiDAR measurements, resulting in a less accurate environment perception by creating artifacts like ghost object detections. In the lit…
▽ More
LiDAR sensors used in autonomous driving applications are negatively affected by adverse weather conditions. One common, but understudied effect, is the condensation of vehicle gas exhaust in cold weather. This everyday phenomenon can severely impact the quality of LiDAR measurements, resulting in a less accurate environment perception by creating artifacts like ghost object detections. In the literature, the semantic segmentation of adverse weather effects like rain and fog is achieved using learning-based approaches. However, such methods require large sets of labeled data, which can be extremely expensive and laborious to get. We address this problem by presenting a two-step approach for the detection of condensed vehicle gas exhaust. First, we identify for each vehicle in a scene its emission area and detect gas exhaust if present. Then, isolated clouds are detected by modeling through time the regions of space where gas exhaust is likely to be present. We test our method on real urban data, showing that our approach can reliably detect gas exhaust in different scenarios, making it appealing for offline pre-labeling and online applications such as ghost object detection.
△ Less
Submitted 11 July, 2022;
originally announced July 2022.
-
Identification of Threat Regions From a Dynamic Occupancy Grid Map for Situation-Aware Environment Perception
Authors:
Matti Henning,
Jan Strohbeck,
Michael Buchholz,
Klaus Dietmayer
Abstract:
The advance towards higher levels of automation within the field of automated driving is accompanied by increasing requirements for the operational safety of vehicles. Induced by the limitation of computational resources, trade-offs between the computational complexity of algorithms and their potential to ensure safe operation of automated vehicles are often encountered. Situation-aware environmen…
▽ More
The advance towards higher levels of automation within the field of automated driving is accompanied by increasing requirements for the operational safety of vehicles. Induced by the limitation of computational resources, trade-offs between the computational complexity of algorithms and their potential to ensure safe operation of automated vehicles are often encountered. Situation-aware environment perception presents one promising example, where computational resources are distributed to regions within the perception area that are relevant for the task of the automated vehicle. While prior map knowledge is often leveraged to identify relevant regions, in this work, we present a lightweight identification of safety-relevant regions that relies solely on online information. We show that our approach enables safe vehicle operation in critical scenarios, while retaining the benefits of non-uniformly distributed resources within the environment perception.
△ Less
Submitted 14 February, 2023; v1 submitted 5 July, 2022;
originally announced July 2022.
-
Situation-Aware Environment Perception for Decentralized Automation Architectures
Authors:
Matti Henning,
Michael Buchholz,
Klaus Dietmayer
Abstract:
Advances in the field of environment perception for automated agents have resulted in an ongoing increase in generated sensor data. The available computational resources to process these data are bound to become insufficient for real-time applications. Reducing the amount of data to be processed by identifying the most relevant data based on the agents' situation, often referred to as situation-aw…
▽ More
Advances in the field of environment perception for automated agents have resulted in an ongoing increase in generated sensor data. The available computational resources to process these data are bound to become insufficient for real-time applications. Reducing the amount of data to be processed by identifying the most relevant data based on the agents' situation, often referred to as situation-awareness, has gained increasing research interest, and the importance of complementary approaches is expected to increase further in the near future. In this work, we extend the applicability range of our recently introduced concept for situation-aware environment perception to the decentralized automation architecture of the UNICARagil project. Considering the specific driving capabilities of the vehicle and using real-world data on target hardware in a post-processing manner, we provide an estimate for the daily reduction in power consumption that accumulates to 36.2%. While achieving these promising results, we additionally show the need to consider scalability in data processing in the design of software modules as well as in the design of functional systems if the benefits of situation-awareness shall be leveraged optimally.
△ Less
Submitted 26 July, 2022; v1 submitted 5 July, 2022;
originally announced July 2022.
-
MotionMixer: MLP-based 3D Human Body Pose Forecasting
Authors:
Arij Bouazizi,
Adrian Holzbock,
Ulrich Kressel,
Klaus Dietmayer,
Vasileios Belagiannis
Abstract:
In this work, we present MotionMixer, an efficient 3D human body pose forecasting model based solely on multi-layer perceptrons (MLPs). MotionMixer learns the spatial-temporal 3D body pose dependencies by sequentially mixing both modalities. Given a stacked sequence of 3D body poses, a spatial-MLP extracts fine grained spatial dependencies of the body joints. The interaction of the body joints ove…
▽ More
In this work, we present MotionMixer, an efficient 3D human body pose forecasting model based solely on multi-layer perceptrons (MLPs). MotionMixer learns the spatial-temporal 3D body pose dependencies by sequentially mixing both modalities. Given a stacked sequence of 3D body poses, a spatial-MLP extracts fine grained spatial dependencies of the body joints. The interaction of the body joints over time is then modelled by a temporal MLP. The spatial-temporal mixed features are finally aggregated and decoded to obtain the future motion. To calibrate the influence of each time step in the pose sequence, we make use of squeeze-and-excitation (SE) blocks. We evaluate our approach on Human3.6M, AMASS, and 3DPW datasets using the standard evaluation protocols. For all evaluations, we demonstrate state-of-the-art performance, while having a model with a smaller number of parameters. Our code is available at: https://github.com/MotionMLP/MotionMixer
△ Less
Submitted 1 July, 2022;
originally announced July 2022.
-
MGNet: Monocular Geometric Scene Understanding for Autonomous Driving
Authors:
Markus Schön,
Michael Buchholz,
Klaus Dietmayer
Abstract:
We introduce MGNet, a multi-task framework for monocular geometric scene understanding. We define monocular geometric scene understanding as the combination of two known tasks: Panoptic segmentation and self-supervised monocular depth estimation. Panoptic segmentation captures the full scene not only semantically, but also on an instance basis. Self-supervised monocular depth estimation uses geome…
▽ More
We introduce MGNet, a multi-task framework for monocular geometric scene understanding. We define monocular geometric scene understanding as the combination of two known tasks: Panoptic segmentation and self-supervised monocular depth estimation. Panoptic segmentation captures the full scene not only semantically, but also on an instance basis. Self-supervised monocular depth estimation uses geometric constraints derived from the camera measurement model in order to measure depth from monocular video sequences only. To the best of our knowledge, we are the first to propose the combination of these two tasks in one single model. Our model is designed with focus on low latency to provide fast inference in real-time on a single consumer-grade GPU. During deployment, our model produces dense 3D point clouds with instance aware semantic labels from single high-resolution camera images. We evaluate our model on two popular autonomous driving benchmarks, i.e., Cityscapes and KITTI, and show competitive performance among other real-time capable methods. Source code is available at https://github.com/markusschoen/MGNet.
△ Less
Submitted 27 June, 2022;
originally announced June 2022.
-
Self-Assessment for Single-Object Tracking in Clutter Using Subjective Logic
Authors:
Thomas Griebel,
Johannes Müller,
Paul Geisler,
Charlotte Hermann,
Martin Herrmann,
Michael Buchholz,
Klaus Dietmayer
Abstract:
Reliable tracking algorithms are essential for automated driving. However, the existing consistency measures are not sufficient to meet the increasing safety demands in the automotive sector. Therefore, this work presents a novel method for self-assessment of single-object tracking in clutter based on Kalman filtering and subjective logic. A key feature of the approach is that it additionally prov…
▽ More
Reliable tracking algorithms are essential for automated driving. However, the existing consistency measures are not sufficient to meet the increasing safety demands in the automotive sector. Therefore, this work presents a novel method for self-assessment of single-object tracking in clutter based on Kalman filtering and subjective logic. A key feature of the approach is that it additionally provides a measure of the collected statistical evidence in its online reliability scores. In this way, various aspects of reliability, such as the correctness of the assumed measurement noise, detection probability, and clutter rate, can be monitored in addition to the overall assessment based on the available evidence. Here, we present a mathematical derivation of the reference distribution used in our self-assessment module for our studied problem. Moreover, we introduce a formula that describes how a threshold should be chosen for the degree of conflict, the subjective logic comparison measure used for the reliability decision making. Our approach is evaluated in a challenging simulation scenario designed to model adverse weather conditions. The simulations show that our method can significantly improve the reliability checking of single-object tracking in clutter in several aspects.
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
MEAT: Maneuver Extraction from Agent Trajectories
Authors:
Julian Schmidt,
Julian Jordan,
David Raba,
Tobias Welz,
Klaus Dietmayer
Abstract:
Advances in learning-based trajectory prediction are enabled by large-scale datasets. However, in-depth analysis of such datasets is limited. Moreover, the evaluation of prediction models is limited to metrics averaged over all samples in the dataset. We propose an automated methodology that allows to extract maneuvers (e.g., left turn, lane change) from agent trajectories in such datasets. The me…
▽ More
Advances in learning-based trajectory prediction are enabled by large-scale datasets. However, in-depth analysis of such datasets is limited. Moreover, the evaluation of prediction models is limited to metrics averaged over all samples in the dataset. We propose an automated methodology that allows to extract maneuvers (e.g., left turn, lane change) from agent trajectories in such datasets. The methodology considers information about the agent dynamics and information about the lane segments the agent traveled along. Although it is possible to use the resulting maneuvers for training classification networks, we exemplary use them for extensive trajectory dataset analysis and maneuver-specific evaluation of multiple state-of-the-art trajectory prediction models. Additionally, an analysis of the datasets and an evaluation of the prediction models based on the agent dynamics is provided.
△ Less
Submitted 10 June, 2022;
originally announced June 2022.
-
Transformers for Multi-Object Tracking on Point Clouds
Authors:
Felicia Ruppel,
Florian Faion,
Claudius Gläser,
Klaus Dietmayer
Abstract:
We present TransMOT, a novel transformer-based end-to-end trainable online tracker and detector for point cloud data. The model utilizes a cross- and a self-attention mechanism and is applicable to lidar data in an automotive context, as well as other data types, such as radar. Both track management and the detection of new tracks are performed by the same transformer decoder module and the tracke…
▽ More
We present TransMOT, a novel transformer-based end-to-end trainable online tracker and detector for point cloud data. The model utilizes a cross- and a self-attention mechanism and is applicable to lidar data in an automotive context, as well as other data types, such as radar. Both track management and the detection of new tracks are performed by the same transformer decoder module and the tracker state is encoded in feature space. With this approach, we make use of the rich latent space of the detector for tracking rather than relying on low-dimensional bounding boxes. Still, we are able to retain some of the desirable properties of traditional Kalman-filter based approaches, such as an ability to handle sensor input at arbitrary timesteps or to compensate frame skips. This is possible due to a novel module that transforms the track information from one frame to the next on feature-level and thereby fulfills a similar task as the prediction step of a Kalman filter. Results are presented on the challenging real-world dataset nuScenes, where the proposed model outperforms its Kalman filter-based tracking baseline.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
Robust 3D Object Detection in Cold Weather Conditions
Authors:
Aldi Piroli,
Vinzenz Dallabetta,
Marc Walessa,
Daniel Meissner,
Johannes Kopp,
Klaus Dietmayer
Abstract:
Adverse weather conditions can negatively affect LiDAR-based object detectors. In this work, we focus on the phenomenon of vehicle gas exhaust condensation in cold weather conditions. This everyday effect can influence the estimation of object sizes, orientations and introduce ghost object detections, compromising the reliability of the state of the art object detectors. We propose to solve this p…
▽ More
Adverse weather conditions can negatively affect LiDAR-based object detectors. In this work, we focus on the phenomenon of vehicle gas exhaust condensation in cold weather conditions. This everyday effect can influence the estimation of object sizes, orientations and introduce ghost object detections, compromising the reliability of the state of the art object detectors. We propose to solve this problem by using data augmentation and a novel training loss term. To effectively train deep neural networks, a large set of labeled data is needed. In case of adverse weather conditions, this process can be extremely laborious and expensive. We address this issue in two steps: First, we present a gas exhaust data generation method based on 3D surface reconstruction and sampling which allows us to generate large sets of gas exhaust clouds from a small pool of labeled data. Second, we introduce a point cloud augmentation process that can be used to add gas exhaust to datasets recorded in good weather conditions. Finally, we formulate a new training loss term that leverages the augmented point cloud to increase object detection robustness by penalizing predictions that include noise. In contrast to other works, our method can be used with both grid-based and point-based detectors. Moreover, since our approach does not require any network architecture changes, inference times remain unchanged. Experimental results on real data show that our proposed method greatly increases robustness to gas exhaust and noisy data.
△ Less
Submitted 25 July, 2022; v1 submitted 24 May, 2022;
originally announced May 2022.
-
A Spatio-Temporal Multilayer Perceptron for Gesture Recognition
Authors:
Adrian Holzbock,
Alexander Tsaregorodtsev,
Youssef Dawoud,
Klaus Dietmayer,
Vasileios Belagiannis
Abstract:
Gesture recognition is essential for the interaction of autonomous vehicles with humans. While the current approaches focus on combining several modalities like image features, keypoints and bone vectors, we present neural network architecture that delivers state-of-the-art results only with body skeleton input data. We propose the spatio-temporal multilayer perceptron for gesture recognition in t…
▽ More
Gesture recognition is essential for the interaction of autonomous vehicles with humans. While the current approaches focus on combining several modalities like image features, keypoints and bone vectors, we present neural network architecture that delivers state-of-the-art results only with body skeleton input data. We propose the spatio-temporal multilayer perceptron for gesture recognition in the context of autonomous vehicles. Given 3D body poses over time, we define temporal and spatial mixing operations to extract features in both domains. Additionally, the importance of each time step is re-weighted with Squeeze-and-Excitation layers. An extensive evaluation of the TCG and Drive&Act datasets is provided to showcase the promising performance of our approach. Furthermore, we deploy our model to our autonomous vehicle to show its real-time capability and stable execution.
△ Less
Submitted 18 August, 2022; v1 submitted 25 April, 2022;
originally announced April 2022.
-
CRAT-Pred: Vehicle Trajectory Prediction with Crystal Graph Convolutional Neural Networks and Multi-Head Self-Attention
Authors:
Julian Schmidt,
Julian Jordan,
Franz Gritschneder,
Klaus Dietmayer
Abstract:
Predicting the motion of surrounding vehicles is essential for autonomous vehicles, as it governs their own motion plan. Current state-of-the-art vehicle prediction models heavily rely on map information. In reality, however, this information is not always available. We therefore propose CRAT-Pred, a multi-modal and non-rasterization-based trajectory prediction model, specifically designed to effe…
▽ More
Predicting the motion of surrounding vehicles is essential for autonomous vehicles, as it governs their own motion plan. Current state-of-the-art vehicle prediction models heavily rely on map information. In reality, however, this information is not always available. We therefore propose CRAT-Pred, a multi-modal and non-rasterization-based trajectory prediction model, specifically designed to effectively model social interactions between vehicles, without relying on map information. CRAT-Pred applies a graph convolution method originating from the field of material science to vehicle prediction, allowing to efficiently leverage edge features, and combines it with multi-head self-attention. Compared to other map-free approaches, the model achieves state-of-the-art performance with a significantly lower number of model parameters. In addition to that, we quantitatively show that the self-attention mechanism is able to learn social interactions between vehicles, with the weights representing a measurable interaction score. The source code is publicly available.
△ Less
Submitted 10 February, 2022; v1 submitted 9 February, 2022;
originally announced February 2022.
-
A Multi-Task Recurrent Neural Network for End-to-End Dynamic Occupancy Grid Mapping
Authors:
Marcel Schreiber,
Vasileios Belagiannis,
Claudius Gläser,
Klaus Dietmayer
Abstract:
A common approach for modeling the environment of an autonomous vehicle are dynamic occupancy grid maps, in which the surrounding is divided into cells, each containing the occupancy and velocity state of its location. Despite the advantage of modeling arbitrary shaped objects, the used algorithms rely on hand-designed inverse sensor models and semantic information is missing. Therefore, we introd…
▽ More
A common approach for modeling the environment of an autonomous vehicle are dynamic occupancy grid maps, in which the surrounding is divided into cells, each containing the occupancy and velocity state of its location. Despite the advantage of modeling arbitrary shaped objects, the used algorithms rely on hand-designed inverse sensor models and semantic information is missing. Therefore, we introduce a multi-task recurrent neural network to predict grid maps providing occupancies, velocity estimates, semantic information and the driveable area. During training, our network architecture, which is a combination of convolutional and recurrent layers, processes sequences of raw lidar data, that is represented as bird's eye view images with several height channels. The multi-task network is trained in an end-to-end fashion to predict occupancy grid maps without the usual preprocessing steps consisting of removing ground points and applying an inverse sensor model. In our evaluations, we show that our learned inverse sensor model is able to overcome some limitations of a geometric inverse sensor model in terms of representing object shapes and modeling freespace. Moreover, we report a better runtime performance and more accurate semantic predictions for our end-to-end approach, compared to our network relying on measurement grid maps as input data.
△ Less
Submitted 5 May, 2022; v1 submitted 9 February, 2022;
originally announced February 2022.
-
Globally Optimal Multi-Scale Monocular Hand-Eye Calibration Using Dual Quaternions
Authors:
Thomas Wodtko,
Markus Horn,
Michael Buchholz,
Klaus Dietmayer
Abstract:
In this work, we present an approach for monocular hand-eye calibration from per-sensor ego-motion based on dual quaternions. Due to non-metrically scaled translations of monocular odometry, a scaling factor has to be estimated in addition to the rotation and translation calibration. For this, we derive a quadratically constrained quadratic program that allows a combined estimation of all extrinsi…
▽ More
In this work, we present an approach for monocular hand-eye calibration from per-sensor ego-motion based on dual quaternions. Due to non-metrically scaled translations of monocular odometry, a scaling factor has to be estimated in addition to the rotation and translation calibration. For this, we derive a quadratically constrained quadratic program that allows a combined estimation of all extrinsic calibration parameters. Using dual quaternions leads to low run-times due to their compact representation. Our problem formulation further allows to estimate multiple scalings simultaneously for different sequences of the same sensor setup. Based on our problem formulation, we derive both, a fast local and a globally optimal solving approach. Finally, our algorithms are evaluated and compared to state-of-the-art approaches on simulated and real-world data, e.g., the EuRoC MAV dataset.
△ Less
Submitted 12 January, 2022;
originally announced January 2022.
-
Situation-Aware Environment Perception Using a Multi-Layer Attention Map
Authors:
Matti Henning,
Johannes Müller,
Fabian Gies,
Michael Buchholz,
Klaus Dietmayer
Abstract:
Within the field of automated driving, a clear trend in environment perception tends towards more sensors, higher redundancy, and overall increase in computational power. This is mainly driven by the paradigm to perceive the entire environment as best as possible at all times. However, due to the ongoing rise in functional complexity, compromises have to be considered to ensure real-time capabilit…
▽ More
Within the field of automated driving, a clear trend in environment perception tends towards more sensors, higher redundancy, and overall increase in computational power. This is mainly driven by the paradigm to perceive the entire environment as best as possible at all times. However, due to the ongoing rise in functional complexity, compromises have to be considered to ensure real-time capabilities of the perception system.
In this work, we introduce a concept for situation-aware environment perception to control the resource allocation towards processing relevant areas within the data as well as towards employing only a subset of functional modules for environment perception, if sufficient for the current driving task. Specifically, we propose to evaluate the context of an automated vehicle to derive a multi-layer attention map (MLAM) that defines relevant areas. Using this MLAM, the optimum of active functional modules is dynamically configured and intra-module processing of only relevant data is enforced.
We outline the feasibility of application of our concept using real-world data in a straight-forward implementation for our system at hand. While retaining overall functionality, we achieve a reduction of accumulated processing time of 59%.
△ Less
Submitted 4 April, 2022; v1 submitted 2 December, 2021;
originally announced December 2021.
-
Anomaly Detection in Radar Data Using PointNets
Authors:
Thomas Griebel,
Dominik Authaler,
Markus Horn,
Matti Henning,
Michael Buchholz,
Klaus Dietmayer
Abstract:
For autonomous driving, radar is an important sensor type. On the one hand, radar offers a direct measurement of the radial velocity of targets in the environment. On the other hand, in literature, radar sensors are known for their robustness against several kinds of adverse weather conditions. However, on the downside, radar is susceptible to ghost targets or clutter which can be caused by severa…
▽ More
For autonomous driving, radar is an important sensor type. On the one hand, radar offers a direct measurement of the radial velocity of targets in the environment. On the other hand, in literature, radar sensors are known for their robustness against several kinds of adverse weather conditions. However, on the downside, radar is susceptible to ghost targets or clutter which can be caused by several different causes, e.g., reflective surfaces in the environment. Ghost targets, for instance, can result in erroneous object detections. To this end, it is desirable to identify anomalous targets as early as possible in radar data. In this work, we present an approach based on PointNets to detect anomalous radar targets. Modifying the PointNet-architecture driven by our task, we developed a novel grouping variant which contributes to a multi-form grouping module. Our method is evaluated on a real-world dataset in urban scenarios and shows promising results for the detection of anomalous radar targets.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Fast Rule-Based Clutter Detection in Automotive Radar Data
Authors:
Johannes Kopp,
Dominik Kellner,
Aldi Piroli,
Klaus Dietmayer
Abstract:
Automotive radar sensors output a lot of unwanted clutter or ghost detections, whose position and velocity do not correspond to any real object in the sensor's field of view. This poses a substantial challenge for environment perception methods like object detection or tracking. Especially problematic are clutter detections that occur in groups or at similar locations in multiple consecutive measu…
▽ More
Automotive radar sensors output a lot of unwanted clutter or ghost detections, whose position and velocity do not correspond to any real object in the sensor's field of view. This poses a substantial challenge for environment perception methods like object detection or tracking. Especially problematic are clutter detections that occur in groups or at similar locations in multiple consecutive measurements. In this paper, a new algorithm for identifying such erroneous detections is presented. It is mainly based on the modeling of specific commonly occurring wave propagation paths that lead to clutter. In particular, the three effects explicitly covered are reflections at the underbody of a car or truck, signals traveling back and forth between the vehicle on which the sensor is mounted and another object, and multipath propagation via specular reflection. The latter often occurs near guardrails, concrete walls or similar reflective surfaces. Each of these effects is described both theoretically and regarding a method for identifying the corresponding clutter detections. Identification is done by analyzing detections generated from a single sensor measurement only. The final algorithm is evaluated on recordings of real extra-urban traffic. For labeling, a semi-automatic process is employed. The results are promising, both in terms of performance and regarding the very low execution time. Typically, a large part of clutter is found, while only a small ratio of detections corresponding to real objects are falsely classified by the algorithm.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
GenRadar: Self-supervised Probabilistic Camera Synthesis based on Radar Frequencies
Authors:
Carsten Ditzel,
Klaus Dietmayer
Abstract:
Autonomous systems require a continuous and dependable environment perception for navigation and decision-making, which is best achieved by combining different sensor types. Radar continues to function robustly in compromised circumstances in which cameras become impaired, guaranteeing a steady inflow of information. Yet, camera images provide a more intuitive and readily applicable impression of…
▽ More
Autonomous systems require a continuous and dependable environment perception for navigation and decision-making, which is best achieved by combining different sensor types. Radar continues to function robustly in compromised circumstances in which cameras become impaired, guaranteeing a steady inflow of information. Yet, camera images provide a more intuitive and readily applicable impression of the world. This work combines the complementary strengths of both sensor types in a unique self-learning fusion approach for a probabilistic scene reconstruction in adverse surrounding conditions. After reducing the memory requirements of both high-dimensional measurements through a decoupled stochastic self-supervised compression technique, the proposed algorithm exploits similarities and establishes correspondences between both domains at different feature levels during training. Then, at inference time, relying exclusively on radio frequencies, the model successively predicts camera constituents in an autoregressive and self-contained process. These discrete tokens are finally transformed back into an instructive view of the respective surrounding, allowing to visually perceive potential dangers for important tasks downstream.
△ Less
Submitted 19 July, 2021;
originally announced July 2021.
-
Attention-based Vehicle Self-Localization with HD Feature Maps
Authors:
Nico Engel,
Vasileios Belagiannis,
Klaus Dietmayer
Abstract:
We present a vehicle self-localization method using point-based deep neural networks. Our approach processes measurements and point features, i.e. landmarks, from a high-definition digital map to infer the vehicle's pose. To learn the best association and incorporate local information between the point sets, we propose an attention mechanism that matches the measurements to the corresponding landm…
▽ More
We present a vehicle self-localization method using point-based deep neural networks. Our approach processes measurements and point features, i.e. landmarks, from a high-definition digital map to infer the vehicle's pose. To learn the best association and incorporate local information between the point sets, we propose an attention mechanism that matches the measurements to the corresponding landmarks. Finally, we use this representation for the point-cloud registration and the subsequent pose regression task. Furthermore, we introduce a training simulation framework that artificially generates measurements and landmarks to facilitate the deployment process and reduce the cost of creating extensive datasets from real-world data. We evaluate our method on our dataset, as well as an adapted version of the Kitti odometry dataset, where we achieve superior performance compared to related approaches; and additionally show dominant generalization capabilities.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
Motion Classification and Height Estimation of Pedestrians Using Sparse Radar Data
Authors:
Markus Horn,
Ole Schumann,
Markus Hahn,
Jürgen Dickmann,
Klaus Dietmayer
Abstract:
A complete overview of the surrounding vehicle environment is important for driver assistance systems and highly autonomous driving. Fusing results of multiple sensor types like camera, radar and lidar is crucial for increasing the robustness. The detection and classification of objects like cars, bicycles or pedestrians has been analyzed in the past for many sensor types. Beyond that, it is also…
▽ More
A complete overview of the surrounding vehicle environment is important for driver assistance systems and highly autonomous driving. Fusing results of multiple sensor types like camera, radar and lidar is crucial for increasing the robustness. The detection and classification of objects like cars, bicycles or pedestrians has been analyzed in the past for many sensor types. Beyond that, it is also helpful to refine these classes and distinguish for example between different pedestrian types or activities. This task is usually performed on camera data, though recent developments are based on radar spectrograms. However, for most automotive radar systems, it is only possible to obtain radar targets instead of the original spectrograms. This work demonstrates that it is possible to estimate the body height of walking pedestrians using 2D radar targets. Furthermore, different pedestrian motion types are classified.
△ Less
Submitted 3 March, 2021;
originally announced March 2021.
-
Graph-based Motion Planning for Automated Vehicles using Multi-model Branching and Admissible Heuristics
Authors:
Oliver Speidel,
Jona Ruof,
Klaus Dietmayer
Abstract:
Automated driving in urban scenarios requires efficient planning algorithms able to handle complex situations in real-time. A popular approach is to use graph-based planning methods in order to obtain a rough trajectory which is subsequently optimized. A key aspect is the generation of trajectories implementing comfortable and safe behavior already during graph-search while keeping computation tim…
▽ More
Automated driving in urban scenarios requires efficient planning algorithms able to handle complex situations in real-time. A popular approach is to use graph-based planning methods in order to obtain a rough trajectory which is subsequently optimized. A key aspect is the generation of trajectories implementing comfortable and safe behavior already during graph-search while keeping computation times low. To capture this aspect, on the one hand, a branching strategy is presented in this work that leads to better performance in terms of quality of resulting trajectories and runtime. On the other hand, admissible heuristics are shown which guide the graph-search efficiently, where the solution remains optimal.
△ Less
Submitted 15 February, 2021;
originally announced February 2021.
-
Online Extrinsic Calibration based on Per-Sensor Ego-Motion Using Dual Quaternions
Authors:
Markus Horn,
Thomas Wodtko,
Michael Buchholz,
Klaus Dietmayer
Abstract:
In this work, we propose an approach for extrinsic sensor calibration from per-sensor ego-motion estimates. Our problem formulation is based on dual quaternions, enabling two different online capable solving approaches. We provide a certifiable globally optimal and a fast local approach along with a method to verify the globality of the local approach. Additionally, means for integrating previous…
▽ More
In this work, we propose an approach for extrinsic sensor calibration from per-sensor ego-motion estimates. Our problem formulation is based on dual quaternions, enabling two different online capable solving approaches. We provide a certifiable globally optimal and a fast local approach along with a method to verify the globality of the local approach. Additionally, means for integrating previous knowledge, for example, a common ground plane for planar sensor motion, are described. Our algorithms are evaluated on simulated data and on a publicly available dataset containing RGB-D camera images. Further, our online calibration approach is tested on the KITTI odometry dataset, which provides data of a lidar and two stereo camera systems mounted on a vehicle. Our evaluation confirms the short run time, state-of-the-art accuracy, as well as online capability of our approach while retaining the global optimality of the solution at any time.
△ Less
Submitted 1 March, 2021; v1 submitted 27 January, 2021;
originally announced January 2021.
-
Labels Are Not Perfect: Inferring Spatial Uncertainty in Object Detection
Authors:
Di Feng,
Zining Wang,
Yiyang Zhou,
Lars Rosenbaum,
Fabian Timm,
Klaus Dietmayer,
Masayoshi Tomizuka,
Wei Zhan
Abstract:
The availability of many real-world driving datasets is a key reason behind the recent progress of object detection algorithms in autonomous driving. However, there exist ambiguity or even failures in object labels due to error-prone annotation process or sensor observation noise. Current public object detection datasets only provide deterministic object labels without considering their inherent u…
▽ More
The availability of many real-world driving datasets is a key reason behind the recent progress of object detection algorithms in autonomous driving. However, there exist ambiguity or even failures in object labels due to error-prone annotation process or sensor observation noise. Current public object detection datasets only provide deterministic object labels without considering their inherent uncertainty, as does the common training process or evaluation metrics for object detectors. As a result, an in-depth evaluation among different object detection methods remains challenging, and the training process of object detectors is sub-optimal, especially in probabilistic object detection. In this work, we infer the uncertainty in bounding box labels from LiDAR point clouds based on a generative model, and define a new representation of the probabilistic bounding box through a spatial uncertainty distribution. Comprehensive experiments show that the proposed model reflects complex environmental noises in LiDAR perception and the label quality. Furthermore, we propose Jaccard IoU (JIoU) as a new evaluation metric that extends IoU by incorporating label uncertainty. We conduct an in-depth comparison among several LiDAR-based object detectors using the JIoU metric. Finally, we incorporate the proposed label uncertainty in a loss function to train a probabilistic object detector and to improve its detection accuracy. We verify our proposed methods on two public datasets (KITTI, Waymo), as well as on simulation data. Code is released at https://bit.ly/2W534yo.
△ Less
Submitted 18 December, 2020;
originally announced December 2020.