Skip to main content

Showing 1–50 of 53 results for author: Waslander, S L

  1. arXiv:2406.12095  [pdf, other

    cs.CV cs.AI cs.RO

    DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features

    Authors: Letian Wang, Seung Wook Kim, Jiawei Yang, Cunjun Yu, Boris Ivanovic, Steven L. Waslander, Yue Wang, Sanja Fidler, Marco Pavone, Peter Karkus

    Abstract: We propose DistillNeRF, a self-supervised learning framework addressing the challenge of understanding 3D environments from limited 2D observations in autonomous driving. Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs, and is trained self-supervised with differentiable rendering to reconstruct RGB,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2403.11492  [pdf, other

    cs.CV cs.AI cs.RO

    SmartRefine: A Scenario-Adaptive Refinement Framework for Efficient Motion Prediction

    Authors: Yang Zhou, Hao Shao, Letian Wang, Steven L. Waslander, Hongsheng Li, Yu Liu

    Abstract: Predicting the future motion of surrounding agents is essential for autonomous vehicles (AVs) to operate safely in dynamic, human-robot-mixed environments. Context information, such as road maps and surrounding agents' states, provides crucial geometric and semantic information for motion behavior prediction. To this end, recent works explore two-stage prediction frameworks where coarse trajectori… ▽ More

    Submitted 19 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Camera-ready version for CVPR 2024

  3. arXiv:2402.12303  [pdf, other

    cs.CV cs.RO

    UncertaintyTrack: Exploiting Detection and Localization Uncertainty in Multi-Object Tracking

    Authors: Chang Won Lee, Steven L. Waslander

    Abstract: Multi-object tracking (MOT) methods have seen a significant boost in performance recently, due to strong interest from the research community and steadily improving object detection methods. The majority of tracking methods follow the tracking-by-detection (TBD) paradigm, blindly trust the incoming detections with no sense of their associated localization uncertainty. This lack of uncertainty awar… ▽ More

    Submitted 29 April, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted to ICRA 2024

  4. arXiv:2402.06537  [pdf, other

    cs.CV

    Feature Density Estimation for Out-of-Distribution Detection via Normalizing Flows

    Authors: Evan D. Cook, Marc-Antoine Lavoie, Steven L. Waslander

    Abstract: Out-of-distribution (OOD) detection is a critical task for safe deployment of learning systems in the open world setting. In this work, we investigate the use of feature density estimation via normalizing flows for OOD detection and present a fully unsupervised approach which requires no exposure to OOD data, avoiding researcher bias in OOD sample selection. This is a post-hoc method which can be… ▽ More

    Submitted 29 April, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: Accepted to CRV 2024

  5. arXiv:2312.07488  [pdf, other

    cs.CV cs.AI cs.RO

    LMDrive: Closed-Loop End-to-End Driving with Large Language Models

    Authors: Hao Shao, Yuxuan Hu, Letian Wang, Steven L. Waslander, Yu Liu, Hongsheng Li

    Abstract: Despite significant recent progress in the field of autonomous driving, modern methods still struggle and can incur serious accidents when encountering long-tail unforeseen events and challenging urban scenarios. On the one hand, large language models (LLM) have shown impressive reasoning capabilities that approach "Artificial General Intelligence". On the other hand, previous autonomous driving m… ▽ More

    Submitted 21 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: project page: https://hao-shao.com/projects/lmdrive.html

  6. arXiv:2311.10983  [pdf, other

    cs.CV cs.AI cs.LG

    Multiple View Geometry Transformers for 3D Human Pose Estimation

    Authors: Ziwei Liao, Jialiang Zhu, Chunyu Wang, Han Hu, Steven L. Waslander

    Abstract: In this work, we aim to improve the 3D reasoning ability of Transformers in multi-view 3D human pose estimation. Recent works have focused on end-to-end learning-based transformer designs, which struggle to resolve geometric information accurately, particularly during occlusion. Instead, we propose a novel hybrid model, MVGFormer, which has a series of geometric and appearance modules organized in… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: 14 pages, 8 figures

  7. arXiv:2309.09118  [pdf, other

    cs.CV cs.AI cs.RO

    Uncertainty-aware 3D Object-Level Mapping with Deep Shape Priors

    Authors: Ziwei Liao, Jun Yang, Jingxing Qian, Angela P. Schoellig, Steven L. Waslander

    Abstract: 3D object-level mapping is a fundamental problem in robotics, which is especially challenging when object CAD models are unavailable during inference. In this work, we propose a framework that can reconstruct high-quality object-level maps for unknown objects. Our approach takes multiple RGB-D images as input and outputs dense 3D shapes and 9-DoF poses (including 3 scale parameters) for detected o… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

    Comments: Manuscript submitted to ICRA 2024

  8. arXiv:2308.14665  [pdf, other

    cs.RO

    Active Pose Refinement for Textureless Shiny Objects using the Structured Light Camera

    Authors: Jun Yang, Jian Yao, Steven L. Waslander

    Abstract: 6D pose estimation of textureless shiny objects has become an essential problem in many robotic applications. Many pose estimators require high-quality depth data, often measured by structured light cameras. However, when objects have shiny surfaces (e.g., metal parts), these cameras fail to sense complete depths from a single viewpoint due to the specular reflection, resulting in a significant dr… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

  9. arXiv:2307.00488  [pdf, other

    cs.RO

    POV-SLAM: Probabilistic Object-Aware Variational SLAM in Semi-Static Environments

    Authors: Jingxing Qian, Veronica Chatrath, James Servos, Aaron Mavrinac, Wolfram Burgard, Steven L. Waslander, Angela P. Schoellig

    Abstract: Simultaneous localization and mapping (SLAM) in slowly varying scenes is important for long-term robot task completion. Failing to detect scene changes may lead to inaccurate maps and, ultimately, lost robots. Classical SLAM algorithms assume static scenes, and recent works take dynamics into account, but require scene changes to be observed in consecutive frames. Semi-static scenes, wherein objec… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: Published in Robotics: Science and Systems (RSS) 2023

  10. arXiv:2306.11739  [pdf, other

    cs.CV cs.AI cs.RO

    Multi-view 3D Object Reconstruction and Uncertainty Modelling with Neural Shape Prior

    Authors: Ziwei Liao, Steven L. Waslander

    Abstract: 3D object reconstruction is important for semantic scene understanding. It is challenging to reconstruct detailed 3D shapes from monocular images directly due to a lack of depth information, occlusion and noise. Most current methods generate deterministic object models without any awareness of the uncertainty of the reconstruction. We tackle this problem by leveraging a neural object representatio… ▽ More

    Submitted 6 November, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: Manuscript accepted by WACV 2024

  11. arXiv:2305.10507  [pdf, other

    cs.CV cs.AI

    ReasonNet: End-to-End Driving with Temporal and Global Reasoning

    Authors: Hao Shao, Letian Wang, Ruobing Chen, Steven L. Waslander, Hongsheng Li, Yu Liu

    Abstract: The large-scale deployment of autonomous vehicles is yet to come, and one of the major remaining challenges lies in urban dense traffic scenarios. In such cases, it remains challenging to predict the future evolution of the scene and future behaviors of objects, and to deal with rare adverse events such as the sudden appearance of occluded objects. In this paper, we present ReasonNet, a novel end-… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

    Comments: CVPR 2023

  12. arXiv:2305.04412  [pdf, other

    cs.RO cs.AI cs.LG

    Efficient Reinforcement Learning for Autonomous Driving with Parameterized Skills and Priors

    Authors: Letian Wang, Jie Liu, Hao Shao, Wenshuo Wang, Ruobing Chen, Yu Liu, Steven L. Waslander

    Abstract: When autonomous vehicles are deployed on public roads, they will encounter countless and diverse driving situations. Many manually designed driving policies are difficult to scale to the real world. Fortunately, reinforcement learning has shown great success in many tasks by automatic trial and error. However, when it comes to autonomous driving in interactive dense traffic, RL agents either fail… ▽ More

    Submitted 7 May, 2023; originally announced May 2023.

    Comments: Robotics: Science and Systems (RSS 2023)

  13. arXiv:2304.14460  [pdf, other

    cs.CV cs.LG

    Gradient-based Maximally Interfered Retrieval for Domain Incremental 3D Object Detection

    Authors: Barza Nisar, Hruday Vishal Kanna Anand, Steven L. Waslander

    Abstract: Accurate 3D object detection in all weather conditions remains a key challenge to enable the widespread deployment of autonomous vehicles, as most work to date has been performed on clear weather data. In order to generalize to adverse weather conditions, supervised methods perform best if trained from scratch on all weather data instead of finetuning a model pretrained on clear weather data. Trai… ▽ More

    Submitted 3 May, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

  14. arXiv:2304.14446  [pdf, other

    cs.CV

    HyperMODEST: Self-Supervised 3D Object Detection with Confidence Score Filtering

    Authors: Jenny Xu, Steven L. Waslander

    Abstract: Current LiDAR-based 3D object detectors for autonomous driving are almost entirely trained on human-annotated data collected in specific geographical domains with specific sensor setups, making it difficult to adapt to a different domain. MODEST is the first work to train 3D object detectors without any labels. Our work, HyperMODEST, proposes a universal method implemented on top of MODEST that ca… ▽ More

    Submitted 1 June, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: Accepted in CRV (Conference on Robots and Vision) 2023

  15. arXiv:2303.10729  [pdf, other

    cs.RO

    A Target-Based Extrinsic Calibration Framework for Non-Overlapping Camera-Lidar Systems Using a Motion Capture System

    Authors: Nicholas Charron, Steven L. Waslander, Sriram Narasimhan

    Abstract: In this work, we present a novel target-based lidar-camera extrinsic calibration methodology that can be used for non-overlapping field of view (FOV) sensors. Contrary to previous work, our methodology overcomes the non-overlapping FOV challenge using a motion capture system (MCS) instead of traditional simultaneous localization and mapping approaches. Due to the high relative precision of the MCS… ▽ More

    Submitted 14 June, 2023; v1 submitted 19 March, 2023; originally announced March 2023.

    Comments: 8 pages, 15 figures

  16. arXiv:2303.06766  [pdf, other

    cs.RO

    Next-Best-View Selection for Robot Eye-in-Hand Calibration

    Authors: Jun Yang, Jason Rebello, Steven L. Waslander

    Abstract: Robotic eye-in-hand calibration is the task of determining the rigid 6-DoF pose of the camera with respect to the robot end-effector frame. In this paper, we formulate this task as a non-linear optimization problem and introduce an active vision approach to strategically select the robot pose for maximizing calibration accuracy. Specifically, given an initial collection of measurement sets, our sy… ▽ More

    Submitted 12 March, 2023; originally announced March 2023.

  17. arXiv:2301.05709  [pdf, other

    cs.CV

    Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss

    Authors: Anas Mahmoud, Jordan S. K. Hu, Tianshu Kuai, Ali Harakeh, Liam Paull, Steven L. Waslander

    Abstract: An effective framework for learning 3D representations for perception tasks is distilling rich self-supervised image features via contrastive learning. However, image-to point representation learning for autonomous driving datasets faces two main challenges: 1) the abundance of self-similarity, which results in the contrastive losses pushing away semantically similar point and image regions and th… ▽ More

    Submitted 24 March, 2023; v1 submitted 12 January, 2023; originally announced January 2023.

    Comments: Accepted in CVPR 2023

  18. arXiv:2211.13724  [pdf, other

    cs.LG cs.CV

    Estimating Regression Predictive Distributions with Sample Networks

    Authors: Ali Harakeh, Jordan Hu, Naiqing Guan, Steven L. Waslander, Liam Paull

    Abstract: Estimating the uncertainty in deep neural network predictions is crucial for many real-world applications. A common approach to model uncertainty is to choose a parametric distribution and fit the data to it using maximum likelihood estimation. The chosen parametric form can be a poor fit to the data-generating distribution, resulting in unreliable uncertainty estimates. In this work, we propose S… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

    Comments: Accepted for publication in AAAI 2023. Example code at: https://samplenet.github.io/

  19. arXiv:2210.11554  [pdf, other

    cs.RO

    6D Pose Estimation for Textureless Objects on RGB Frames using Multi-View Optimization

    Authors: Jun Yang, Wenjie Xue, Sahar Ghavidel, Steven L. Waslander

    Abstract: 6D pose estimation of textureless objects is a valuable but challenging task for many robotic applications. In this work, we propose a framework to address this challenge using only RGB images acquired from multiple viewpoints. The core idea of our approach is to decouple 6D pose estimation into a sequential two-step process, first estimating the 3D translation and then the 3D rotation of each obj… ▽ More

    Submitted 21 February, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

  20. arXiv:2208.08041  [pdf, other

    cs.CV

    InterTrack: Interaction Transformer for 3D Multi-Object Tracking

    Authors: John Willes, Cody Reading, Steven L. Waslander

    Abstract: 3D multi-object tracking (MOT) is a key problem for autonomous vehicles, required to perform well-informed motion planning in dynamic environments. Particularly for densely occupied scenes, associating existing tracks to new detections remains challenging as existing systems tend to omit critical contextual information. Our proposed solution, InterTrack, introduces the Interaction Transformer for… ▽ More

    Submitted 6 May, 2023; v1 submitted 16 August, 2022; originally announced August 2022.

    Comments: Accepted to CRV 2023

  21. arXiv:2205.01202  [pdf, other

    cs.RO

    POCD: Probabilistic Object-Level Change Detection and Volumetric Mapping in Semi-Static Scenes

    Authors: Jingxing Qian, Veronica Chatrath, Jun Yang, James Servos, Angela P. Schoellig, Steven L. Waslander

    Abstract: Maintaining an up-to-date map to reflect recent changes in the scene is very important, particularly in situations involving repeated traversals by a robot operating in an environment over an extended period. Undetected changes may cause a deterioration in map quality, leading to poor localization, inefficient operations, and lost robots. Volumetric methods, such as truncated signed distance funct… ▽ More

    Submitted 15 July, 2022; v1 submitted 2 May, 2022; originally announced May 2022.

    Comments: Published in Robotics: Science and Systems (RSS) 2022

  22. arXiv:2203.05662  [pdf, other

    cs.CV

    Point Density-Aware Voxels for LiDAR 3D Object Detection

    Authors: Jordan S. K. Hu, Tianshu Kuai, Steven L. Waslander

    Abstract: LiDAR has become one of the primary 3D object detection sensors in autonomous driving. However, LiDAR's diverging point pattern with increasing distance results in a non-uniform sampled point cloud ill-suited to discretized volumetric feature extraction. Current methods either rely on voxelized point clouds or use inefficient farthest point sampling to mitigate detrimental effects caused by densit… ▽ More

    Submitted 21 March, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: Accepted in CVPR 2022

  23. arXiv:2203.00871  [pdf, other

    cs.CV

    Dense Voxel Fusion for 3D Object Detection

    Authors: Anas Mahmoud, Jordan S. K. Hu, Steven L. Waslander

    Abstract: Camera and LiDAR sensor modalities provide complementary appearance and geometric information useful for detecting 3D objects for autonomous vehicle applications. However, current end-to-end fusion methods are challenging to train and underperform state-of-the-art LiDAR-only detectors. Sequential fusion methods suffer from a limited number of pixel and point correspondences due to point cloud spar… ▽ More

    Submitted 27 October, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: Accepted in WACV 2023

  24. arXiv:2202.13263  [pdf, other

    cs.CV cs.RO

    Next-Best-View Prediction for Active Stereo Cameras and Highly Reflective Objects

    Authors: Jun Yang, Steven L. Waslander

    Abstract: Depth acquisition with the active stereo camera is a challenging task for highly reflective objects. When setup permits, multi-view fusion can provide increased levels of depth completion. However, due to the slow acquisition speed of high-end active stereo cameras, collecting a large number of viewpoints for a single scene is generally not practical. In this work, we propose a next-best-view fram… ▽ More

    Submitted 26 February, 2022; originally announced February 2022.

  25. Pattern-Aware Data Augmentation for LiDAR 3D Object Detection

    Authors: Jordan S. K. Hu, Steven L. Waslander

    Abstract: Autonomous driving datasets are often skewed and in particular, lack training data for objects at farther distances from the ego vehicle. The imbalance of data causes a performance degradation as the distance of the detected objects increases. In this paper, we propose pattern-aware ground truth sampling, a data augmentation technique that downsamples an object's point cloud based on the LiDAR's c… ▽ More

    Submitted 30 November, 2021; originally announced December 2021.

    Comments: Published paper in the IEEE Intelligent Transportation Systems Conference - ITSC 2021

    Journal ref: 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), 2021, pp. 2703-2710

  26. arXiv:2110.04182  [pdf, other

    cs.RO cs.LG

    Temporal Convolutions for Multi-Step Quadrotor Motion Prediction

    Authors: Samuel Looper, Steven L. Waslander

    Abstract: Model-based control methods for robotic systems such as quadrotors, autonomous driving vehicles and flexible manipulators require motion models that generate accurate predictions of complex nonlinear system dynamics over long periods of time. Temporal Convolutional Networks (TCNs) can be adapted to this challenge by formulating multi-step prediction as a sequence-to-sequence modeling problem. We p… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

  27. arXiv:2105.04112  [pdf, other

    cs.RO

    ROBI: A Multi-View Dataset for Reflective Objects in Robotic Bin-Picking

    Authors: Jun Yang, Yizhou Gao, Dong Li, Steven L. Waslander

    Abstract: In robotic bin-picking applications, the perception of texture-less, highly reflective parts is a valuable but challenging task. The high glossiness can introduce fake edges in RGB images and inaccurate depth measurements especially in heavily cluttered bin scenario. In this paper, we present the ROBI (Reflective Objects in BIns) dataset, a public dataset for 6D object pose estimation and multi-vi… ▽ More

    Submitted 6 October, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

  28. arXiv:2103.10968  [pdf, other

    cs.RO

    Probabilistic Multi-View Fusion of Active Stereo Depth Maps for Robotic Bin-Picking

    Authors: Jun Yang, Dong Li, Steven L. Waslander

    Abstract: The reliable fusion of depth maps from multiple viewpoints has become an important problem in many 3D reconstruction pipelines. In this work, we investigate its impact on robotic bin-picking tasks such as 6D object pose estimation. The performance of object pose estimation relies heavily on the quality of depth data. However, due to the prevalence of shiny surfaces and cluttered scenes, industrial… ▽ More

    Submitted 19 March, 2021; originally announced March 2021.

  29. arXiv:2103.01100  [pdf, other

    cs.CV

    Categorical Depth Distribution Network for Monocular 3D Object Detection

    Authors: Cody Reading, Ali Harakeh, Julia Chae, Steven L. Waslander

    Abstract: Monocular 3D object detection is a key problem for autonomous vehicles, as it provides a solution with simple configuration compared to typical multi-sensor systems. The main challenge in monocular 3D detection lies in accurately predicting object depth, which must be inferred from object and scene cues due to the lack of direct range measurement. Many methods attempt to directly estimate depth to… ▽ More

    Submitted 23 March, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: Accepted in CVPR 2021

  30. Learned Camera Gain and Exposure Control for Improved Visual Feature Detection and Matching

    Authors: Justin Tomasi, Brandon Wagstaff, Steven L. Waslander, Jonathan Kelly

    Abstract: Successful visual navigation depends upon capturing images that contain sufficient useful information. In this letter, we explore a data-driven approach to account for environmental lighting changes, improving the quality of images for use in visual odometry (VO) or visual simultaneous localization and mapping (SLAM). We train a deep convolutional neural network model to predictively adjust camera… ▽ More

    Submitted 11 July, 2022; v1 submitted 8 February, 2021; originally announced February 2021.

    Comments: In IEEE Robotics and Automation Letters (RA-L) and presented at the IEEE International Conference on Robotics and Automation (ICRA'21), Xi'an, China, May 30-Jun. 5, 2021

    Journal ref: IEEE Robotics and Automation Letters (RA-L), Vol. 6, No. 2, pp. 2028-2035, Apr. 2021

  31. arXiv:2101.05036  [pdf, other

    cs.CV stat.ML

    Estimating and Evaluating Regression Predictive Uncertainty in Deep Object Detectors

    Authors: Ali Harakeh, Steven L. Waslander

    Abstract: Predictive uncertainty estimation is an essential next step for the reliable deployment of deep object detectors in safety-critical tasks. In this work, we focus on estimating predictive distributions for bounding box regression output with variance networks. We show that in the context of object detection, training variance networks with negative log likelihood (NLL) can lead to high entropy pred… ▽ More

    Submitted 12 March, 2021; v1 submitted 13 January, 2021; originally announced January 2021.

    Comments: Published as a conference paper at ICLR 2021. Link: https://openreview.net/forum?id=YLewtnvKgR7. This is the final camera-ready version

  32. arXiv:2012.00218  [pdf, ps, other

    cs.RO

    Uncertainty-Constrained Differential Dynamic Programming in Belief Space for Vision Based Robots

    Authors: Shatil Rahman, Steven L. Waslander

    Abstract: Most mobile robots follow a modular sense-planact system architecture that can lead to poor performance or even catastrophic failure for visual inertial navigation systems due to trajectories devoid of feature matches. Planning in belief space provides a unified approach to tightly couple the perception, planning and control modules, leading to trajectories that are robust to noisy measurements an… ▽ More

    Submitted 30 November, 2020; originally announced December 2020.

    Comments: This work has been submitted to the 2021 IEEE International Conference on Robotics and Automation (ICRA) with the Robotics and Automation Letters (RA-L) option for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  33. arXiv:2009.08577  [pdf

    cs.CY cs.RO

    Making Sense of the Robotized Pandemic Response: A Comparison of Global and Canadian Robot Deployments and Success Factors

    Authors: T. Barfoot, J. Burgner-Kahrs, E. Diller, A. Garg, A. Goldenberg, J. Kelly, X. Liu, H. E. Naguib, G. Nejat, A. P. Schoellig, F. Shkurti, H. Siegel, Y. Sun, S. L. Waslander, .

    Abstract: From disinfection and remote triage, to logistics and delivery, countries around the world are making use of robots to address the unique challenges presented by the COVID-19 pandemic. Robots are being used to manage the pandemic in Canada too, but relative to other regions, we have been more cautious in our adoption -- this despite the important role that robots of Canadian origin are now playing… ▽ More

    Submitted 21 September, 2020; v1 submitted 17 September, 2020; originally announced September 2020.

    Comments: 104 pages, 18 figures, 13 tables. Corresponding Author: H Siegel

  34. arXiv:2003.05505  [pdf, other

    cs.CV

    Confidence Guided Stereo 3D Object Detection with Split Depth Estimation

    Authors: Chengyao Li, Jason Ku, Steven L. Waslander

    Abstract: Accurate and reliable 3D object detection is vital to safe autonomous driving. Despite recent developments, the performance gap between stereo-based methods and LiDAR-based methods is still considerable. Accurate depth estimation is crucial to the performance of stereo-based 3D object detection methods, particularly for those pixels associated with objects in the foreground. Moreover, stereo-based… ▽ More

    Submitted 11 March, 2020; originally announced March 2020.

    Comments: 8 pages, 6 figures

  35. arXiv:2001.09297  [pdf, other

    cs.RO cs.MA

    Vehicle Scheduling Problem

    Authors: Mirmojtaba Gharibi, Steven L. Waslander, Raouf Boutaba

    Abstract: We define a new problem called the Vehicle Scheduling Problem (VSP). The goal is to minimize an objective function, such as the number of tardy vehicles over a transportation network subject to maintaining safety distances, meeting hard deadlines, and maintaining speeds on each link between the allowed minimums and maximums. We prove VSP is an NP-hard problem for multiple objective functions that… ▽ More

    Submitted 25 January, 2020; originally announced January 2020.

  36. arXiv:1909.08537  [pdf, other

    cs.RO cs.CV eess.IV

    Visual Measurement Integrity Monitoring for UAV Localization

    Authors: Chengyao Li, Steven L. Waslander

    Abstract: Unmanned aerial vehicles (UAVs) have increasingly been adopted for safety, security, and rescue missions, for which they need precise and reliable pose estimates relative to their environment. To ensure mission safety when relying on visual perception, it is essential to have an approach to assess the integrity of the visual localization solution. However, to the best of our knowledge, such an app… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

    Comments: Published in Safety, Security, and Rescue Robotics 2019

  37. arXiv:1909.07566  [pdf, other

    cs.CV

    Object-Centric Stereo Matching for 3D Object Detection

    Authors: Alex D. Pon, Jason Ku, Chengyao Li, Steven L. Waslander

    Abstract: Safe autonomous driving requires reliable 3D object detection-determining the 6 DoF pose and dimensions of objects of interest. Using stereo cameras to solve this task is a cost-effective alternative to the widely used LiDAR sensor. The current state-of-the-art for stereo 3D object detection takes the existing PSMNet stereo matching network, with no modifications, and converts the estimated dispar… ▽ More

    Submitted 10 March, 2020; v1 submitted 16 September, 2019; originally announced September 2019.

    Comments: Accepted in ICRA 2020

  38. arXiv:1909.04838  [pdf, other

    cs.RO cs.MA

    3D traffic flow model for UAVs

    Authors: Mirmojtaba Gharibi, Raouf Boutaba, Steven L. Waslander

    Abstract: In this work, we introduce a microscopic traffic flow model called Scalar Capacity Model (SCM) which can be used to study the formation of traffic on an airway link for autonomous Unmanned Aerial Vehicles (UAV) as well as for the ground vehicles on the road. Given the 3D nature of UAV flights, the main novelty in our model is to eliminate the commonly used notion of lanes and replace it with a not… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.

    Comments: 1 Table, 6 Figures

  39. arXiv:1907.06777  [pdf, other

    cs.CV

    Improving 3D Object Detection for Pedestrians with Virtual Multi-View Synthesis Orientation Estimation

    Authors: Jason Ku, Alex D. Pon, Sean Walsh, Steven L. Waslander

    Abstract: Accurately estimating the orientation of pedestrians is an important and challenging task for autonomous driving because this information is essential for tracking and predicting pedestrian behavior. This paper presents a flexible Virtual Multi-View Synthesis module that can be adopted into 3D object detection methods to improve orientation estimation. The module uses a multi-step process to acqui… ▽ More

    Submitted 15 July, 2019; originally announced July 2019.

    Comments: Accepted in IROS 2019

  40. arXiv:1905.08758  [pdf, other

    cs.RO

    aUToTrack: A Lightweight Object Detection and Tracking System for the SAE AutoDrive Challenge

    Authors: Keenan Burnett, Sepehr Samavi, Steven L. Waslander, Timothy D. Barfoot, Angela P. Schoellig

    Abstract: The University of Toronto is one of eight teams competing in the SAE AutoDrive Challenge -- a competition to develop a self-driving car by 2020. After placing first at the Year 1 challenge, we are headed to MCity in June 2019 for the second challenge. There, we will interact with pedestrians, cyclists, and cars. For safe operation, it is critical to have an accurate estimate of the position of all… ▽ More

    Submitted 21 May, 2019; originally announced May 2019.

    Comments: Accepted to CRV (Computer and Robot Vision) 2019

  41. arXiv:1904.01690  [pdf, other

    cs.CV

    Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction

    Authors: Jason Ku, Alex D. Pon, Steven L. Waslander

    Abstract: We present MonoPSR, a monocular 3D object detection method that leverages proposals and shape reconstruction. First, using the fundamental relations of a pinhole camera model, detections from a mature 2D object detector are used to generate a 3D proposal per object in a scene. The 3D location of these proposals prove to be quite accurate, which greatly reduces the difficulty of regressing the fina… ▽ More

    Submitted 2 April, 2019; originally announced April 2019.

    Comments: Accepted in CVPR 2019

  42. arXiv:1903.03838  [pdf, other

    cs.CV

    BayesOD: A Bayesian Approach for Uncertainty Estimation in Deep Object Detectors

    Authors: Ali Harakeh, Michael Smart, Steven L. Waslander

    Abstract: When incorporating deep neural networks into robotic systems, a major challenge is the lack of uncertainty measures associated with their output predictions. Methods for uncertainty estimation in the output of deep object detectors (DNNs) have been proposed in recent works, but have had limited success due to 1) information loss at the detectors non-maximum suppression (NMS) stage, and 2) failure… ▽ More

    Submitted 16 September, 2019; v1 submitted 9 March, 2019; originally announced March 2019.

  43. Network Uncertainty Informed Semantic Feature Selection for Visual SLAM

    Authors: Pranav Ganti, Steven L. Waslander

    Abstract: In order to facilitate long-term localization using a visual simultaneous localization and mapping (SLAM) algorithm, careful feature selection can help ensure that reference points persist over long durations and the runtime and storage complexity of the algorithm remain consistent. We present SIVO (Semantically Informed Visual Odometry and Mapping), a novel information-theoretic feature selection… ▽ More

    Submitted 26 August, 2019; v1 submitted 28 November, 2018; originally announced November 2018.

    Comments: Published in: 2019 16th Conference on Computer and Robot Vision (CRV)

  44. Aerial Imagery for Roof Segmentation: A Large-Scale Dataset towards Automatic Mapping of Buildings

    Authors: Qi Chen, Lei Wang, Yifan Wu, Guangming Wu, Zhiling Guo, Steven L. Waslander

    Abstract: arXiv admin note: This version has been removed as the user did not have the right to agree to the license at the time of submission

    Submitted 27 July, 2018; v1 submitted 25 July, 2018; originally announced July 2018.

    Comments: arXiv admin note: This version has been removed as the user did not have the right to agree to the license at the time of submission

  45. arXiv:1807.09304  [pdf, other

    cs.CV

    Encoderless Gimbal Calibration of Dynamic Multi-Camera Clusters

    Authors: Christopher L. Choi, Jason Rebello, Leonid Koppel, Pranav Ganti, Arun Das, Steven L. Waslander

    Abstract: Dynamic Camera Clusters (DCCs) are multi-camera systems where one or more cameras are mounted on actuated mechanisms such as a gimbal. Existing methods for DCC calibration rely on joint angle measurements to resolve the time-varying transformation between the dynamic and static camera. This information is usually provided by motor encoders, however, joint angle measurements are not always readily… ▽ More

    Submitted 24 July, 2018; originally announced July 2018.

    Comments: ICRA 2018

  46. arXiv:1807.06072  [pdf, other

    cs.LG cs.AI stat.ML

    Leveraging Pre-Trained 3D Object Detection Models For Fast Ground Truth Generation

    Authors: Jungwook Lee, Sean Walsh, Ali Harakeh, Steven L. Waslander

    Abstract: Training 3D object detectors for autonomous driving has been limited to small datasets due to the effort required to generate annotations. Reducing both task complexity and the amount of task switching done by annotators is key to reducing the effort and time required to generate 3D bounding box annotations. This paper introduces a novel ground truth generation method that combines human supervisi… ▽ More

    Submitted 16 July, 2018; originally announced July 2018.

  47. arXiv:1806.07987  [pdf, other

    cs.CV

    A Hierarchical Deep Architecture and Mini-Batch Selection Method For Joint Traffic Sign and Light Detection

    Authors: Alex D. Pon, Oles Andrienko, Ali Harakeh, Steven L. Waslander

    Abstract: Traffic light and sign detectors on autonomous cars are integral for road scene perception. The literature is abundant with deep learning networks that detect either lights or signs, not both, which makes them unsuitable for real-life deployment due to the limited graphics processing unit (GPU) memory and power available on embedded systems. The root cause of this issue is that no public dataset c… ▽ More

    Submitted 13 September, 2018; v1 submitted 20 June, 2018; originally announced June 2018.

    Comments: Accepted in the IEEE 15th Conference on Computer and Robot Vision

  48. arXiv:1806.00526  [pdf, ps, other

    cs.NE cs.RO

    Multi-Step Prediction of Dynamic Systems with Recurrent Neural Networks

    Authors: Nima Mohajerin, Steven L. Waslander

    Abstract: Recurrent Neural Networks (RNNs) can encode rich dynamics which makes them suitable for modeling dynamic systems. To train an RNN for multi-step prediction of dynamic systems, it is crucial to efficiently address the state initialization problem, which seeks proper values for the RNN initial states at the beginning of each prediction interval. In this work, the state initialization problem is addr… ▽ More

    Submitted 19 May, 2018; originally announced June 2018.

  49. arXiv:1805.01810  [pdf, other

    cs.RO

    Manifold Geometry with Fast Automatic Derivatives and Coordinate Frame Semantics Checking in C++

    Authors: Leonid Koppel, Steven L. Waslander

    Abstract: Computer vision and robotics problems often require representation and estimation of poses on the SE(3) manifold. Developers of algorithms that must run in real time face several time-consuming programming tasks, including deriving and computing analytic derivatives and avoiding mathematical errors when handling poses in multiple coordinate frames. To support rapid and error-free development, we p… ▽ More

    Submitted 4 May, 2018; originally announced May 2018.

    Comments: 8 pages, Conference on Computer and Robot Vision (CRV) 2018

  50. arXiv:1802.00036  [pdf, other

    cs.CV

    In Defense of Classical Image Processing: Fast Depth Completion on the CPU

    Authors: Jason Ku, Ali Harakeh, Steven L. Waslander

    Abstract: With the rise of data driven deep neural networks as a realization of universal function approximators, most research on computer vision problems has moved away from hand crafted classical image processing algorithms. This paper shows that with a well designed algorithm, we are capable of outperforming neural network based methods on the task of depth completion. The proposed algorithm is simple a… ▽ More

    Submitted 31 January, 2018; originally announced February 2018.