subscribe to arXiv mailings

Lowering Barriers to Entry for Fully-Integrated Custom Payloads on a DJI Matrice

Authors: Joshua Springer, Gylfi Þór Guðmundsson, Marcel Kyas

Abstract: Consumer-grade drones have become effective multimedia collection tools, spring-boarded by rapid development in embedded CPUs, GPUs, and cameras. They are best known for their ability to cheaply collect high-quality aerial video, 3D terrain scans, infrared imagery, etc., with respect to manned aircraft. However, users can also create and attach custom sensors, actuators, or computers, so the drone… ▽ More Consumer-grade drones have become effective multimedia collection tools, spring-boarded by rapid development in embedded CPUs, GPUs, and cameras. They are best known for their ability to cheaply collect high-quality aerial video, 3D terrain scans, infrared imagery, etc., with respect to manned aircraft. However, users can also create and attach custom sensors, actuators, or computers, so the drone can collect different data, generate composite data, or interact intelligently with its environment, e.g., autonomously changing behavior to land in a safe way, or choosing further data collection sites. Unfortunately, developing custom payloads is prohibitively difficult for many researchers outside of engineering. We provide guidelines for how to create a sophisticated computational payload that integrates a Raspberry Pi 5 into a DJI Matrice 350. The payload fits into the Matrice's case like a typical DJI payload (but is much cheaper), is easy to build and expand (3D-printed), uses the drone's power and telemetry, can control the drone and its other payloads, can access the drone's sensors and camera feeds, and can process video and stream it to the operator via the controller in real time. We describe the difficulties and proprietary quirks we encountered, how we worked through them, and provide setup scripts and a known-working configuration for others to use. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: 5 pages, 3 figures, 1 table, workshop paper, version 1 (preprint)

arXiv:2403.03806 [pdf, other]

A Precision Drone Landing System using Visual and IR Fiducial Markers and a Multi-Payload Camera

Authors: Joshua Springer, Gylfi Þór Guðmundsson, Marcel Kyas

Abstract: We propose a method for autonomous precision drone landing with fiducial markers and a gimbal-mounted, multi-payload camera with wide-angle, zoom, and IR sensors. The method has minimal data requirements; it depends primarily on the direction from the drone to the landing pad, enabling it to switch dynamically between the camera's different sensors and zoom factors, and minimizing auxiliary sensor… ▽ More We propose a method for autonomous precision drone landing with fiducial markers and a gimbal-mounted, multi-payload camera with wide-angle, zoom, and IR sensors. The method has minimal data requirements; it depends primarily on the direction from the drone to the landing pad, enabling it to switch dynamically between the camera's different sensors and zoom factors, and minimizing auxiliary sensor requirements. It eliminates the need for data such as altitude above ground level, straight-line distance to the landing pad, fiducial marker size, and 6 DoF marker pose (of which the orientation is problematic). We leverage the zoom and wide-angle cameras, as well as visual April Tag fiducial markers to conduct successful precision landings from much longer distances than in previous work (168m horizontal distance, 102m altitude). We use two types of April Tags in the IR spectrum - active and passive - for precision landing both at daytime and nighttime, instead of simple IR beacons used in most previous work. The active IR landing pad is heated; the novel, passive one is unpowered, at ambient temperature, and depends on its high reflectivity and an IR differential between the ground and the sky. Finally, we propose a high-level control policy to manage initial search for the landing pad and subsequent searches if it is lost - not addressed in previous work. The method demonstrates successful landings with the landing skids at least touching the landing pad, achieving an average error of 0.19m. It also demonstrates successful recovery and landing when the landing pad is temporarily obscured. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 7 pages, 6 figures, 2 tables

arXiv:1904.08689 [pdf, other]

Exquisitor: Interactive Learning at Large

Authors: Björn Þór Jónsson, Omar Shahbaz Khan, Hanna Ragnarsdóttir, Þórhildur Þorleiksdóttir, Jan Zahálka, Stevan Rudinac, Gylfi Þór Guðmundsson, Laurent Amsaleg, Marcel Worring

Abstract: Increasing scale is a dominant trend in today's multimedia collections, which especially impacts interactive applications. To facilitate interactive exploration of large multimedia collections, new approaches are needed that are capable of learning on the fly new analytic categories based on the visual and textual content. To facilitate general use on standard desktops, laptops, and mobile devices… ▽ More Increasing scale is a dominant trend in today's multimedia collections, which especially impacts interactive applications. To facilitate interactive exploration of large multimedia collections, new approaches are needed that are capable of learning on the fly new analytic categories based on the visual and textual content. To facilitate general use on standard desktops, laptops, and mobile devices, they must furthermore work with limited computing resources. We present Exquisitor, a highly scalable interactive learning approach, capable of intelligent exploration of the large-scale YFCC100M image collection with extremely efficient responses from the interactive classifier. Based on relevance feedback from the user on previously suggested items, Exquisitor uses semantic features, extracted from both visual and text attributes, to suggest relevant media items to the user. Exquisitor builds upon the state of the art in large-scale data representation, compression and indexing, introducing a cluster-based retrieval mechanism that facilitates the efficient suggestions. With Exquisitor, each interaction round over the full YFCC100M collection is completed in less than 0.3 seconds using a single CPU core. That is 4x less time using 16x smaller computational resources than the most efficient state-of-the-art method, with a positive impact on result quality. These results open up many interesting research avenues, both for exploration of industry-scale media collections and for media exploration on mobile devices. △ Less

Submitted 17 July, 2019; v1 submitted 18 April, 2019; originally announced April 2019.

Showing 1–3 of 3 results for author: Guðmundsson, G Þ