subscribe to arXiv mailings

A Machine Learning Approach for Identifying Anatomical Biomarkers of Early Mild Cognitive Impairment

Authors: Alwani Liyana Ahmad, Jose Sanchez-Bornot, Roberto C. Sotero, Damien Coyle, Zamzuri Idris, Ibrahima Faye

Abstract: Alzheimer's Disease (AD) is a progressive neurodegenerative disorder that primarily affects the aging population by impairing cognitive and motor functions. Early detection of AD through accessible methodologies like magnetic resonance imaging (MRI) is vital for developing effective interventions to halt or slow the disease's progression. This study aims to perform a comprehensive analysis of mach… ▽ More Alzheimer's Disease (AD) is a progressive neurodegenerative disorder that primarily affects the aging population by impairing cognitive and motor functions. Early detection of AD through accessible methodologies like magnetic resonance imaging (MRI) is vital for developing effective interventions to halt or slow the disease's progression. This study aims to perform a comprehensive analysis of machine learning techniques for selecting MRI-based biomarkers and classifying individuals into healthy controls (HC) and unstable controls (uHC) who later show mild cognitive impairment within five years. The research utilizes MRI data from the Alzheimer's Disease Neuroinformatics Initiative (ADNI) and the Open Access Series of Imaging Studies 3 (OASIS-3), focusing on both HC and uHC participants. The study addresses the challenges of imbalanced data by testing classification methods on balanced and unbalanced datasets, and harmonizes data using polynomial regression to mitigate nuisance variables like age, gender, and intracranial volume. Results indicate that Gaussian Naive Bayes and RusBoost classifiers shows an optimal performance, achieving accuracies of up to 76.46% and 72.48% respectively on the ADNI dataset. For the OASIS-3 dataset, Kernel Naive Bayes and RusBoost yield accuracies ranging from 64.66% to 75.71%, improving further in age-matched datasets. Brain regions like the entorhinal cortex, hippocampus, lateral ventricle, and lateral orbitofrontal cortex are identified as significantly impacted during early cognitive decline. Despite limitations such as small sample sizes, the study's harmonization approach enhances the robustness of biomarker selection, suggesting the potential of this semi-automatic machine learning pipeline for early AD detection using MRI. △ Less

Submitted 29 May, 2024; originally announced July 2024.

Comments: 27 pages, 5 figures

arXiv:2406.03548 [pdf, other]

Robust Communication and Computation using Deep Learning via Joint Uncertainty Injection

Authors: Robert-Jeron Reifert, Hayssam Dahrouj, Alaa Alameer Ahmad, Haris Gacanin, Aydin Sezgin

Abstract: The convergence of communication and computation, along with the integration of machine learning and artificial intelligence, stand as key empowering pillars for the sixth-generation of communication systems (6G). This paper considers a network of one base station serving a number of devices simultaneously using spatial multiplexing. The paper then presents an innovative deep learning-based approa… ▽ More The convergence of communication and computation, along with the integration of machine learning and artificial intelligence, stand as key empowering pillars for the sixth-generation of communication systems (6G). This paper considers a network of one base station serving a number of devices simultaneously using spatial multiplexing. The paper then presents an innovative deep learning-based approach to simultaneously manage the transmit and computing powers, alongside computation allocation, amidst uncertainties in both channel and computing states information. More specifically, the paper aims at proposing a robust solution that minimizes the worst-case delay across the served devices subject to computation and power constraints. The paper uses a deep neural network (DNN)-based solution that maps estimated channels and computation requirements to optimized resource allocations. During training, uncertainty samples are injected after the DNN output to jointly account for both communication and computation estimation errors. The DNN is then trained via backpropagation using the robust utility, thus implicitly learning the uncertainty distributions. Our results validate the enhanced robust delay performance of the joint uncertainty injection versus the classical DNN approach, especially in high channel and computational uncertainty regimes. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 7 pages, 6 figures, one table. Accepted for presentation at the 19th International Symposium on Wireless Communication Systems 2024 (ISWCS 2024)

arXiv:2406.02088 [pdf, other]

Fast and Practical Strassen's Matrix Multiplication using FPGAs

Authors: Afzal Ahmad, Linfeng Du, Wei Zhang

Abstract: Matrix multiplication is a cornerstone operation in a wide array of scientific fields, including machine learning and computer graphics. The standard algorithm for matrix multiplication has a complexity of $\mathcal{O}(n^3)$ for $n\times n$ matrices. Strassen's algorithm improves this to $\mathcal{O}(n^{2.807})$, but its practicality is limited for small to medium matrix sizes due to the large num… ▽ More Matrix multiplication is a cornerstone operation in a wide array of scientific fields, including machine learning and computer graphics. The standard algorithm for matrix multiplication has a complexity of $\mathcal{O}(n^3)$ for $n\times n$ matrices. Strassen's algorithm improves this to $\mathcal{O}(n^{2.807})$, but its practicality is limited for small to medium matrix sizes due to the large number of additions it introduces. This paper presents a novel FPGA-based implementation of Strassen's algorithm that achieves superior speed over an optimized General Matrix Multiply (GeMM) implementation for matrices as small as $n=256$. Our design, tested extensively on two high-performance FPGA accelerators (Alveo U50 and U280) across various data types, matches or surpasses the performance of a highly optimized baseline across a range of matrix sizes. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: Accepted at 34th International Conference on Field-Programmable Logic and Applications (FPL 2024), 7 pages

ACM Class: C.1.3

arXiv:2405.20058 [pdf, other]

Enhancing Plant Disease Detection: A Novel CNN-Based Approach with Tensor Subspace Learning and HOWSVD-MD

Authors: Abdelmalik Ouamane, Ammar Chouchane, Yassine Himeur, Abderrazak Debilou, Abbes Amira, Shadi Atalla, Wathiq Mansoor, Hussain Al Ahmad

Abstract: Machine learning has revolutionized the field of agricultural science, particularly in the early detection and management of plant diseases, which are crucial for maintaining crop health and productivity. Leveraging advanced algorithms and imaging technologies, researchers are now able to identify and classify plant diseases with unprecedented accuracy and speed. Effective management of tomato dis… ▽ More Machine learning has revolutionized the field of agricultural science, particularly in the early detection and management of plant diseases, which are crucial for maintaining crop health and productivity. Leveraging advanced algorithms and imaging technologies, researchers are now able to identify and classify plant diseases with unprecedented accuracy and speed. Effective management of tomato diseases is crucial for enhancing agricultural productivity. The development and application of tomato disease classification methods are central to this objective. This paper introduces a cutting-edge technique for the detection and classification of tomato leaf diseases, utilizing insights from the latest pre-trained Convolutional Neural Network (CNN) models. We propose a sophisticated approach within the domain of tensor subspace learning, known as Higher-Order Whitened Singular Value Decomposition (HOWSVD), designed to boost the discriminatory power of the system. Our approach to Tensor Subspace Learning is methodically executed in two phases, beginning with HOWSVD and culminating in Multilinear Discriminant Analysis (MDA). The efficacy of this innovative method was rigorously tested through comprehensive experiments on two distinct datasets, namely PlantVillage and the Taiwan dataset. The findings reveal that HOWSVD-MDA outperforms existing methods, underscoring its capability to markedly enhance the precision and dependability of diagnosing tomato leaf diseases. For instance, up to 98.36\% and 89.39\% accuracy scores have been achieved under PlantVillage and the Taiwan datasets, respectively. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 17 pages, 9 figures and 8 tables

arXiv:2405.16729 [pdf, other]

Free-Space Optical Channel Turbulence Prediction: A Machine Learning Approach

Authors: Md Zobaer Islam, Ethan Abele, Fahim Ferdous Hossain, Arsalan Ahmad, Sabit Ekin, John F. O'Hara

Abstract: Channel turbulence presents a formidable obstacle for free-space optical (FSO) communication. Anticipation of turbulence levels is highly important for mitigating disruptions. We study the application of machine learning (ML) to FSO data streams to rapidly predict channel turbulence levels with no additional sensing hardware. An optical bit stream was transmitted through a controlled channel in th… ▽ More Channel turbulence presents a formidable obstacle for free-space optical (FSO) communication. Anticipation of turbulence levels is highly important for mitigating disruptions. We study the application of machine learning (ML) to FSO data streams to rapidly predict channel turbulence levels with no additional sensing hardware. An optical bit stream was transmitted through a controlled channel in the lab under six distinct turbulence levels, and the efficacy of using ML to classify turbulence levels was examined. ML-based turbulence level classification was found to be >98% accurate with multiple ML training parameters, but highly dependent upon the timescale of changes between turbulence levels. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: 5 pages, 4 figures, 1 table. Submitted to IEEE letters

arXiv:2404.18713 [pdf, other]

Adaptive Reinforcement Learning for Robot Control

Authors: Yu Tang Liu, Nilaksh Singh, Aamir Ahmad

Abstract: Deep reinforcement learning (DRL) has shown remarkable success in simulation domains, yet its application in designing robot controllers remains limited, due to its single-task orientation and insufficient adaptability to environmental changes. To overcome these limitations, we present a novel adaptive agent that leverages transfer learning techniques to dynamically adapt policy in response to dif… ▽ More Deep reinforcement learning (DRL) has shown remarkable success in simulation domains, yet its application in designing robot controllers remains limited, due to its single-task orientation and insufficient adaptability to environmental changes. To overcome these limitations, we present a novel adaptive agent that leverages transfer learning techniques to dynamically adapt policy in response to different tasks and environmental conditions. The approach is validated through the blimp control challenge, where multitasking capabilities and environmental adaptability are essential. The agent is trained using a custom, highly parallelized simulator built on IsaacGym. We perform zero-shot transfer to fly the blimp in the real world to solve various tasks. We share our code at \url{https://github.com/robot-perception-group/adaptive\_agent/}. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.13440 [pdf, ps, other]

PACNav: Enhancing Collective Navigation for UAV Swarms in Communication-Challenged Environments

Authors: Afzal Ahmad, Daniel Bonilla Licea, Giuseppe Silano, Tomas Baca, Martin Saska

Abstract: This article presents Persistence Administered Collective Navigation (PACNav) as an approach for achieving decentralized collective navigation of Unmanned Aerial Vehicle (UAV) swarms. The technique is inspired by the flocking and collective navigation behavior observed in natural swarms, such as cattle herds, bird flocks, and even large groups of humans. PACNav relies solely on local observations… ▽ More This article presents Persistence Administered Collective Navigation (PACNav) as an approach for achieving decentralized collective navigation of Unmanned Aerial Vehicle (UAV) swarms. The technique is inspired by the flocking and collective navigation behavior observed in natural swarms, such as cattle herds, bird flocks, and even large groups of humans. PACNav relies solely on local observations of relative positions of UAVs, making it suitable for large swarms deprived of communication capabilities and external localization systems. We introduce the novel concepts of path persistence and path similarity, which allow each swarm member to analyze the motion of others. PACNav is grounded on two main principles: (1) UAVs with little variation in motion direction exhibit high path persistence and are considered reliable leaders by other UAVs; (2) groups of UAVs that move in a similar direction demonstrate high path similarity, and such groups are assumed to contain a reliable leader. The proposed approach also incorporates a reactive collision avoidance mechanism to prevent collisions with swarm members and environmental obstacles. The method is validated through simulated and real-world experiments conducted in a natural forest. △ Less

Submitted 20 April, 2024; originally announced April 2024.

Comments: 2 pages, Accepted for discussion at the workshop session "Breaking Swarm Stereotypes" at ICRA'24 in Yokohama, Japan

arXiv:2404.11720 [pdf, other]

GEOBIND: Binding Text, Image, and Audio through Satellite Images

Authors: Aayush Dhakal, Subash Khanal, Srikumar Sastry, Adeel Ahmad, Nathan Jacobs

Abstract: In remote sensing, we are interested in modeling various modalities for some geographic location. Several works have focused on learning the relationship between a location and type of landscape, habitability, audio, textual descriptions, etc. Recently, a common way to approach these problems is to train a deep-learning model that uses satellite images to infer some unique characteristics of the l… ▽ More In remote sensing, we are interested in modeling various modalities for some geographic location. Several works have focused on learning the relationship between a location and type of landscape, habitability, audio, textual descriptions, etc. Recently, a common way to approach these problems is to train a deep-learning model that uses satellite images to infer some unique characteristics of the location. In this work, we present a deep-learning model, GeoBind, that can infer about multiple modalities, specifically text, image, and audio, from satellite imagery of a location. To do this, we use satellite images as the binding element and contrastively align all other modalities to the satellite image data. Our training results in a joint embedding space with multiple types of data: satellite image, ground-level image, audio, and text. Furthermore, our approach does not require a single complex dataset that contains all the modalities mentioned above. Rather it only requires multiple satellite-image paired data. While we only align three modalities in this paper, we present a general framework that can be used to create an embedding space with any number of modalities by using satellite images as the binding element. Our results show that, unlike traditional unimodal models, GeoBind is versatile and can reason about multiple modalities for a given satellite image input. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 2024 IEEE International Geoscience and Remote Sensing Symposium

arXiv:2404.08986 [pdf, other]

Airship Formations for Animal Motion Capture and Behavior Analysis

Authors: Eric Price, Aamir Ahmad

Abstract: Using UAVs for wildlife observation and motion capture offers manifold advantages for studying animals in the wild, especially grazing herds in open terrain. The aerial perspective allows observation at a scale and depth that is not possible on the ground, offering new insights into group behavior. However, the very nature of wildlife field-studies puts traditional fixed wing and multi-copter syst… ▽ More Using UAVs for wildlife observation and motion capture offers manifold advantages for studying animals in the wild, especially grazing herds in open terrain. The aerial perspective allows observation at a scale and depth that is not possible on the ground, offering new insights into group behavior. However, the very nature of wildlife field-studies puts traditional fixed wing and multi-copter systems to their limits: limited flight time, noise and safety aspects affect their efficacy, where lighter than air systems can remain on station for many hours. Nevertheless, airships are challenging from a ground handling perspective as well as from a control point of view, being voluminous and highly affected by wind. In this work, we showcase a system designed to use airship formations to track, follow, and visually record wild horses from multiple angles, including airship design, simulation, control, on board computer vision, autonomous operation and practical aspects of field experiments. △ Less

Submitted 24 May, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

Comments: Accepted for presentation at the 2nd International Conference on Design and Engineering of Lighter-Than-Air systems (DELTAS2024)

arXiv:2404.08443 [pdf, other]

Toward FAIR Semantic Publishing of Research Dataset Metadata in the Open Research Knowledge Graph

Authors: Raia Abu Ahmad, Jennifer D'Souza, Matthäus Zloch, Wolfgang Otto, Georg Rehm, Allard Oelen, Stefan Dietze, Sören Auer

Abstract: Search engines these days can serve datasets as search results. Datasets get picked up by search technologies based on structured descriptions on their official web pages, informed by metadata ontologies such as the Dataset content type of schema.org. Despite this promotion of the content type dataset as a first-class citizen of search results, a vast proportion of datasets, particularly research… ▽ More Search engines these days can serve datasets as search results. Datasets get picked up by search technologies based on structured descriptions on their official web pages, informed by metadata ontologies such as the Dataset content type of schema.org. Despite this promotion of the content type dataset as a first-class citizen of search results, a vast proportion of datasets, particularly research datasets, still need to be made discoverable and, therefore, largely remain unused. This is due to the sheer volume of datasets released every day and the inability of metadata to reflect a dataset's content and context accurately. This work seeks to improve this situation for a specific class of datasets, namely research datasets, which are the result of research endeavors and are accompanied by a scholarly publication. We propose the ORKG-Dataset content type, a specialized branch of the Open Research Knowledge Graoh (ORKG) platform, which provides descriptive information and a semantic model for research datasets, integrating them with their accompanying scholarly publications. This work aims to establish a standardized framework for recording and reporting research datasets within the ORKG-Dataset content type. This, in turn, increases research dataset transparency on the web for their improved discoverability and applied use. In this paper, we present a proposal -- the minimum FAIR, comparable, semantic description of research datasets in terms of salient properties of their supporting publication. We design a specific application of the ORKG-Dataset semantic model based on 40 diverse research datasets on scientific information extraction. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 8 pages, 1 figure, published in the Joint Proceedings of the Onto4FAIR 2023 Workshops

Journal ref: In Joint Proceedings of the Onto4FAIR 2023 Workshops: Collocated with FOIS 2023 and SEMANTICS 2023. pp.23-31. https://hal.science/hal-04312604

arXiv:2404.08005 [pdf, other]

doi 10.1145/3649329.3657391

Accel-NASBench: Sustainable Benchmarking for Accelerator-Aware NAS

Authors: Afzal Ahmad, Linfeng Du, Zhiyao Xie, Wei Zhang

Abstract: One of the primary challenges impeding the progress of Neural Architecture Search (NAS) is its extensive reliance on exorbitant computational resources. NAS benchmarks aim to simulate runs of NAS experiments at zero cost, remediating the need for extensive compute. However, existing NAS benchmarks use synthetic datasets and model proxies that make simplified assumptions about the characteristics o… ▽ More One of the primary challenges impeding the progress of Neural Architecture Search (NAS) is its extensive reliance on exorbitant computational resources. NAS benchmarks aim to simulate runs of NAS experiments at zero cost, remediating the need for extensive compute. However, existing NAS benchmarks use synthetic datasets and model proxies that make simplified assumptions about the characteristics of these datasets and models, leading to unrealistic evaluations. We present a technique that allows searching for training proxies that reduce the cost of benchmark construction by significant margins, making it possible to construct realistic NAS benchmarks for large-scale datasets. Using this technique, we construct an open-source bi-objective NAS benchmark for the ImageNet2012 dataset combined with the on-device performance of accelerators, including GPUs, TPUs, and FPGAs. Through extensive experimentation with various NAS optimizers and hardware platforms, we show that the benchmark is accurate and allows searching for state-of-the-art hardware-aware models at zero cost. △ Less

Submitted 18 June, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: Accepted at Design Automation Conference DAC'24

arXiv:2403.20147 [pdf, other]

IndiBias: A Benchmark Dataset to Measure Social Biases in Language Models for Indian Context

Authors: Nihar Ranjan Sahoo, Pranamya Prashant Kulkarni, Narjis Asad, Arif Ahmad, Tanu Goyal, Aparna Garimella, Pushpak Bhattacharyya

Abstract: The pervasive influence of social biases in language data has sparked the need for benchmark datasets that capture and evaluate these biases in Large Language Models (LLMs). Existing efforts predominantly focus on English language and the Western context, leaving a void for a reliable dataset that encapsulates India's unique socio-cultural nuances. To bridge this gap, we introduce IndiBias, a comp… ▽ More The pervasive influence of social biases in language data has sparked the need for benchmark datasets that capture and evaluate these biases in Large Language Models (LLMs). Existing efforts predominantly focus on English language and the Western context, leaving a void for a reliable dataset that encapsulates India's unique socio-cultural nuances. To bridge this gap, we introduce IndiBias, a comprehensive benchmarking dataset designed specifically for evaluating social biases in the Indian context. We filter and translate the existing CrowS-Pairs dataset to create a benchmark dataset suited to the Indian context in Hindi language. Additionally, we leverage LLMs including ChatGPT and InstructGPT to augment our dataset with diverse societal biases and stereotypes prevalent in India. The included bias dimensions encompass gender, religion, caste, age, region, physical appearance, and occupation. We also build a resource to address intersectional biases along three intersectional dimensions. Our dataset contains 800 sentence pairs and 300 tuples for bias measurement across different demographics. The dataset is available in English and Hindi, providing a size comparable to existing benchmark datasets. Furthermore, using IndiBias we compare ten different language models on multiple bias measurement metrics. We observed that the language models exhibit more bias across a majority of the intersectional groups. △ Less

Submitted 3 April, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

arXiv:2403.14058 [pdf, other]

Hypothesis-Driven Deep Learning for Out of Distribution Detection

Authors: Yasith Jayawardana, Azeem Ahmad, Balpreet S. Ahluwalia, Rafi Ahmad, Sampath Jayarathna, Dushan N. Wadduwage

Abstract: Predictions of opaque black-box systems are frequently deployed in high-stakes applications such as healthcare. For such applications, it is crucial to assess how models handle samples beyond the domain of training data. While several metrics and tests exist to detect out-of-distribution (OoD) data from in-distribution (InD) data to a deep neural network (DNN), their performance varies significant… ▽ More Predictions of opaque black-box systems are frequently deployed in high-stakes applications such as healthcare. For such applications, it is crucial to assess how models handle samples beyond the domain of training data. While several metrics and tests exist to detect out-of-distribution (OoD) data from in-distribution (InD) data to a deep neural network (DNN), their performance varies significantly across datasets, models, and tasks, which limits their practical use. In this paper, we propose a hypothesis-driven approach to quantify whether a new sample is InD or OoD. Given a trained DNN and some input, we first feed the input through the DNN and compute an ensemble of OoD metrics, which we term latent responses. We then formulate the OoD detection problem as a hypothesis test between latent responses of different groups, and use permutation-based resampling to infer the significance of the observed latent responses under a null hypothesis. We adapt our method to detect an unseen sample of bacteria to a trained deep learning model, and show that it reveals interpretable differences between InD and OoD latent responses. Our work has implications for systematic novelty detection and informed decision-making from classifiers trained on a subset of labels. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.13809 [pdf]

Predicting Confinement Effect of Carbon Fiber Reinforced Polymers on Strength of Concrete using Metaheuristics-based Artificial Neural Networks

Authors: Sarmed Wahab, Mohamed Suleiman, Faisal Shabbir, Nasim Shakouri Mahmoudabadi, Sarmad Waqas, Nouman Herl, Afaq Ahmad

Abstract: This article deals with the study of predicting the confinement effect of carbon fiber reinforced polymers (CFRPs) on concrete cylinder strength using metaheuristics-based artificial neural networks. A detailed database of 708 CFRP confined concrete cylinders is developed from previously published research with information on 8 parameters including geometrical parameters like the diameter (d) and… ▽ More This article deals with the study of predicting the confinement effect of carbon fiber reinforced polymers (CFRPs) on concrete cylinder strength using metaheuristics-based artificial neural networks. A detailed database of 708 CFRP confined concrete cylinders is developed from previously published research with information on 8 parameters including geometrical parameters like the diameter (d) and height (h) of a cylinder, unconfined compressive strength of concrete (fco'), thickness (nt), the elastic modulus of CFRP (Ef), unconfined concrete strain confined concrete strain and the ultimate compressive strength of confined concrete fcc'. Three metaheuristic models are implemented including particle swarm optimization (PSO), grey wolf optimizer (GWO), and bat algorithm (BA). These algorithms are trained on the data using an objective function of mean square error and their predicted results are validated against the experimental studies and finite element analysis. The study shows that the hybrid model of PSO predicted the strength of CFRP-confined concrete cylinders with maximum accuracy of 99.13% and GWO predicted the results with an accuracy of 98.17%. The high accuracy of axial compressive strength predictions demonstrated that these prediction models are a reliable solution to the empirical methods. The prediction models are especially suitable for avoiding full-scale time-consuming experimental tests that make the process quick and economical. △ Less

Submitted 22 December, 2023; originally announced March 2024.

Comments: 28 Pages, 19 Figures

arXiv:2403.12980 [pdf, other]

Containerization in Multi-Cloud Environment: Roles, Strategies, Challenges, and Solutions for Effective Implementation

Authors: Muhammad Waseem, Aakash Ahmad, Peng Liang, Muhammad Azeem Akbar, Arif Ali Khan, Iftikhar Ahmad, Manu Setälä, Tommi Mikkonen

Abstract: Containerization in a multi-cloud environment facilitates workload portability and optimized resource utilization. Containerization in multi-cloud environments has received significant attention in recent years both from academic research and industrial development perspectives. However, there exists no effort to systematically investigate the state of research on this topic. The aim of this resea… ▽ More Containerization in a multi-cloud environment facilitates workload portability and optimized resource utilization. Containerization in multi-cloud environments has received significant attention in recent years both from academic research and industrial development perspectives. However, there exists no effort to systematically investigate the state of research on this topic. The aim of this research is to systematically identify and categorize the multiple aspects of container utilization in multi-cloud environment. We conduct the Systematic Mapping Study (SMS) on the literature published between January 2013 and March 2023. Eighty-six studies were finally selected and the key results are: (1) Four leading themes on cloud computing and network systems research were identified: 'Scalability and High Availability', 'Performance and Optimization', 'Security and Privacy', and 'Multi-Cloud Container Monitoring and Adaptation'. (2) Seventy-four patterns and strategies for containerization in multi-cloud environment were classified across 10 subcategories and 4 categories. (3) Ten quality attributes considered were identified with 47 associated tactics. (4) Four distinct frameworks were introduced based on the analysis of identified challenges and solutions: a security challenge-solution framework, an automation challenge-solution framework, a deployment challenge-solution framework, and a monitoring challenge-solution framework. The results of this SMS will assist researchers and practitioners in pursuing further studies on containerization in multi-cloud environment and developing specialized solutions for challenges related to containerization applications in multi-cloud environment. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: 59 pages, 4 images, 16 tables, Manuscript submitted to a Journal (2024)

arXiv:2403.10494 [pdf, other]

Lifelong LERF: Local 3D Semantic Inventory Monitoring Using FogROS2

Authors: Adam Rashid, Chung Min Kim, Justin Kerr, Letian Fu, Kush Hari, Ayah Ahmad, Kaiyuan Chen, Huang Huang, Marcus Gualtieri, Michael Wang, Christian Juette, Nan Tian, Liu Ren, Ken Goldberg

Abstract: Inventory monitoring in homes, factories, and retail stores relies on maintaining data despite objects being swapped, added, removed, or moved. We introduce Lifelong LERF, a method that allows a mobile robot with minimal compute to jointly optimize a dense language and geometric representation of its surroundings. Lifelong LERF maintains this representation over time by detecting semantic changes… ▽ More Inventory monitoring in homes, factories, and retail stores relies on maintaining data despite objects being swapped, added, removed, or moved. We introduce Lifelong LERF, a method that allows a mobile robot with minimal compute to jointly optimize a dense language and geometric representation of its surroundings. Lifelong LERF maintains this representation over time by detecting semantic changes and selectively updating these regions of the environment, avoiding the need to exhaustively remap. Human users can query inventory by providing natural language queries and receiving a 3D heatmap of potential object locations. To manage the computational load, we use Fog-ROS2, a cloud robotics platform, to offload resource-intensive tasks. Lifelong LERF obtains poses from a monocular RGBD SLAM backend, and uses these poses to progressively optimize a Language Embedded Radiance Field (LERF) for semantic monitoring. Experiments with 3-5 objects arranged on a tabletop and a Turtlebot with a RealSense camera suggest that Lifelong LERF can persistently adapt to changes in objects with up to 91% accuracy. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: See project webpage at: https://sites.google.com/berkeley.edu/lifelonglerf/home

arXiv:2403.05576 [pdf]

Understanding Subjectivity through the Lens of Motivational Context in Model-Generated Image Satisfaction

Authors: Senjuti Dutta, Sherol Chen, Sunny Mak, Amnah Ahmad, Katherine Collins, Alena Butryna, Deepak Ramachandran, Krishnamurthy Dvijotham, Ellie Pavlick, Ravi Rajakumar

Abstract: Image generation models are poised to become ubiquitous in a range of applications. These models are often fine-tuned and evaluated using human quality judgments that assume a universal standard, failing to consider the subjectivity of such tasks. To investigate how to quantify subjectivity, and the scale of its impact, we measure how assessments differ among human annotators across different use… ▽ More Image generation models are poised to become ubiquitous in a range of applications. These models are often fine-tuned and evaluated using human quality judgments that assume a universal standard, failing to consider the subjectivity of such tasks. To investigate how to quantify subjectivity, and the scale of its impact, we measure how assessments differ among human annotators across different use cases. Simulating the effects of ordinarily latent elements of annotators subjectivity, we contrive a set of motivations (t-shirt graphics, presentation visuals, and phone background images) to contextualize a set of crowdsourcing tasks. Our results show that human evaluations of images vary within individual contexts and across combinations of contexts. Three key factors affecting this subjectivity are image appearance, image alignment with text, and representation of objects mentioned in the text. Our study highlights the importance of taking individual users and contexts into account, both when building and evaluating generative models △ Less

Submitted 26 February, 2024; originally announced March 2024.

arXiv:2402.01386 [pdf, other]

Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis

Authors: Zeeshan Rasheed, Muhammad Waseem, Aakash Ahmad, Kai-Kristian Kemell, Wang Xiaofeng, Anh Nguyen Duc, Pekka Abrahamsson

Abstract: Recent advancements in Large Language Models (LLMs) have enabled collaborative human-bot interactions in Software Engineering (SE), similar to many other professions. However, the potential benefits and implications of incorporating LLMs into qualitative data analysis in SE have not been completely explored. For instance, conducting qualitative data analysis manually can be a time-consuming, effor… ▽ More Recent advancements in Large Language Models (LLMs) have enabled collaborative human-bot interactions in Software Engineering (SE), similar to many other professions. However, the potential benefits and implications of incorporating LLMs into qualitative data analysis in SE have not been completely explored. For instance, conducting qualitative data analysis manually can be a time-consuming, effort-intensive, and error-prone task for researchers. LLM-based solutions, such as generative AI models trained on massive datasets, can be utilized to automate tasks in software development as well as in qualitative data analysis. To this end, we utilized LLMs to automate and expedite the qualitative data analysis processes. We employed a multi-agent model, where each agent was tasked with executing distinct, individual research related activities. Our proposed model interpreted large quantities of textual documents and interview transcripts to perform several common tasks used in qualitative analysis. The results show that this technical assistant speeds up significantly the data analysis process, enabling researchers to manage larger datasets much more effectively. Furthermore, this approach introduces a new dimension of scalability and accuracy in qualitative research, potentially transforming data interpretation methodologies in SE. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: 9 pages and 2 figures

arXiv:2401.09185 [pdf, other]

Behavior Trees with Dataflow: Coordinating Reactive Tasks in Lingua Franca

Authors: Alexander Schulz-Rosengarten, Akash Ahmad, Malte Clement, Reinhard von Hanxleden, Benjamin Asch, Marten Lohstroh, Edward A. Lee, Gustavo Quiros Araya, Ankit Shukla

Abstract: Behavior Trees (BTs) provide a lean set of control flow elements that are easily composable in a modular tree structure. They are well established for modeling the high-level behavior of non-player characters in computer games and recently gained popularity in other areas such as industrial automation. While BTs nicely express control, data handling aspects so far must be provided separately, e. g… ▽ More Behavior Trees (BTs) provide a lean set of control flow elements that are easily composable in a modular tree structure. They are well established for modeling the high-level behavior of non-player characters in computer games and recently gained popularity in other areas such as industrial automation. While BTs nicely express control, data handling aspects so far must be provided separately, e. g. in the form of blackboards. This may hamper reusability and can be a source of nondeterminism. We here present a dataflow extension to BTs that explicitly models data relations and communication. We provide a combined textual/graphical approach in line with modern, productivity-enhancing pragmatics-aware modeling techniques. We realized and validated that approach in the recently introduced polyglot coordination language Lingua Franca (LF). △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.01636 [pdf, other]

doi 10.1109/ACCESS.2024.3350138

Efficient UAVs Deployment and Resource Allocation in UAV-Relay Assisted Public Safety Networks for Video Transmission

Authors: Naveed Khan, Ayaz Ahmad, Abdul Wakeel, Zeeshan Kaleem, Bushra Rashid, Waqas Khalid

Abstract: Wireless communication highly depends on the cellular ground base station (GBS). A failure of the cellular GBS, fully or partially, during natural or man-made disasters creates a communication gap in the disaster-affected areas. In such situations, public safety communication (PSC) can significantly save the national infrastructure, property, and lives. Throughout emergencies, the PSC can provide… ▽ More Wireless communication highly depends on the cellular ground base station (GBS). A failure of the cellular GBS, fully or partially, during natural or man-made disasters creates a communication gap in the disaster-affected areas. In such situations, public safety communication (PSC) can significantly save the national infrastructure, property, and lives. Throughout emergencies, the PSC can provide mission-critical communication and video transmission services in the affected area. Unmanned aerial vehicles (UAVs) as flying base stations (UAV-BSs) are particularly suitable for PSC services as they are flexible, mobile, and easily deployable. This manuscript considers a multi-UAV-assisted PSC network with an observational UAV receiving videos from the affected area's ground users (AGUs) and transmitting them to the nearby GBS via a relay UAV. The objective of the proposed study is to maximize the average utility of the video streams generated by the AGUs upon reaching the GBS. This is achieved by optimizing the positions of the observational and relay UAVs, as well as the distribution of communication resources, such as bandwidth, and transmit power, while satisfying the system-designed constraints, such as transmission rate, rate outage probability, transmit power budget, and available bandwidth. To this end, a joint UAVs placement and resource allocation problem is mathematically formulated. The proposed problem poses a significant challenge for a solution. Considering the block coordinate descent and successive convex approximation techniques, an efficient iterative algorithm is proposed. Finally, simulation results are provided which show that our proposed approach outperforms the existing methods. △ Less

Submitted 3 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

Comments: Accepted for IEEE Access. Corresponding author: Waqas Khalid

arXiv:2312.08334 [pdf, other]

LD-SDM: Language-Driven Hierarchical Species Distribution Modeling

Authors: Srikumar Sastry, Xin Xing, Aayush Dhakal, Subash Khanal, Adeel Ahmad, Nathan Jacobs

Abstract: We focus on the problem of species distribution modeling using global-scale presence-only data. Most previous studies have mapped the range of a given species using geographical and environmental features alone. To capture a stronger implicit relationship between species, we encode the taxonomic hierarchy of species using a large language model. This enables range mapping for any taxonomic rank an… ▽ More We focus on the problem of species distribution modeling using global-scale presence-only data. Most previous studies have mapped the range of a given species using geographical and environmental features alone. To capture a stronger implicit relationship between species, we encode the taxonomic hierarchy of species using a large language model. This enables range mapping for any taxonomic rank and unseen species without additional supervision. Further, we propose a novel proximity-aware evaluation metric that enables evaluating species distribution models using any pixel-level representation of ground-truth species range map. The proposed metric penalizes the predictions of a model based on its proximity to the ground truth. We describe the effectiveness of our model by systematically evaluating on the task of species range prediction, zero-shot prediction and geo-feature regression against the state-of-the-art. Results show our model outperforms the strong baselines when trained with a variety of multi-label learning losses. △ Less

Submitted 13 December, 2023; originally announced December 2023.

Comments: 17 pages, 9 figures

arXiv:2312.05421 [pdf, other]

Architecture Decisions in Quantum Software Systems: An Empirical Study on Stack Exchange and GitHub

Authors: Mst Shamima Aktar, Peng Liang, Muhammad Waseem, Amjed Tahir, Aakash Ahmad, Beiqi Zhang, Zengyang Li

Abstract: Quantum computing provides a new dimension in computation, utilizing the principles of quantum mechanics to potentially solve complex problems that are currently intractable for classical computers. However, little research has been conducted about the architecture decisions made in quantum software development, which have a significant influence on the functionality, performance, scalability, and… ▽ More Quantum computing provides a new dimension in computation, utilizing the principles of quantum mechanics to potentially solve complex problems that are currently intractable for classical computers. However, little research has been conducted about the architecture decisions made in quantum software development, which have a significant influence on the functionality, performance, scalability, and reliability of these systems. The study aims to empirically investigate and analyze architecture decisions made during the development of quantum software systems, identifying prevalent challenges and limitations by using the posts and issues from Stack Exchange and GitHub. We used a qualitative approach to analyze the obtained data from Stack Exchange Sites and GitHub projects. Specifically, we collected data from 151 issues (from 47 GitHub projects) and 43 posts (from three Stack Exchange sites) related to architecture decisions in quantum software development. The results show that in quantum software development (1) architecture decisions are articulated in six linguistic patterns, the most common of which are Solution Proposal and Information Giving, (2) the two major categories of architectural decisions are Implementation Decision and Technology Decision, (3) Quantum Programming Framework is the most common application domain among the sixteen application domains identified, (4) Maintainability is the most frequently considered quality attribute, and (5) Design Issue and Performance Issue are the major limitations and challenges that practitioners face when making architecture decisions in quantum software development. Our results show that the limitations and challenges encountered in architecture decision-making during the development of quantum software systems are strongly linked to the particular features (e.g., quantum entanglement, superposition, and decoherence) of those systems. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: 33 pages, 3 images, 10 tables, Manuscript submitted to a Journal (2023)

arXiv:2311.12824 [pdf]

Comparative Analysis of Shear Strength Prediction Models for Reinforced Concrete Slab-Column Connections

Authors: Sarmed Wahab, Nasim Shakouri Mahmoudabadi, Sarmad Waqas, Nouman Herl, Muhammad Iqbal, Khurshid Alam, Afaq Ahmad

Abstract: This research aims at comparative analysis of shear strength prediction at slab-column connection, unifying machine learning, design codes and Finite Element Analysis. Current design codes (CDCs) of ACI 318-19 (ACI), Eurocode 2 (EC2), Compressive Force Path (CFP) method, Feed Forward Neural Network (FNN) based Artificial Neural Network (ANN), PSO-based FNN (PSOFNN), and BAT algorithm-based BATFNN… ▽ More This research aims at comparative analysis of shear strength prediction at slab-column connection, unifying machine learning, design codes and Finite Element Analysis. Current design codes (CDCs) of ACI 318-19 (ACI), Eurocode 2 (EC2), Compressive Force Path (CFP) method, Feed Forward Neural Network (FNN) based Artificial Neural Network (ANN), PSO-based FNN (PSOFNN), and BAT algorithm-based BATFNN are used. The study is complemented with FEA of slab for validating the experimental results and machine learning predictions.In the case of hybrid models of PSOFNN and BATFNN, mean square error is used as an objective function to obtain the optimized values of the weights, that are used by Feed Forward Neural Network to perform predictions on the slab data. Seven different models of PSOFNN, BATFNN, and FNN are trained on this data and the results exhibited that PSOFNN is the best model overall. PSOFNN has the best results for SCS=1 with highest value of R as 99.37% and lowest of MSE, and MAE values of 0.0275%, and 1.214% respectively which are better than the best FNN model for SCS=4 having the values of R, MSE, and MAE as 97.464%, 0.0492%, and 1.43%, respectively. △ Less

Submitted 28 November, 2023; v1 submitted 29 September, 2023; originally announced November 2023.

Comments: 34 Pages,25 Figures

arXiv:2311.08545 [pdf, other]

Efficient Continual Pre-training for Building Domain Specific Large Language Models

Authors: Yong Xie, Karan Aggarwal, Aitzaz Ahmad

Abstract: Large language models (LLMs) have demonstrated remarkable open-domain capabilities. Traditionally, LLMs tailored for a domain are trained from scratch to excel at handling domain-specific tasks. In this work, we explore an alternative strategy of continual pre-training as a means to develop domain-specific LLMs. We introduce FinPythia-6.9B, developed through domain-adaptive continual pre-training… ▽ More Large language models (LLMs) have demonstrated remarkable open-domain capabilities. Traditionally, LLMs tailored for a domain are trained from scratch to excel at handling domain-specific tasks. In this work, we explore an alternative strategy of continual pre-training as a means to develop domain-specific LLMs. We introduce FinPythia-6.9B, developed through domain-adaptive continual pre-training on the financial domain. Continual pre-trained FinPythia showcases consistent improvements on financial tasks over the original foundational model. We further explore simple but effective data selection strategies for continual pre-training. Our data selection strategies outperforms vanilla continual pre-training's performance with just 10% of corpus size and cost, without any degradation on open-domain standard tasks. Our work proposes an alternative solution to building domain-specific LLMs from scratch in a cost-effective manner. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2311.01624 [pdf, other]

Attention based Dual-Branch Complex Feature Fusion Network for Hyperspectral Image Classification

Authors: Mohammed Q. Alkhatib, Mina Al-Saad, Nour Aburaed, M. Sami Zitouni, Hussain Al Ahmad

Abstract: This research work presents a novel dual-branch model for hyperspectral image classification that combines two streams: one for processing standard hyperspectral patches using Real-Valued Neural Network (RVNN) and the other for processing their corresponding Fourier transforms using Complex-Valued Neural Network (CVNN). The proposed model is evaluated on the Pavia University and Salinas datasets.… ▽ More This research work presents a novel dual-branch model for hyperspectral image classification that combines two streams: one for processing standard hyperspectral patches using Real-Valued Neural Network (RVNN) and the other for processing their corresponding Fourier transforms using Complex-Valued Neural Network (CVNN). The proposed model is evaluated on the Pavia University and Salinas datasets. Results show that the proposed model outperforms state-of-the-art methods in terms of overall accuracy, average accuracy, and Kappa. Through the incorporation of Fourier transforms in the second stream, the model is able to extract frequency information, which complements the spatial information extracted by the first stream. The combination of these two streams improves the overall performance of the model. Furthermore, to enhance the model performance, the Squeeze and Excitation (SE) mechanism has been utilized. Experimental evidence show that SE block improves the models overall accuracy by almost 1\%. △ Less

Submitted 2 November, 2023; originally announced November 2023.

arXiv:2311.01463 [pdf, other]

Creating Trustworthy LLMs: Dealing with Hallucinations in Healthcare AI

Authors: Muhammad Aurangzeb Ahmad, Ilker Yaramis, Taposh Dutta Roy

Abstract: Large language models have proliferated across multiple domains in as short period of time. There is however hesitation in the medical and healthcare domain towards their adoption because of issues like factuality, coherence, and hallucinations. Give the high stakes nature of healthcare, many researchers have even cautioned against its usage until these issues are resolved. The key to the implemen… ▽ More Large language models have proliferated across multiple domains in as short period of time. There is however hesitation in the medical and healthcare domain towards their adoption because of issues like factuality, coherence, and hallucinations. Give the high stakes nature of healthcare, many researchers have even cautioned against its usage until these issues are resolved. The key to the implementation and deployment of LLMs in healthcare is to make these models trustworthy, transparent (as much possible) and explainable. In this paper we describe the key elements in creating reliable, trustworthy, and unbiased models as a necessary condition for their adoption in healthcare. Specifically we focus on the quantification, validation, and mitigation of hallucinations in the context in healthcare. Lastly, we discuss how the future of LLMs in healthcare may look like. △ Less

Submitted 26 September, 2023; originally announced November 2023.

arXiv:2311.01020 [pdf, other]

Exploring the Problems, their Causes and Solutions of AI Pair Programming: A Study with Practitioners of GitHub Copilot

Authors: Xiyu Zhou, Peng Liang, Beiqi Zhang, Zengyang Li, Aakash Ahmad, Mojtaba Shahin, Muhammad Waseem

Abstract: With the recent advancement of Artificial Intelligence (AI) and Large Language Models (LLMs), AI-based code generation tools become a practical solution for software development. GitHub Copilot, the AI pair programmer, utilizes machine learning models trained on a large corpus of code snippets to generate code suggestions using natural language processing. Despite its popularity in software develo… ▽ More With the recent advancement of Artificial Intelligence (AI) and Large Language Models (LLMs), AI-based code generation tools become a practical solution for software development. GitHub Copilot, the AI pair programmer, utilizes machine learning models trained on a large corpus of code snippets to generate code suggestions using natural language processing. Despite its popularity in software development, there is limited empirical evidence on the actual experiences of practitioners who work with Copilot. To this end, we conducted an empirical study to understand the problems that practitioners face when using Copilot, as well as their underlying causes and potential solutions. We collected data from 476 GitHub issues, 706 GitHub discussions, and 142 Stack Overflow posts. Our results reveal that (1) Operation Issue and Compatibility Issue are the most common problems faced by Copilot users, (2) Copilot Internal Error, Network Connection Error, and Editor/IDE Compatibility Issue are identified as the most frequent causes, and (3) Bug Fixed by Copilot, Modify Configuration/Setting, and Use Suitable Version are the predominant solutions. Based on the results, we discuss the potential areas of Copilot for enhancement, and provide the implications for the Copilot users, the Copilot team, and researchers. △ Less

Submitted 28 April, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

arXiv:2311.00646 [pdf, other]

Issues and Their Causes in WebAssembly Applications: An Empirical Study

Authors: Muhammad Waseem, Teerath Das, Aakash Ahmad, Peng Liang, Tommi Mikkonen

Abstract: WebAssembly (Wasm) is a binary instruction format designed for secure and efficient execution within sandboxed environments -- predominantly web apps and browsers -- to facilitate performance, security, and flexibility of web programming languages. In recent years, Wasm has gained significant attention from the academic research community and industrial development projects to engineer high-perfor… ▽ More WebAssembly (Wasm) is a binary instruction format designed for secure and efficient execution within sandboxed environments -- predominantly web apps and browsers -- to facilitate performance, security, and flexibility of web programming languages. In recent years, Wasm has gained significant attention from the academic research community and industrial development projects to engineer high-performance web applications. Despite the offered benefits, developers encounter a multitude of issues rooted in Wasm (e.g., faults, errors, failures) and are often unaware of their root causes that impact the development of web applications. To this end, we conducted an empirical study that mines and documents practitioners' knowledge expressed as 385 issues from 12 open-source Wasm projects deployed on GitHub and 354 question-answer posts via Stack Overflow. Overall, we identified 120 types of issues, which were categorized into 19 subcategories and 9 categories to create a taxonomical classification of issues encountered in Wasm-based applications. Furthermore, root cause analysis of the issues helped us identify 278 types of causes, which have been categorized into 29 subcategories and 10 categories as a taxonomy of causes. Our study led to first-of-its-kind taxonomies of the issues faced by developers and their underlying causes in Wasm-based applications. The issue-cause taxonomies -- identified from GitHub and SO, offering empirically derived guidelines -- can guide researchers and practitioners to design, develop, and refactor Wasm-based applications. △ Less

Submitted 9 April, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

Comments: The 28th International Conference on Evaluation and Assessment in Software Engineering (EASE)

arXiv:2310.16956 [pdf, ps, other]

Datastore Design for Analysis of Police Broadcast Audio at Scale

Authors: Ayah Ahmad, Christopher Graziul, Margaret Beale Spencer

Abstract: With policing coming under greater scrutiny in recent years, researchers have begun to more thoroughly study the effects of contact between police and minority communities. Despite data archives of hundreds of thousands of recorded Broadcast Police Communications (BPC) being openly available to the public, a closer look at a large-scale analysis of the language of policing has remained largely une… ▽ More With policing coming under greater scrutiny in recent years, researchers have begun to more thoroughly study the effects of contact between police and minority communities. Despite data archives of hundreds of thousands of recorded Broadcast Police Communications (BPC) being openly available to the public, a closer look at a large-scale analysis of the language of policing has remained largely unexplored. While this research is critical in understanding a "pre-reflective" notion of policing, the large quantity of data presents numerous challenges in its organization and analysis. In this paper, we describe preliminary work towards enabling Speech Emotion Recognition (SER) in an analysis of the Chicago Police Department's (CPD) BPC by demonstrating the pipelined creation of a datastore to enable a multimodal analysis of composed raw audio files. △ Less

Submitted 25 October, 2023; originally announced October 2023.

arXiv:2310.16951 [pdf, other]

The Teenager's Problem: Efficient Garment Decluttering With Grasp Optimization

Authors: Aviv Adler, Ayah Ahmad, Shengyin Wang, Wisdom C. Agboh, Edith Llontop, Tianshuang Qiu, Jeffrey Ichnowski, Mehmet Dogar, Thomas Kollar, Richard Cheng, Ken Goldberg

Abstract: This paper addresses the ''Teenager's Problem'': efficiently removing scattered garments from a planar surface. As grasping and transporting individual garments is highly inefficient, we propose analytical policies to select grasp locations for multiple garments using an overhead camera. Two classes of methods are considered: depth-based, which use overhead depth data to find efficient grasps, and… ▽ More This paper addresses the ''Teenager's Problem'': efficiently removing scattered garments from a planar surface. As grasping and transporting individual garments is highly inefficient, we propose analytical policies to select grasp locations for multiple garments using an overhead camera. Two classes of methods are considered: depth-based, which use overhead depth data to find efficient grasps, and segment-based, which use segmentation on the RGB overhead image (without requiring any depth data); grasp efficiency is measured by Objects per Transport, which denotes the average number of objects removed per trip to the laundry basket. Experiments suggest that both depth- and segment-based methods easily reduce Objects per Transport (OpT) by $20\%$; furthermore, these approaches complement each other, with combined hybrid methods yielding improvements of $34\%$. Finally, a method employing consolidation (with segmentation) is considered, which manipulates the garments on the work surface to increase OpT; this yields an improvement of $67\%$ over the baseline, though at a cost of additional physical actions. △ Less

Submitted 25 October, 2023; originally announced October 2023.

arXiv:2310.13648 [pdf, other]

ChatGPT as a Software Development Bot: A Project-based Study

Authors: Muhammad Waseem, Teerath Das, Aakash Ahmad, Peng Liang, Mahdi Fehmideh, Tommi Mikkonen

Abstract: Artificial Intelligence has demonstrated its significance in software engineering through notable improvements in productivity, accuracy, collaboration, and learning outcomes. This study examines the impact of generative AI tools, specifically ChatGPT, on the software development experiences of undergraduate students. Over a three-month project with seven students, ChatGPT was used as a support to… ▽ More Artificial Intelligence has demonstrated its significance in software engineering through notable improvements in productivity, accuracy, collaboration, and learning outcomes. This study examines the impact of generative AI tools, specifically ChatGPT, on the software development experiences of undergraduate students. Over a three-month project with seven students, ChatGPT was used as a support tool. The research focused on assessing ChatGPT's effectiveness, benefits, limitations, and its influence on learning. Results showed that ChatGPT significantly addresses skill gaps in software development education, enhancing efficiency, accuracy, and collaboration. It also improved participants' fundamental understanding and soft skills. The study highlights the importance of incorporating AI tools like ChatGPT in education to bridge skill gaps and increase productivity, but stresses the need for a balanced approach to technology use. Future research should focus on optimizing ChatGPT's application in various development contexts to maximize learning and address specific challenges. △ Less

Submitted 22 February, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

Comments: The 19th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE)

arXiv:2309.17426 [pdf]

Classification of Potholes Based on Surface Area Using Pre-Trained Models of Convolutional Neural Network

Authors: Chauhdary Fazeel Ahmad, Abdullah Cheema, Waqas Qayyum, Rana Ehtisham, Muhammad Haroon Yousaf, Junaid Mir, Nasim Shakouri Mahmoudabadi, Afaq Ahmad

Abstract: Potholes are fatal and can cause severe damage to vehicles as well as can cause deadly accidents. In South Asian countries, pavement distresses are the primary cause due to poor subgrade conditions, lack of subsurface drainage, and excessive rainfalls. The present research compares the performance of three pre-trained Convolutional Neural Network (CNN) models, i.e., ResNet 50, ResNet 18, and Mobil… ▽ More Potholes are fatal and can cause severe damage to vehicles as well as can cause deadly accidents. In South Asian countries, pavement distresses are the primary cause due to poor subgrade conditions, lack of subsurface drainage, and excessive rainfalls. The present research compares the performance of three pre-trained Convolutional Neural Network (CNN) models, i.e., ResNet 50, ResNet 18, and MobileNet. At first, pavement images are classified to find whether images contain potholes, i.e., Potholes or Normal. Secondly, pavements images are classi-fied into three categories, i.e., Small Pothole, Large Pothole, and Normal. Pavement images are taken from 3.5 feet (waist height) and 2 feet. MobileNet v2 has an accuracy of 98% for detecting a pothole. The classification of images taken at the height of 2 feet has an accuracy value of 87.33%, 88.67%, and 92% for classifying the large, small, and normal pavement, respectively. Similarly, the classification of the images taken from full of waist (FFW) height has an accuracy value of 98.67%, 98.67%, and 100%. △ Less

Submitted 29 September, 2023; originally announced September 2023.

Comments: 24 Pages, 26 Figures

arXiv:2309.09879 [pdf, other]

DynaPix SLAM: A Pixel-Based Dynamic SLAM Approach

Authors: Chenghao Xu, Elia Bonetto, Aamir Ahmad

Abstract: In static environments, visual simultaneous localization and mapping (V-SLAM) methods achieve remarkable performance. However, moving objects severely affect core modules of such systems like state estimation and loop closure detection. To address this, dynamic SLAM approaches often use semantic information, geometric constraints, or optical flow to mask features associated with dynamic entities.… ▽ More In static environments, visual simultaneous localization and mapping (V-SLAM) methods achieve remarkable performance. However, moving objects severely affect core modules of such systems like state estimation and loop closure detection. To address this, dynamic SLAM approaches often use semantic information, geometric constraints, or optical flow to mask features associated with dynamic entities. These are limited by various factors such as a dependency on the quality of the underlying method, poor generalization to unknown or unexpected moving objects, and often produce noisy results, e.g. by masking static but movable objects or making use of predefined thresholds. In this paper, to address these trade-offs, we introduce a novel visual SLAM system, DynaPix, based on per-pixel motion probability values. Our approach consists of a new semantic-free probabilistic pixel-wise motion estimation module and an improved pose optimization process. Our per-pixel motion probability estimation combines a novel static background differencing method on both images and optical flows from splatted frames. DynaPix fully integrates those motion probabilities into both map point selection and weighted bundle adjustment within the tracking and optimization modules of ORB-SLAM2. We evaluate DynaPix against ORB-SLAM2 and DynaSLAM on both GRADE and TUM-RGBD datasets, obtaining lower errors and longer trajectory tracking times. We will release both source code and data upon acceptance of this work. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: 19 pages

arXiv:2309.05687 [pdf, other]

Demystifying Practices, Challenges and Expected Features of Using GitHub Copilot

Authors: Beiqi Zhang, Peng Liang, Xiyu Zhou, Aakash Ahmad, Muhammad Waseem

Abstract: With the advances in machine learning, there is a growing interest in AI-enabled tools for autocompleting source code. GitHub Copilot has been trained on billions of lines of open source GitHub code, and is one of such tools that has been increasingly used since its launch in June 2021. However, little effort has been devoted to understanding the practices, challenges, and expected features of usi… ▽ More With the advances in machine learning, there is a growing interest in AI-enabled tools for autocompleting source code. GitHub Copilot has been trained on billions of lines of open source GitHub code, and is one of such tools that has been increasingly used since its launch in June 2021. However, little effort has been devoted to understanding the practices, challenges, and expected features of using Copilot in programming for auto-completed source code from the point of view of practitioners. To this end, we conducted an empirical study by collecting and analyzing the data from Stack Overflow (SO) and GitHub Discussions. We searched and manually collected 303 SO posts and 927 GitHub discussions related to the usage of Copilot. We identified the programming languages, Integrated Development Environments (IDEs), technologies used with Copilot, functions implemented, benefits, limitations, and challenges when using Copilot. The results show that when practitioners use Copilot: (1) The major programming languages used with Copilot are JavaScript and Python, (2) the main IDE used with Copilot is Visual Studio Code, (3) the most common used technology with Copilot is Node.js, (4) the leading function implemented by Copilot is data processing, (5) the main purpose of users using Copilot is to help generate code, (6) the significant benefit of using Copilot is useful code generation, (7) the main limitation encountered by practitioners when using Copilot is difficulty of integration, and (8) the most common expected feature is that Copilot can be integrated with more IDEs. Our results suggest that using Copilot is like a double-edged sword, which requires developers to carefully consider various aspects when deciding whether or not to use it. Our study provides empirically grounded foundations that could inform developers and practitioners, as well as provide a basis for future investigations. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Comments: Preprint accepted for publication in International Journal of Software Engineering and Knowledge Engineering, 2023. arXiv admin note: substantial text overlap with arXiv:2303.08733

arXiv:2308.10920 [pdf]

Addressing Knowledge Leakage Risk caused by the use of mobile devices in Australian Organizations

Authors: Carlos Andres Agudelo Serna, Rachelle Bosua, Sean B. Maynard, Atif Ahmad

Abstract: Information and knowledge leakage has become a significant security risk to Australian organizations. Each security incident in Australian business cost an average US$\$$2.8 million. Furthermore, Australian organisations spend the second most worldwide (US$\$$1.2 million each on average) on investigating and assessing information breaches. The leakage of sensitive organizational information occurs… ▽ More Information and knowledge leakage has become a significant security risk to Australian organizations. Each security incident in Australian business cost an average US$\$$2.8 million. Furthermore, Australian organisations spend the second most worldwide (US$\$$1.2 million each on average) on investigating and assessing information breaches. The leakage of sensitive organizational information occurs through different avenues, such as social media, cloud computing and mobile devices. In this study, we (1) analyze the knowledge leakage risk (KLR) caused by the use of mobile devices in knowledge-intensive Australian organizations, (2) present a conceptual research model to explain the determinants that influence KLR through the use of mobile devices grounded in the literature, (3) conduct interviews with security and knowledge managers to understand what strategies they use to mitigate KLR caused by the use of mobile devices and (4) use content analysis and the conceptual model to frame the preliminary findings from the interviews. Keywords: Knowledge leakage, mobile devices, mobile contexts, knowledge leakage risk △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: Pages 14. arXiv admin note: text overlap with arXiv:1606.01450

Journal ref: PACIS 2017 Proceedings. 224 http://aisel.aisnet.org/pacis2017/224

arXiv:2308.10689 [pdf]

Towards a knowledge leakage Mitigation framework for mobile Devices in knowledge-intensive Organizations

Authors: Carlos Andres Agudelo Serna, Rachelle Bosua, Atif Ahmad, Sean B. Maynard

Abstract: The use of mobile devices in knowledge-intensive organizations while effective and cost-efficient also pose a challenging management problem. Often employees whether deliberately or inadvertently are the cause of knowledge leakage in organizations and the use of mobile devices further exacerbates it. This problem is the result of overly focusing on technical controls while neglecting human factors… ▽ More The use of mobile devices in knowledge-intensive organizations while effective and cost-efficient also pose a challenging management problem. Often employees whether deliberately or inadvertently are the cause of knowledge leakage in organizations and the use of mobile devices further exacerbates it. This problem is the result of overly focusing on technical controls while neglecting human factors. Knowledge leakage is a multidimensional problem, and in this paper, we highlight the different dimensions that constitute it. In this study, our contributions are threefold. First, we study knowledge leakage risk (KLR) within the context of mobile devices in knowledge-intensive organizations in Australia. Second, we present a conceptual framework to explain and categorize the mitigation strategies to combat KLR through the use of mobile devices grounded in the literature. And third, we apply the framework to the findings from interviews with security and knowledge managers. Keywords: Knowledge Leakage, Knowledge Risk, Knowledge intensive, Mobile device. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: 22 pages, ECIS full paper 2018

Journal ref: Research Papers. 72 (2018) https://aisel.aisnet.org/ecis2018_rp/72

arXiv:2307.15904 [pdf, other]

Sat2Cap: Mapping Fine-Grained Textual Descriptions from Satellite Images

Authors: Aayush Dhakal, Adeel Ahmad, Subash Khanal, Srikumar Sastry, Hannah Kerner, Nathan Jacobs

Abstract: We propose a weakly supervised approach for creating maps using free-form textual descriptions. We refer to this work of creating textual maps as zero-shot mapping. Prior works have approached mapping tasks by developing models that predict a fixed set of attributes using overhead imagery. However, these models are very restrictive as they can only solve highly specific tasks for which they were t… ▽ More We propose a weakly supervised approach for creating maps using free-form textual descriptions. We refer to this work of creating textual maps as zero-shot mapping. Prior works have approached mapping tasks by developing models that predict a fixed set of attributes using overhead imagery. However, these models are very restrictive as they can only solve highly specific tasks for which they were trained. Mapping text, on the other hand, allows us to solve a large variety of mapping problems with minimal restrictions. To achieve this, we train a contrastive learning framework called Sat2Cap on a new large-scale dataset with 6.1M pairs of overhead and ground-level images. For a given location and overhead image, our model predicts the expected CLIP embeddings of the ground-level scenery. The predicted CLIP embeddings are then used to learn about the textual space associated with that location. Sat2Cap is also conditioned on date-time information, allowing it to model temporally varying concepts over a location. Our experimental results demonstrate that our models successfully capture ground-level concepts and allow large-scale mapping of fine-grained textual queries. Our approach does not require any text-labeled data, making the training easily scalable. The code, dataset, and models will be made publicly available. △ Less

Submitted 11 April, 2024; v1 submitted 29 July, 2023; originally announced July 2023.

Comments: 16 pages

arXiv:2307.05790 [pdf]

doi 10.1109/IPDPSW.2017.22

Design of an energy aware petaflops class high performance cluster based on power architecture

Authors: W. A. Ahmad, A. Bartolini, F. Beneventi, L. Benini, A. Borghesi, M. Cicala, P. Forestieri, C. Gianfreda, D. Gregori, A. Libri, F. Spiga, S. Tinti

Abstract: In this paper we present D.A.V.I.D.E. (Development for an Added Value Infrastructure Designed in Europe), an innovative and energy efficient High Performance Computing cluster designed by E4 Computer Engineering for PRACE (Partnership for Advanced Computing in Europe). D.A.V.I.D.E. is built using best-in-class components (IBM's POWER8-NVLink CPUs, NVIDIA TESLA P100 GPUs, Mellanox InfiniBand EDR 10… ▽ More In this paper we present D.A.V.I.D.E. (Development for an Added Value Infrastructure Designed in Europe), an innovative and energy efficient High Performance Computing cluster designed by E4 Computer Engineering for PRACE (Partnership for Advanced Computing in Europe). D.A.V.I.D.E. is built using best-in-class components (IBM's POWER8-NVLink CPUs, NVIDIA TESLA P100 GPUs, Mellanox InfiniBand EDR 100 Gb/s networking) plus custom hardware and an innovative system middleware software. D.A.V.I.D.E. features (i) a dedicated power monitor interface, built around the BeagleBone Black Board that allows high frequency sampling directly from the power backplane and scalable integration with the internal node telemetry and system level power management software; (ii) a custom-built chassis, based on OpenRack form factor, and liquid cooling that allows the system to be used in modern, energy efficient, datacenter; (iii) software components designed for enabling fine grain power monitoring, power management (i.e. power capping and energy aware job scheduling) and application power profiling, based on dedicated machine learning components. Software APIs are offered to developers and users to tune the computing node performance and power consumption around on the application requirements. The first pilot system that we will deploy at the beginning of 2017, will demonstrate key HPC applications from different fields ported and optimized for this innovative platform. △ Less

Submitted 11 July, 2023; originally announced July 2023.

arXiv:2307.03906 [pdf, other]

ScriptWorld: Text Based Environment For Learning Procedural Knowledge

Authors: Abhinav Joshi, Areeb Ahmad, Umang Pandey, Ashutosh Modi

Abstract: Text-based games provide a framework for developing natural language understanding and commonsense knowledge about the world in reinforcement learning based agents. Existing text-based environments often rely on fictional situations and characters to create a gaming framework and are far from real-world scenarios. In this paper, we introduce ScriptWorld: a text-based environment for teaching agent… ▽ More Text-based games provide a framework for developing natural language understanding and commonsense knowledge about the world in reinforcement learning based agents. Existing text-based environments often rely on fictional situations and characters to create a gaming framework and are far from real-world scenarios. In this paper, we introduce ScriptWorld: a text-based environment for teaching agents about real-world daily chores and hence imparting commonsense knowledge. To the best of our knowledge, it is the first interactive text-based gaming framework that consists of daily real-world human activities designed using scripts dataset. We provide gaming environments for 10 daily activities and perform a detailed analysis of the proposed environment. We develop RL-based baseline models/agents to play the games in Scriptworld. To understand the role of language models in such environments, we leverage features obtained from pre-trained language models in the RL agents. Our experiments show that prior knowledge obtained from a pre-trained language model helps to solve real-world text-based gaming environments. We release the environment via Github: https://github.com/Exploration-Lab/ScriptWorld △ Less

Submitted 8 July, 2023; originally announced July 2023.

Comments: Accepted at IJCAI 2023, 26 Pages (7 main + 19 for appendix)

arXiv:2307.03882 [pdf, other]

The Busboy Problem: Efficient Tableware Decluttering Using Consolidation and Multi-Object Grasps

Authors: Kishore Srinivas, Shreya Ganti, Rishi Parikh, Ayah Ahmad, Wisdom Agboh, Mehmet Dogar, Ken Goldberg

Abstract: We present the "Busboy Problem": automating an efficient decluttering of cups, bowls, and silverware from a planar surface. As grasping and transporting individual items is highly inefficient, we propose policies to generate grasps for multiple items. We introduce the metric of Objects per Trip (OpT) carried by the robot to the collection bin to analyze the improvement seen as a result of our poli… ▽ More We present the "Busboy Problem": automating an efficient decluttering of cups, bowls, and silverware from a planar surface. As grasping and transporting individual items is highly inefficient, we propose policies to generate grasps for multiple items. We introduce the metric of Objects per Trip (OpT) carried by the robot to the collection bin to analyze the improvement seen as a result of our policies. In physical experiments with singulated items, we find that consolidation and multi-object grasps resulted in an 1.8x improvement in OpT, compared to methods without multi-object grasps. See https://sites.google.com/berkeley.edu/busboyproblem for code and supplemental materials. △ Less

Submitted 7 July, 2023; originally announced July 2023.

arXiv:2306.07229 [pdf, other]

doi 10.1007/s10846-023-01879-2

MRS Drone: A Modular Platform for Real-World Deployment of Aerial Multi-Robot Systems

Authors: Daniel Hert, Tomas Baca, Pavel Petracek, Vit Kratky, Robert Penicka, Vojtech Spurny, Matej Petrlik, Matous Vrba, David Zaitlik, Pavel Stoudek, Viktor Walter, Petr Stepan, Jiri Horyna, Vaclav Pritzl, Martin Sramek, Afzal Ahmad, Giuseppe Silano, Daniel Bonilla Licea, Petr Stibinger, Tiago Nascimento, Martin Saska

Abstract: This paper presents a modular autonomous Unmanned Aerial Vehicle (UAV) platform called the Multi-robot Systems (MRS) Drone that can be used in a large range of indoor and outdoor applications. The MRS Drone features unique modularity with respect to changes in actuators, frames, and sensory configuration. As the name suggests, the platform is specially tailored for deployment within a MRS group. T… ▽ More This paper presents a modular autonomous Unmanned Aerial Vehicle (UAV) platform called the Multi-robot Systems (MRS) Drone that can be used in a large range of indoor and outdoor applications. The MRS Drone features unique modularity with respect to changes in actuators, frames, and sensory configuration. As the name suggests, the platform is specially tailored for deployment within a MRS group. The MRS Drone contributes to the state-of-the-art of UAV platforms by allowing smooth real-world deployment of multiple aerial robots, as well as by outperforming other platforms with its modularity. For real-world multi-robot deployment in various applications, the platform is easy to both assemble and modify. Moreover, it is accompanied by a realistic simulator to enable safe pre-flight testing and a smooth transition to complex real-world experiments. In this manuscript, we present mechanical and electrical designs, software architecture, and technical specifications to build a fully autonomous multi UAV system. Finally, we demonstrate the full capabilities and the unique modularity of the MRS Drone in various real-world applications that required a diverse range of platform configurations. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: 49 pages, 39 figures, accepted for publication to the Journal of Intelligent & Robotic Systems

Journal ref: Journal of Intelligent & Robotic Systems, 2023, vol. 108, issue 64

arXiv:2306.04578 [pdf, other]

A Reference Architecture for Quantum Computing as a Service

Authors: Aakash Ahmad, Ahmed B. Altamimi, Jamal Aqib

Abstract: Quantum computers (QCs) aim to disrupt the status-quo of computing -- replacing traditional systems and platforms that are driven by digital circuits and modular software -- with hardware and software that operates on the principle of quantum mechanics. QCs that rely on quantum mechanics can exploit quantum circuits (i.e., quantum bits for manipulating quantum gates) to achieve "quantum computatio… ▽ More Quantum computers (QCs) aim to disrupt the status-quo of computing -- replacing traditional systems and platforms that are driven by digital circuits and modular software -- with hardware and software that operates on the principle of quantum mechanics. QCs that rely on quantum mechanics can exploit quantum circuits (i.e., quantum bits for manipulating quantum gates) to achieve "quantum computational supremacy" over traditional, i.e., digital computing systems. Currently, the issues that impede mass-scale adoption of quantum systems are rooted in the fact that building, maintaining, and/or programming QCs is a complex and radically distinct engineering paradigm when compared to challenges of classical computing and software engineering. Quantum service orientation is seen as a solution that synergises the research on service computing and quantum software engineering (QSE) to allow developers and users to build and utilise quantum software services based on pay-per-shot utility computing model. The pay-per-shot model represents a single execution of instruction on quantum processing unit and it allows vendors (e.g., Amazon Braket) to offer their QC platforms, simulators, software services etc. to enterprises and individuals who do not need to own or maintain quantum systems. This research contributes by 1) developing a reference architecture for enabling quantum computing as a service, 2) implementing microservices with the quantum-classic split pattern as an architectural use-case, and 3) evaluating the reference architecture based on feedback by 22 practitioners. In the QSE context, the research focuses on unifying architectural methods and service-orientation patterns to promote reuse knowledge and best practices to tackle emerging and futuristic challenges of architecting and implementing Quantum Computing as a Service (QCaaS). △ Less

Submitted 3 June, 2023; originally announced June 2023.

Comments: 25 pages

MSC Class: 14J60 (Primary) 14F05 ACM Class: F.2.2; I.2.7

arXiv:2305.04286 [pdf, ps, other]

Simulation of Dynamic Environments for SLAM

Authors: Elia Bonetto, Chenghao Xu, Aamir Ahmad

Abstract: Simulation engines are widely adopted in robotics. However, they lack either full simulation control, ROS integration, realistic physics, or photorealism. Recently, synthetic data generation and realistic rendering has advanced tasks like target tracking and human pose estimation. However, when focusing on vision applications, there is usually a lack of information like sensor measurements or time… ▽ More Simulation engines are widely adopted in robotics. However, they lack either full simulation control, ROS integration, realistic physics, or photorealism. Recently, synthetic data generation and realistic rendering has advanced tasks like target tracking and human pose estimation. However, when focusing on vision applications, there is usually a lack of information like sensor measurements or time continuity. On the other hand, simulations for most robotics tasks are performed in (semi)static environments, with specific sensors and low visual fidelity. To solve this, we introduced in our previous work a fully customizable framework for generating realistic animated dynamic environments (GRADE) [1]. We use GRADE to generate an indoor dynamic environment dataset and then compare multiple SLAM algorithms on different sequences. By doing that, we show how current research over-relies on known benchmarks, failing to generalize. Our tests with refined YOLO and Mask R-CNN models provide further evidence that additional research in dynamic SLAM is necessary. The code, results, and generated data are provided as open-source at https://eliabntt.github.io/grade-rrSimulation of Dynamic Environments for SLAM △ Less

Submitted 26 May, 2023; v1 submitted 7 May, 2023; originally announced May 2023.

Comments: CITE AS: @inproceedings{ bonetto2023dynamicSLAM, title={{S}imulation of {D}ynamic {E}nvironments for {SLAM}}, author={Elia Bonetto and Chenghao Xu and Aamir Ahmad}, booktitle={ICRA2023 Workshop on Active Methods in Autonomous Navigation}, year={2023}, url={https://robotics.pme.duth.gr/workshop_active2/wp-content/uploads/2023/05/01.-Simulation-of-Dynamic-Environments-for-SLAM.pdf} }. arXiv admin note: substantial text overlap with arXiv:2303.04466

arXiv:2305.04282 [pdf, other]

Learning from synthetic data generated with GRADE

Authors: Elia Bonetto, Chenghao Xu, Aamir Ahmad

Abstract: Recently, synthetic data generation and realistic rendering has advanced tasks like target tracking and human pose estimation. Simulations for most robotics applications are obtained in (semi)static environments, with specific sensors and low visual fidelity. To solve this, we present a fully customizable framework for generating realistic animated dynamic environments (GRADE) for robotics researc… ▽ More Recently, synthetic data generation and realistic rendering has advanced tasks like target tracking and human pose estimation. Simulations for most robotics applications are obtained in (semi)static environments, with specific sensors and low visual fidelity. To solve this, we present a fully customizable framework for generating realistic animated dynamic environments (GRADE) for robotics research, first introduced in [1]. GRADE supports full simulation control, ROS integration, realistic physics, while being in an engine that produces high visual fidelity images and ground truth data. We use GRADE to generate a dataset focused on indoor dynamic scenes with people and flying objects. Using this, we evaluate the performance of YOLO and Mask R-CNN on the tasks of segmenting and detecting people. Our results provide evidence that using data generated with GRADE can improve the model performance when used for a pre-training step. We also show that, even training using only synthetic data, can generalize well to real-world images in the same application domain such as the ones from the TUM-RGBD dataset. The code, results, trained models, and the generated data are provided as open-source at https://eliabntt.github.io/grade-rr. △ Less

Submitted 26 May, 2023; v1 submitted 7 May, 2023; originally announced May 2023.

Comments: ICRA2023 Workshop on Pretraining for Robotics (PT4R) 2023, https://openreview.net/forum?id=SUIOuV2y-Ce . arXiv admin note: substantial text overlap with arXiv:2303.04466

arXiv:2305.00432 [pdf, other]

doi 10.1109/ECMR59166.2023.10256293

Synthetic Data-based Detection of Zebras in Drone Imagery

Authors: Elia Bonetto, Aamir Ahmad

Abstract: Nowadays, there is a wide availability of datasets that enable the training of common object detectors or human detectors. These come in the form of labelled real-world images and require either a significant amount of human effort, with a high probability of errors such as missing labels, or very constrained scenarios, e.g. VICON systems. On the other hand, uncommon scenarios, like aerial views,… ▽ More Nowadays, there is a wide availability of datasets that enable the training of common object detectors or human detectors. These come in the form of labelled real-world images and require either a significant amount of human effort, with a high probability of errors such as missing labels, or very constrained scenarios, e.g. VICON systems. On the other hand, uncommon scenarios, like aerial views, animals, like wild zebras, or difficult-to-obtain information, such as human shapes, are hardly available. To overcome this, synthetic data generation with realistic rendering technologies has recently gained traction and advanced research areas such as target tracking and human pose estimation. However, subjects such as wild animals are still usually not well represented in such datasets. In this work, we first show that a pre-trained YOLO detector can not identify zebras in real images recorded from aerial viewpoints. To solve this, we present an approach for training an animal detector using only synthetic data. We start by generating a novel synthetic zebra dataset using GRADE, a state-of-the-art framework for data generation. The dataset includes RGB, depth, skeletal joint locations, pose, shape and instance segmentations for each subject. We use this to train a YOLO detector from scratch. Through extensive evaluations of our model with real-world data from i) limited datasets available on the internet and ii) a new one collected and manually labelled by us, we show that we can detect zebras by using only synthetic data during training. The code, results, trained models, and both the generated and training data are provided as open-source at https://eliabntt.github.io/grade-rr. △ Less

Submitted 4 July, 2023; v1 submitted 30 April, 2023; originally announced May 2023.

Comments: 8 pages, 7 figures, 3 tables. Published in IEEE ECMR 2023

arXiv:2304.06645 [pdf, other]

Robustness Measures and Monitors for Time Window Temporal Logic

Authors: Ahmad Ahmad, Cristian-Ioan Vasile, Roberto Tron, Calin Belta

Abstract: Temporal logics (TLs) have been widely used to formalize interpretable tasks for cyber-physical systems. Time Window Temporal Logic (TWTL) has been recently proposed as a specification language for dynamical systems. In particular, it can easily express robotic tasks, and it allows for efficient, automata-based verification and synthesis of control policies for such systems. In this paper, we defi… ▽ More Temporal logics (TLs) have been widely used to formalize interpretable tasks for cyber-physical systems. Time Window Temporal Logic (TWTL) has been recently proposed as a specification language for dynamical systems. In particular, it can easily express robotic tasks, and it allows for efficient, automata-based verification and synthesis of control policies for such systems. In this paper, we define two quantitative semantics for this logic, and two corresponding monitoring algorithms, which allow for real-time quantification of satisfaction of formulas by trajectories of discrete-time systems. We demonstrate the new semantics and their runtime monitors on numerical examples. △ Less

Submitted 13 April, 2023; originally announced April 2023.

Comments: Submitted to the 62nd IEEE Conference on Decision and Control (CDC2023)

arXiv:2304.00790 [pdf, other]

LQR-CBF-RRT*: Safe and Optimal Motion Planning

Authors: Guang Yang, Mingyu Cai, Ahmad Ahmad, Amanda Prorok, Roberto Tron, Calin Belta

Abstract: We present LQR-CBF-RRT*, an incremental sampling-based algorithm for offline motion planning. Our framework leverages the strength of Control Barrier Functions (CBFs) and Linear Quadratic Regulators (LQR) to generate safety-critical and optimal trajectories for a robot with dynamics described by an affine control system. CBFs are used for safety guarantees, while LQRs are employed for optimal cont… ▽ More We present LQR-CBF-RRT*, an incremental sampling-based algorithm for offline motion planning. Our framework leverages the strength of Control Barrier Functions (CBFs) and Linear Quadratic Regulators (LQR) to generate safety-critical and optimal trajectories for a robot with dynamics described by an affine control system. CBFs are used for safety guarantees, while LQRs are employed for optimal control synthesis during edge extensions. Popular CBF-based formulations for safety critical control require solving Quadratic Programs (QPs), which can be computationally expensive. Moreover, LQR-based controllers require repetitive applications of first-order Taylor approximations for nonlinear systems, which can also create an additional computational burden. To improve the motion planning efficiency, we verify the satisfaction of the CBF constraints directly in edge extension to avoid the burden of solving the QPs. We store computed optimal LQR gain matrices in a hash table to avoid re-computation during the local linearization of the rewiring procedure. Lastly, we utilize the Cross-Entropy Method for importance sampling to improve sampling efficiency. Our results show that the proposed planner surpasses its counterparts in computational efficiency and performs well in an experimental setup. △ Less

Submitted 27 September, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

arXiv:2303.16898 [pdf, other]

Bagging by Learning to Singulate Layers Using Interactive Perception

Authors: Lawrence Yunliang Chen, Baiyu Shi, Roy Lin, Daniel Seita, Ayah Ahmad, Richard Cheng, Thomas Kollar, David Held, Ken Goldberg

Abstract: Many fabric handling and 2D deformable material tasks in homes and industry require singulating layers of material such as opening a bag or arranging garments for sewing. In contrast to methods requiring specialized sensing or end effectors, we use only visual observations with ordinary parallel jaw grippers. We propose SLIP: Singulating Layers using Interactive Perception, and apply SLIP to the t… ▽ More Many fabric handling and 2D deformable material tasks in homes and industry require singulating layers of material such as opening a bag or arranging garments for sewing. In contrast to methods requiring specialized sensing or end effectors, we use only visual observations with ordinary parallel jaw grippers. We propose SLIP: Singulating Layers using Interactive Perception, and apply SLIP to the task of autonomous bagging. We develop SLIP-Bagging, a bagging algorithm that manipulates a plastic or fabric bag from an unstructured state, and uses SLIP to grasp the top layer of the bag to open it for object insertion. In physical experiments, a YuMi robot achieves a success rate of 67% to 81% across bags of a variety of materials, shapes, and sizes, significantly improving in success rate and generality over prior work. Experiments also suggest that SLIP can be applied to tasks such as singulating layers of folded cloth and garments. Supplementary material is available at https://sites.google.com/view/slip-bagging/. △ Less

Submitted 1 September, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

Comments: IROS 2023

arXiv:2303.15597 [pdf, other]

The Gap between Higher Education and the Software Industry -- A Case Study on Technology Differences

Authors: Felix Dobslaw, Kristian Angelin, Lena-Maria Öberg, Awais Ahmad

Abstract: We see an explosive global labour demand in the Software Industry, and higher education institutions play a crucial role in supplying the industry with professionals with relevant education. Existing literature identifies a gap between what software engineering education teaches students and what the software industry demands. Using our open-sourced Job Market AnalyseR (JMAR) text-analysis tool, w… ▽ More We see an explosive global labour demand in the Software Industry, and higher education institutions play a crucial role in supplying the industry with professionals with relevant education. Existing literature identifies a gap between what software engineering education teaches students and what the software industry demands. Using our open-sourced Job Market AnalyseR (JMAR) text-analysis tool, we compared keywords from higher education course syllabi and job posts to investigate the knowledge gap from a technology-focused departure point. We present a trend analysis of technology in job posts over the past six years in Sweden. We found that demand for cloud and automation technology such as Kubernetes and Docker is rising in job ads but not that much in higher education syllabi. The language used in higher education syllabi and job ads differs where the former emphasizes concepts and the latter technologies more heavily. We discuss possible remedies to bridge this mismatch to draw further conclusions in future work, including calibrating JMAR to other industry-relevant aspects, including soft skills, software concepts, or new demographics. △ Less

Submitted 27 March, 2023; originally announced March 2023.

Comments: 16 pages

arXiv:2303.14713 [pdf, other]

Engineering Software Systems for Quantum Computing as a Service: A Mapping Study

Authors: Aakash Ahmad, Muhammad Waseem, Peng Liang, Mahdi Fehmideh, Arif Ali Khan, David Georg Reichelt, Tommi Mikkonen

Abstract: Quantum systems have started to emerge as a disruptive technology and enabling platforms - exploiting the principles of quantum mechanics - to achieve quantum supremacy in computing. Academic research, industrial projects (e.g., Amazon Braket), and consortiums like 'Quantum Flagship' are striving to develop practically capable and commercially viable quantum computing (QC) systems and technologies… ▽ More Quantum systems have started to emerge as a disruptive technology and enabling platforms - exploiting the principles of quantum mechanics - to achieve quantum supremacy in computing. Academic research, industrial projects (e.g., Amazon Braket), and consortiums like 'Quantum Flagship' are striving to develop practically capable and commercially viable quantum computing (QC) systems and technologies. Quantum Computing as a Service (QCaaS) is viewed as a solution attuned to the philosophy of service-orientation that can offer QC resources and platforms, as utility computing, to individuals and organisations who do not own quantum computers. To understand the quantum service development life cycle and pinpoint emerging trends, we used evidence-based software engineering approach to conduct a systematic mapping study (SMS) of research that enables or enhances QCaaS. The SMS process retrieved a total of 55 studies, and based on their qualitative assessment we selected 9 of them to investigate (i) the functional aspects, design models, patterns, programming languages, deployment platforms, and (ii) trends of emerging research on QCaaS. The results indicate three modelling notations and a catalogue of five design patterns to architect QCaaS, whereas Python (native code or frameworks) and Amazon Braket are the predominant solutions to implement and deploy QCaaS solutions. From the quantum software engineering (QSE) perspective, this SMS provides empirically grounded findings that could help derive processes, patterns, and reference architectures to engineer software services for QC. △ Less

Submitted 26 March, 2023; originally announced March 2023.

Showing 1–50 of 198 results for author: Ahmad, A