subscribe to arXiv mailings

iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning

Authors: Tom Fischer, Yaoyao Liu, Artur Jesslen, Noor Ahmed, Prakhar Kaushik, Angtian Wang, Alan Yuille, Adam Kortylewski, Eddy Ilg

Abstract: Different from human nature, it is still common practice today for vision tasks to train deep learning models only initially and on fixed datasets. A variety of approaches have recently addressed handling continual data streams. However, extending these methods to manage out-of-distribution (OOD) scenarios has not effectively been investigated. On the other hand, it has recently been shown that no… ▽ More Different from human nature, it is still common practice today for vision tasks to train deep learning models only initially and on fixed datasets. A variety of approaches have recently addressed handling continual data streams. However, extending these methods to manage out-of-distribution (OOD) scenarios has not effectively been investigated. On the other hand, it has recently been shown that non-continual neural mesh models exhibit strong performance in generalizing to such OOD scenarios. To leverage this decisive property in a continual learning setting, we propose incremental neural mesh models that can be extended with new meshes over time. In addition, we present a latent space initialization strategy that enables us to allocate feature space for future unseen classes in advance and a positional regularization term that forces the features of the different classes to consistently stay in respective latent space regions. We demonstrate the effectiveness of our method through extensive experiments on the Pascal3D and ObjectNet3D datasets and show that our approach outperforms the baselines for classification by $2-6\%$ in the in-domain and by $6-50\%$ in the OOD setting. Our work also presents the first incremental learning approach for pose estimation. Our code and model can be found at https://github.com/Fischer-Tom/iNeMo. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2406.11984 [pdf, other]

Online Pareto-Optimal Decision-Making for Complex Tasks using Active Inference

Authors: Peter Amorese, Shohei Wakayama, Nisar Ahmed, Morteza Lahijanian

Abstract: When a robot autonomously performs a complex task, it frequently must balance competing objectives while maintaining safety. This becomes more difficult in uncertain environments with stochastic outcomes. Enhancing transparency in the robot's behavior and aligning with user preferences are also crucial. This paper introduces a novel framework for multi-objective reinforcement learning that ensures… ▽ More When a robot autonomously performs a complex task, it frequently must balance competing objectives while maintaining safety. This becomes more difficult in uncertain environments with stochastic outcomes. Enhancing transparency in the robot's behavior and aligning with user preferences are also crucial. This paper introduces a novel framework for multi-objective reinforcement learning that ensures safe task execution, optimizes trade-offs between objectives, and adheres to user preferences. The framework has two main layers: a multi-objective task planner and a high-level selector. The planning layer generates a set of optimal trade-off plans that guarantee satisfaction of a temporal logic task. The selector uses active inference to decide which generated plan best complies with user preferences and aids learning. Operating iteratively, the framework updates a parameterized learning model based on collected data. Case studies and benchmarks on both manipulation and mobile robots show that our framework outperforms other methods and (i) learns multiple optimal trade-offs, (ii) adheres to a user preference, and (iii) allows the user to adjust the balance between (i) and (ii). △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 17 pages, 10 figures, submitted to IEEE Transactions on Robotics journal

arXiv:2406.05109 [pdf, other]

Large Generative Graph Models

Authors: Yu Wang, Ryan A. Rossi, Namyong Park, Huiyuan Chen, Nesreen K. Ahmed, Puja Trivedi, Franck Dernoncourt, Danai Koutra, Tyler Derr

Abstract: Large Generative Models (LGMs) such as GPT, Stable Diffusion, Sora, and Suno are trained on a huge amount of language corpus, images, videos, and audio that are extremely diverse from numerous domains. This training paradigm over diverse well-curated data lies at the heart of generating creative and sensible content. However, all previous graph generative models (e.g., GraphRNN, MDVAE, MoFlow, GDS… ▽ More Large Generative Models (LGMs) such as GPT, Stable Diffusion, Sora, and Suno are trained on a huge amount of language corpus, images, videos, and audio that are extremely diverse from numerous domains. This training paradigm over diverse well-curated data lies at the heart of generating creative and sensible content. However, all previous graph generative models (e.g., GraphRNN, MDVAE, MoFlow, GDSS, and DiGress) have been trained only on one dataset each time, which cannot replicate the revolutionary success achieved by LGMs in other fields. To remedy this crucial gap, we propose a new class of graph generative model called Large Graph Generative Model (LGGM) that is trained on a large corpus of graphs (over 5000 graphs) from 13 different domains. We empirically demonstrate that the pre-trained LGGM has superior zero-shot generative capability to existing graph generative models. Furthermore, our pre-trained LGGM can be easily fine-tuned with graphs from target domains and demonstrate even better performance than those directly trained from scratch, behaving as a solid starting point for real-world customization. Inspired by Stable Diffusion, we further equip LGGM with the capability to generate graphs given text prompts (Text-to-Graph), such as the description of the network name and domain (i.e., "The power-1138-bus graph represents a network of buses in a power distribution system."), and network statistics (i.e., "The graph has a low average degree, suitable for modeling social media interactions."). This Text-to-Graph capability integrates the extensive world knowledge in the underlying language model, offering users fine-grained control of the generated graphs. We release the code, the model checkpoint, and the datasets at https://lggm-lg.github.io/. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2405.20513 [pdf, other]

Deep Modeling of Non-Gaussian Aleatoric Uncertainty

Authors: Aastha Acharya, Caleb Lee, Marissa D'Alonzo, Jared Shamwell, Nisar R. Ahmed, Rebecca Russell

Abstract: Deep learning offers promising new ways to accurately model aleatoric uncertainty in robotic estimation systems, particularly when the uncertainty distributions do not conform to traditional assumptions of being fixed and Gaussian. In this study, we formulate and evaluate three fundamental deep learning approaches for conditional probability density modeling to quantify non-Gaussian aleatoric unce… ▽ More Deep learning offers promising new ways to accurately model aleatoric uncertainty in robotic estimation systems, particularly when the uncertainty distributions do not conform to traditional assumptions of being fixed and Gaussian. In this study, we formulate and evaluate three fundamental deep learning approaches for conditional probability density modeling to quantify non-Gaussian aleatoric uncertainty: parametric, discretized, and generative modeling. We systematically compare the respective strengths and weaknesses of these three methods on simulated non-Gaussian densities as well as on real-world terrain-relative navigation data. Our results show that these deep learning methods can accurately capture complex uncertainty patterns, highlighting their potential for improving the reliability and robustness of estimation systems. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 8 pages, 7 figures

arXiv:2405.14185 [pdf, other]

A structure-aware framework for learning device placements on computation graphs

Authors: Shukai Duan, Heng Ping, Nikos Kanakaris, Xiongye Xiao, Peiyu Zhang, Panagiotis Kyriakis, Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Shahin Nazarian, Theodore L. Willke, Paul Bogdan

Abstract: Existing approaches for device placement ignore the topological features of computation graphs and rely mostly on heuristic methods for graph partitioning. At the same time, they either follow a grouper-placer or an encoder-placer architecture, which requires understanding the interaction structure between code operations. To bridge the gap between encoder-placer and grouper-placer techniques, we… ▽ More Existing approaches for device placement ignore the topological features of computation graphs and rely mostly on heuristic methods for graph partitioning. At the same time, they either follow a grouper-placer or an encoder-placer architecture, which requires understanding the interaction structure between code operations. To bridge the gap between encoder-placer and grouper-placer techniques, we propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit using reinforcement learning. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. It facilitates end-to-end training and takes into consideration the directed and acyclic nature of the computation graphs. We also propose a model variant, inspired by graph parsing networks and complex network analysis, enabling graph representation learning and personalized graph partitioning jointly, using an unspecified number of groups. To train the entire framework, we utilize reinforcement learning techniques by employing the execution time of the suggested device placements to formulate the reward. We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models, namely Inception-V3, ResNet, and BERT. The robustness of the proposed framework is also highlighted through an ablation study. The suggested placements improve the inference speed for the benchmark models by up to $58.2\%$ over CPU execution and by up to $60.24\%$ compared to other commonly used baselines. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.12193 [pdf, other]

The Narrow Depth and Breadth of Corporate Responsible AI Research

Authors: Nur Ahmed, Amit Das, Kirsten Martin, Kawshik Banerjee

Abstract: The transformative potential of AI presents remarkable opportunities, but also significant risks, underscoring the importance of responsible AI development and deployment. Despite a growing emphasis on this area, there is limited understanding of industry's engagement in responsible AI research, i.e., the critical examination of AI's ethical, social, and legal dimensions. To address this gap, we a… ▽ More The transformative potential of AI presents remarkable opportunities, but also significant risks, underscoring the importance of responsible AI development and deployment. Despite a growing emphasis on this area, there is limited understanding of industry's engagement in responsible AI research, i.e., the critical examination of AI's ethical, social, and legal dimensions. To address this gap, we analyzed over 6 million peer-reviewed articles and 32 million patent citations using multiple methods across five distinct datasets to quantify industry's engagement. Our findings reveal that the majority of AI firms show limited or no engagement in this critical subfield of AI. We show a stark disparity between industry's dominant presence in conventional AI research and its limited engagement in responsible AI. Leading AI firms exhibit significantly lower output in responsible AI research compared to their conventional AI research and the contributions of leading academic institutions. Our linguistic analysis documents a narrower scope of responsible AI research within industry, with a lack of diversity in key topics addressed. Our large-scale patent citation analysis uncovers a pronounced disconnect between responsible AI research and the commercialization of AI technologies, suggesting that industry patents rarely build upon insights generated by the responsible AI literature. This gap highlights the potential for AI development to diverge from a socially optimal path, risking unintended consequences due to insufficient consideration of ethical and societal implications. Our results highlight the urgent need for industry to publicly engage in responsible AI research to absorb academic knowledge, cultivate public trust, and proactively mitigate AI-induced societal harms. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2404.06094 [pdf, ps, other]

S-box Security Analysis of NIST Lightweight Cryptography Candidates: A Critical Empirical Study

Authors: Mahnoor Naseer, Sundas Tariq, Naveed Riaz, Naveed Ahmed, Mureed Hussain

Abstract: In the resource-constrained world of the digital landscape, lightweight cryptography plays a critical role in safeguarding information and ensuring the security of various systems, devices, and communication channels. Its efficient and resource-friendly nature makes it the ideal solution for applications where computational power is limited. In response to the growing need for platform-specific im… ▽ More In the resource-constrained world of the digital landscape, lightweight cryptography plays a critical role in safeguarding information and ensuring the security of various systems, devices, and communication channels. Its efficient and resource-friendly nature makes it the ideal solution for applications where computational power is limited. In response to the growing need for platform-specific implementations, NIST issued a call for standardization of Lightweight cryptography algorithms in 2018. Ascon emerged as the winner of this competition. NIST initially established general evaluation criteria for a standard lightweight scheme including security strength, mitigation against side-channel and fault-injection attacks, and implementation efficiency. To verify the security claims, evaluating the individual components used in any cryptographic algorithm is a crucial step. The quality of a substitution box (S-box) significantly impacts the overall security of a cryptographic primitive. This paper analyzes the S-boxes of six finalists in the NIST Lightweight Cryptography (LWC) standardization process. We evaluate them based on well-established cryptographic properties. Our analysis explores how these properties influence the S-boxes' resistance against known cryptanalytic attacks and potential implementation-specific vulnerabilities, thus reflecting on their compliance with NIST's security requirements. △ Less

Submitted 9 April, 2024; originally announced April 2024.

arXiv:2404.01578 [pdf, other]

GLEMOS: Benchmark for Instantaneous Graph Learning Model Selection

Authors: Namyong Park, Ryan Rossi, Xing Wang, Antoine Simoulin, Nesreen Ahmed, Christos Faloutsos

Abstract: The choice of a graph learning (GL) model (i.e., a GL algorithm and its hyperparameter settings) has a significant impact on the performance of downstream tasks. However, selecting the right GL model becomes increasingly difficult and time consuming as more and more GL models are developed. Accordingly, it is of great significance and practical value to equip users of GL with the ability to perfor… ▽ More The choice of a graph learning (GL) model (i.e., a GL algorithm and its hyperparameter settings) has a significant impact on the performance of downstream tasks. However, selecting the right GL model becomes increasingly difficult and time consuming as more and more GL models are developed. Accordingly, it is of great significance and practical value to equip users of GL with the ability to perform a near-instantaneous selection of an effective GL model without manual intervention. Despite the recent attempts to tackle this important problem, there has been no comprehensive benchmark environment to evaluate the performance of GL model selection methods. To bridge this gap, we present GLEMOS in this work, a comprehensive benchmark for instantaneous GL model selection that makes the following contributions. (i) GLEMOS provides extensive benchmark data for fundamental GL tasks, i.e., link prediction and node classification, including the performances of 366 models on 457 graphs on these tasks. (ii) GLEMOS designs multiple evaluation settings, and assesses how effectively representative model selection techniques perform in these different settings. (iii) GLEMOS is designed to be easily extended with new models, new graphs, and new performance records. (iv) Based on the experimental results, we discuss the limitations of existing approaches and highlight future research directions. To promote research on this significant problem, we make the benchmark data and code publicly available at https://github.com/facebookresearch/glemos. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: NeurIPS 2023

arXiv:2403.18550 [pdf, other]

OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning

Authors: Noor Ahmed, Anna Kukleva, Bernt Schiele

Abstract: Few-Shot Class-Incremental Learning (FSCIL) introduces a paradigm in which the problem space expands with limited data. FSCIL methods inherently face the challenge of catastrophic forgetting as data arrives incrementally, making models susceptible to overwriting previously acquired knowledge. Moreover, given the scarcity of labeled samples available at any given time, models may be prone to overfi… ▽ More Few-Shot Class-Incremental Learning (FSCIL) introduces a paradigm in which the problem space expands with limited data. FSCIL methods inherently face the challenge of catastrophic forgetting as data arrives incrementally, making models susceptible to overwriting previously acquired knowledge. Moreover, given the scarcity of labeled samples available at any given time, models may be prone to overfitting and find it challenging to strike a balance between extensive pretraining and the limited incremental data. To address these challenges, we propose the OrCo framework built on two core principles: features' orthogonality in the representation space, and contrastive learning. In particular, we improve the generalization of the embedding space by employing a combination of supervised and self-supervised contrastive losses during the pretraining phase. Additionally, we introduce OrCo loss to address challenges arising from data limitations during incremental sessions. Through feature space perturbations and orthogonality between classes, the OrCo loss maximizes margins and reserves space for the following incremental data. This, in turn, ensures the accommodation of incoming classes in the feature space without compromising previously acquired knowledge. Our experimental results showcase state-of-the-art performance across three benchmark datasets, including mini-ImageNet, CIFAR100, and CUB datasets. Code is available at https://github.com/noorahmedds/OrCo △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2403.11004 [pdf, other]

Forward Learning of Graph Neural Networks

Authors: Namyong Park, Xing Wang, Antoine Simoulin, Shuai Yang, Grey Yang, Ryan Rossi, Puja Trivedi, Nesreen Ahmed

Abstract: Graph neural networks (GNNs) have achieved remarkable success across a wide range of applications, such as recommendation, drug discovery, and question answering. Behind the success of GNNs lies the backpropagation (BP) algorithm, which is the de facto standard for training deep neural networks (NNs). However, despite its effectiveness, BP imposes several constraints, which are not only biological… ▽ More Graph neural networks (GNNs) have achieved remarkable success across a wide range of applications, such as recommendation, drug discovery, and question answering. Behind the success of GNNs lies the backpropagation (BP) algorithm, which is the de facto standard for training deep neural networks (NNs). However, despite its effectiveness, BP imposes several constraints, which are not only biologically implausible, but also limit the scalability, parallelism, and flexibility in learning NNs. Examples of such constraints include storage of neural activities computed in the forward pass for use in the subsequent backward pass, and the dependence of parameter updates on non-local signals. To address these limitations, the forward-forward algorithm (FF) was recently proposed as an alternative to BP in the image classification domain, which trains NNs by performing two forward passes over positive and negative data. Inspired by this advance, we propose ForwardGNN in this work, a new forward learning procedure for GNNs, which avoids the constraints imposed by BP via an effective layer-wise local forward training. ForwardGNN extends the original FF to deal with graph data and GNNs, and makes it possible to operate without generating negative inputs (hence no longer forward-forward). Further, ForwardGNN enables each layer to learn from both the bottom-up and top-down signals without relying on the backpropagation of errors. Extensive experiments on real-world datasets show the effectiveness and generality of the proposed forward graph learning framework. We release our code at https://github.com/facebookresearch/forwardgnn. △ Less

Submitted 12 April, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

Comments: ICLR 2024

arXiv:2403.09735 [pdf]

A Sophisticated Framework for the Accurate Detection of Phishing Websites

Authors: Asif Newaz, Farhan Shahriyar Haq, Nadim Ahmed

Abstract: Phishing is an increasingly sophisticated form of cyberattack that is inflicting huge financial damage to corporations throughout the globe while also jeopardizing individuals' privacy. Attackers are constantly devising new methods of launching such assaults and detecting them has become a daunting task. Many different techniques have been suggested, each with its own pros and cons. While machine… ▽ More Phishing is an increasingly sophisticated form of cyberattack that is inflicting huge financial damage to corporations throughout the globe while also jeopardizing individuals' privacy. Attackers are constantly devising new methods of launching such assaults and detecting them has become a daunting task. Many different techniques have been suggested, each with its own pros and cons. While machine learning-based techniques have been most successful in identifying such attacks, they continue to fall short in terms of performance and generalizability. This paper proposes a comprehensive methodology for detecting phishing websites. The goal is to design a system that is capable of accurately distinguishing phishing websites from legitimate ones and provides generalized performance over a broad variety of datasets. A combination of feature selection, greedy algorithm, cross-validation, and deep learning methods have been utilized to construct a sophisticated stacking ensemble classifier. Extensive experimentation on four different phishing datasets was conducted to evaluate the performance of the proposed technique. The proposed algorithm outperformed the other existing phishing detection models obtaining accuracy of 97.49%, 98.23%, 97.48%, and 98.20% on dataset-1 (UCI Phishing Websites Dataset), dataset-2 (Phishing Dataset for Machine Learning: Feature Evaluation), dataset-3 (Phishing Websites Dataset), and dataset-4 (Web page phishing detection), respectively. The high accuracy values obtained across all datasets imply the models' generalizability and effectiveness in the accurate identification of phishing websites. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2402.13415 [pdf, other]

Structure Guided Prompt: Instructing Large Language Model in Multi-Step Reasoning by Exploring Graph Structure of the Text

Authors: Kewei Cheng, Nesreen K. Ahmed, Theodore Willke, Yizhou Sun

Abstract: Although Large Language Models (LLMs) excel at addressing straightforward reasoning tasks, they frequently struggle with difficulties when confronted by more complex multi-step reasoning due to a range of factors. Firstly, natural language often encompasses complex relationships among entities, making it challenging to maintain a clear reasoning chain over longer spans. Secondly, the abundance of… ▽ More Although Large Language Models (LLMs) excel at addressing straightforward reasoning tasks, they frequently struggle with difficulties when confronted by more complex multi-step reasoning due to a range of factors. Firstly, natural language often encompasses complex relationships among entities, making it challenging to maintain a clear reasoning chain over longer spans. Secondly, the abundance of linguistic diversity means that the same entities and relationships can be expressed using different terminologies and structures, complicating the task of identifying and establishing connections between multiple pieces of information. Graphs provide an effective solution to represent data rich in relational information and capture long-term dependencies among entities. To harness the potential of graphs, our paper introduces Structure Guided Prompt, an innovative three-stage task-agnostic prompting framework designed to improve the multi-step reasoning capabilities of LLMs in a zero-shot setting. This framework explicitly converts unstructured text into a graph via LLMs and instructs them to navigate this graph using task-specific strategies to formulate responses. By effectively organizing information and guiding navigation, it enables LLMs to provide more accurate and context-aware responses. Our experiments show that this framework significantly enhances the reasoning capabilities of LLMs, enabling them to excel in a broader spectrum of natural language scenarios. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.09126 [pdf, other]

MPIrigen: MPI Code Generation through Domain-Specific Language Models

Authors: Nadav Schneider, Niranjan Hasabnis, Vy A. Vo, Tal Kadosh, Neva Krien, Mihai Capotă, Guy Tamir, Ted Willke, Nesreen Ahmed, Yuval Pinter, Timothy Mattson, Gal Oren

Abstract: The imperative need to scale computation across numerous nodes highlights the significance of efficient parallel computing, particularly in the realm of Message Passing Interface (MPI) integration. The challenging parallel programming task of generating MPI-based parallel programs has remained unexplored. This study first investigates the performance of state-of-the-art language models in generati… ▽ More The imperative need to scale computation across numerous nodes highlights the significance of efficient parallel computing, particularly in the realm of Message Passing Interface (MPI) integration. The challenging parallel programming task of generating MPI-based parallel programs has remained unexplored. This study first investigates the performance of state-of-the-art language models in generating MPI-based parallel programs. Findings reveal that widely used models such as GPT-3.5 and PolyCoder (specialized multi-lingual code models) exhibit notable performance degradation, when generating MPI-based programs compared to general-purpose programs. In contrast, domain-specific models such as MonoCoder, which are pretrained on MPI-related programming languages of C and C++, outperform larger models. Subsequently, we introduce a dedicated downstream task of MPI-based program generation by fine-tuning MonoCoder on HPCorpusMPI. We call the resulting model as MPIrigen. We propose an innovative preprocessing for completion only after observing the whole code, thus enabling better completion with a wider context. Comparative analysis against GPT-3.5 zero-shot performance, using a novel HPC-oriented evaluation method, demonstrates that MPIrigen excels in generating accurate MPI functions up to 0.8 accuracy in location and function predictions, and with more than 0.9 accuracy for argument predictions. The success of this tailored solution underscores the importance of domain-specific fine-tuning in optimizing language models for parallel computing code generation, paving the way for a new generation of automatic parallelization tools. The sources of this work are available at our GitHub MPIrigen repository: https://github.com/Scientific-Computing-Lab-NRCN/MPI-rigen △ Less

Submitted 23 April, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

arXiv:2402.04750 [pdf, other]

AINS: Affordable Indoor Navigation Solution via Line Color Identification Using Mono-Camera for Autonomous Vehicles

Authors: Nizamuddin Maitlo, Nooruddin Noonari, Kaleem Arshid, Naveed Ahmed, Sathishkumar Duraisamy

Abstract: Recently, researchers have been exploring various ways to improve the effectiveness and efficiency of autonomous vehicles by researching new methods, especially for indoor scenarios. Autonomous Vehicles in indoor navigation systems possess many challenges especially the limited accuracy of GPS in indoor scenarios. Several, robust methods have been explored for autonomous vehicles in indoor scenari… ▽ More Recently, researchers have been exploring various ways to improve the effectiveness and efficiency of autonomous vehicles by researching new methods, especially for indoor scenarios. Autonomous Vehicles in indoor navigation systems possess many challenges especially the limited accuracy of GPS in indoor scenarios. Several, robust methods have been explored for autonomous vehicles in indoor scenarios to solve this problem, but the ineffectiveness of the proposed methods is the high deployment cost. To address the above-mentioned problems we have presented A low-cost indoor navigation method for autonomous vehicles called Affordable Indoor Navigation Solution (AINS) which is based on based on Monocular Camera. Our proposed solution is mainly based on a mono camera without relying on various huge or power-inefficient sensors to find the path, such as range finders and other navigation sensors. Our proposed method shows that we can deploy autonomous vehicles indoor navigation systems while taking into consideration the cost. We can observe that the results shown by our solution are better than existing solutions and we can reduce the estimated error and time consumption. △ Less

Submitted 7 February, 2024; originally announced February 2024.

arXiv:2402.02018 [pdf, other]

The Landscape and Challenges of HPC Research and LLMs

Authors: Le Chen, Nesreen K. Ahmed, Akash Dutta, Arijit Bhattacharjee, Sixing Yu, Quazi Ishtiaque Mahmud, Waqwoya Abebe, Hung Phan, Aishwarya Sarkar, Branden Butler, Niranjan Hasabnis, Gal Oren, Vy A. Vo, Juan Pablo Munoz, Theodore L. Willke, Tim Mattson, Ali Jannesari

Abstract: Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breach… ▽ More Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breaching exascale performance levels. In this paper, we posit that adapting and utilizing such language model-based techniques for tasks in high-performance computing (HPC) would be very beneficial. This study presents our reasoning behind the aforementioned position and highlights how existing ideas can be improved and adapted for HPC tasks. △ Less

Submitted 6 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

arXiv:2401.16445 [pdf, other]

OMPGPT: A Generative Pre-trained Transformer Model for OpenMP

Authors: Le Chen, Arijit Bhattacharjee, Nesreen Ahmed, Niranjan Hasabnis, Gal Oren, Vy Vo, Ali Jannesari

Abstract: Large language models (LLMs)such as ChatGPT have significantly advanced the field of Natural Language Processing (NLP). This trend led to the development of code-based large language models such as StarCoder, WizardCoder, and CodeLlama, which are trained extensively on vast repositories of code and programming languages. While the generic abilities of these code LLMs are useful for many programmer… ▽ More Large language models (LLMs)such as ChatGPT have significantly advanced the field of Natural Language Processing (NLP). This trend led to the development of code-based large language models such as StarCoder, WizardCoder, and CodeLlama, which are trained extensively on vast repositories of code and programming languages. While the generic abilities of these code LLMs are useful for many programmers in tasks like code generation, the area of high-performance computing (HPC) has a narrower set of requirements that make a smaller and more domain-specific model a smarter choice. This paper presents OMPGPT, a novel domain-specific model meticulously designed to harness the inherent strengths of language models for OpenMP pragma generation. Furthermore, we leverage prompt engineering techniques from the NLP domain to create Chain-of-OMP, an innovative strategy designed to enhance OMPGPT's effectiveness. Our extensive evaluations demonstrate that OMPGPT outperforms existing large language models specialized in OpenMP tasks and maintains a notably smaller size, aligning it more closely with the typical hardware constraints of HPC environments. We consider our contribution as a pivotal bridge, connecting the advantage of language models with the specific demands of HPC tasks. △ Less

Submitted 21 June, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

arXiv:2401.16301 [pdf, other]

Scalable Factor Graph-Based Heterogeneous Bayesian DDF for Dynamic Systems

Authors: Ofer Dagan, Tycho L. Cinquini, Nisar R. Ahmed

Abstract: Heterogeneous Bayesian decentralized data fusion captures the set of problems in which two robots must combine two probability density functions over non-equal, but overlapping sets of random variables. In the context of multi-robot dynamic systems, this enables robots to take a "divide and conquer" approach to reason and share data over complementary tasks instead of over the full joint state spa… ▽ More Heterogeneous Bayesian decentralized data fusion captures the set of problems in which two robots must combine two probability density functions over non-equal, but overlapping sets of random variables. In the context of multi-robot dynamic systems, this enables robots to take a "divide and conquer" approach to reason and share data over complementary tasks instead of over the full joint state space. For example, in a target tracking application, this allows robots to track different subsets of targets and share data on only common targets. This paper presents a framework by which robots can each use a local factor graph to represent relevant partitions of a complex global joint probability distribution, thus allowing them to avoid reasoning over the entirety of a more complex model and saving communication as well as computation costs. From a theoretical point of view, this paper makes contributions by casting the heterogeneous decentralized fusion problem in terms of a factor graph, analyzing the challenges that arise due to dynamic filtering, and then developing a new conservative filtering algorithm that ensures statistical correctness. From a practical point of view, we show how this framework can be used to represent different multi-robot applications and then test it with simulations and hardware experiments to validate and demonstrate its statistical conservativeness, applicability, and robustness to real-world challenges. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 15 pages, 13 figures, submitted for review at IEEE Transactions on Robotics (T-RO)

arXiv:2312.17420 [pdf, other]

Exact Consistency Tests for Gaussian Mixture Filters using Normalized Deviation Squared Statistics

Authors: Nisar Ahmed, Luke Burks, Kailah Cabral, Alyssa Bekai Rose

Abstract: We consider the problem of evaluating dynamic consistency in discrete time probabilistic filters that approximate stochastic system state densities with Gaussian mixtures. Dynamic consistency means that the estimated probability distributions correctly describe the actual uncertainties. As such, the problem of consistency testing naturally arises in applications with regards to estimator tuning an… ▽ More We consider the problem of evaluating dynamic consistency in discrete time probabilistic filters that approximate stochastic system state densities with Gaussian mixtures. Dynamic consistency means that the estimated probability distributions correctly describe the actual uncertainties. As such, the problem of consistency testing naturally arises in applications with regards to estimator tuning and validation. However, due to the general complexity of the density functions involved, straightforward approaches for consistency testing of mixture-based estimators have remained challenging to define and implement. This paper derives a new exact result for Gaussian mixture consistency testing within the framework of normalized deviation squared (NDS) statistics. It is shown that NDS test statistics for generic multivariate Gaussian mixture models exactly follow mixtures of generalized chi-square distributions, for which efficient computational tools are available. The accuracy and utility of the resulting consistency tests are numerically demonstrated on static and dynamic mixture estimation examples. △ Less

Submitted 14 March, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

Comments: 8 pages, 4 figures; final manuscript to be published 2024 American Control Conference (ACC 2024), corrected small typos and updated Fig. 1 for clarity

arXiv:2312.13322 [pdf, other]

Domain-Specific Code Language Models: Unraveling the Potential for HPC Codes and Tasks

Authors: Tal Kadosh, Niranjan Hasabnis, Vy A. Vo, Nadav Schneider, Neva Krien, Mihai Capota, Abdul Wasay, Nesreen Ahmed, Ted Willke, Guy Tamir, Yuval Pinter, Timothy Mattson, Gal Oren

Abstract: With easier access to powerful compute resources, there is a growing trend in AI for software development to develop larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size and demand expensive compute resources for training. This is partly because these LLMs for HPC tasks are obtained by… ▽ More With easier access to powerful compute resources, there is a growing trend in AI for software development to develop larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size and demand expensive compute resources for training. This is partly because these LLMs for HPC tasks are obtained by finetuning existing LLMs that support several natural and/or programming languages. We found this design choice confusing - why do we need large LMs trained on natural languages and programming languages unrelated to HPC for HPC-specific tasks? In this line of work, we aim to question choices made by existing LLMs by developing smaller LMs for specific domains - we call them domain-specific LMs. Specifically, we start off with HPC as a domain and build an HPC-specific LM, named MonoCoder, that is orders of magnitude smaller than existing LMs but delivers similar, if not better performance, on non-HPC and HPC tasks. Specifically, we pre-trained MonoCoder on an HPC-specific dataset (named HPCorpus) of C and C++ programs mined from GitHub. We evaluated the performance of MonoCoder against conventional multi-lingual LLMs. Results demonstrate that MonoCoder, although much smaller than existing LMs, achieves similar results on normalized-perplexity tests and much better ones in CodeBLEU competence for high-performance and parallel code generations. Furthermore, fine-tuning the base model for the specific task of parallel code generation (OpenMP parallel for pragmas) demonstrates outstanding results compared to GPT, especially when local misleading semantics are removed by our novel pre-processor Tokompiler, showcasing the ability of domain-specific models to assist in HPC-relevant tasks. △ Less

Submitted 20 December, 2023; originally announced December 2023.

arXiv:2312.12583 [pdf, other]

Observation-Augmented Contextual Multi-Armed Bandits for Robotic Exploration with Uncertain Semantic Data

Authors: Shohei Wakayama, Nisar Ahmed

Abstract: For robotic decision-making under uncertainty, the balance between exploitation and exploration of available options must be carefully taken into account. In this study, we introduce a new variant of contextual multi-armed bandits called observation-augmented CMABs (OA-CMABs) wherein a decision-making agent can utilize extra outcome observations from an external information source. CMABs model the… ▽ More For robotic decision-making under uncertainty, the balance between exploitation and exploration of available options must be carefully taken into account. In this study, we introduce a new variant of contextual multi-armed bandits called observation-augmented CMABs (OA-CMABs) wherein a decision-making agent can utilize extra outcome observations from an external information source. CMABs model the expected option outcomes as a function of context features and hidden parameters, which are inferred from previous option outcomes. In OA-CMABs, external observations are also a function of context features and thus provide additional evidence about the hidden parameters. Yet, if an external information source is error-prone, the resulting posterior updates can harm decision-making performance unless the presence of errors is considered. To this end, we propose a robust Bayesian inference process for OA-CMABs that is based on the concept of probabilistic data validation. Our approach handles complex mixture model parameter priors and hybrid observation likelihoods for semantic data sources, allowing us to develop validation algorithms based on recently develop probabilistic semantic data association techniques. Furthermore, to more effectively cope with the combined sources of uncertainty in OA-CMABs, we derive a new active inference algorithm for option selection based on expected free energy minimization. This generalizes previous work on active inference for bandit-based robotic decision-making by accounting for faulty observations and non-Gaussian inference. Our approaches are demonstrated on a simulated asynchronous search site selection problem for space exploration. The results show that even if incorrect observations are provided by external information sources, efficient decision-making and robust parameter inference are still achieved in a wide variety of experimental conditions. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: 8 pages, 9 figures

arXiv:2312.09033 [pdf, other]

Using Surprise Index for Competency Assessment in Autonomous Decision-Making

Authors: Akash Ratheesh, Ofer Dagan, Nisar R. Ahmed, Jay McMahon

Abstract: This paper considers the problem of evaluating an autonomous system's competency in performing a task, particularly when working in dynamic and uncertain environments. The inherent opacity of machine learning models, from the perspective of the user, often described as a `black box', poses a challenge. To overcome this, we propose using a measure called the Surprise index, which leverages availabl… ▽ More This paper considers the problem of evaluating an autonomous system's competency in performing a task, particularly when working in dynamic and uncertain environments. The inherent opacity of machine learning models, from the perspective of the user, often described as a `black box', poses a challenge. To overcome this, we propose using a measure called the Surprise index, which leverages available measurement data to quantify whether the dynamic system performs as expected. We show that the surprise index can be computed in closed form for dynamic systems when observed evidence in a probabilistic model if the joint distribution for that evidence follows a multivariate Gaussian marginal distribution. We then apply it to a nonlinear spacecraft maneuver problem, where actions are chosen by a reinforcement learning agent and show it can indicate how well the trajectory follows the required orbit. △ Less

Submitted 10 January, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: 10 pages, 5 figures, presented at AIAA SciTech 2024

arXiv:2312.05657 [pdf, other]

Leveraging Reinforcement Learning and Large Language Models for Code Optimization

Authors: Shukai Duan, Nikos Kanakaris, Xiongye Xiao, Heng Ping, Chenyu Zhou, Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Theodore L. Willke, Shahin Nazarian, Paul Bogdan

Abstract: Code optimization is a daunting task that requires a significant level of expertise from experienced programmers. This level of expertise is not sufficient when compared to the rapid development of new hardware architectures. Towards advancing the whole code optimization process, recent approaches rely on machine learning and artificial intelligence techniques. This paper introduces a new framewor… ▽ More Code optimization is a daunting task that requires a significant level of expertise from experienced programmers. This level of expertise is not sufficient when compared to the rapid development of new hardware architectures. Towards advancing the whole code optimization process, recent approaches rely on machine learning and artificial intelligence techniques. This paper introduces a new framework to decrease the complexity of code optimization. The proposed framework builds on large language models (LLMs) and reinforcement learning (RL) and enables LLMs to receive feedback from their environment (i.e., unit tests) during the fine-tuning process. We compare our framework with existing state-of-the-art models and show that it is more efficient with respect to speed and computational usage, as a result of the decrement in training steps and its applicability to models with fewer parameters. Additionally, our framework reduces the possibility of logical and syntactical errors. Toward evaluating our approach, we run several experiments on the PIE dataset using a CodeT5 language model and RRHF, a new reinforcement learning algorithm. We adopt a variety of evaluation metrics with regards to optimization quality, and speedup. The evaluation results demonstrate that the proposed framework has similar results in comparison with existing models using shorter training times and smaller pre-trained models. In particular, we accomplish an increase of 5.6% and 2.2 over the baseline models concerning the %OP T and SP metrics. △ Less

Submitted 9 December, 2023; originally announced December 2023.

arXiv:2311.17856 [pdf, other]

Leveraging Graph Diffusion Models for Network Refinement Tasks

Authors: Puja Trivedi, Ryan Rossi, David Arbour, Tong Yu, Franck Dernoncourt, Sungchul Kim, Nedim Lipka, Namyong Park, Nesreen K. Ahmed, Danai Koutra

Abstract: Most real-world networks are noisy and incomplete samples from an unknown target distribution. Refining them by correcting corruptions or inferring unobserved regions typically improves downstream performance. Inspired by the impressive generative capabilities that have been used to correct corruptions in images, and the similarities between "in-painting" and filling in missing nodes and edges con… ▽ More Most real-world networks are noisy and incomplete samples from an unknown target distribution. Refining them by correcting corruptions or inferring unobserved regions typically improves downstream performance. Inspired by the impressive generative capabilities that have been used to correct corruptions in images, and the similarities between "in-painting" and filling in missing nodes and edges conditioned on the observed graph, we propose a novel graph generative framework, SGDM, which is based on subgraph diffusion. Our framework not only improves the scalability and fidelity of graph diffusion models, but also leverages the reverse process to perform novel, conditional generation tasks. In particular, through extensive empirical analysis and a set of novel metrics, we demonstrate that our proposed model effectively supports the following refinement tasks for partially observable networks: T1: denoising extraneous subgraphs, T2: expanding existing subgraphs and T3: performing "style" transfer by regenerating a particular subgraph to match the characteristics of a different node or subgraph. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: Work in Progress. 21 pages, 7 figures

arXiv:2311.06505 [pdf, other]

CompCodeVet: A Compiler-guided Validation and Enhancement Approach for Code Dataset

Authors: Le Chen, Arijit Bhattacharjee, Nesreen K. Ahmed, Niranjan Hasabnis, Gal Oren, Bin Lei, Ali Jannesari

Abstract: Large language models (LLMs) have become increasingly prominent in academia and industry due to their remarkable performance in diverse applications. As these models evolve with increasing parameters, they excel in tasks like sentiment analysis and machine translation. However, even models with billions of parameters face challenges in tasks demanding multi-step reasoning. Code generation and comp… ▽ More Large language models (LLMs) have become increasingly prominent in academia and industry due to their remarkable performance in diverse applications. As these models evolve with increasing parameters, they excel in tasks like sentiment analysis and machine translation. However, even models with billions of parameters face challenges in tasks demanding multi-step reasoning. Code generation and comprehension, especially in C and C++, emerge as significant challenges. While LLMs trained on code datasets demonstrate competence in many tasks, they struggle with rectifying non-compilable C and C++ code. Our investigation attributes this subpar performance to two primary factors: the quality of the training dataset and the inherent complexity of the problem which demands intricate reasoning. Existing "Chain of Thought" (CoT) prompting techniques aim to enhance multi-step reasoning. This approach, however, retains the limitations associated with the latent drawbacks of LLMs. In this work, we propose CompCodeVet, a compiler-guided CoT approach to produce compilable code from non-compilable ones. Diverging from the conventional approach of utilizing larger LLMs, we employ compilers as a teacher to establish a more robust zero-shot thought process. The evaluation of CompCodeVet on two open-source code datasets shows that CompCodeVet has the ability to improve the training dataset quality for LLMs. △ Less

Submitted 11 November, 2023; originally announced November 2023.

arXiv:2310.04047 [pdf, other]

AUTOPARLLM: GNN-Guided Automatic Code Parallelization using Large Language Models

Authors: Quazi Ishtiaque Mahmud, Ali TehraniJamsaz, Hung D Phan, Nesreen K. Ahmed, Ali Jannesari

Abstract: Parallelizing sequentially written programs is a challenging task. Even experienced developers need to spend considerable time finding parallelism opportunities and then actually writing parallel versions of sequentially written programs. To address this issue, we present AUTOPARLLM, a framework for automatically discovering parallelism and generating the parallel version of the sequentially writt… ▽ More Parallelizing sequentially written programs is a challenging task. Even experienced developers need to spend considerable time finding parallelism opportunities and then actually writing parallel versions of sequentially written programs. To address this issue, we present AUTOPARLLM, a framework for automatically discovering parallelism and generating the parallel version of the sequentially written program. Our framework consists of two major components: i) a heterogeneous Graph Neural Network (GNN) based parallelism discovery and parallel pattern detection module, and ii) an LLM-based code generator to generate the parallel counterpart of the sequential programs. We use the GNN to learn the flow-aware characteristics of the programs to identify parallel regions in sequential programs and then construct an enhanced prompt using the GNN's results for the LLM-based generator to finally produce the parallel counterparts of the sequential programs. We evaluate AUTOPARLLM on 11 applications of 2 well-known benchmark suites: NAS Parallel Benchmark and Rodinia Benchmark. Our results show that AUTOPARLLM is indeed effective in improving the state-of-the-art LLM-based models for the task of parallel code generation in terms of multiple code generation metrics. AUTOPARLLM also improves the average runtime of the parallel code generated by the state-of-the-art LLMs by as high as 3.4% and 2.9% for the NAS Parallel Benchmark and Rodinia Benchmark respectively. Additionally, to overcome the issue that well-known metrics for translation evaluation have not been optimized to evaluate the quality of the generated parallel code, we propose OMPScore for evaluating the quality of the generated code. We show that OMPScore exhibits a better correlation with human judgment than existing metrics, measured by up to 75% improvement of Spearman correlation. △ Less

Submitted 8 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

Comments: 10 pages

arXiv:2310.01765 [pdf, other]

Data Cleaning and Machine Learning: A Systematic Literature Review

Authors: Pierre-Olivier Côté, Amin Nikanjam, Nafisa Ahmed, Dmytro Humeniuk, Foutse Khomh

Abstract: Context: Machine Learning (ML) is integrated into a growing number of systems for various applications. Because the performance of an ML model is highly dependent on the quality of the data it has been trained on, there is a growing interest in approaches to detect and repair data errors (i.e., data cleaning). Researchers are also exploring how ML can be used for data cleaning; hence creating a du… ▽ More Context: Machine Learning (ML) is integrated into a growing number of systems for various applications. Because the performance of an ML model is highly dependent on the quality of the data it has been trained on, there is a growing interest in approaches to detect and repair data errors (i.e., data cleaning). Researchers are also exploring how ML can be used for data cleaning; hence creating a dual relationship between ML and data cleaning. To the best of our knowledge, there is no study that comprehensively reviews this relationship. Objective: This paper's objectives are twofold. First, it aims to summarize the latest approaches for data cleaning for ML and ML for data cleaning. Second, it provides future work recommendations. Method: We conduct a systematic literature review of the papers published between 2016 and 2022 inclusively. We identify different types of data cleaning activities with and for ML: feature cleaning, label cleaning, entity matching, outlier detection, imputation, and holistic data cleaning. Results: We summarize the content of 101 papers covering various data cleaning activities and provide 24 future work recommendations. Our review highlights many promising data cleaning techniques that can be further extended. Conclusion: We believe that our review of the literature will help the community develop better approaches to clean data. △ Less

Submitted 30 May, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: Published in the Automated Software Engineering Journal

arXiv:2309.06395 [pdf, other]

Human-Centered Autonomy for UAS Target Search

Authors: Hunter M. Ray, Zakariya Laouar, Zachary Sunberg, Nisar Ahmed

Abstract: Current methods of deploying robots that operate in dynamic, uncertain environments, such as Uncrewed Aerial Systems in search \& rescue missions, require nearly continuous human supervision for vehicle guidance and operation. These methods do not consider high-level mission context resulting in cumbersome manual operation or inefficient exhaustive search patterns. We present a human-centered auto… ▽ More Current methods of deploying robots that operate in dynamic, uncertain environments, such as Uncrewed Aerial Systems in search \& rescue missions, require nearly continuous human supervision for vehicle guidance and operation. These methods do not consider high-level mission context resulting in cumbersome manual operation or inefficient exhaustive search patterns. We present a human-centered autonomous framework that infers geospatial mission context through dynamic feature sets, which then guides a probabilistic target search planner. Operators provide a set of diverse inputs, including priority definition, spatial semantic information about ad-hoc geographical areas, and reference waypoints, which are probabilistically fused with geographical database information and condensed into a geospatial distribution representing an operator's preferences over an area. An online, POMDP-based planner, optimized for target searching, is augmented with this reward map to generate an operator-constrained policy. Our results, simulated based on input from five professional rescuers, display effective task mental model alignment, 18\% more victim finds, and 15 times more efficient guidance plans then current operational methods. △ Less

Submitted 6 March, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

Comments: Accepted to ICRA 2024. 9 pages, 5 figures

arXiv:2309.06364 [pdf, other]

Framework-Based Qualitative Analysis of Free Responses of Large Language Models: Algorithmic Fidelity

Authors: Aliya Amirova, Theodora Fteropoulli, Nafiso Ahmed, Martin R. Cowie, Joel Z. Leibo

Abstract: Today, using Large-scale generative Language Models (LLMs) it is possible to simulate free responses to interview questions like those traditionally analyzed using qualitative research methods. Qualitative methodology encompasses a broad family of techniques involving manual analysis of open-ended interviews or conversations conducted freely in natural language. Here we consider whether artificial… ▽ More Today, using Large-scale generative Language Models (LLMs) it is possible to simulate free responses to interview questions like those traditionally analyzed using qualitative research methods. Qualitative methodology encompasses a broad family of techniques involving manual analysis of open-ended interviews or conversations conducted freely in natural language. Here we consider whether artificial "silicon participants" generated by LLMs may be productively studied using qualitative methods aiming to produce insights that could generalize to real human populations. The key concept in our analysis is algorithmic fidelity, a term introduced by Argyle et al. (2023) capturing the degree to which LLM-generated outputs mirror human sub-populations' beliefs and attitudes. By definition, high algorithmic fidelity suggests latent beliefs elicited from LLMs may generalize to real humans, whereas low algorithmic fidelity renders such research invalid. Here we used an LLM to generate interviews with silicon participants matching specific demographic characteristics one-for-one with a set of human participants. Using framework-based qualitative analysis, we showed the key themes obtained from both human and silicon participants were strikingly similar. However, when we analyzed the structure and tone of the interviews we found even more striking differences. We also found evidence of the hyper-accuracy distortion described by Aher et al. (2023). We conclude that the LLM we tested (GPT-3.5) does not have sufficient algorithmic fidelity to expect research on it to generalize to human populations. However, the rapid pace of LLM research makes it plausible this could change in the future. Thus we stress the need to establish epistemic norms now around how to assess validity of LLM-based qualitative research, especially concerning the need to ensure representation of heterogeneous lived experiences. △ Less

Submitted 4 February, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

Comments: 52 pages, 5 tables, 5 figures

arXiv:2309.00848

Bengali Document Layout Analysis -- A YOLOV8 Based Ensembling Approach

Authors: Nazmus Sakib Ahmed, Saad Sakib Noor, Ashraful Islam Shanto Sikder, Abhijit Paul

Abstract: This paper focuses on enhancing Bengali Document Layout Analysis (DLA) using the YOLOv8 model and innovative post-processing techniques. We tackle challenges unique to the complex Bengali script by employing data augmentation for model robustness. After meticulous validation set evaluation, we fine-tune our approach on the complete dataset, leading to a two-stage prediction strategy for accurate e… ▽ More This paper focuses on enhancing Bengali Document Layout Analysis (DLA) using the YOLOv8 model and innovative post-processing techniques. We tackle challenges unique to the complex Bengali script by employing data augmentation for model robustness. After meticulous validation set evaluation, we fine-tune our approach on the complete dataset, leading to a two-stage prediction strategy for accurate element segmentation. Our ensemble model, combined with post-processing, outperforms individual base architectures, addressing issues identified in the BaDLAD dataset. By leveraging this approach, we aim to advance Bengali document analysis, contributing to improved OCR and document comprehension and BaDLAD serves as a foundational resource for this endeavor, aiding future research in the field. Furthermore, our experiments provided key insights to incorporate new strategies into the established solution. △ Less

Submitted 29 April, 2024; v1 submitted 2 September, 2023; originally announced September 2023.

Comments: Need to review and rework this

arXiv:2309.00770 [pdf, other]

Bias and Fairness in Large Language Models: A Survey

Authors: Isabel O. Gallegos, Ryan A. Rossi, Joe Barrow, Md Mehrab Tanjim, Sungchul Kim, Franck Dernoncourt, Tong Yu, Ruiyi Zhang, Nesreen K. Ahmed

Abstract: Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation of human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these models can learn, perpetuate, and amplify harmful social biases. In this paper, we present a comprehensive survey of bias evaluation and mitigation techniques for LLMs. We… ▽ More Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation of human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these models can learn, perpetuate, and amplify harmful social biases. In this paper, we present a comprehensive survey of bias evaluation and mitigation techniques for LLMs. We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing, defining distinct facets of harm and introducing several desiderata to operationalize fairness for LLMs. We then unify the literature by proposing three intuitive taxonomies, two for bias evaluation, namely metrics and datasets, and one for mitigation. Our first taxonomy of metrics for bias evaluation disambiguates the relationship between metrics and evaluation datasets, and organizes metrics by the different levels at which they operate in a model: embeddings, probabilities, and generated text. Our second taxonomy of datasets for bias evaluation categorizes datasets by their structure as counterfactual inputs or prompts, and identifies the targeted harms and social groups; we also release a consolidation of publicly-available datasets for improved access. Our third taxonomy of techniques for bias mitigation classifies methods by their intervention during pre-processing, in-training, intra-processing, and post-processing, with granular subcategories that elucidate research trends. Finally, we identify open problems and challenges for future work. Synthesizing a wide range of recent research, we aim to provide a clear guide of the existing literature that empowers researchers and practitioners to better understand and prevent the propagation of bias in LLMs. △ Less

Submitted 12 July, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

Comments: Accepted at Computational Linguistics, Volume 50, Number 3

arXiv:2308.09440 [pdf, other]

Scope is all you need: Transforming LLMs for HPC Code

Authors: Tal Kadosh, Niranjan Hasabnis, Vy A. Vo, Nadav Schneider, Neva Krien, Abdul Wasay, Nesreen Ahmed, Ted Willke, Guy Tamir, Yuval Pinter, Timothy Mattson, Gal Oren

Abstract: With easier access to powerful compute resources, there is a growing trend in the field of AI for software development to develop larger and larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size (e.g., billions of parameters) and demand expensive compute resources for training. We found… ▽ More With easier access to powerful compute resources, there is a growing trend in the field of AI for software development to develop larger and larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size (e.g., billions of parameters) and demand expensive compute resources for training. We found this design choice confusing - why do we need large LLMs trained on natural languages and programming languages unrelated to HPC for HPC-specific tasks? In this line of work, we aim to question design choices made by existing LLMs by developing smaller LLMs for specific domains - we call them domain-specific LLMs. Specifically, we start off with HPC as a domain and propose a novel tokenizer named Tokompiler, designed specifically for preprocessing code in HPC and compilation-centric tasks. Tokompiler leverages knowledge of language primitives to generate language-oriented tokens, providing a context-aware understanding of code structure while avoiding human semantics attributed to code structures completely. We applied Tokompiler to pre-train two state-of-the-art models, SPT-Code and Polycoder, for a Fortran code corpus mined from GitHub. We evaluate the performance of these models against the conventional LLMs. Results demonstrate that Tokompiler significantly enhances code completion accuracy and semantic understanding compared to traditional tokenizers in normalized-perplexity tests, down to ~1 perplexity score. This research opens avenues for further advancements in domain-specific LLMs, catering to the unique demands of HPC and compilation tasks. △ Less

Submitted 29 September, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

arXiv:2307.10594 [pdf, other]

Exploiting Structure for Optimal Multi-Agent Bayesian Decentralized Estimation

Authors: Christopher Funk, Ofer Dagan, Benjamin Noack, Nisar R. Ahmed

Abstract: A key challenge in Bayesian decentralized data fusion is the `rumor propagation' or `double counting' phenomenon, where previously sent data circulates back to its sender. It is often addressed by approximate methods like covariance intersection (CI) which takes a weighted average of the estimates to compute the bound. The problem is that this bound is not tight, i.e. the estimate is often over-co… ▽ More A key challenge in Bayesian decentralized data fusion is the `rumor propagation' or `double counting' phenomenon, where previously sent data circulates back to its sender. It is often addressed by approximate methods like covariance intersection (CI) which takes a weighted average of the estimates to compute the bound. The problem is that this bound is not tight, i.e. the estimate is often over-conservative. In this paper, we show that by exploiting the probabilistic independence structure in multi-agent decentralized fusion problems a tighter bound can be found using (i) an expansion to the CI algorithm that uses multiple (non-monolithic) weighting factors instead of one (monolithic) factor in the original CI and (ii) a general optimization scheme that is able to compute optimal bounds and fully exploit an arbitrary dependency structure. We compare our methods and show that on a simple problem, they converge to the same solution. We then test our new non-monolithic CI algorithm on a large-scale target tracking simulation and show that it achieves a tighter bound and a more accurate estimate compared to the original monolithic CI. △ Less

Submitted 20 July, 2023; originally announced July 2023.

Comments: 4 pages, 4 figures. presented at the Inference and Decision Making for Autonomous Vehicles (IDMAV) RSS 2023 workshop

arXiv:2307.09857 [pdf]

Blind Image Quality Assessment Using Multi-Stream Architecture with Spatial and Channel Attention

Authors: Hassan Khalid, Nisar Ahmed

Abstract: BIQA (Blind Image Quality Assessment) is an important field of study that evaluates images automatically. Although significant progress has been made, blind image quality assessment remains a difficult task since images vary in content and distortions. Most algorithms generate quality without emphasizing the important region of interest. In order to solve this, a multi-stream spatial and channel a… ▽ More BIQA (Blind Image Quality Assessment) is an important field of study that evaluates images automatically. Although significant progress has been made, blind image quality assessment remains a difficult task since images vary in content and distortions. Most algorithms generate quality without emphasizing the important region of interest. In order to solve this, a multi-stream spatial and channel attention-based algorithm is being proposed. This algorithm generates more accurate predictions with a high correlation to human perceptual assessment by combining hybrid features from two different backbones, followed by spatial and channel attention to provide high weights to the region of interest. Four legacy image quality assessment datasets are used to validate the effectiveness of our proposed approach. Authentic and synthetic distortion image databases are used to demonstrate the effectiveness of the proposed method, and we show that it has excellent generalization properties with a particular focus on the perceptual foreground information. △ Less

Submitted 31 July, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

Comments: submitted in Expert Systems with Applications

arXiv:2307.03929 [pdf, other]

Fairness-Aware Graph Neural Networks: A Survey

Authors: April Chen, Ryan A. Rossi, Namyong Park, Puja Trivedi, Yu Wang, Tong Yu, Sungchul Kim, Franck Dernoncourt, Nesreen K. Ahmed

Abstract: Graph Neural Networks (GNNs) have become increasingly important due to their representational power and state-of-the-art predictive performance on many fundamental learning tasks. Despite this success, GNNs suffer from fairness issues that arise as a result of the underlying graph data and the fundamental aggregation mechanism that lies at the heart of the large class of GNN models. In this articl… ▽ More Graph Neural Networks (GNNs) have become increasingly important due to their representational power and state-of-the-art predictive performance on many fundamental learning tasks. Despite this success, GNNs suffer from fairness issues that arise as a result of the underlying graph data and the fundamental aggregation mechanism that lies at the heart of the large class of GNN models. In this article, we examine and categorize fairness techniques for improving the fairness of GNNs. Previous work on fair GNN models and techniques are discussed in terms of whether they focus on improving fairness during a preprocessing step, during training, or in a post-processing phase. Furthermore, we discuss how such techniques can be used together whenever appropriate, and highlight the advantages and intuition as well. We also introduce an intuitive taxonomy for fairness evaluation metrics including graph-level fairness, neighborhood-level fairness, embedding-level fairness, and prediction-level fairness metrics. In addition, graph datasets that are useful for benchmarking the fairness of GNN models are summarized succinctly. Finally, we highlight key open problems and challenges that remain to be addressed. △ Less

Submitted 8 July, 2023; originally announced July 2023.

arXiv:2306.09822 [pdf, other]

Lightweight Attribute Localizing Models for Pedestrian Attribute Recognition

Authors: Ashish Jha, Dimitrii Ermilov, Konstantin Sobolev, Anh Huy Phan, Salman Ahmadi-Asl, Naveed Ahmed, Imran Junejo, Zaher AL Aghbari, Thar Baker, Ahmed Mohamed Khedr, Andrzej Cichocki

Abstract: Pedestrian Attribute Recognition (PAR) deals with the problem of identifying features in a pedestrian image. It has found interesting applications in person retrieval, suspect re-identification and soft biometrics. In the past few years, several Deep Neural Networks (DNNs) have been designed to solve the task; however, the developed DNNs predominantly suffer from over-parameterization and high com… ▽ More Pedestrian Attribute Recognition (PAR) deals with the problem of identifying features in a pedestrian image. It has found interesting applications in person retrieval, suspect re-identification and soft biometrics. In the past few years, several Deep Neural Networks (DNNs) have been designed to solve the task; however, the developed DNNs predominantly suffer from over-parameterization and high computational complexity. These problems hinder them from being exploited in resource-constrained embedded devices with limited memory and computational capacity. By reducing a network's layers using effective compression techniques, such as tensor decomposition, neural network compression is an effective method to tackle these problems. We propose novel Lightweight Attribute Localizing Models (LWALM) for Pedestrian Attribute Recognition (PAR). LWALM is a compressed neural network obtained after effective layer-wise compression of the Attribute Localization Model (ALM) using the Canonical Polyadic Decomposition with Error Preserving Correction (CPD-EPC) algorithm. △ Less

Submitted 16 June, 2023; originally announced June 2023.

arXiv:2306.07225 [pdf, other]

Kalman Filter Auto-tuning through Enforcing Chi-Squared Normalized Error Distributions with Bayesian Optimization

Authors: Zhaozhong Chen, Harel Biggie, Nisar Ahmed, Simon Julier, Christoffer Heckman

Abstract: The nonlinear and stochastic relationship between noise covariance parameter values and state estimator performance makes optimal filter tuning a very challenging problem. Popular optimization-based tuning approaches can easily get trapped in local minima, leading to poor noise parameter identification and suboptimal state estimation. Recently, black box techniques based on Bayesian optimization w… ▽ More The nonlinear and stochastic relationship between noise covariance parameter values and state estimator performance makes optimal filter tuning a very challenging problem. Popular optimization-based tuning approaches can easily get trapped in local minima, leading to poor noise parameter identification and suboptimal state estimation. Recently, black box techniques based on Bayesian optimization with Gaussian processes (GPBO) have been shown to overcome many of these issues, using normalized estimation error squared (NEES) and normalized innovation error (NIS) statistics to derive cost functions for Kalman filter auto-tuning. While reliable noise parameter estimates are obtained in many cases, GPBO solutions obtained with these conventional cost functions do not always converge to optimal filter noise parameters and lack robustness to parameter ambiguities in time-discretized system models. This paper addresses these issues by making two main contributions. First, we show that NIS and NEES errors are only chi-squared distributed for tuned estimators. As a result, chi-square tests are not sufficient to ensure that an estimator has been correctly tuned. We use this to extend the familiar consistency tests for NIS and NEES to penalize if the distribution is not chi-squared distributed. Second, this cost measure is applied within a Student-t processes Bayesian Optimization (TPBO) to achieve robust estimator performance for time discretized state space models. The robustness, accuracy, and reliability of our approach are illustrated on classical state estimation problems. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: 34 pages, 9 figures, submitted to IEEE Transactions on Aerospace and Electronic Systems

arXiv:2306.04570 [pdf, other]

Towards Decentralized Heterogeneous Multi-Robot SLAM and Target Tracking

Authors: Ofer Dagan, Tycho L. Cinquini, Luke Morrissey, Kristen Such, Nisar R. Ahmed, Christoffer Heckman

Abstract: In many robotics problems, there is a significant gain in collaborative information sharing between multiple robots, for exploration, search and rescue, tracking multiple targets, or mapping large environments. One of the key implicit assumptions when solving cooperative multi-robot problems is that all robots use the same (homogeneous) underlying algorithm. However, in practice, we want to allow… ▽ More In many robotics problems, there is a significant gain in collaborative information sharing between multiple robots, for exploration, search and rescue, tracking multiple targets, or mapping large environments. One of the key implicit assumptions when solving cooperative multi-robot problems is that all robots use the same (homogeneous) underlying algorithm. However, in practice, we want to allow collaboration between robots possessing different capabilities and that therefore must rely on heterogeneous algorithms. We present a system architecture and the supporting theory, to enable collaboration in a decentralized network of robots, where each robot relies on different estimation algorithms. To develop our approach, we focus on multi-robot simultaneous localization and mapping (SLAM) with multi-target tracking. Our theoretical framework builds on our idea of exploiting the conditional independence structure inherent to many robotics applications to separate between each robot's local inference (estimation) tasks and fuse only relevant parts of their non-equal, but overlapping probability density function (pdfs). We present a new decentralized graph-based approach to the multi-robot SLAM and tracking problem. We leverage factor graphs to split between different parts of the problem for efficient data sharing between robots in the network while enabling robots to use different local sparse landmark/dense/metric-semantic SLAM algorithms. △ Less

Submitted 7 June, 2023; originally announced June 2023.

Comments: 5 pages, 2 figures, presented at the ICRA 2023 workshop on "Distributed Graph Algorithms for Robotics"

arXiv:2306.00210 [pdf, other]

PERFOGRAPH: A Numerical Aware Program Graph Representation for Performance Optimization and Program Analysis

Authors: Ali TehraniJamsaz, Quazi Ishtiaque Mahmud, Le Chen, Nesreen K. Ahmed, Ali Jannesari

Abstract: The remarkable growth and significant success of machine learning have expanded its applications into programming languages and program analysis. However, a key challenge in adopting the latest machine learning methods is the representation of programming languages, which directly impacts the ability of machine learning methods to reason about programs. The absence of numerical awareness, aggregat… ▽ More The remarkable growth and significant success of machine learning have expanded its applications into programming languages and program analysis. However, a key challenge in adopting the latest machine learning methods is the representation of programming languages, which directly impacts the ability of machine learning methods to reason about programs. The absence of numerical awareness, aggregate data structure information, and improper way of presenting variables in previous representation works have limited their performances. To overcome the limitations and challenges of current program representations, we propose a graph-based program representation called PERFOGRAPH. PERFOGRAPH can capture numerical information and the aggregate data structure by introducing new nodes and edges. Furthermore, we propose an adapted embedding method to incorporate numerical awareness. These enhancements make PERFOGRAPH a highly flexible and scalable representation that effectively captures programs intricate dependencies and semantics. Consequently, it serves as a powerful tool for various applications such as program analysis, performance optimization, and parallelism discovery. Our experimental results demonstrate that PERFOGRAPH outperforms existing representations and sets new state-of-the-art results by reducing the error rate by 7.4% (AMD dataset) and 10% (NVIDIA dataset) in the well-known Device Mapping challenge. It also sets new state-of-the-art results in various performance optimization tasks like Parallelism Discovery and NUMA and Prefetchers Configuration prediction. △ Less

Submitted 29 November, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

arXiv:2305.18387 [pdf]

Augmenting Character Designers Creativity Using Generative Adversarial Networks

Authors: Mohammad Lataifeh, Xavier Carrasco, Ashraf Elnagar, Naveed Ahmed

Abstract: Recent advances in Generative Adversarial Networks (GANs) continue to attract the attention of researchers in different fields due to the wide range of applications devised to take advantage of their key features. Most recent GANs are focused on realism, however, generating hyper-realistic output is not a priority for some domains, as in the case of this work. The generated outcomes are used here… ▽ More Recent advances in Generative Adversarial Networks (GANs) continue to attract the attention of researchers in different fields due to the wide range of applications devised to take advantage of their key features. Most recent GANs are focused on realism, however, generating hyper-realistic output is not a priority for some domains, as in the case of this work. The generated outcomes are used here as cognitive components to augment character designers creativity while conceptualizing new characters for different multimedia projects. To select the best-suited GANs for such a creative context, we first present a comparison between different GAN architectures and their performance when trained from scratch on a new visual characters dataset using a single Graphics Processing Unit. We also explore alternative techniques, such as transfer learning and data augmentation, to overcome computational resource limitations, a challenge faced by many researchers in the domain. Additionally, mixed methods are used to evaluate the cognitive value of the generated visuals on character designers agency conceptualizing new characters. The results discussed proved highly effective for this context, as demonstrated by early adaptations to the characters design process. As an extension for this work, the presented approach will be further evaluated as a novel co-design process between humans and machines to investigate where and how the generated concepts are interacting with and influencing the design process outcome. △ Less

Submitted 28 May, 2023; originally announced May 2023.

Comments: 18 pages

Journal ref: Preprint- ICR'23 - The Second International Conference on Innovations in Computing Research, 2023

arXiv:2305.12932 [pdf, ps, other]

Forecasting Irregularly Sampled Time Series using Graphs

Authors: Vijaya Krishna Yalavarthi, Kiran Madhusudhanan, Randolf Sholz, Nourhan Ahmed, Johannes Burchert, Shayan Jawed, Stefan Born, Lars Schmidt-Thieme

Abstract: Forecasting irregularly sampled time series with missing values is a crucial task for numerous real-world applications such as healthcare, astronomy, and climate sciences. State-of-the-art approaches to this problem rely on Ordinary Differential Equations (ODEs) which are known to be slow and often require additional features to handle missing values. To address this issue, we propose a novel mode… ▽ More Forecasting irregularly sampled time series with missing values is a crucial task for numerous real-world applications such as healthcare, astronomy, and climate sciences. State-of-the-art approaches to this problem rely on Ordinary Differential Equations (ODEs) which are known to be slow and often require additional features to handle missing values. To address this issue, we propose a novel model using Graphs for Forecasting Irregularly Sampled Time Series with missing values which we call GraFITi. GraFITi first converts the time series to a Sparsity Structure Graph which is a sparse bipartite graph, and then reformulates the forecasting problem as the edge weight prediction task in the graph. It uses the power of Graph Neural Networks to learn the graph and predict the target edge weights. GraFITi has been tested on 3 real-world and 1 synthetic irregularly sampled time series dataset with missing values and compared with various state-of-the-art models. The experimental results demonstrate that GraFITi improves the forecasting accuracy by up to 17% and reduces the run time up to 5 times compared to the state-of-the-art forecasting models. △ Less

Submitted 10 August, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

arXiv:2305.09214 [pdf]

doi 10.1007/s11042-020-10286-w

PIQI: Perceptual Image Quality Index based on Ensemble of Gaussian Process Regression

Authors: Nisar Ahmed, Hafiz Muhammad Shahzad Asif, Hassan Khalid

Abstract: Digital images contain a lot of redundancies, therefore, compression techniques are applied to reduce the image size without loss of reasonable image quality. Same become more prominent in the case of videos which contains image sequences and higher compression ratios are achieved in low throughput networks. Assessment of quality of images in such scenarios has become of particular interest. Subje… ▽ More Digital images contain a lot of redundancies, therefore, compression techniques are applied to reduce the image size without loss of reasonable image quality. Same become more prominent in the case of videos which contains image sequences and higher compression ratios are achieved in low throughput networks. Assessment of quality of images in such scenarios has become of particular interest. Subjective evaluation in most of the scenarios is infeasible so objective evaluation is preferred. Among the three objective quality measures, full-reference and reduced-reference methods require an original image in some form to calculate the image quality which is unfeasible in scenarios such as broadcasting, acquisition or enhancement. Therefore, a no-reference Perceptual Image Quality Index (PIQI) is proposed in this paper to assess the quality of digital images which calculates luminance and gradient statistics along with mean subtracted contrast normalized products in multiple scales and color spaces. These extracted features are provided to a stacked ensemble of Gaussian Process Regression (GPR) to perform the perceptual quality evaluation. The performance of the PIQI is checked on six benchmark databases and compared with twelve state-of-the-art methods and competitive results are achieved. The comparison is made based on RMSE, Pearson and Spearman correlation coefficients between ground truth and predicted quality scores. The scores of 0.0552, 0.9802 and 0.9776 are achieved respectively for these metrics on CSIQ database. Two cross-dataset evaluation experiments are performed to check the generalization of PIQI. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Journal ref: AMultimed Tools Appl 80, 15677 to 15700 (2021)

arXiv:2305.09141 [pdf]

doi 10.1007/s00500-021-06662-9

Deep Ensembling for Perceptual Image Quality Assessment

Authors: Nisar Ahmed, H. M. Shahzad Asif, Abdul Rauf Bhatti, Atif Khan

Abstract: Blind image quality assessment is a challenging task particularly due to the unavailability of reference information. Training a deep neural network requires a large amount of training data which is not readily available for image quality. Transfer learning is usually opted to overcome this limitation and different deep architectures are used for this purpose as they learn features differently. Af… ▽ More Blind image quality assessment is a challenging task particularly due to the unavailability of reference information. Training a deep neural network requires a large amount of training data which is not readily available for image quality. Transfer learning is usually opted to overcome this limitation and different deep architectures are used for this purpose as they learn features differently. After extensive experiments, we have designed a deep architecture containing two CNN architectures as its sub-units. Moreover, a self-collected image database BIQ2021 is proposed with 12,000 images having natural distortions. The self-collected database is subjectively scored and is used for model training and validation. It is demonstrated that synthetic distortion databases cannot provide generalization beyond the distortion types used in the database and they are not ideal candidates for general-purpose image quality assessment. Moreover, a large-scale database of 18.75 million images with synthetic distortions is used to pretrain the model and then retrain it on benchmark databases for evaluation. Experiments are conducted on six benchmark databases three of which are synthetic distortion databases (LIVE, CSIQ and TID2013) and three are natural distortion databases (LIVE Challenge Database, CID2013 and KonIQ-10 k). The proposed approach has provided a Pearson correlation coefficient of 0.8992, 0.8472 and 0.9452 subsequently and Spearman correlation coefficient of 0.8863, 0.8408 and 0.9421. Moreover, the performance is demonstrated using perceptually weighted rank correlation to indicate the perceptual superiority of the proposed approach. Multiple experiments are conducted to validate the generalization performance of the proposed model by training on different subsets of the databases and validating on the test subset of BIQ2021 database. △ Less

Submitted 15 May, 2023; originally announced May 2023.

Journal ref: Soft Comput 26, 7601 to 7622 (2022)

arXiv:2305.05779 [pdf, other]

Learning to Parallelize with OpenMP by Augmented Heterogeneous AST Representation

Authors: Le Chen, Quazi Ishtiaque Mahmud, Hung Phan, Nesreen K. Ahmed, Ali Jannesari

Abstract: Detecting parallelizable code regions is a challenging task, even for experienced developers. Numerous recent studies have explored the use of machine learning for code analysis and program synthesis, including parallelization, in light of the success of machine learning in natural language processing. However, applying machine learning techniques to parallelism detection presents several challeng… ▽ More Detecting parallelizable code regions is a challenging task, even for experienced developers. Numerous recent studies have explored the use of machine learning for code analysis and program synthesis, including parallelization, in light of the success of machine learning in natural language processing. However, applying machine learning techniques to parallelism detection presents several challenges, such as the lack of an adequate dataset for training, an effective code representation with rich information, and a suitable machine learning model to learn the latent features of code for diverse analyses. To address these challenges, we propose a novel graph-based learning approach called Graph2Par that utilizes a heterogeneous augmented abstract syntax tree (Augmented-AST) representation for code. The proposed approach primarily focused on loop-level parallelization with OpenMP. Moreover, we create an OMP\_Serial dataset with 18598 parallelizable and 13972 non-parallelizable loops to train the machine learning models. Our results show that our proposed approach achieves the accuracy of parallelizable code region detection with 85\% accuracy and outperforms the state-of-the-art token-based machine learning approach. These results indicate that our approach is competitive with state-of-the-art tools and capable of handling loops with complex structures that other tools may overlook. △ Less

Submitted 9 May, 2023; originally announced May 2023.

arXiv:2305.02061 [pdf, other]

Attention Based Feature Fusion For Multi-Agent Collaborative Perception

Authors: Ahmed N. Ahmed, Siegfried Mercelis, Ali Anwar

Abstract: In the domain of intelligent transportation systems (ITS), collaborative perception has emerged as a promising approach to overcome the limitations of individual perception by enabling multiple agents to exchange information, thus enhancing their situational awareness. Collaborative perception overcomes the limitations of individual sensors, allowing connected agents to perceive environments beyon… ▽ More In the domain of intelligent transportation systems (ITS), collaborative perception has emerged as a promising approach to overcome the limitations of individual perception by enabling multiple agents to exchange information, thus enhancing their situational awareness. Collaborative perception overcomes the limitations of individual sensors, allowing connected agents to perceive environments beyond their line-of-sight and field of view. However, the reliability of collaborative perception heavily depends on the data aggregation strategy and communication bandwidth, which must overcome the challenges posed by limited network resources. To improve the precision of object detection and alleviate limited network resources, we propose an intermediate collaborative perception solution in the form of a graph attention network (GAT). The proposed approach develops an attention-based aggregation strategy to fuse intermediate representations exchanged among multiple connected agents. This approach adaptively highlights important regions in the intermediate feature maps at both the channel and spatial levels, resulting in improved object detection precision. We propose a feature fusion scheme using attention-based architectures and evaluate the results quantitatively in comparison to other state-of-the-art collaborative perception approaches. Our proposed approach is validated using the V2XSim dataset. The results of this work demonstrate the efficacy of the proposed approach for intermediate collaborative perception in improving object detection average precision while reducing network resource usage. △ Less

Submitted 3 May, 2023; originally announced May 2023.

arXiv:2303.11476 [pdf, other]

Chance-Constrained Multi-Robot Motion Planning under Gaussian Uncertainties

Authors: Anne Theurkauf, Justin Kottinger, Nisar Ahmed, Morteza Lahijanian

Abstract: We consider a chance-constrained multi-robot motion planning problem in the presence of Gaussian motion and sensor noise. Our proposed algorithm, CC-K-CBS, leverages the scalability of kinodynamic conflict-based search (K-CBS) in conjunction with the efficiency of the Gaussian belief trees used in the Belief-A framework, and inherits the completeness guarantees of Belief-A's low-level sampling-bas… ▽ More We consider a chance-constrained multi-robot motion planning problem in the presence of Gaussian motion and sensor noise. Our proposed algorithm, CC-K-CBS, leverages the scalability of kinodynamic conflict-based search (K-CBS) in conjunction with the efficiency of the Gaussian belief trees used in the Belief-A framework, and inherits the completeness guarantees of Belief-A's low-level sampling-based planner. We also develop three different methods for robot-robot probabilistic collision checking, which trade off computation with accuracy. Our algorithm generates motion plans driving each robot from its initial state to its goal while accounting for the evolution of its uncertainty with chance-constrained safety guarantees. Benchmarks compare computation time to conservatism of the collision checkers, in addition to characterizing the performance of the planner as a whole. Results show that CC-K-CBS can scale up to 30 robots. △ Less

Submitted 7 August, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

Comments: Submitted to Robotics and Automation Letters in July 2023

arXiv:2303.07514 [pdf]

doi 10.1109/BigData55660.2022.10021025

Handwritten Word Recognition using Deep Learning Approach: A Novel Way of Generating Handwritten Words

Authors: Mst Shapna Akter, Hossain Shahriar, Alfredo Cuzzocrea, Nova Ahmed, Carson Leung

Abstract: A handwritten word recognition system comes with issues such as lack of large and diverse datasets. It is necessary to resolve such issues since millions of official documents can be digitized by training deep learning models using a large and diverse dataset. Due to the lack of data availability, the trained model does not give the expected result. Thus, it has a high chance of showing poor resul… ▽ More A handwritten word recognition system comes with issues such as lack of large and diverse datasets. It is necessary to resolve such issues since millions of official documents can be digitized by training deep learning models using a large and diverse dataset. Due to the lack of data availability, the trained model does not give the expected result. Thus, it has a high chance of showing poor results. This paper proposes a novel way of generating diverse handwritten word images using handwritten characters. The idea of our project is to train the BiLSTM-CTC architecture with generated synthetic handwritten words. The whole approach shows the process of generating two types of large and diverse handwritten word datasets: overlapped and non-overlapped. Since handwritten words also have issues like overlapping between two characters, we have tried to put it into our experimental part. We have also demonstrated the process of recognizing handwritten documents using the deep learning model. For the experiments, we have targeted the Bangla language, which lacks the handwritten word dataset, and can be followed for any language. Our approach is less complex and less costly than traditional GAN models. Finally, we have evaluated our model using Word Error Rate (WER), accuracy, f1-score, precision, and recall metrics. The model gives 39% WER score, 92% percent accuracy, and 92% percent f1 scores using non-overlapped data and 63% percent WER score, 83% percent accuracy, and 85% percent f1 scores using overlapped data. △ Less

Submitted 13 March, 2023; originally announced March 2023.

arXiv:2303.07484 [pdf]

doi 10.1109/BigData55660.2022.10020249

Deep Learning Approach for Classifying the Aggressive Comments on Social Media: Machine Translated Data Vs Real Life Data

Authors: Mst Shapna Akter, Hossain Shahriar, Nova Ahmed, Alfredo Cuzzocrea

Abstract: Aggressive comments on social media negatively impact human life. Such offensive contents are responsible for depression and suicidal-related activities. Since online social networking is increasing day by day, the hate content is also increasing. Several investigations have been done on the domain of cyberbullying, cyberaggression, hate speech, etc. The majority of the inquiry has been done in th… ▽ More Aggressive comments on social media negatively impact human life. Such offensive contents are responsible for depression and suicidal-related activities. Since online social networking is increasing day by day, the hate content is also increasing. Several investigations have been done on the domain of cyberbullying, cyberaggression, hate speech, etc. The majority of the inquiry has been done in the English language. Some languages (Hindi and Bangla) still lack proper investigations due to the lack of a dataset. This paper particularly worked on the Hindi, Bangla, and English datasets to detect aggressive comments and have shown a novel way of generating machine-translated data to resolve data unavailability issues. A fully machine-translated English dataset has been analyzed with the models such as the Long Short term memory model (LSTM), Bidirectional Long-short term memory model (BiLSTM), LSTM-Autoencoder, word2vec, Bidirectional Encoder Representations from Transformers (BERT), and generative pre-trained transformer (GPT-2) to make an observation on how the models perform on a machine-translated noisy dataset. We have compared the performance of using the noisy data with two more datasets such as raw data, which does not contain any noises, and semi-noisy data, which contains a certain amount of noisy data. We have classified both the raw and semi-noisy data using the aforementioned models. To evaluate the performance of the models, we have used evaluation metrics such as F1-score,accuracy, precision, and recall. We have achieved the highest accuracy on raw data using the gpt2 model, semi-noisy data using the BERT model, and fully machine-translated data using the BERT model. Since many languages do not have proper data availability, our approach will help researchers create machine-translated datasets for several analysis purposes. △ Less

Submitted 13 March, 2023; originally announced March 2023.

arXiv:2303.03581 [pdf, other]

Neural Compositional Rule Learning for Knowledge Graph Reasoning

Authors: Kewei Cheng, Nesreen K. Ahmed, Yizhou Sun

Abstract: Learning logical rules is critical to improving reasoning in KGs. This is due to their ability to provide logical and interpretable explanations when used for predictions, as well as their ability to generalize to other tasks, domains, and data. While recent methods have been proposed to learn logical rules, the majority of these methods are either restricted by their computational complexity and… ▽ More Learning logical rules is critical to improving reasoning in KGs. This is due to their ability to provide logical and interpretable explanations when used for predictions, as well as their ability to generalize to other tasks, domains, and data. While recent methods have been proposed to learn logical rules, the majority of these methods are either restricted by their computational complexity and can not handle the large search space of large-scale KGs, or show poor generalization when exposed to data outside the training set. In this paper, we propose an end-to-end neural model for learning compositional logical rules called NCRL. NCRL detects the best compositional structure of a rule body, and breaks it into small compositions in order to infer the rule head. By recurrently merging compositions in the rule body with a recurrent attention unit, NCRL finally predicts a single rule head. Experimental results show that NCRL learns high-quality rules, as well as being generalizable. Specifically, we show that NCRL is scalable, efficient, and yields state-of-the-art results for knowledge graph completion on large-scale KGs. Moreover, we test NCRL for systematic generalization by learning to reason on small-scale observed graphs and evaluating on larger unseen ones. △ Less

Submitted 6 March, 2023; originally announced March 2023.

Comments: ICLR 2023

arXiv:2303.01646 [pdf, other]

Dynamic Competency Self-Assessment for Autonomous Agents

Authors: Nicholas Conlon, Nisar R. Ahmed, Daniel Szafir

Abstract: As autonomous robots are deployed in increasingly complex environments, platform degradation, environmental uncertainties, and deviations from validated operation conditions can make it difficult for human partners to understand robot capabilities and limitations. The ability for a robot to self-assess its competency in dynamic and uncertain environments will be a crucial next step in successful h… ▽ More As autonomous robots are deployed in increasingly complex environments, platform degradation, environmental uncertainties, and deviations from validated operation conditions can make it difficult for human partners to understand robot capabilities and limitations. The ability for a robot to self-assess its competency in dynamic and uncertain environments will be a crucial next step in successful human-robot teaming. This work presents and evaluates an Event-Triggered Generalized Outcome Assessment (ET-GOA) algorithm for autonomous agents to dynamically assess task confidence during execution. The algorithm uses a fast online statistical test of the agent's observations and its model predictions to decide when competency assessment is needed. We provide experimental results using ET-GOA to generate competency reports during a simulated delivery task and suggest future research directions for self-assessing agents. △ Less

Submitted 2 March, 2023; originally announced March 2023.

Comments: Accepted to the HRI 2023 workshop on Variable Autonomy for human-robot Teaming (VAT-HRI)

arXiv:2302.08669 [pdf, other]

Learning to Forecast Aleatoric and Epistemic Uncertainties over Long Horizon Trajectories

Authors: Aastha Acharya, Rebecca Russell, Nisar R. Ahmed

Abstract: Giving autonomous agents the ability to forecast their own outcomes and uncertainty will allow them to communicate their competencies and be used more safely. We accomplish this by using a learned world model of the agent system to forecast full agent trajectories over long time horizons. Real world systems involve significant sources of both aleatoric and epistemic uncertainty that compound and i… ▽ More Giving autonomous agents the ability to forecast their own outcomes and uncertainty will allow them to communicate their competencies and be used more safely. We accomplish this by using a learned world model of the agent system to forecast full agent trajectories over long time horizons. Real world systems involve significant sources of both aleatoric and epistemic uncertainty that compound and interact over time in the trajectory forecasts. We develop a deep generative world model that quantifies aleatoric uncertainty while incorporating the effects of epistemic uncertainty during the learning process. We show on two reinforcement learning problems that our uncertainty model produces calibrated outcome uncertainty estimates over the full trajectory horizon. △ Less

Submitted 16 February, 2023; originally announced February 2023.

Comments: Accepted to ICRA 2023

Showing 1–50 of 174 results for author: Ahmed, N