subscribe to arXiv mailings

ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke

Authors: Jingxi Xu, Runsheng Wang, Siqi Shang, Ava Chen, Lauren Winterbottom, To-Liang Hsu, Wenxi Chen, Khondoker Ahmed, Pedro Leandro La Rotta, Xinyue Zhu, Dawn M. Nilsen, Joel Stein, Matei Ciocarlie

Abstract: Intent inferral on a hand orthosis for stroke patients is challenging due to the difficulty of data collection from impaired subjects. Additionally, EMG signals exhibit significant variations across different conditions, sessions, and subjects, making it hard for classifiers to generalize. Traditional approaches require a large labeled dataset from the new condition, session, or subject to train i… ▽ More Intent inferral on a hand orthosis for stroke patients is challenging due to the difficulty of data collection from impaired subjects. Additionally, EMG signals exhibit significant variations across different conditions, sessions, and subjects, making it hard for classifiers to generalize. Traditional approaches require a large labeled dataset from the new condition, session, or subject to train intent classifiers; however, this data collection process is burdensome and time-consuming. In this paper, we propose ChatEMG, an autoregressive generative model that can generate synthetic EMG signals conditioned on prompts (i.e., a given sequence of EMG signals). ChatEMG enables us to collect only a small dataset from the new condition, session, or subject and expand it with synthetic samples conditioned on prompts from this new context. ChatEMG leverages a vast repository of previous data via generative training while still remaining context-specific via prompting. Our experiments show that these synthetic samples are classifier-agnostic and can improve intent inferral accuracy for different types of classifiers. We demonstrate that our complete approach can be integrated into a single patient session, including the use of the classifier for functional orthosis-assisted tasks. To the best of our knowledge, this is the first time an intent classifier trained partially on synthetic data has been deployed for functional control of an orthosis by a stroke survivor. Videos and additional information can be found at https://jxu.ai/chatemg. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 8 pages

arXiv:2406.05109 [pdf, other]

Large Generative Graph Models

Authors: Yu Wang, Ryan A. Rossi, Namyong Park, Huiyuan Chen, Nesreen K. Ahmed, Puja Trivedi, Franck Dernoncourt, Danai Koutra, Tyler Derr

Abstract: Large Generative Models (LGMs) such as GPT, Stable Diffusion, Sora, and Suno are trained on a huge amount of language corpus, images, videos, and audio that are extremely diverse from numerous domains. This training paradigm over diverse well-curated data lies at the heart of generating creative and sensible content. However, all previous graph generative models (e.g., GraphRNN, MDVAE, MoFlow, GDS… ▽ More Large Generative Models (LGMs) such as GPT, Stable Diffusion, Sora, and Suno are trained on a huge amount of language corpus, images, videos, and audio that are extremely diverse from numerous domains. This training paradigm over diverse well-curated data lies at the heart of generating creative and sensible content. However, all previous graph generative models (e.g., GraphRNN, MDVAE, MoFlow, GDSS, and DiGress) have been trained only on one dataset each time, which cannot replicate the revolutionary success achieved by LGMs in other fields. To remedy this crucial gap, we propose a new class of graph generative model called Large Graph Generative Model (LGGM) that is trained on a large corpus of graphs (over 5000 graphs) from 13 different domains. We empirically demonstrate that the pre-trained LGGM has superior zero-shot generative capability to existing graph generative models. Furthermore, our pre-trained LGGM can be easily fine-tuned with graphs from target domains and demonstrate even better performance than those directly trained from scratch, behaving as a solid starting point for real-world customization. Inspired by Stable Diffusion, we further equip LGGM with the capability to generate graphs given text prompts (Text-to-Graph), such as the description of the network name and domain (i.e., "The power-1138-bus graph represents a network of buses in a power distribution system."), and network statistics (i.e., "The graph has a low average degree, suitable for modeling social media interactions."). This Text-to-Graph capability integrates the extensive world knowledge in the underlying language model, offering users fine-grained control of the generated graphs. We release the code, the model checkpoint, and the datasets at https://lggm-lg.github.io/. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2406.01953 [pdf, other]

On-Demand Routing in LEO Mega-Constellations with Dynamic Laser Inter-Satellite Links

Authors: Dhiraj Bhattacharjee, Pablo G. Madoery, Aizaz U. Chaudhry, Halim Yanikomeroglu, Gunes Karabulut Kurt, Peng Hu, Khaled Ahmed, Stephane Martel

Abstract: Low Earth orbit (LEO) satellite mega constellations are beginning to include laser inter-satellite links (LISLs) to extend the Internet to the most remote locations on Earth. Since the process of establishing these links incurs a setup delay on the order of seconds, a static network topology is generally established well in advance, which is then used for the routing calculations. However, this in… ▽ More Low Earth orbit (LEO) satellite mega constellations are beginning to include laser inter-satellite links (LISLs) to extend the Internet to the most remote locations on Earth. Since the process of establishing these links incurs a setup delay on the order of seconds, a static network topology is generally established well in advance, which is then used for the routing calculations. However, this involves keeping links active even when they are not being used to forward traffic, leading to poor energy efficiency. Motivated by technological advances that are gradually decreasing the LISL setup delays, we foresee scenarios where it will be possible to compute routes and establish dynamic LISLs on demand. This will require considering setup delays as penalties that will affect the end-to-end latency. In this paper, we present a nonlinear optimization model that considers these penalties in the cost function and propose three heuristic algorithms that solve the problem in a tractable way. The algorithms establish different trade-offs in terms of performance and computational complexity. We extensively analyze metrics including average latency, route change rate, outage probability, and jitter in Starlink's Phase I version 2 constellation. The results show the benefit of adaptive routing schemes according to the link setup delay. In particular, more complex schemes can decrease the average end-to-end latency in exchange for an increase in execution time. On the other hand, depending on the maximum tolerated latency, it is possible to use less computationally complex schemes which will be more scalable for the satellite mega constellations of the future. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.00766 [pdf, other]

Scaling Tractable Probabilistic Circuits: A Systems Perspective

Authors: Anji Liu, Kareem Ahmed, Guy Van den Broeck

Abstract: Probabilistic Circuits (PCs) are a general framework for tractable deep generative models, which support exact and efficient probabilistic inference on their learned distributions. Recent modeling and training advancements have enabled their application to complex real-world tasks. However, the time and memory inefficiency of existing PC implementations hinders further scaling up. This paper propo… ▽ More Probabilistic Circuits (PCs) are a general framework for tractable deep generative models, which support exact and efficient probabilistic inference on their learned distributions. Recent modeling and training advancements have enabled their application to complex real-world tasks. However, the time and memory inefficiency of existing PC implementations hinders further scaling up. This paper proposes PyJuice, a general GPU implementation design for PCs that improves prior art in several regards. Specifically, PyJuice is 1-2 orders of magnitude faster than existing systems (including very recent ones) at training large-scale PCs. Moreover, PyJuice consumes 2-5x less GPU memory, which enables us to train larger models. At the core of our system is a compilation process that converts a PC into a compact representation amenable to efficient block-based parallelization, which significantly reduces IO and makes it possible to leverage Tensor Cores available in modern GPUs. Empirically, PyJuice can be used to improve state-of-the-art PCs trained on image (e.g., ImageNet32) and language (e.g., WikiText, CommonGen) datasets. We further establish a new set of baselines on natural image and language datasets by benchmarking existing PC structures but with much larger sizes and more training epochs, with the hope of incentivizing future research. Code is available at https://github.com/Tractables/pyjuice. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.14185 [pdf, other]

A structure-aware framework for learning device placements on computation graphs

Authors: Shukai Duan, Heng Ping, Nikos Kanakaris, Xiongye Xiao, Peiyu Zhang, Panagiotis Kyriakis, Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Shahin Nazarian, Theodore L. Willke, Paul Bogdan

Abstract: Existing approaches for device placement ignore the topological features of computation graphs and rely mostly on heuristic methods for graph partitioning. At the same time, they either follow a grouper-placer or an encoder-placer architecture, which requires understanding the interaction structure between code operations. To bridge the gap between encoder-placer and grouper-placer techniques, we… ▽ More Existing approaches for device placement ignore the topological features of computation graphs and rely mostly on heuristic methods for graph partitioning. At the same time, they either follow a grouper-placer or an encoder-placer architecture, which requires understanding the interaction structure between code operations. To bridge the gap between encoder-placer and grouper-placer techniques, we propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit using reinforcement learning. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. It facilitates end-to-end training and takes into consideration the directed and acyclic nature of the computation graphs. We also propose a model variant, inspired by graph parsing networks and complex network analysis, enabling graph representation learning and personalized graph partitioning jointly, using an unspecified number of groups. To train the entire framework, we utilize reinforcement learning techniques by employing the execution time of the suggested device placements to formulate the reward. We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models, namely Inception-V3, ResNet, and BERT. The robustness of the proposed framework is also highlighted through an ablation study. The suggested placements improve the inference speed for the benchmark models by up to $58.2\%$ over CPU execution and by up to $60.24\%$ compared to other commonly used baselines. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14148 [pdf]

Real Time Deep Learning Weapon Detection Techniques for Mitigating Lone Wolf Attacks

Authors: Kambhatla Akhila, Khaled R Ahmed

Abstract: Firearm Shootings and stabbings attacks are intense and result in severe trauma and threat to public safety. Technology is needed to prevent lone-wolf attacks without human supervision. Hence designing an automatic weapon detection using deep learning, is an optimized solution to localize and detect the presence of weapon objects using Neural Networks. This research focuses on both unified and II-… ▽ More Firearm Shootings and stabbings attacks are intense and result in severe trauma and threat to public safety. Technology is needed to prevent lone-wolf attacks without human supervision. Hence designing an automatic weapon detection using deep learning, is an optimized solution to localize and detect the presence of weapon objects using Neural Networks. This research focuses on both unified and II-stage object detectors whose resultant model not only detects the presence of weapons but also classifies with respective to its weapon classes, including handgun, knife, revolver, and rifle, along with person detection. This research focuses on (You Look Only Once) family and Faster RCNN family for model validation and training. Pruning and Ensembling techniques were applied to YOLOv5 to enhance their speed and performance. models achieve the highest score of 78% with an inference speed of 8.1ms. However, Faster R-CNN models achieve the highest AP 89%. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.07387 [pdf, other]

Semantic Loss Functions for Neuro-Symbolic Structured Prediction

Authors: Kareem Ahmed, Stefano Teso, Paolo Morettin, Luca Di Liello, Pierfrancesco Ardino, Jacopo Gobbi, Yitao Liang, Eric Wang, Kai-Wei Chang, Andrea Passerini, Guy Van den Broeck

Abstract: Structured output prediction problems are ubiquitous in machine learning. The prominent approach leverages neural networks as powerful feature extractors, otherwise assuming the independence of the outputs. These outputs, however, jointly encode an object, e.g. a path in a graph, and are therefore related through the structure underlying the output space. We discuss the semantic loss, which inject… ▽ More Structured output prediction problems are ubiquitous in machine learning. The prominent approach leverages neural networks as powerful feature extractors, otherwise assuming the independence of the outputs. These outputs, however, jointly encode an object, e.g. a path in a graph, and are therefore related through the structure underlying the output space. We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training by minimizing the network's violation of such dependencies, steering the network towards predicting distributions satisfying the underlying structure. At the same time, it is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby, while also enabling efficient end-to-end training and inference. We also discuss key improvements and applications of the semantic loss. One limitations of the semantic loss is that it does not exploit the association of every data point with certain features certifying its membership in a target class. We should therefore prefer minimum-entropy distributions over valid structures, which we obtain by additionally minimizing the neuro-symbolic entropy. We empirically demonstrate the benefits of this more refined formulation. Moreover, the semantic loss is designed to be modular and can be combined with both discriminative and generative neural models. This is illustrated by integrating it into generative adversarial networks, yielding constrained adversarial networks, a novel class of deep generative models able to efficiently synthesize complex objects obeying the structure of the underlying domain. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: Preprint of Ch. 22 "Semantic Loss Functions for Neuro-Symbolic Structured Prediction" in "Compendium of Neurosymbolic Artificial Intelligence", https://ebooks.iospress.nl/ISBN/978-1-64368-406-2. arXiv admin note: substantial text overlap with arXiv:2201.11250, arXiv:2007.13197

arXiv:2404.10841 [pdf, other]

Gasformer: A Transformer-based Architecture for Segmenting Methane Emissions from Livestock in Optical Gas Imaging

Authors: Toqi Tahamid Sarker, Mohamed G Embaby, Khaled R Ahmed, Amer AbuGhazaleh

Abstract: Methane emissions from livestock, particularly cattle, significantly contribute to climate change. Effective methane emission mitigation strategies are crucial as the global population and demand for livestock products increase. We introduce Gasformer, a novel semantic segmentation architecture for detecting low-flow rate methane emissions from livestock, and controlled release experiments using o… ▽ More Methane emissions from livestock, particularly cattle, significantly contribute to climate change. Effective methane emission mitigation strategies are crucial as the global population and demand for livestock products increase. We introduce Gasformer, a novel semantic segmentation architecture for detecting low-flow rate methane emissions from livestock, and controlled release experiments using optical gas imaging. We present two unique datasets captured with a FLIR GF77 OGI camera. Gasformer leverages a Mix Vision Transformer encoder and a Light-Ham decoder to generate multi-scale features and refine segmentation maps. Gasformer outperforms other state-of-the-art models on both datasets, demonstrating its effectiveness in detecting and segmenting methane plumes in controlled and real-world scenarios. On the livestock dataset, Gasformer achieves mIoU of 88.56%, surpassing other state-of-the-art models. Materials are available at: github.com/toqitahamid/Gasformer. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: 9 pages, 5 figures, this paper has been submitted and accepted for publication at CVPRW 2024

arXiv:2404.00470 [pdf]

Classification of Short Segment Pediatric Heart Sounds Based on a Transformer-Based Convolutional Neural Network

Authors: Md Hassanuzzaman, Nurul Akhtar Hasan, Mohammad Abdullah Al Mamun, Khawza I Ahmed, Ahsan H Khandoker, Raqibul Mostafa

Abstract: Congenital anomalies arising as a result of a defect in the structure of the heart and great vessels are known as congenital heart diseases or CHDs. A PCG can provide essential details about the mechanical conduction system of the heart and point out specific patterns linked to different kinds of CHD. This study aims to investigate the minimum signal duration required for the automatic classificat… ▽ More Congenital anomalies arising as a result of a defect in the structure of the heart and great vessels are known as congenital heart diseases or CHDs. A PCG can provide essential details about the mechanical conduction system of the heart and point out specific patterns linked to different kinds of CHD. This study aims to investigate the minimum signal duration required for the automatic classification of heart sounds. This study also investigated the optimum signal quality assessment indicator (Root Mean Square of Successive Differences) RMSSD and (Zero Crossings Rate) ZCR value. Mel-frequency cepstral coefficients (MFCCs) based feature is used as an input to build a Transformer-Based residual one-dimensional convolutional neural network, which is then used for classifying the heart sound. The study showed that 0.4 is the ideal threshold for getting suitable signals for the RMSSD and ZCR indicators. Moreover, a minimum signal length of 5s is required for effective heart sound classification. It also shows that a shorter signal (3 s heart sound) does not have enough information to categorize heart sounds accurately, and the longer signal (15 s heart sound) may contain more noise. The best accuracy, 93.69%, is obtained for the 5s signal to distinguish the heart sound. △ Less

Submitted 30 March, 2024; originally announced April 2024.

Comments: 16 pages,11 Figures

arXiv:2403.10722 [pdf, other]

Cannabis Seed Variant Detection using Faster R-CNN

Authors: Toqi Tahamid Sarker, Taminul Islam, Khaled R Ahmed

Abstract: Analyzing and detecting cannabis seed variants is crucial for the agriculture industry. It enables precision breeding, allowing cultivators to selectively enhance desirable traits. Accurate identification of seed variants also ensures regulatory compliance, facilitating the cultivation of specific cannabis strains with defined characteristics, ultimately improving agricultural productivity and mee… ▽ More Analyzing and detecting cannabis seed variants is crucial for the agriculture industry. It enables precision breeding, allowing cultivators to selectively enhance desirable traits. Accurate identification of seed variants also ensures regulatory compliance, facilitating the cultivation of specific cannabis strains with defined characteristics, ultimately improving agricultural productivity and meeting diverse market demands. This paper presents a study on cannabis seed variant detection by employing a state-of-the-art object detection model Faster R-CNN. This study implemented the model on a locally sourced cannabis seed dataset in Thailand, comprising 17 distinct classes. We evaluate six Faster R-CNN models by comparing performance on various metrics and achieving a mAP score of 94.08\% and an F1 score of 95.66\%. This paper presents the first known application of deep neural network object detection models to the novel task of visually identifying cannabis seed types. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 6 pages, 2 figures, this has been submitted and accepted for publication at IEEE - ICACCS 2024

arXiv:2403.08264 [pdf, other]

GPT, Ontology, and CAABAC: A Tripartite Personalized Access Control Model Anchored by Compliance, Context and Attribute

Authors: Raza Nowrozy, Khandakar Ahmed, Hua Wang

Abstract: As digital healthcare evolves, the security of electronic health records (EHR) becomes increasingly crucial. This study presents the GPT-Onto-CAABAC framework, integrating Generative Pretrained Transformer (GPT), medical-legal ontologies and Context-Aware Attribute-Based Access Control (CAABAC) to enhance EHR access security. Unlike traditional models, GPT-Onto-CAABAC dynamically interprets polici… ▽ More As digital healthcare evolves, the security of electronic health records (EHR) becomes increasingly crucial. This study presents the GPT-Onto-CAABAC framework, integrating Generative Pretrained Transformer (GPT), medical-legal ontologies and Context-Aware Attribute-Based Access Control (CAABAC) to enhance EHR access security. Unlike traditional models, GPT-Onto-CAABAC dynamically interprets policies and adapts to changing healthcare and legal environments, offering customized access control solutions. Through empirical evaluation, this framework is shown to be effective in improving EHR security by accurately aligning access decisions with complex regulatory and situational requirements. The findings suggest its broader applicability in sectors where access control must meet stringent compliance and adaptability standards. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.07483 [pdf, other]

A Deep Learning Approach to Diabetes Diagnosis

Authors: Zeyu Zhang, Khandaker Asif Ahmed, Md Rakibul Hasan, Tom Gedeon, Md Zakir Hossain

Abstract: Diabetes, resulting from inadequate insulin production or utilization, causes extensive harm to the body. Existing diagnostic methods are often invasive and come with drawbacks, such as cost constraints. Although there are machine learning models like Classwise k Nearest Neighbor (CkNN) and General Regression Neural Network (GRNN), they struggle with imbalanced data and result in under-performance… ▽ More Diabetes, resulting from inadequate insulin production or utilization, causes extensive harm to the body. Existing diagnostic methods are often invasive and come with drawbacks, such as cost constraints. Although there are machine learning models like Classwise k Nearest Neighbor (CkNN) and General Regression Neural Network (GRNN), they struggle with imbalanced data and result in under-performance. Leveraging advancements in sensor technology and machine learning, we propose a non-invasive diabetes diagnosis using a Back Propagation Neural Network (BPNN) with batch normalization, incorporating data re-sampling and normalization for class balancing. Our method addresses existing challenges such as limited performance associated with traditional machine learning. Experimental results on three datasets show significant improvements in overall accuracy, sensitivity, and specificity compared to traditional methods. Notably, we achieve accuracies of 89.81% in Pima diabetes dataset, 75.49% in CDC BRFSS2015 dataset, and 95.28% in Mesra Diabetes dataset. This underscores the potential of deep learning models for robust diabetes diagnosis. See project website https://steve-zeyu-zhang.github.io/DiabetesDiagnosis/ △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: Accepted to ACIIDS 2024

arXiv:2402.17204 [pdf, other]

Advancing Generative Model Evaluation: A Novel Algorithm for Realistic Image Synthesis and Comparison in OCR System

Authors: Majid Memari, Khaled R. Ahmed, Shahram Rahimi, Noorbakhsh Amiri Golilarz

Abstract: This research addresses a critical challenge in the field of generative models, particularly in the generation and evaluation of synthetic images. Given the inherent complexity of generative models and the absence of a standardized procedure for their comparison, our study introduces a pioneering algorithm to objectively assess the realism of synthetic images. This approach significantly enhances… ▽ More This research addresses a critical challenge in the field of generative models, particularly in the generation and evaluation of synthetic images. Given the inherent complexity of generative models and the absence of a standardized procedure for their comparison, our study introduces a pioneering algorithm to objectively assess the realism of synthetic images. This approach significantly enhances the evaluation methodology by refining the Fréchet Inception Distance (FID) score, allowing for a more precise and subjective assessment of image quality. Our algorithm is particularly tailored to address the challenges in generating and evaluating realistic images of Arabic handwritten digits, a task that has traditionally been near-impossible due to the subjective nature of realism in image generation. By providing a systematic and objective framework, our method not only enables the comparison of different generative models but also paves the way for improvements in their design and output. This breakthrough in evaluation and comparison is crucial for advancing the field of OCR, especially for scripts that present unique complexities, and sets a new standard in the generation and assessment of high-quality synthetic images. △ Less

Submitted 1 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: The manuscript was submitted to IEEE Access on 29-Jan-2024 and is currently under review for publication in IEEE Access

arXiv:2402.13415 [pdf, other]

Structure Guided Prompt: Instructing Large Language Model in Multi-Step Reasoning by Exploring Graph Structure of the Text

Authors: Kewei Cheng, Nesreen K. Ahmed, Theodore Willke, Yizhou Sun

Abstract: Although Large Language Models (LLMs) excel at addressing straightforward reasoning tasks, they frequently struggle with difficulties when confronted by more complex multi-step reasoning due to a range of factors. Firstly, natural language often encompasses complex relationships among entities, making it challenging to maintain a clear reasoning chain over longer spans. Secondly, the abundance of… ▽ More Although Large Language Models (LLMs) excel at addressing straightforward reasoning tasks, they frequently struggle with difficulties when confronted by more complex multi-step reasoning due to a range of factors. Firstly, natural language often encompasses complex relationships among entities, making it challenging to maintain a clear reasoning chain over longer spans. Secondly, the abundance of linguistic diversity means that the same entities and relationships can be expressed using different terminologies and structures, complicating the task of identifying and establishing connections between multiple pieces of information. Graphs provide an effective solution to represent data rich in relational information and capture long-term dependencies among entities. To harness the potential of graphs, our paper introduces Structure Guided Prompt, an innovative three-stage task-agnostic prompting framework designed to improve the multi-step reasoning capabilities of LLMs in a zero-shot setting. This framework explicitly converts unstructured text into a graph via LLMs and instructs them to navigate this graph using task-specific strategies to formulate responses. By effectively organizing information and guiding navigation, it enables LLMs to provide more accurate and context-aware responses. Our experiments show that this framework significantly enhances the reasoning capabilities of LLMs, enabling them to excel in a broader spectrum of natural language scenarios. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.09095 [pdf, other]

FedSiKD: Clients Similarity and Knowledge Distillation: Addressing Non-i.i.d. and Constraints in Federated Learning

Authors: Yousef Alsenani, Rahul Mishra, Khaled R. Ahmed, Atta Ur Rahman

Abstract: In recent years, federated learning (FL) has emerged as a promising technique for training machine learning models in a decentralized manner while also preserving data privacy. The non-independent and identically distributed (non-i.i.d.) nature of client data, coupled with constraints on client or edge devices, presents significant challenges in FL. Furthermore, learning across a high number of co… ▽ More In recent years, federated learning (FL) has emerged as a promising technique for training machine learning models in a decentralized manner while also preserving data privacy. The non-independent and identically distributed (non-i.i.d.) nature of client data, coupled with constraints on client or edge devices, presents significant challenges in FL. Furthermore, learning across a high number of communication rounds can be risky and potentially unsafe for model exploitation. Traditional FL approaches may suffer from these challenges. Therefore, we introduce FedSiKD, which incorporates knowledge distillation (KD) within a similarity-based federated learning framework. As clients join the system, they securely share relevant statistics about their data distribution, promoting intra-cluster homogeneity. This enhances optimization efficiency and accelerates the learning process, effectively transferring knowledge between teacher and student models and addressing device constraints. FedSiKD outperforms state-of-the-art algorithms by achieving higher accuracy, exceeding by 25\% and 18\% for highly skewed data at $α= {0.1,0.5}$ on the HAR and MNIST datasets, respectively. Its faster convergence is illustrated by a 17\% and 20\% increase in accuracy within the first five rounds on the HAR and MNIST datasets, respectively, highlighting its early-stage learning proficiency. Code is publicly available and hosted on GitHub (https://github.com/SimuEnv/FedSiKD) △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: 11 pages, 10 figures Under Review - IEEE Transactions on Information Forensics & Security

arXiv:2402.02018 [pdf, other]

The Landscape and Challenges of HPC Research and LLMs

Authors: Le Chen, Nesreen K. Ahmed, Akash Dutta, Arijit Bhattacharjee, Sixing Yu, Quazi Ishtiaque Mahmud, Waqwoya Abebe, Hung Phan, Aishwarya Sarkar, Branden Butler, Niranjan Hasabnis, Gal Oren, Vy A. Vo, Juan Pablo Munoz, Theodore L. Willke, Tim Mattson, Ali Jannesari

Abstract: Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breach… ▽ More Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breaching exascale performance levels. In this paper, we posit that adapting and utilizing such language model-based techniques for tasks in high-performance computing (HPC) would be very beneficial. This study presents our reasoning behind the aforementioned position and highlights how existing ideas can be improved and adapted for HPC tasks. △ Less

Submitted 6 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

arXiv:2312.05657 [pdf, other]

Leveraging Reinforcement Learning and Large Language Models for Code Optimization

Authors: Shukai Duan, Nikos Kanakaris, Xiongye Xiao, Heng Ping, Chenyu Zhou, Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Theodore L. Willke, Shahin Nazarian, Paul Bogdan

Abstract: Code optimization is a daunting task that requires a significant level of expertise from experienced programmers. This level of expertise is not sufficient when compared to the rapid development of new hardware architectures. Towards advancing the whole code optimization process, recent approaches rely on machine learning and artificial intelligence techniques. This paper introduces a new framewor… ▽ More Code optimization is a daunting task that requires a significant level of expertise from experienced programmers. This level of expertise is not sufficient when compared to the rapid development of new hardware architectures. Towards advancing the whole code optimization process, recent approaches rely on machine learning and artificial intelligence techniques. This paper introduces a new framework to decrease the complexity of code optimization. The proposed framework builds on large language models (LLMs) and reinforcement learning (RL) and enables LLMs to receive feedback from their environment (i.e., unit tests) during the fine-tuning process. We compare our framework with existing state-of-the-art models and show that it is more efficient with respect to speed and computational usage, as a result of the decrement in training steps and its applicability to models with fewer parameters. Additionally, our framework reduces the possibility of logical and syntactical errors. Toward evaluating our approach, we run several experiments on the PIE dataset using a CodeT5 language model and RRHF, a new reinforcement learning algorithm. We adopt a variety of evaluation metrics with regards to optimization quality, and speedup. The evaluation results demonstrate that the proposed framework has similar results in comparison with existing models using shorter training times and smaller pre-trained models. In particular, we accomplish an increase of 5.6% and 2.2 over the baseline models concerning the %OP T and SP metrics. △ Less

Submitted 9 December, 2023; originally announced December 2023.

arXiv:2312.03905 [pdf, other]

A Pseudo-Semantic Loss for Autoregressive Models with Logical Constraints

Authors: Kareem Ahmed, Kai-Wei Chang, Guy Van den Broeck

Abstract: Neuro-symbolic AI bridges the gap between purely symbolic and neural approaches to learning. This often requires maximizing the likelihood of a symbolic constraint w.r.t the neural network's output distribution. Such output distributions are typically assumed to be fully-factorized. This limits the applicability of neuro-symbolic learning to the more expressive autoregressive distributions, e.g.,… ▽ More Neuro-symbolic AI bridges the gap between purely symbolic and neural approaches to learning. This often requires maximizing the likelihood of a symbolic constraint w.r.t the neural network's output distribution. Such output distributions are typically assumed to be fully-factorized. This limits the applicability of neuro-symbolic learning to the more expressive autoregressive distributions, e.g., transformers. Under such distributions, computing the likelihood of even simple constraints is #P-hard. Instead of attempting to enforce the constraint on the entire output distribution, we propose to do so on a random, local approximation thereof. More precisely, we optimize the likelihood of the constraint under a pseudolikelihood-based approximation centered around a model sample. Our approximation is factorized, allowing the reuse of solutions to sub-problems, a main tenet for efficiently computing neuro-symbolic losses. Moreover, it is a local, high-fidelity approximation of the likelihood, exhibiting low entropy and KL-divergence around the model sample. We evaluate our approach on Sudoku and shortest-path prediction cast as autoregressive generation, and observe that we greatly improve upon the base model's ability to predict logically-consistent outputs. We also evaluate on the task of detoxifying large language models. Using a simple constraint disallowing a list of toxic words, we are able to steer the model's outputs away from toxic generations, achieving SoTA detoxification compared to previous approaches. △ Less

Submitted 26 January, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

Comments: Updated detoxification experiments; moved example toxic generations to Github and added link

arXiv:2311.17856 [pdf, other]

Leveraging Graph Diffusion Models for Network Refinement Tasks

Authors: Puja Trivedi, Ryan Rossi, David Arbour, Tong Yu, Franck Dernoncourt, Sungchul Kim, Nedim Lipka, Namyong Park, Nesreen K. Ahmed, Danai Koutra

Abstract: Most real-world networks are noisy and incomplete samples from an unknown target distribution. Refining them by correcting corruptions or inferring unobserved regions typically improves downstream performance. Inspired by the impressive generative capabilities that have been used to correct corruptions in images, and the similarities between "in-painting" and filling in missing nodes and edges con… ▽ More Most real-world networks are noisy and incomplete samples from an unknown target distribution. Refining them by correcting corruptions or inferring unobserved regions typically improves downstream performance. Inspired by the impressive generative capabilities that have been used to correct corruptions in images, and the similarities between "in-painting" and filling in missing nodes and edges conditioned on the observed graph, we propose a novel graph generative framework, SGDM, which is based on subgraph diffusion. Our framework not only improves the scalability and fidelity of graph diffusion models, but also leverages the reverse process to perform novel, conditional generation tasks. In particular, through extensive empirical analysis and a set of novel metrics, we demonstrate that our proposed model effectively supports the following refinement tasks for partially observable networks: T1: denoising extraneous subgraphs, T2: expanding existing subgraphs and T3: performing "style" transfer by regenerating a particular subgraph to match the characteristics of a different node or subgraph. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: Work in Progress. 21 pages, 7 figures

arXiv:2311.13718 [pdf, other]

A Unified Approach to Count-Based Weakly-Supervised Learning

Authors: Vinay Shukla, Zhe Zeng, Kareem Ahmed, Guy Van den Broeck

Abstract: High-quality labels are often very scarce, whereas unlabeled data with inferred weak labels occurs more naturally. In many cases, these weak labels dictate the frequency of each respective class over a set of instances. In this paper, we develop a unified approach to learning from such weakly-labeled data, which we call count-based weakly-supervised learning. At the heart of our approach is the ab… ▽ More High-quality labels are often very scarce, whereas unlabeled data with inferred weak labels occurs more naturally. In many cases, these weak labels dictate the frequency of each respective class over a set of instances. In this paper, we develop a unified approach to learning from such weakly-labeled data, which we call count-based weakly-supervised learning. At the heart of our approach is the ability to compute the probability of exactly k out of n outputs being set to true. This computation is differentiable, exact, and efficient. Building upon the previous computation, we derive a count loss penalizing the model for deviations in its distribution from an arithmetic constraint defined over label counts. We evaluate our approach on three common weakly-supervised learning paradigms and observe that our proposed approach achieves state-of-the-art or highly competitive results across all three of the paradigms. △ Less

Submitted 22 November, 2023; originally announced November 2023.

arXiv:2311.06505 [pdf, other]

CompCodeVet: A Compiler-guided Validation and Enhancement Approach for Code Dataset

Authors: Le Chen, Arijit Bhattacharjee, Nesreen K. Ahmed, Niranjan Hasabnis, Gal Oren, Bin Lei, Ali Jannesari

Abstract: Large language models (LLMs) have become increasingly prominent in academia and industry due to their remarkable performance in diverse applications. As these models evolve with increasing parameters, they excel in tasks like sentiment analysis and machine translation. However, even models with billions of parameters face challenges in tasks demanding multi-step reasoning. Code generation and comp… ▽ More Large language models (LLMs) have become increasingly prominent in academia and industry due to their remarkable performance in diverse applications. As these models evolve with increasing parameters, they excel in tasks like sentiment analysis and machine translation. However, even models with billions of parameters face challenges in tasks demanding multi-step reasoning. Code generation and comprehension, especially in C and C++, emerge as significant challenges. While LLMs trained on code datasets demonstrate competence in many tasks, they struggle with rectifying non-compilable C and C++ code. Our investigation attributes this subpar performance to two primary factors: the quality of the training dataset and the inherent complexity of the problem which demands intricate reasoning. Existing "Chain of Thought" (CoT) prompting techniques aim to enhance multi-step reasoning. This approach, however, retains the limitations associated with the latent drawbacks of LLMs. In this work, we propose CompCodeVet, a compiler-guided CoT approach to produce compilable code from non-compilable ones. Diverging from the conventional approach of utilizing larger LLMs, we employ compilers as a teacher to establish a more robust zero-shot thought process. The evaluation of CompCodeVet on two open-source code datasets shows that CompCodeVet has the ability to improve the training dataset quality for LLMs. △ Less

Submitted 11 November, 2023; originally announced November 2023.

arXiv:2310.05768 [pdf]

doi 10.1109/ICCD59681.2023.10420622

DANet: Enhancing Small Object Detection through an Efficient Deformable Attention Network

Authors: Md Sohag Mia, Abdullah Al Bary Voban, Abu Bakor Hayat Arnob, Abdu Naim, Md Kawsar Ahmed, Md Shariful Islam

Abstract: Efficient and accurate detection of small objects in manufacturing settings, such as defects and cracks, is crucial for ensuring product quality and safety. To address this issue, we proposed a comprehensive strategy by synergizing Faster R-CNN with cutting-edge methods. By combining Faster R-CNN with Feature Pyramid Network, we enable the model to efficiently handle multi-scale features intrinsic… ▽ More Efficient and accurate detection of small objects in manufacturing settings, such as defects and cracks, is crucial for ensuring product quality and safety. To address this issue, we proposed a comprehensive strategy by synergizing Faster R-CNN with cutting-edge methods. By combining Faster R-CNN with Feature Pyramid Network, we enable the model to efficiently handle multi-scale features intrinsic to manufacturing environments. Additionally, Deformable Net is used that contorts and conforms to the geometric variations of defects, bringing precision in detecting even the minuscule and complex features. Then, we incorporated an attention mechanism called Convolutional Block Attention Module in each block of our base ResNet50 network to selectively emphasize informative features and suppress less useful ones. After that we incorporated RoI Align, replacing RoI Pooling for finer region-of-interest alignment and finally the integration of Focal Loss effectively handles class imbalance, crucial for rare defect occurrences. The rigorous evaluation of our model on both the NEU-DET and Pascal VOC datasets underscores its robust performance and generalization capabilities. On the NEU-DET dataset, our model exhibited a profound understanding of steel defects, achieving state-of-the-art accuracy in identifying various defects. Simultaneously, when evaluated on the Pascal VOC dataset, our model showcases its ability to detect objects across a wide spectrum of categories within complex and small scenes. △ Less

Submitted 13 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

Comments: ICCD-23

Report number: 10.1109/ICCD59681.2023

Journal ref: International Conference on the Cognitive Computing and Complex Data (ICCD) 2023

arXiv:2310.04047 [pdf, other]

AUTOPARLLM: GNN-Guided Automatic Code Parallelization using Large Language Models

Authors: Quazi Ishtiaque Mahmud, Ali TehraniJamsaz, Hung D Phan, Nesreen K. Ahmed, Ali Jannesari

Abstract: Parallelizing sequentially written programs is a challenging task. Even experienced developers need to spend considerable time finding parallelism opportunities and then actually writing parallel versions of sequentially written programs. To address this issue, we present AUTOPARLLM, a framework for automatically discovering parallelism and generating the parallel version of the sequentially writt… ▽ More Parallelizing sequentially written programs is a challenging task. Even experienced developers need to spend considerable time finding parallelism opportunities and then actually writing parallel versions of sequentially written programs. To address this issue, we present AUTOPARLLM, a framework for automatically discovering parallelism and generating the parallel version of the sequentially written program. Our framework consists of two major components: i) a heterogeneous Graph Neural Network (GNN) based parallelism discovery and parallel pattern detection module, and ii) an LLM-based code generator to generate the parallel counterpart of the sequential programs. We use the GNN to learn the flow-aware characteristics of the programs to identify parallel regions in sequential programs and then construct an enhanced prompt using the GNN's results for the LLM-based generator to finally produce the parallel counterparts of the sequential programs. We evaluate AUTOPARLLM on 11 applications of 2 well-known benchmark suites: NAS Parallel Benchmark and Rodinia Benchmark. Our results show that AUTOPARLLM is indeed effective in improving the state-of-the-art LLM-based models for the task of parallel code generation in terms of multiple code generation metrics. AUTOPARLLM also improves the average runtime of the parallel code generated by the state-of-the-art LLMs by as high as 3.4% and 2.9% for the NAS Parallel Benchmark and Rodinia Benchmark respectively. Additionally, to overcome the issue that well-known metrics for translation evaluation have not been optimized to evaluate the quality of the generated parallel code, we propose OMPScore for evaluating the quality of the generated code. We show that OMPScore exhibits a better correlation with human judgment than existing metrics, measured by up to 75% improvement of Spearman correlation. △ Less

Submitted 8 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

Comments: 10 pages

arXiv:2310.02156 [pdf, other]

Probabilistically Rewired Message-Passing Neural Networks

Authors: Chendi Qian, Andrei Manolache, Kareem Ahmed, Zhe Zeng, Guy Van den Broeck, Mathias Niepert, Christopher Morris

Abstract: Message-passing graph neural networks (MPNNs) emerged as powerful tools for processing graph-structured input. However, they operate on a fixed input graph structure, ignoring potential noise and missing information. Furthermore, their local aggregation mechanism can lead to problems such as over-squashing and limited expressive power in capturing relevant graph structures. Existing solutions to t… ▽ More Message-passing graph neural networks (MPNNs) emerged as powerful tools for processing graph-structured input. However, they operate on a fixed input graph structure, ignoring potential noise and missing information. Furthermore, their local aggregation mechanism can lead to problems such as over-squashing and limited expressive power in capturing relevant graph structures. Existing solutions to these challenges have primarily relied on heuristic methods, often disregarding the underlying data distribution. Hence, devising principled approaches for learning to infer graph structures relevant to the given prediction task remains an open challenge. In this work, leveraging recent progress in exact and differentiable $k$-subset sampling, we devise probabilistically rewired MPNNs (PR-MPNNs), which learn to add relevant edges while omitting less beneficial ones. For the first time, our theoretical analysis explores how PR-MPNNs enhance expressive power, and we identify precise conditions under which they outperform purely randomized approaches. Empirically, we demonstrate that our approach effectively mitigates issues like over-squashing and under-reaching. In addition, on established real-world datasets, our method exhibits competitive or superior predictive performance compared to traditional MPNN models and recent graph transformer architectures. △ Less

Submitted 26 March, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

Comments: ICLR 2024

arXiv:2309.00770 [pdf, other]

Bias and Fairness in Large Language Models: A Survey

Authors: Isabel O. Gallegos, Ryan A. Rossi, Joe Barrow, Md Mehrab Tanjim, Sungchul Kim, Franck Dernoncourt, Tong Yu, Ruiyi Zhang, Nesreen K. Ahmed

Abstract: Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation of human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these models can learn, perpetuate, and amplify harmful social biases. In this paper, we present a comprehensive survey of bias evaluation and mitigation techniques for LLMs. We… ▽ More Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation of human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these models can learn, perpetuate, and amplify harmful social biases. In this paper, we present a comprehensive survey of bias evaluation and mitigation techniques for LLMs. We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing, defining distinct facets of harm and introducing several desiderata to operationalize fairness for LLMs. We then unify the literature by proposing three intuitive taxonomies, two for bias evaluation, namely metrics and datasets, and one for mitigation. Our first taxonomy of metrics for bias evaluation disambiguates the relationship between metrics and evaluation datasets, and organizes metrics by the different levels at which they operate in a model: embeddings, probabilities, and generated text. Our second taxonomy of datasets for bias evaluation categorizes datasets by their structure as counterfactual inputs or prompts, and identifies the targeted harms and social groups; we also release a consolidation of publicly-available datasets for improved access. Our third taxonomy of techniques for bias mitigation classifies methods by their intervention during pre-processing, in-training, intra-processing, and post-processing, with granular subcategories that elucidate research trends. Finally, we identify open problems and challenges for future work. Synthesizing a wide range of recent research, we aim to provide a clear guide of the existing literature that empowers researchers and practitioners to better understand and prevent the propagation of bias in LLMs. △ Less

Submitted 12 July, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

Comments: Accepted at Computational Linguistics, Volume 50, Number 3

arXiv:2307.03929 [pdf, other]

Fairness-Aware Graph Neural Networks: A Survey

Authors: April Chen, Ryan A. Rossi, Namyong Park, Puja Trivedi, Yu Wang, Tong Yu, Sungchul Kim, Franck Dernoncourt, Nesreen K. Ahmed

Abstract: Graph Neural Networks (GNNs) have become increasingly important due to their representational power and state-of-the-art predictive performance on many fundamental learning tasks. Despite this success, GNNs suffer from fairness issues that arise as a result of the underlying graph data and the fundamental aggregation mechanism that lies at the heart of the large class of GNN models. In this articl… ▽ More Graph Neural Networks (GNNs) have become increasingly important due to their representational power and state-of-the-art predictive performance on many fundamental learning tasks. Despite this success, GNNs suffer from fairness issues that arise as a result of the underlying graph data and the fundamental aggregation mechanism that lies at the heart of the large class of GNN models. In this article, we examine and categorize fairness techniques for improving the fairness of GNNs. Previous work on fair GNN models and techniques are discussed in terms of whether they focus on improving fairness during a preprocessing step, during training, or in a post-processing phase. Furthermore, we discuss how such techniques can be used together whenever appropriate, and highlight the advantages and intuition as well. We also introduce an intuitive taxonomy for fairness evaluation metrics including graph-level fairness, neighborhood-level fairness, embedding-level fairness, and prediction-level fairness metrics. In addition, graph datasets that are useful for benchmarking the fairness of GNN models are summarized succinctly. Finally, we highlight key open problems and challenges that remain to be addressed. △ Less

Submitted 8 July, 2023; originally announced July 2023.

arXiv:2306.00210 [pdf, other]

PERFOGRAPH: A Numerical Aware Program Graph Representation for Performance Optimization and Program Analysis

Authors: Ali TehraniJamsaz, Quazi Ishtiaque Mahmud, Le Chen, Nesreen K. Ahmed, Ali Jannesari

Abstract: The remarkable growth and significant success of machine learning have expanded its applications into programming languages and program analysis. However, a key challenge in adopting the latest machine learning methods is the representation of programming languages, which directly impacts the ability of machine learning methods to reason about programs. The absence of numerical awareness, aggregat… ▽ More The remarkable growth and significant success of machine learning have expanded its applications into programming languages and program analysis. However, a key challenge in adopting the latest machine learning methods is the representation of programming languages, which directly impacts the ability of machine learning methods to reason about programs. The absence of numerical awareness, aggregate data structure information, and improper way of presenting variables in previous representation works have limited their performances. To overcome the limitations and challenges of current program representations, we propose a graph-based program representation called PERFOGRAPH. PERFOGRAPH can capture numerical information and the aggregate data structure by introducing new nodes and edges. Furthermore, we propose an adapted embedding method to incorporate numerical awareness. These enhancements make PERFOGRAPH a highly flexible and scalable representation that effectively captures programs intricate dependencies and semantics. Consequently, it serves as a powerful tool for various applications such as program analysis, performance optimization, and parallelism discovery. Our experimental results demonstrate that PERFOGRAPH outperforms existing representations and sets new state-of-the-art results by reducing the error rate by 7.4% (AMD dataset) and 10% (NVIDIA dataset) in the well-known Device Mapping challenge. It also sets new state-of-the-art results in various performance optimization tasks like Parallelism Discovery and NUMA and Prefetchers Configuration prediction. △ Less

Submitted 29 November, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

arXiv:2305.10449 [pdf, other]

Cooperation Is All You Need

Authors: Ahsan Adeel, Junaid Muzaffar, Khubaib Ahmed, Mohsin Raza

Abstract: Going beyond 'dendritic democracy', we introduce a 'democracy of local processors', termed Cooperator. Here we compare their capabilities when used in permutation-invariant neural networks for reinforcement learning (RL), with machine learning algorithms based on Transformers, such as ChatGPT. Transformers are based on the long-standing conception of integrate-and-fire 'point' neurons, whereas Coo… ▽ More Going beyond 'dendritic democracy', we introduce a 'democracy of local processors', termed Cooperator. Here we compare their capabilities when used in permutation-invariant neural networks for reinforcement learning (RL), with machine learning algorithms based on Transformers, such as ChatGPT. Transformers are based on the long-standing conception of integrate-and-fire 'point' neurons, whereas Cooperator is inspired by recent neurobiological breakthroughs suggesting that the cellular foundations of mental life depend on context-sensitive pyramidal neurons in the neocortex which have two functionally distinct points. We show that when used for RL, an algorithm based on Cooperator learns far quicker than that based on Transformer, even while having the same number of parameters. △ Less

Submitted 16 May, 2023; originally announced May 2023.

arXiv:2305.05779 [pdf, other]

Learning to Parallelize with OpenMP by Augmented Heterogeneous AST Representation

Authors: Le Chen, Quazi Ishtiaque Mahmud, Hung Phan, Nesreen K. Ahmed, Ali Jannesari

Abstract: Detecting parallelizable code regions is a challenging task, even for experienced developers. Numerous recent studies have explored the use of machine learning for code analysis and program synthesis, including parallelization, in light of the success of machine learning in natural language processing. However, applying machine learning techniques to parallelism detection presents several challeng… ▽ More Detecting parallelizable code regions is a challenging task, even for experienced developers. Numerous recent studies have explored the use of machine learning for code analysis and program synthesis, including parallelization, in light of the success of machine learning in natural language processing. However, applying machine learning techniques to parallelism detection presents several challenges, such as the lack of an adequate dataset for training, an effective code representation with rich information, and a suitable machine learning model to learn the latent features of code for diverse analyses. To address these challenges, we propose a novel graph-based learning approach called Graph2Par that utilizes a heterogeneous augmented abstract syntax tree (Augmented-AST) representation for code. The proposed approach primarily focused on loop-level parallelization with OpenMP. Moreover, we create an OMP\_Serial dataset with 18598 parallelizable and 13972 non-parallelizable loops to train the machine learning models. Our results show that our proposed approach achieves the accuracy of parallelizable code region detection with 85\% accuracy and outperforms the state-of-the-art token-based machine learning approach. These results indicate that our approach is competitive with state-of-the-art tools and capable of handling loops with complex structures that other tools may overlook. △ Less

Submitted 9 May, 2023; originally announced May 2023.

arXiv:2304.13501 [pdf, other]

doi 10.1109/JRFID.2023.3269887

Routing Heterogeneous Traffic in Delay-Tolerant Satellite Networks

Authors: Pablo G. Madoery, Gunes Karabulut Kurt, Halim Yanikomeroglu, Peng Hu, Khaled Ahmed, Stéphane Martel, Guillaume Lamontagne

Abstract: Delay-tolerant networking (DTN) offers a novel architecture that can be used to enhance store-carry-forward routing in satellite networks. Since these networks can take advantage of scheduled contact plans, distributed algorithms like the Contact Graph Routing (CGR) can be utilized to optimize data delivery performance. However, despite the numerous improvements made to CGR, there is a lack of pro… ▽ More Delay-tolerant networking (DTN) offers a novel architecture that can be used to enhance store-carry-forward routing in satellite networks. Since these networks can take advantage of scheduled contact plans, distributed algorithms like the Contact Graph Routing (CGR) can be utilized to optimize data delivery performance. However, despite the numerous improvements made to CGR, there is a lack of proposals to prioritize traffic with distinct quality of service (QoS) requirements. This study presents adaptations to CGR to improve QoS-compliant delivery ratio when transmitting traffic with different latency constraints, along with an integer linear programming optimization model that serves as a performance upper bound. The extensive results obtained by simulating different scenarios show that the proposed algorithms can effectively improve the delivery ratio and energy efficiency while meeting latency constraints. △ Less

Submitted 7 May, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

Journal ref: IEEE Journal of Radio Frequency Identification 2023

arXiv:2303.03581 [pdf, other]

Neural Compositional Rule Learning for Knowledge Graph Reasoning

Authors: Kewei Cheng, Nesreen K. Ahmed, Yizhou Sun

Abstract: Learning logical rules is critical to improving reasoning in KGs. This is due to their ability to provide logical and interpretable explanations when used for predictions, as well as their ability to generalize to other tasks, domains, and data. While recent methods have been proposed to learn logical rules, the majority of these methods are either restricted by their computational complexity and… ▽ More Learning logical rules is critical to improving reasoning in KGs. This is due to their ability to provide logical and interpretable explanations when used for predictions, as well as their ability to generalize to other tasks, domains, and data. While recent methods have been proposed to learn logical rules, the majority of these methods are either restricted by their computational complexity and can not handle the large search space of large-scale KGs, or show poor generalization when exposed to data outside the training set. In this paper, we propose an end-to-end neural model for learning compositional logical rules called NCRL. NCRL detects the best compositional structure of a rule body, and breaks it into small compositions in order to infer the rule head. By recurrently merging compositions in the rule body with a recurrent attention unit, NCRL finally predicts a single rule head. Experimental results show that NCRL learns high-quality rules, as well as being generalizable. Specifically, we show that NCRL is scalable, efficient, and yields state-of-the-art results for knowledge graph completion on large-scale KGs. Moreover, we test NCRL for systematic generalization by learning to reason on small-scale observed graphs and evaluating on larger unseen ones. △ Less

Submitted 6 March, 2023; originally announced March 2023.

Comments: ICLR 2023

arXiv:2302.14207 [pdf, other]

Semantic Strengthening of Neuro-Symbolic Learning

Authors: Kareem Ahmed, Kai-Wei Chang, Guy Van den Broeck

Abstract: Numerous neuro-symbolic approaches have recently been proposed typically with the goal of adding symbolic knowledge to the output layer of a neural network. Ideally, such losses maximize the probability that the neural network's predictions satisfy the underlying domain. Unfortunately, this type of probabilistic inference is often computationally infeasible. Neuro-symbolic approaches therefore com… ▽ More Numerous neuro-symbolic approaches have recently been proposed typically with the goal of adding symbolic knowledge to the output layer of a neural network. Ideally, such losses maximize the probability that the neural network's predictions satisfy the underlying domain. Unfortunately, this type of probabilistic inference is often computationally infeasible. Neuro-symbolic approaches therefore commonly resort to fuzzy approximations of this probabilistic objective, sacrificing sound probabilistic semantics, or to sampling which is very seldom feasible. We approach the problem by first assuming the constraint decomposes conditioned on the features learned by the network. We iteratively strengthen our approximation, restoring the dependence between the constraints most responsible for degrading the quality of the approximation. This corresponds to computing the mutual information between pairs of constraints conditioned on the network's learned features, and may be construed as a measure of how well aligned the gradients of two distributions are. We show how to compute this efficiently for tractable circuits. We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles, observing that it improves upon the baselines while sidestepping intractability. △ Less

Submitted 27 February, 2023; originally announced February 2023.

Comments: Accepted at AISTATS 2023

arXiv:2212.12040 [pdf, ps, other]

Graph Learning with Localized Neighborhood Fairness

Authors: April Chen, Ryan Rossi, Nedim Lipka, Jane Hoffswell, Gromit Chan, Shunan Guo, Eunyee Koh, Sungchul Kim, Nesreen K. Ahmed

Abstract: Learning fair graph representations for downstream applications is becoming increasingly important, but existing work has mostly focused on improving fairness at the global level by either modifying the graph structure or objective function without taking into account the local neighborhood of a node. In this work, we formally introduce the notion of neighborhood fairness and develop a computation… ▽ More Learning fair graph representations for downstream applications is becoming increasingly important, but existing work has mostly focused on improving fairness at the global level by either modifying the graph structure or objective function without taking into account the local neighborhood of a node. In this work, we formally introduce the notion of neighborhood fairness and develop a computational framework for learning such locally fair embeddings. We argue that the notion of neighborhood fairness is more appropriate since GNN-based models operate at the local neighborhood level of a node. Our neighborhood fairness framework has two main components that are flexible for learning fair graph representations from arbitrary data: the first aims to construct fair neighborhoods for any arbitrary node in a graph and the second enables adaption of these fair neighborhoods to better capture certain application or data-dependent constraints, such as allowing neighborhoods to be more biased towards certain attributes or neighbors in the graph.Furthermore, while link prediction has been extensively studied, we are the first to investigate the graph representation learning task of fair link classification. We demonstrate the effectiveness of the proposed neighborhood fairness framework for a variety of graph machine learning tasks including fair link prediction, link classification, and learning fair graph embeddings. Notably, our approach achieves not only better fairness but also increases the accuracy in the majority of cases across a wide variety of graphs, problem settings, and metrics. △ Less

Submitted 22 December, 2022; originally announced December 2022.

arXiv:2212.11790 [pdf, other]

Normalized Contrastive Learning for Text-Video Retrieval

Authors: Yookoon Park, Mahmoud Azab, Bo Xiong, Seungwhan Moon, Florian Metze, Gourab Kundu, Kirmani Ahmed

Abstract: Cross-modal contrastive learning has led the recent advances in multimodal retrieval with its simplicity and effectiveness. In this work, however, we reveal that cross-modal contrastive learning suffers from incorrect normalization of the sum retrieval probabilities of each text or video instance. Specifically, we show that many test instances are either over- or under-represented during retrieval… ▽ More Cross-modal contrastive learning has led the recent advances in multimodal retrieval with its simplicity and effectiveness. In this work, however, we reveal that cross-modal contrastive learning suffers from incorrect normalization of the sum retrieval probabilities of each text or video instance. Specifically, we show that many test instances are either over- or under-represented during retrieval, significantly hurting the retrieval performance. To address this problem, we propose Normalized Contrastive Learning (NCL) which utilizes the Sinkhorn-Knopp algorithm to compute the instance-wise biases that properly normalize the sum retrieval probabilities of each instance so that every text and video instance is fairly represented during cross-modal retrieval. Empirical study shows that NCL brings consistent and significant gains in text-video retrieval on different model architectures, with new state-of-the-art multimodal retrieval metrics on the ActivityNet, MSVD, and MSR-VTT datasets without any architecture engineering. △ Less

Submitted 30 November, 2022; originally announced December 2022.

Comments: Published in EMNLP 2022

arXiv:2211.12996 [pdf, other]

Converting OpenStreetMap Data to Road Networks for Downstream Applications

Authors: Md Kaisar Ahmed

Abstract: We study how to convert OpenStreetMap data to road networks for downstream applications. OpenStreetMap data has different formats. Extensible Markup Language (XML) is one of them. OSM data consist of nodes, ways, and relations. We process OSM XML data to extract the information of nodes and ways to obtain the map of streets of the Memphis area. We can use this map for different downstream applicat… ▽ More We study how to convert OpenStreetMap data to road networks for downstream applications. OpenStreetMap data has different formats. Extensible Markup Language (XML) is one of them. OSM data consist of nodes, ways, and relations. We process OSM XML data to extract the information of nodes and ways to obtain the map of streets of the Memphis area. We can use this map for different downstream applications. △ Less

Submitted 13 December, 2022; v1 submitted 22 November, 2022; originally announced November 2022.

arXiv:2211.12009 [pdf]

Deep-Learning-Based Computer Vision Approach For The Segmentation Of Ball Deliveries And Tracking In Cricket

Authors: Kumail Abbas, Muhammad Saeed, M. Imad Khan, Khandakar Ahmed, Hua Wang

Abstract: There has been a significant increase in the adoption of technology in cricket recently. This trend has created the problem of duplicate work being done in similar computer vision-based research works. Our research tries to solve one of these problems by segmenting ball deliveries in a cricket broadcast using deep learning models, MobileNet and YOLO, thus enabling researchers to use our work as a… ▽ More There has been a significant increase in the adoption of technology in cricket recently. This trend has created the problem of duplicate work being done in similar computer vision-based research works. Our research tries to solve one of these problems by segmenting ball deliveries in a cricket broadcast using deep learning models, MobileNet and YOLO, thus enabling researchers to use our work as a dataset for their research. The output from our research can be used by cricket coaches and players to analyze ball deliveries which are played during the match. This paper presents an approach to segment and extract video shots in which only the ball is being delivered. The video shots are a series of continuous frames that make up the whole scene of the video. Object detection models are applied to reach a high level of accuracy in terms of correctly extracting video shots. The proof of concept for building large datasets of video shots for ball deliveries is proposed which paves the way for further processing on those shots for the extraction of semantics. Ball tracking in these video shots is also done using a separate RetinaNet model as a sample of the usefulness of the proposed dataset. The position on the cricket pitch where the ball lands is also extracted by tracking the ball along the y-axis. The video shot is then classified as a full-pitched, good-length or short-pitched delivery. △ Less

Submitted 21 November, 2022; originally announced November 2022.

arXiv:2211.01950 [pdf, other]

Unlocking the potential of two-point cells for energy-efficient and resilient training of deep nets

Authors: Ahsan Adeel, Adewale Adetomi, Khubaib Ahmed, Amir Hussain, Tughrul Arslan, W. A. Phillips

Abstract: Context-sensitive two-point layer 5 pyramidal cells (L5PCs) were discovered as long ago as 1999. However, the potential of this discovery to provide useful neural computation has yet to be demonstrated. Here we show for the first time how a transformative L5PCs-driven deep neural network (DNN), termed the multisensory cooperative computing (MCC) architecture, can effectively process large amounts… ▽ More Context-sensitive two-point layer 5 pyramidal cells (L5PCs) were discovered as long ago as 1999. However, the potential of this discovery to provide useful neural computation has yet to be demonstrated. Here we show for the first time how a transformative L5PCs-driven deep neural network (DNN), termed the multisensory cooperative computing (MCC) architecture, can effectively process large amounts of heterogeneous real-world audio-visual (AV) data, using far less energy compared to best available 'point' neuron-driven DNNs. A novel highly-distributed parallel implementation on a Xilinx UltraScale+ MPSoC device estimates energy savings up to 245759 $ \times $ 50000 $μ$J (i.e., 62% less than the baseline model in a semi-supervised learning setup) where a single synapse consumes $8e^{-5}μ$J. In a supervised learning setup, the energy-saving can potentially reach up to 1250x less (per feedforward transmission) than the baseline model. The significantly reduced neural activity in MCC leads to inherently fast learning and resilience against sudden neural damage. This remarkable performance in pilot experiments demonstrates the embodied neuromorphic intelligence of our proposed cooperative L5PC that receives input from diverse neighbouring neurons as context to amplify the transmission of most salient and relevant information for onward transmission, from overwhelmingly large multimodal information utilised at the early stages of on-chip training. Our proposed approach opens new cross-disciplinary avenues for future on-chip DNN training implementations and posits a radical shift in current neuromorphic computing paradigms. △ Less

Submitted 22 December, 2022; v1 submitted 24 October, 2022; originally announced November 2022.

arXiv:2210.02998 [pdf, other]

doi 10.1109/ACCESS.2023.3346315

ThoraX-PriorNet: A Novel Attention-Based Architecture Using Anatomical Prior Probability Maps for Thoracic Disease Classification

Authors: Md. Iqbal Hossain, Mohammad Zunaed, Md. Kawsar Ahmed, S. M. Jawwad Hossain, Anwarul Hasan, Taufiq Hasan

Abstract: Objective: Computer-aided disease diagnosis and prognosis based on medical images is a rapidly emerging field. Many Convolutional Neural Network (CNN) architectures have been developed by researchers for disease classification and localization from chest X-ray images. It is known that different thoracic disease lesions are more likely to occur in specific anatomical regions compared to others. Thi… ▽ More Objective: Computer-aided disease diagnosis and prognosis based on medical images is a rapidly emerging field. Many Convolutional Neural Network (CNN) architectures have been developed by researchers for disease classification and localization from chest X-ray images. It is known that different thoracic disease lesions are more likely to occur in specific anatomical regions compared to others. This article aims to incorporate this disease and region-dependent prior probability distribution within a deep learning framework. Methods: We present the ThoraX-PriorNet, a novel attention-based CNN model for thoracic disease classification. We first estimate a disease-dependent spatial probability, i.e., an anatomical prior, that indicates the probability of occurrence of a disease in a specific region in a chest X-ray image. Next, we develop a novel attention-based classification model that combines information from the estimated anatomical prior and automatically extracted chest region of interest (ROI) masks to provide attention to the feature maps generated from a deep convolution network. Unlike previous works that utilize various self-attention mechanisms, the proposed method leverages the extracted chest ROI masks along with the probabilistic anatomical prior information, which selects the region of interest for different diseases to provide attention. Results: The proposed method shows superior performance in disease classification on the NIH ChestX-ray14 dataset compared to existing state-of-the-art methods while reaching an area under the ROC curve (%AUC) of 84.67. Regarding disease localization, the anatomy prior attention method shows competitive performance compared to state-of-the-art methods, achieving an accuracy of 0.80, 0.63, 0.49, 0.33, 0.28, 0.21, and 0.04 with an Intersection over Union (IoU) threshold of 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, and 0.7, respectively. △ Less

Submitted 21 December, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

Comments: Accepted to IEEE ACCESS

arXiv:2210.01941 [pdf, other]

SIMPLE: A Gradient Estimator for $k$-Subset Sampling

Authors: Kareem Ahmed, Zhe Zeng, Mathias Niepert, Guy Van den Broeck

Abstract: $k$-subset sampling is ubiquitous in machine learning, enabling regularization and interpretability through sparsity. The challenge lies in rendering $k$-subset sampling amenable to end-to-end learning. This has typically involved relaxing the reparameterized samples to allow for backpropagation, with the risk of introducing high bias and high variance. In this work, we fall back to discrete $k… ▽ More $k$-subset sampling is ubiquitous in machine learning, enabling regularization and interpretability through sparsity. The challenge lies in rendering $k$-subset sampling amenable to end-to-end learning. This has typically involved relaxing the reparameterized samples to allow for backpropagation, with the risk of introducing high bias and high variance. In this work, we fall back to discrete $k$-subset sampling on the forward pass. This is coupled with using the gradient with respect to the exact marginals, computed efficiently, as a proxy for the true gradient. We show that our gradient estimator, SIMPLE, exhibits lower bias and variance compared to state-of-the-art estimators, including the straight-through Gumbel estimator when $k = 1$. Empirical results show improved performance on learning to explain and sparse linear regression. We provide an algorithm for computing the exact ELBO for the $k$-subset distribution, obtaining significantly lower loss compared to SOTA. △ Less

Submitted 6 June, 2024; v1 submitted 4 October, 2022; originally announced October 2022.

Comments: ICLR 2023; fixed typo in Theorem 1

arXiv:2209.12987 [pdf, other]

TrustToken, a Trusted SoC solution for Non-Trusted Intellectual Property (IP)s

Authors: Muhammed Kawser Ahmed, Sujan Kumar Saha, Christophe Bobda

Abstract: Secure and trustworthy execution in heterogeneous SoCs is a major priority in the modern computing system. Security of SoCs mainly addresses two broad layers of trust issues: 1. Protection against hardware security threats(Side-channel, IP Privacy, Cloning, Fault Injection, and Denial of Service); and 2. Protection against malicious software attacks running on SoC processors. To resist malicious s… ▽ More Secure and trustworthy execution in heterogeneous SoCs is a major priority in the modern computing system. Security of SoCs mainly addresses two broad layers of trust issues: 1. Protection against hardware security threats(Side-channel, IP Privacy, Cloning, Fault Injection, and Denial of Service); and 2. Protection against malicious software attacks running on SoC processors. To resist malicious software-level attackers from gaining unauthorized access and compromising security, we propose a root of trust-based trusted execution mechanism \textbf{\textit{(named as \textbf{TrustToken}) }}. TrustToken builds a security block to provide a root of trust-based IP security: secure key generation and truly random source. \textbf{TrustToken} only allows trusted communication between the non-trusted third-party IP and the rest of the SoC world by providing essential security features, i.e., secure, isolated execution, and trusted user interaction. The proposed design achieves this by interconnecting the third-party IP interface to \textbf{TrustToken} Controller and checking IP authorization(Token) signals \texttt{`correctness'} at run-time. \textbf{TrustToken} architecture shows a very low overhead resource utilization LUT (618, 1.16 \%), FF (44, 0.04 \%), and BUFG (2 , 6.25\%) in implementation. The experiment results show that TrustToken can provide a secure, low-cost, and trusted solution for non-trusted SoC IPs. △ Less

Submitted 26 September, 2022; originally announced September 2022.

arXiv:2209.11274 [pdf, other]

Trusted IP solution in multi-tenant cloud FPGA platform

Authors: Muhammed Kawser Ahmed, Sujan Kumar Saha, Christophe Bobda

Abstract: Because FPGAs outperform traditional processing cores like CPUs and GPUs in terms of performance per watt and flexibility, they are being used more and more in cloud and data center applications. There are growing worries about the security risks posed by multi-tenant sharing as the demand for hardware acceleration increases and gradually gives way to FPGA multi-tenancy in the cloud. The confident… ▽ More Because FPGAs outperform traditional processing cores like CPUs and GPUs in terms of performance per watt and flexibility, they are being used more and more in cloud and data center applications. There are growing worries about the security risks posed by multi-tenant sharing as the demand for hardware acceleration increases and gradually gives way to FPGA multi-tenancy in the cloud. The confidentiality, integrity, and availability of FPGA-accelerated applications may be compromised if space-shared FPGAs are made available to many cloud tenants. We propose a root of trust-based trusted execution mechanism called \textbf{TrustToken} to prevent harmful software-level attackers from getting unauthorized access and jeopardizing security. With safe key creation and truly random sources, \textbf{TrustToken} creates a security block that serves as the foundation of trust-based IP security. By offering crucial security characteristics, such as secure, isolated execution and trusted user interaction, \textbf{TrustToken} only permits trustworthy connection between the non-trusted third-party IP and the rest of the SoC environment. The suggested approach does this by connecting the third-party IP interface to the \textbf{TrustToken} Controller and running run-time checks on the correctness of the IP authorization(Token) signals. With an emphasis on software-based assaults targeting unauthorized access and information leakage, we offer a noble hardware/software architecture for trusted execution in FPGA-accelerated clouds and data centers. △ Less

Submitted 22 September, 2022; originally announced September 2022.

arXiv:2209.11158 [pdf, other]

Multi-Tenant Cloud FPGA: A Survey on Security

Authors: Muhammed Kawser Ahmed, Joel Mandebi, Sujan Kumar Saha, Christophe Bobda

Abstract: With the exponentially increasing demand for performance and scalability in cloud applications and systems, data center architectures evolved to integrate heterogeneous computing fabrics that leverage CPUs, GPUs, and FPGAs. FPGAs differ from traditional processing platforms such as CPUs and GPUs in that they are reconfigurable at run-time, providing increased and customized performance, flexibilit… ▽ More With the exponentially increasing demand for performance and scalability in cloud applications and systems, data center architectures evolved to integrate heterogeneous computing fabrics that leverage CPUs, GPUs, and FPGAs. FPGAs differ from traditional processing platforms such as CPUs and GPUs in that they are reconfigurable at run-time, providing increased and customized performance, flexibility, and acceleration. FPGAs can perform large-scale search optimization, acceleration, and signal processing tasks compared with power, latency, and processing speed. Many public cloud provider giants, including Amazon, Huawei, Microsoft, Alibaba, etc., have already started integrating FPGA-based cloud acceleration services. While FPGAs in cloud applications enable customized acceleration with low power consumption, it also incurs new security challenges that still need to be reviewed. Allowing cloud users to reconfigure the hardware design after deployment could open the backdoors for malicious attackers, potentially putting the cloud platform at risk. Considering security risks, public cloud providers still don't offer multi-tenant FPGA services. This paper analyzes the security concerns of multi-tenant cloud FPGAs, gives a thorough description of the security problems associated with them, and discusses upcoming future challenges in this field of study. △ Less

Submitted 22 September, 2022; originally announced September 2022.

arXiv:2209.10818 [pdf, other]

Memory-Augmented Graph Neural Networks: A Brain-Inspired Review

Authors: Guixiang Ma, Vy A. Vo, Theodore Willke, Nesreen K. Ahmed

Abstract: We provide a comprehensive review of the existing literature on memory-augmented GNNs. We review these works through the lens of psychology and neuroscience, which has several established theories on how multiple memory systems and mechanisms operate in biological brains. We propose a taxonomy of memory-augmented GNNs and a set of criteria for comparing their memory mechanisms. We also provide cri… ▽ More We provide a comprehensive review of the existing literature on memory-augmented GNNs. We review these works through the lens of psychology and neuroscience, which has several established theories on how multiple memory systems and mechanisms operate in biological brains. We propose a taxonomy of memory-augmented GNNs and a set of criteria for comparing their memory mechanisms. We also provide critical discussions on the limitations of these works. Finally, we discuss the challenges and future directions for this area. △ Less

Submitted 14 July, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

arXiv:2209.07421 [pdf]

doi 10.1007/978-3-031-14054-9_4

Heart Attack Classification System using Neural Network Trained with Particle Swarm Optimization

Authors: Askandar H. Amin, Botan K. Ahmed, Bestan B. Maaroof, Tarik A. Rashid

Abstract: The prior detection of a heart attack could lead to the saving of one's life. Putting specific criteria into a system that provides an early warning of an imminent at-tack will be advantageous to a better prevention plan for an upcoming heart attack. Some studies have been conducted for this purpose, but yet the goal has not been reached to prevent a patient from getting such a disease. In this pa… ▽ More The prior detection of a heart attack could lead to the saving of one's life. Putting specific criteria into a system that provides an early warning of an imminent at-tack will be advantageous to a better prevention plan for an upcoming heart attack. Some studies have been conducted for this purpose, but yet the goal has not been reached to prevent a patient from getting such a disease. In this paper, Neural Network trained with Particle Swarm Optimization (PSONN) is used to analyze the input criteria and enhance heart attack anticipation. A real and novel dataset that has been recorded on the disease is used. After preprocessing the data, the features are fed into the system. As a result, the outcomes from PSONN have been evaluated against those from other algorithms. Decision Tree, Random Forest, Neural network trained with Backpropagation (BPNN), and Naive Bayes were among those employed. Then the results of 100%, 99.2424%, 99.2323%, 81.3131%, and 66.4141% are produced concerning the mentioned algorithms, which show that PSONN has recorded the highest accuracy rate among all other tested algorithms. △ Less

Submitted 21 August, 2022; originally announced September 2022.

Comments: 12 pages

Report number: Advances in Intelligent Systems and Computing, 2022

arXiv:2209.06324 [pdf, other]

Routing heterogeneous traffic in delay tolerant satellite networks

Authors: Pablo G. Madoery, Gunes Karabulut Kurt, Halim Yanikomeroglu, Peng Hu, Khaled Ahmed, Guillaume Lamontagne

Abstract: Delay Tolerant Networking (DTN) has been proposed as a new architecture to provide efficient store-carry-and-forward data transport in satellite networks. Since these networks relay on scheduled contact plans, the Contact Graph Routing (CGR) algorithm can be used to optimize routing and data delivery performance. However, in spite of the various improvements that have been made to CGR, there have… ▽ More Delay Tolerant Networking (DTN) has been proposed as a new architecture to provide efficient store-carry-and-forward data transport in satellite networks. Since these networks relay on scheduled contact plans, the Contact Graph Routing (CGR) algorithm can be used to optimize routing and data delivery performance. However, in spite of the various improvements that have been made to CGR, there have been no significant proposals to prioritize traffic with different quality of service requirements. In this work we propose adaptations to CGR that allow performance improvements when sending traffic with different latency constraints, and develop a linear programming optimization model that works as a performance upper bound. The simulation results of the proposed schemes are promising and open the debate on other ways to improve performance while meeting the particular needs of heterogeneous traffic. △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: 6 pages, 2 figures

arXiv:2207.07338 [pdf, other]

Context-sensitive neocortical neurons transform the effectiveness and efficiency of neural information processing

Authors: Ahsan Adeel, Mario Franco, Mohsin Raza, Khubaib Ahmed

Abstract: Deep learning (DL) has big-data processing capabilities that are as good, or even better, than those of humans in many real-world domains, but at the cost of high energy requirements that may be unsustainable in some applications and of errors, that, though infrequent, can be large. We hypothesise that a fundamental weakness of DL lies in its intrinsic dependence on integrate-and-fire point neuron… ▽ More Deep learning (DL) has big-data processing capabilities that are as good, or even better, than those of humans in many real-world domains, but at the cost of high energy requirements that may be unsustainable in some applications and of errors, that, though infrequent, can be large. We hypothesise that a fundamental weakness of DL lies in its intrinsic dependence on integrate-and-fire point neurons that maximise information transmission irrespective of whether it is relevant in the current context or not. This leads to unnecessary neural firing and to the feedforward transmission of conflicting messages, which makes learning difficult and processing energy inefficient. Here we show how to circumvent these limitations by mimicking the capabilities of context-sensitive neocortical neurons that receive input from diverse sources as a context to amplify and attenuate the transmission of relevant and irrelevant information, respectively. We demonstrate that a deep network composed of such local processors seeks to maximise agreement between the active neurons, thus restricting the transmission of conflicting information to higher levels and reducing the neural activity required to process large amounts of heterogeneous real-world data. As shown to be far more effective and efficient than current forms of DL, this two-point neuron study offers a possible step-change in transforming the cellular foundations of deep network architectures. △ Less

Submitted 4 April, 2023; v1 submitted 15 July, 2022; originally announced July 2022.

arXiv:2206.00426 [pdf, other]

Semantic Probabilistic Layers for Neuro-Symbolic Learning

Authors: Kareem Ahmed, Stefano Teso, Kai-Wei Chang, Guy Van den Broeck, Antonio Vergari

Abstract: We design a predictive layer for structured-output prediction (SOP) that can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints. Our Semantic Probabilistic Layer (SPL) can model intricate correlations, and hard constraints, over a structured output space all while being amenable to end-to-end learning via maximum likelihood.… ▽ More We design a predictive layer for structured-output prediction (SOP) that can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints. Our Semantic Probabilistic Layer (SPL) can model intricate correlations, and hard constraints, over a structured output space all while being amenable to end-to-end learning via maximum likelihood. SPLs combine exact probabilistic inference with logical reasoning in a clean and modular way, learning complex distributions and restricting their support to solutions of the constraint. As such, they can faithfully, and efficiently, model complex SOP tasks beyond the reach of alternative neuro-symbolic approaches. We empirically demonstrate that SPLs outperform these competitors in terms of accuracy on challenging SOP tasks including hierarchical multi-label classification, pathfinding and preference learning, while retaining perfect constraint satisfaction. △ Less

Submitted 1 June, 2022; originally announced June 2022.

arXiv:2204.11981 [pdf, other]

End-to-end Mapping in Heterogeneous Systems Using Graph Representation Learning

Authors: Yao Xiao, Guixiang Ma, Nesreen K. Ahmed, Mihai Capota, Theodore Willke, Shahin Nazarian, Paul Bogdan

Abstract: To enable heterogeneous computing systems with autonomous programming and optimization capabilities, we propose a unified, end-to-end, programmable graph representation learning (PGL) framework that is capable of mining the complexity of high-level programs down to the universal intermediate representation, extracting the specific computational patterns and predicting which code segments would run… ▽ More To enable heterogeneous computing systems with autonomous programming and optimization capabilities, we propose a unified, end-to-end, programmable graph representation learning (PGL) framework that is capable of mining the complexity of high-level programs down to the universal intermediate representation, extracting the specific computational patterns and predicting which code segments would run best on a specific core in heterogeneous hardware platforms. The proposed framework extracts multi-fractal topological features from code graphs, utilizes graph autoencoders to learn how to partition the graph into computational kernels, and exploits graph neural networks (GNN) to predict the correct assignment to a processor type. In the evaluation, we validate the PGL framework and demonstrate a maximum speedup of 6.42x compared to the thread-based execution, and 2.02x compared to the state-of-the-art technique. △ Less

Submitted 25 April, 2022; originally announced April 2022.

arXiv:2204.04837 [pdf, other]

doi 10.1109/TII.2022.3164770

Dependable Intrusion Detection System for IoT: A Deep Transfer Learning-based Approach

Authors: Sk. Tanzir Mehedi, Adnan Anwar, Ziaur Rahman, Kawsar Ahmed, Rafiqul Islam

Abstract: Security concerns for IoT applications have been alarming because of their widespread use in different enterprise systems. The potential threats to these applications are constantly emerging and changing, and therefore, sophisticated and dependable defense solutions are necessary against such threats. With the rapid development of IoT networks and evolving threat types, the traditional machine lea… ▽ More Security concerns for IoT applications have been alarming because of their widespread use in different enterprise systems. The potential threats to these applications are constantly emerging and changing, and therefore, sophisticated and dependable defense solutions are necessary against such threats. With the rapid development of IoT networks and evolving threat types, the traditional machine learning-based IDS must update to cope with the security requirements of the current sustainable IoT environment. In recent years, deep learning, and deep transfer learning have progressed and experienced great success in different fields and have emerged as a potential solution for dependable network intrusion detection. However, new and emerging challenges have arisen related to the accuracy, efficiency, scalability, and dependability of the traditional IDS in a heterogeneous IoT setup. This manuscript proposes a deep transfer learning-based dependable IDS model that outperforms several existing approaches. The unique contributions include effective attribute selection, which is best suited to identify normal and attack scenarios for a small amount of labeled data, designing a dependable deep transfer learning-based ResNet model, and evaluating considering real-world data. To this end, a comprehensive experimental performance evaluation has been conducted. Extensive analysis and performance evaluation show that the proposed model is robust, more efficient, and has demonstrated better performance, ensuring dependability. △ Less

Submitted 10 April, 2022; originally announced April 2022.

Comments: 12 pages, 13 Figures, 4 tables IEEE Transaction

ACM Class: I.2; K.3.2

Journal ref: IEEE Transactions on Industrial Informatics, 2022

arXiv:2202.03427 [pdf]

Performance Evaluation of Infrared Image Enhancement Techniques

Authors: Rania Gaber, AbdElmgied Ali, Kareem Ahmed

Abstract: Infrared (IR) images are widely used in many fields such as medical imaging, object tracking, astronomy and military purposes for securing borders. Infrared images can be captured day or night based on the type of capturing device. The capturing devices use electromagnetic radiation with longer wavelengths. There are several types of IR radiation based on the range of wavelength and corresponding… ▽ More Infrared (IR) images are widely used in many fields such as medical imaging, object tracking, astronomy and military purposes for securing borders. Infrared images can be captured day or night based on the type of capturing device. The capturing devices use electromagnetic radiation with longer wavelengths. There are several types of IR radiation based on the range of wavelength and corresponding frequency. Due to noising and other artifacts, IR images are not clearly visible. In this paper, we present a complete up-todate survey on IR imaging enhancement techniques. The survey includes IR radiation types and devices and existing IR datasets. The survey covers spatial enhancement techniques, frequency-domain based enhancement techniques and Deep learning-based techniques. △ Less

Submitted 6 February, 2022; originally announced February 2022.

Showing 1–50 of 111 results for author: Ahmed, K