subscribe to arXiv mailings

Privacy-Preserving Data Deduplication for Enhancing Federated Learning of Language Models

Authors: Aydin Abadi, Vishnu Asutosh Dasu, Sumanta Sarkar

Abstract: Deduplication is a vital preprocessing step that enhances machine learning model performance and saves training time and energy. However, enhancing federated learning through deduplication poses challenges, especially regarding scalability and potential privacy violations if deduplication involves sharing all clients' data. In this paper, we address the problem of deduplication in a federated setu… ▽ More Deduplication is a vital preprocessing step that enhances machine learning model performance and saves training time and energy. However, enhancing federated learning through deduplication poses challenges, especially regarding scalability and potential privacy violations if deduplication involves sharing all clients' data. In this paper, we address the problem of deduplication in a federated setup by introducing a pioneering protocol, Efficient Privacy-Preserving Multi-Party Deduplication (EP-MPD). It efficiently removes duplicates from multiple clients' datasets without compromising data privacy. EP-MPD is constructed in a modular fashion, utilizing two novel variants of the Private Set Intersection protocol. Our extensive experiments demonstrate the significant benefits of deduplication in federated learning of large language models. For instance, we observe up to 19.61% improvement in perplexity and up to 27.95% reduction in running time. EP-MPD effectively balances privacy and performance in federated learning, making it a valuable solution for large-scale applications. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.06538 [pdf, other]

Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study

Authors: Aniruddha Roy, Pretam Ray, Ayush Maheshwari, Sudeshna Sarkar, Pawan Goyal

Abstract: Neural Machine Translation (NMT) remains a formidable challenge, especially when dealing with low-resource languages. Pre-trained sequence-to-sequence (seq2seq) multi-lingual models, such as mBART-50, have demonstrated impressive performance in various low-resource NMT tasks. However, their pre-training has been confined to 50 languages, leaving out support for numerous low-resource languages, par… ▽ More Neural Machine Translation (NMT) remains a formidable challenge, especially when dealing with low-resource languages. Pre-trained sequence-to-sequence (seq2seq) multi-lingual models, such as mBART-50, have demonstrated impressive performance in various low-resource NMT tasks. However, their pre-training has been confined to 50 languages, leaving out support for numerous low-resource languages, particularly those spoken in the Indian subcontinent. Expanding mBART-50's language support requires complex pre-training, risking performance decline due to catastrophic forgetting. Considering these expanding challenges, this paper explores a framework that leverages the benefits of a pre-trained language model along with knowledge distillation in a seq2seq architecture to facilitate translation for low-resource languages, including those not covered by mBART-50. The proposed framework employs a multilingual encoder-based seq2seq model as the foundational architecture and subsequently uses complementary knowledge distillation techniques to mitigate the impact of imbalanced training. Our framework is evaluated on three low-resource Indic languages in four Indic-to-Indic directions, yielding significant BLEU-4 and chrF improvements over baselines. Further, we conduct human evaluation to confirm effectiveness of our approach. Our code is publicly available at https://github.com/raypretam/Two-step-low-res-NMT. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: Published at Seventh LoResMT Workshop at ACL 2024

arXiv:2406.17720 [pdf, other]

Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity

Authors: Chih-Hsuan Yang, Benjamin Feuer, Zaki Jubery, Zi K. Deng, Andre Nakkab, Md Zahid Hasan, Shivani Chiranjeevi, Kelly Marshall, Nirmal Baishnab, Asheesh K Singh, Arti Singh, Soumik Sarkar, Nirav Merchant, Chinmay Hegde, Baskar Ganapathysubramanian

Abstract: We introduce Arboretum, the largest publicly accessible dataset designed to advance AI for biodiversity applications. This dataset, curated from the iNaturalist community science platform and vetted by domain experts to ensure accuracy, includes 134.6 million images, surpassing existing datasets in scale by an order of magnitude. The dataset encompasses image-language paired data for a diverse set… ▽ More We introduce Arboretum, the largest publicly accessible dataset designed to advance AI for biodiversity applications. This dataset, curated from the iNaturalist community science platform and vetted by domain experts to ensure accuracy, includes 134.6 million images, surpassing existing datasets in scale by an order of magnitude. The dataset encompasses image-language paired data for a diverse set of species from birds (Aves), spiders/ticks/mites (Arachnida), insects (Insecta), plants (Plantae), fungus/mushrooms (Fungi), snails (Mollusca), and snakes/lizards (Reptilia), making it a valuable resource for multimodal vision-language AI models for biodiversity assessment and agriculture research. Each image is annotated with scientific names, taxonomic details, and common names, enhancing the robustness of AI model training. We showcase the value of Arboretum by releasing a suite of CLIP models trained using a subset of 40 million captioned images. We introduce several new benchmarks for rigorous assessment, report accuracy for zero-shot learning, and evaluations across life stages, rare species, confounding species, and various levels of the taxonomic hierarchy. We anticipate that Arboretum will spur the development of AI models that can enable a variety of digital tools ranging from pest control strategies, crop monitoring, and worldwide biodiversity assessment and environmental conservation. These advancements are critical for ensuring food security, preserving ecosystems, and mitigating the impacts of climate change. Arboretum is publicly available, easily accessible, and ready for immediate use. Please see the \href{https://baskargroup.github.io/Arboretum/}{project website} for links to our data, models, and code. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: Preprint under review

arXiv:2406.15211 [pdf, other]

How Effective is GPT-4 Turbo in Generating School-Level Questions from Textbooks Based on Bloom's Revised Taxonomy?

Authors: Subhankar Maity, Aniket Deroy, Sudeshna Sarkar

Abstract: We evaluate the effectiveness of GPT-4 Turbo in generating educational questions from NCERT textbooks in zero-shot mode. Our study highlights GPT-4 Turbo's ability to generate questions that require higher-order thinking skills, especially at the "understanding" level according to Bloom's Revised Taxonomy. While we find a notable consistency between questions generated by GPT-4 Turbo and those ass… ▽ More We evaluate the effectiveness of GPT-4 Turbo in generating educational questions from NCERT textbooks in zero-shot mode. Our study highlights GPT-4 Turbo's ability to generate questions that require higher-order thinking skills, especially at the "understanding" level according to Bloom's Revised Taxonomy. While we find a notable consistency between questions generated by GPT-4 Turbo and those assessed by humans in terms of complexity, there are occasional differences. Our evaluation also uncovers variations in how humans and machines evaluate question quality, with a trend inversely related to Bloom's Revised Taxonomy levels. These findings suggest that while GPT-4 Turbo is a promising tool for educational question generation, its efficacy varies across different cognitive levels, indicating a need for further refinement to fully meet educational standards. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: Accepted at Learnersourcing: Student-Generated Content @ Scale 2024

arXiv:2406.15128 [pdf, other]

A Wavelet Guided Attention Module for Skin Cancer Classification with Gradient-based Feature Fusion

Authors: Ayush Roy, Sujan Sarkar, Sohom Ghosal, Dmitrii Kaplun, Asya Lyanova, Ram Sarkar

Abstract: Skin cancer is a highly dangerous type of cancer that requires an accurate diagnosis from experienced physicians. To help physicians diagnose skin cancer more efficiently, a computer-aided diagnosis (CAD) system can be very helpful. In this paper, we propose a novel model, which uses a novel attention mechanism to pinpoint the differences in features across the spatial dimensions and symmetry of t… ▽ More Skin cancer is a highly dangerous type of cancer that requires an accurate diagnosis from experienced physicians. To help physicians diagnose skin cancer more efficiently, a computer-aided diagnosis (CAD) system can be very helpful. In this paper, we propose a novel model, which uses a novel attention mechanism to pinpoint the differences in features across the spatial dimensions and symmetry of the lesion, thereby focusing on the dissimilarities of various classes based on symmetry, uniformity in texture and color, etc. Additionally, to take into account the variations in the boundaries of the lesions for different classes, we employ a gradient-based fusion of wavelet and soft attention-aided features to extract boundary information of skin lesions. We have tested our model on the multi-class and highly class-imbalanced dataset, called HAM10000, and achieved promising results, with a 91.17\% F1-score and 90.75\% accuracy. The code is made available at: https://github.com/AyushRoy2001/WAGF-Fusion. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.13081 [pdf, other]

Class-specific Data Augmentation for Plant Stress Classification

Authors: Nasla Saleem, Aditya Balu, Talukder Zaki Jubery, Arti Singh, Asheesh K. Singh, Soumik Sarkar, Baskar Ganapathysubramanian

Abstract: Data augmentation is a powerful tool for improving deep learning-based image classifiers for plant stress identification and classification. However, selecting an effective set of augmentations from a large pool of candidates remains a key challenge, particularly in imbalanced and confounding datasets. We propose an approach for automated class-specific data augmentation using a genetic algorithm.… ▽ More Data augmentation is a powerful tool for improving deep learning-based image classifiers for plant stress identification and classification. However, selecting an effective set of augmentations from a large pool of candidates remains a key challenge, particularly in imbalanced and confounding datasets. We propose an approach for automated class-specific data augmentation using a genetic algorithm. We demonstrate the utility of our approach on soybean [Glycine max (L.) Merr] stress classification where symptoms are observed on leaves; a particularly challenging problem due to confounding classes in the dataset. Our approach yields substantial performance, achieving a mean-per-class accuracy of 97.61% and an overall accuracy of 98% on the soybean leaf stress dataset. Our method significantly improves the accuracy of the most challenging classes, with notable enhancements from 83.01% to 88.89% and from 85.71% to 94.05%, respectively. A key observation we make in this study is that high-performing augmentation strategies can be identified in a computationally efficient manner. We fine-tune only the linear layer of the baseline model with different augmentations, thereby reducing the computational burden associated with training classifiers from scratch for each augmentation policy while achieving exceptional performance. This research represents an advancement in automated data augmentation strategies for plant stress classification, particularly in the context of confounding datasets. Our findings contribute to the growing body of research in tailored augmentation techniques and their potential impact on disease management strategies, crop yields, and global food security. The proposed approach holds the potential to enhance the accuracy and efficiency of deep learning-based tools for managing plant stresses in agriculture. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.11868 [pdf, other]

Ethical Framework for Responsible Foundational Models in Medical Imaging

Authors: Abhijit Das, Debesh Jha, Jasmer Sanjotra, Onkar Susladkar, Suramyaa Sarkar, Ashish Rauniyar, Nikhil Tomar, Vanshali Sharma, Ulas Bagci

Abstract: Foundational models (FMs) have tremendous potential to revolutionize medical imaging. However, their deployment in real-world clinical settings demands extensive ethical considerations. This paper aims to highlight the ethical concerns related to FMs and propose a framework to guide their responsible development and implementation within medicine. We meticulously examine ethical issues such as pri… ▽ More Foundational models (FMs) have tremendous potential to revolutionize medical imaging. However, their deployment in real-world clinical settings demands extensive ethical considerations. This paper aims to highlight the ethical concerns related to FMs and propose a framework to guide their responsible development and implementation within medicine. We meticulously examine ethical issues such as privacy of patient data, bias mitigation, algorithmic transparency, explainability and accountability. The proposed framework is designed to prioritize patient welfare, mitigate potential risks, and foster trust in AI-assisted healthcare. △ Less

Submitted 13 April, 2024; originally announced June 2024.

arXiv:2406.00039 [pdf]

How Ready Are Generative Pre-trained Large Language Models for Explaining Bengali Grammatical Errors?

Authors: Subhankar Maity, Aniket Deroy, Sudeshna Sarkar

Abstract: Grammatical error correction (GEC) tools, powered by advanced generative artificial intelligence (AI), competently correct linguistic inaccuracies in user input. However, they often fall short in providing essential natural language explanations, which are crucial for learning languages and gaining a deeper understanding of the grammatical rules. There is limited exploration of these tools in low-… ▽ More Grammatical error correction (GEC) tools, powered by advanced generative artificial intelligence (AI), competently correct linguistic inaccuracies in user input. However, they often fall short in providing essential natural language explanations, which are crucial for learning languages and gaining a deeper understanding of the grammatical rules. There is limited exploration of these tools in low-resource languages such as Bengali. In such languages, grammatical error explanation (GEE) systems should not only correct sentences but also provide explanations for errors. This comprehensive approach can help language learners in their quest for proficiency. Our work introduces a real-world, multi-domain dataset sourced from Bengali speakers of varying proficiency levels and linguistic complexities. This dataset serves as an evaluation benchmark for GEE systems, allowing them to use context information to generate meaningful explanations and high-quality corrections. Various generative pre-trained large language models (LLMs), including GPT-4 Turbo, GPT-3.5 Turbo, Text-davinci-003, Text-babbage-001, Text-curie-001, Text-ada-001, Llama-2-7b, Llama-2-13b, and Llama-2-70b, are assessed against human experts for performance comparison. Our research underscores the limitations in the automatic deployment of current state-of-the-art generative pre-trained LLMs for Bengali GEE. Advocating for human intervention, our findings propose incorporating manual checks to address grammatical errors and improve feedback quality. This approach presents a more suitable strategy to refine the GEC tools in Bengali, emphasizing the educational aspect of language learning. △ Less

Submitted 27 May, 2024; originally announced June 2024.

Comments: Accepted at Educational Data Mining 2024

arXiv:2405.11579 [pdf, ps, other]

Exploring the Capabilities of Prompted Large Language Models in Educational and Assessment Applications

Authors: Subhankar Maity, Aniket Deroy, Sudeshna Sarkar

Abstract: In the era of generative artificial intelligence (AI), the fusion of large language models (LLMs) offers unprecedented opportunities for innovation in the field of modern education. We embark on an exploration of prompted LLMs within the context of educational and assessment applications to uncover their potential. Through a series of carefully crafted research questions, we investigate the effect… ▽ More In the era of generative artificial intelligence (AI), the fusion of large language models (LLMs) offers unprecedented opportunities for innovation in the field of modern education. We embark on an exploration of prompted LLMs within the context of educational and assessment applications to uncover their potential. Through a series of carefully crafted research questions, we investigate the effectiveness of prompt-based techniques in generating open-ended questions from school-level textbooks, assess their efficiency in generating open-ended questions from undergraduate-level technical textbooks, and explore the feasibility of employing a chain-of-thought inspired multi-stage prompting approach for language-agnostic multiple-choice question (MCQ) generation. Additionally, we evaluate the ability of prompted LLMs for language learning, exemplified through a case study in the low-resource Indian language Bengali, to explain Bengali grammatical errors. We also evaluate the potential of prompted LLMs to assess human resource (HR) spoken interview transcripts. By juxtaposing the capabilities of LLMs with those of human experts across various educational tasks and domains, our aim is to shed light on the potential and limitations of LLMs in reshaping educational practices. △ Less

Submitted 19 May, 2024; originally announced May 2024.

Comments: Accepted at EDM 2024

arXiv:2405.10951 [pdf, other]

Block Selective Reprogramming for On-device Training of Vision Transformers

Authors: Sreetama Sarkar, Souvik Kundu, Kai Zheng, Peter A. Beerel

Abstract: The ubiquity of vision transformers (ViTs) for various edge applications, including personalized learning, has created the demand for on-device fine-tuning. However, training with the limited memory and computation power of edge devices remains a significant challenge. In particular, the memory required for training is much higher than that needed for inference, primarily due to the need to store… ▽ More The ubiquity of vision transformers (ViTs) for various edge applications, including personalized learning, has created the demand for on-device fine-tuning. However, training with the limited memory and computation power of edge devices remains a significant challenge. In particular, the memory required for training is much higher than that needed for inference, primarily due to the need to store activations across all layers in order to compute the gradients needed for weight updates. Previous works have explored reducing this memory requirement via frozen-weight training as well storing the activations in a compressed format. However, these methods are deemed inefficient due to their inability to provide training or inference speedup. In this paper, we first investigate the limitations of existing on-device training methods aimed at reducing memory and compute requirements. We then present block selective reprogramming (BSR) in which we fine-tune only a fraction of total blocks of a pre-trained model and selectively drop tokens based on self-attention scores of the frozen layers. To show the efficacy of BSR, we present extensive evaluations on ViT-B and DeiT-S with five different datasets. Compared to the existing alternatives, our approach simultaneously reduces training memory by up to 1.4x and compute cost by up to 2x while maintaining similar accuracy. We also showcase results for Mixture-of-Expert (MoE) models, demonstrating the effectiveness of our approach in multitask learning scenarios. △ Less

Submitted 25 March, 2024; originally announced May 2024.

arXiv:2405.08751 [pdf, other]

From Text to Context: An Entailment Approach for News Stakeholder Classification

Authors: Alapan Kuila, Sudeshna Sarkar

Abstract: Navigating the complex landscape of news articles involves understanding the various actors or entities involved, referred to as news stakeholders. These stakeholders, ranging from policymakers to opposition figures, citizens, and more, play pivotal roles in shaping news narratives. Recognizing their stakeholder types, reflecting their roles, political alignments, social standing, and more, is par… ▽ More Navigating the complex landscape of news articles involves understanding the various actors or entities involved, referred to as news stakeholders. These stakeholders, ranging from policymakers to opposition figures, citizens, and more, play pivotal roles in shaping news narratives. Recognizing their stakeholder types, reflecting their roles, political alignments, social standing, and more, is paramount for a nuanced comprehension of news content. Despite existing works focusing on salient entity extraction, coverage variations, and political affiliations through social media data, the automated detection of stakeholder roles within news content remains an underexplored domain. In this paper, we bridge this gap by introducing an effective approach to classify stakeholder types in news articles. Our method involves transforming the stakeholder classification problem into a natural language inference task, utilizing contextual information from news articles and external knowledge to enhance the accuracy of stakeholder type detection. Moreover, our proposed model showcases efficacy in zero-shot settings, further extending its applicability to diverse news contexts. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: Accepted in SIGIR 2024

arXiv:2405.00012 [pdf, other]

A quantum neural network framework for scalable quantum circuit approximation of unitary matrices

Authors: Rohit Sarma Sarkar, Bibhas Adhikari

Abstract: In this paper, we develop a Lie group theoretic approach for parametric representation of unitary matrices. This leads to develop a quantum neural network framework for quantum circuit approximation of multi-qubit unitary gates. Layers of the neural networks are defined by product of exponential of certain elements of the Standard Recursive Block Basis, which we introduce as an alternative to Paul… ▽ More In this paper, we develop a Lie group theoretic approach for parametric representation of unitary matrices. This leads to develop a quantum neural network framework for quantum circuit approximation of multi-qubit unitary gates. Layers of the neural networks are defined by product of exponential of certain elements of the Standard Recursive Block Basis, which we introduce as an alternative to Pauli string basis for matrix algebra of complex matrices of order $2^n$. The recursive construction of the neural networks implies that the quantum circuit approximation is scalable i.e. quantum circuit for an $(n+1)$-qubit unitary can be constructed from the circuit of $n$-qubit system by adding a few CNOT gates and single-qubit gates. △ Less

Submitted 7 February, 2024; originally announced May 2024.

Comments: 58 pages. arXiv admin note: substantial text overlap with arXiv:2304.14096

arXiv:2404.12498 [pdf]

A Configurable Pythonic Data Center Model for Sustainable Cooling and ML Integration

Authors: Avisek Naug, Antonio Guillen, Ricardo Luna Gutierrez, Vineet Gundecha, Sahand Ghorbanpour, Sajad Mousavi, Ashwin Ramesh Babu, Soumyendu Sarkar

Abstract: There have been growing discussions on estimating and subsequently reducing the operational carbon footprint of enterprise data centers. The design and intelligent control for data centers have an important impact on data center carbon footprint. In this paper, we showcase PyDCM, a Python library that enables extremely fast prototyping of data center design and applies reinforcement learning-enabl… ▽ More There have been growing discussions on estimating and subsequently reducing the operational carbon footprint of enterprise data centers. The design and intelligent control for data centers have an important impact on data center carbon footprint. In this paper, we showcase PyDCM, a Python library that enables extremely fast prototyping of data center design and applies reinforcement learning-enabled control with the purpose of evaluating key sustainability metrics including carbon footprint, energy consumption, and observing temperature hotspots. We demonstrate these capabilities of PyDCM and compare them to existing works in EnergyPlus for modeling data centers. PyDCM can also be used as a standalone Gymnasium environment for demonstrating sustainability-focused data center control. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: NeurIPS 2023 Workshop on Tackling Climate Change with Machine Learning https://www.climatechange.ai/papers/neurips2023/15. arXiv admin note: substantial text overlap with arXiv:2310.03906

arXiv:2404.10991 [pdf]

doi 10.24963/ijcai.2023/688

Function Approximation for Reinforcement Learning Controller for Energy from Spread Waves

Authors: Soumyendu Sarkar, Vineet Gundecha, Sahand Ghorbanpour, Alexander Shmakov, Ashwin Ramesh Babu, Avisek Naug, Alexandre Pichard, Mathieu Cocho

Abstract: The industrial multi-generator Wave Energy Converters (WEC) must handle multiple simultaneous waves coming from different directions called spread waves. These complex devices in challenging circumstances need controllers with multiple objectives of energy capture efficiency, reduction of structural stress to limit maintenance, and proactive protection against high waves. The Multi-Agent Reinforce… ▽ More The industrial multi-generator Wave Energy Converters (WEC) must handle multiple simultaneous waves coming from different directions called spread waves. These complex devices in challenging circumstances need controllers with multiple objectives of energy capture efficiency, reduction of structural stress to limit maintenance, and proactive protection against high waves. The Multi-Agent Reinforcement Learning (MARL) controller trained with the Proximal Policy Optimization (PPO) algorithm can handle these complexities. In this paper, we explore different function approximations for the policy and critic networks in modeling the sequential nature of the system dynamics and find that they are key to better performance. We investigated the performance of a fully connected neural network (FCN), LSTM, and Transformer model variants with varying depths and gated residual connections. Our results show that the transformer model of moderate depth with gated residual connections around the multi-head attention, multi-layer perceptron, and the transformer block (STrXL) proposed in this paper is optimal and boosts energy efficiency by an average of 22.1% for these complex spread waves over the existing spring damper (SD) controller. Furthermore, unlike the default SD controller, the transformer controller almost eliminated the mechanical stress from the rotational yaw motion for angled waves. Demo: https://tinyurl.com/yueda3jh △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: IJCAI 2023, Proceedings of the Thirty-Second International Joint Conference on Artificial IntelligenceAugust 2023

Journal ref: IJCAI 2023, Proceedings of the Thirty-Second International Joint Conference on Artificial IntelligenceAugust 2023, Article No 688, Pages 6201 to 6209

arXiv:2404.10786 [pdf]

doi 10.1609/aaai.v38i20.30238

Sustainability of Data Center Digital Twins with Reinforcement Learning

Authors: Soumyendu Sarkar, Avisek Naug, Antonio Guillen, Ricardo Luna, Vineet Gundecha, Ashwin Ramesh Babu, Sajad Mousavi

Abstract: The rapid growth of machine learning (ML) has led to an increased demand for computational power, resulting in larger data centers (DCs) and higher energy consumption. To address this issue and reduce carbon emissions, intelligent design and control of DC components such as IT servers, cabinets, HVAC cooling, flexible load shifting, and battery energy storage are essential. However, the complexity… ▽ More The rapid growth of machine learning (ML) has led to an increased demand for computational power, resulting in larger data centers (DCs) and higher energy consumption. To address this issue and reduce carbon emissions, intelligent design and control of DC components such as IT servers, cabinets, HVAC cooling, flexible load shifting, and battery energy storage are essential. However, the complexity of designing and controlling them in tandem presents a significant challenge. While some individual components like CFD-based design and Reinforcement Learning (RL) based HVAC control have been researched, there's a gap in the holistic design and optimization covering all elements simultaneously. To tackle this, we've developed DCRL-Green, a multi-agent RL environment that empowers the ML community to design data centers and research, develop, and refine RL controllers for carbon footprint reduction in DCs. It is a flexible, modular, scalable, and configurable platform that can handle large High Performance Computing (HPC) clusters. Furthermore, in its default setup, DCRL-Green provides a benchmark for evaluating single as well as multi-agent RL algorithms. It easily allows users to subclass the default implementations and design their own control approaches, encouraging community development for sustainable data centers. Open Source Link: https://github.com/HewlettPackard/dc-rl △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: 2024 Proceedings of the AAAI Conference on Artificial Intelligence

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 20, pp. 22322-22330, Mar. 2024

arXiv:2404.10575 [pdf, other]

EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence

Authors: Chung-Yiu Yau, Hoi-To Wai, Parameswaran Raman, Soumajyoti Sarkar, Mingyi Hong

Abstract: A key challenge in contrastive learning is to generate negative samples from a large sample set to contrast with positive samples, for learning better encoding of the data. These negative samples often follow a softmax distribution which are dynamically updated during the training process. However, sampling from this distribution is non-trivial due to the high computational costs in computing the… ▽ More A key challenge in contrastive learning is to generate negative samples from a large sample set to contrast with positive samples, for learning better encoding of the data. These negative samples often follow a softmax distribution which are dynamically updated during the training process. However, sampling from this distribution is non-trivial due to the high computational costs in computing the partition function. In this paper, we propose an Efficient Markov Chain Monte Carlo negative sampling method for Contrastive learning (EMC$^2$). We follow the global contrastive learning loss as introduced in SogCLR, and propose EMC$^2$ which utilizes an adaptive Metropolis-Hastings subroutine to generate hardness-aware negative samples in an online fashion during the optimization. We prove that EMC$^2$ finds an $\mathcal{O}(1/\sqrt{T})$-stationary point of the global contrastive loss in $T$ iterations. Compared to prior works, EMC$^2$ is the first algorithm that exhibits global convergence (to stationarity) regardless of the choice of batch size while exhibiting low computation and memory cost. Numerical experiments validate that EMC$^2$ is effective with small batch training and achieves comparable or better performance than baseline algorithms. We report the results for pre-training image encoders on STL-10 and Imagenet-100. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: 20 pages

arXiv:2404.10204 [pdf]

doi 10.13140/RG.2.2.18751.48804

The Impact of Machine Learning on Society: An Analysis of Current Trends and Future Implications

Authors: Md Kamrul Hossain Siam, Manidipa Bhattacharjee, Shakik Mahmud, Md. Saem Sarkar, Md. Masud Rana

Abstract: The Machine learning (ML) is a rapidly evolving field of technology that has the potential to greatly impact society in a variety of ways. However, there are also concerns about the potential negative effects of ML on society, such as job displacement and privacy issues. This research aimed to conduct a comprehensive analysis of the current and future impact of ML on society. The research included… ▽ More The Machine learning (ML) is a rapidly evolving field of technology that has the potential to greatly impact society in a variety of ways. However, there are also concerns about the potential negative effects of ML on society, such as job displacement and privacy issues. This research aimed to conduct a comprehensive analysis of the current and future impact of ML on society. The research included a thorough literature review, case studies, and surveys to gather data on the economic impact of ML, ethical and privacy implications, and public perceptions of the technology. The survey was conducted on 150 respondents from different areas. The case studies conducted were on the impact of ML on healthcare, finance, transportation, and manufacturing. The findings of this research revealed that the majority of respondents have a moderate level of familiarity with the concept of ML, believe that it has the potential to benefit society, and think that society should prioritize the development and use of ML. Based on these findings, it was recommended that more research is conducted on the impact of ML on society, stronger regulations and laws to protect the privacy and rights of individuals when it comes to ML should be developed, transparency and accountability in ML decision-making processes should be increased, and public education and awareness about ML should be enhanced. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 12 pages

arXiv:2404.08079 [pdf, other]

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models

Authors: Nastaran Saadati, Minh Pham, Nasla Saleem, Joshua R. Waite, Aditya Balu, Zhanhong Jiang, Chinmay Hegde, Soumik Sarkar

Abstract: Recent advances in decentralized deep learning algorithms have demonstrated cutting-edge performance on various tasks with large pre-trained models. However, a pivotal prerequisite for achieving this level of competitiveness is the significant communication and computation overheads when updating these models, which prohibits the applications of them to real-world scenarios. To address this issue,… ▽ More Recent advances in decentralized deep learning algorithms have demonstrated cutting-edge performance on various tasks with large pre-trained models. However, a pivotal prerequisite for achieving this level of competitiveness is the significant communication and computation overheads when updating these models, which prohibits the applications of them to real-world scenarios. To address this issue, drawing inspiration from advanced model merging techniques without requiring additional training, we introduce the Decentralized Iterative Merging-And-Training (DIMAT) paradigm--a novel decentralized deep learning framework. Within DIMAT, each agent is trained on their local data and periodically merged with their neighboring agents using advanced model merging techniques like activation matching until convergence is achieved. DIMAT provably converges with the best available rate for nonconvex functions with various first-order methods, while yielding tighter error bounds compared to the popular existing approaches. We conduct a comprehensive empirical analysis to validate DIMAT's superiority over baselines across diverse computer vision tasks sourced from multiple datasets. Empirical results validate our theoretical claims by showing that DIMAT attains faster and higher initial gain in accuracy with independent and identically distributed (IID) and non-IID data, incurring lower communication overhead. This DIMAT paradigm presents a new opportunity for the future decentralized learning, enhancing its adaptability to real-world with sparse and light-weight communication and computation. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: CVPR 2024 accepted paper, 22 pages, 12 figures

arXiv:2404.06423 [pdf, other]

Deep Reinforcement Learning-Based Approach for a Single Vehicle Persistent Surveillance Problem with Fuel Constraints

Authors: Manav Mishra, Hritik Bana, Saswata Sarkar, Sujeevraja Sanjeevi, PB Sujit, Kaarthik Sundar

Abstract: This article presents a deep reinforcement learning-based approach to tackle a persistent surveillance mission requiring a single unmanned aerial vehicle initially stationed at a depot with fuel or time-of-flight constraints to repeatedly visit a set of targets with equal priority. Owing to the vehicle's fuel or time-of-flight constraints, the vehicle must be regularly refueled, or its battery mus… ▽ More This article presents a deep reinforcement learning-based approach to tackle a persistent surveillance mission requiring a single unmanned aerial vehicle initially stationed at a depot with fuel or time-of-flight constraints to repeatedly visit a set of targets with equal priority. Owing to the vehicle's fuel or time-of-flight constraints, the vehicle must be regularly refueled, or its battery must be recharged at the depot. The objective of the problem is to determine an optimal sequence of visits to the targets that minimizes the maximum time elapsed between successive visits to any target while ensuring that the vehicle never runs out of fuel or charge. We present a deep reinforcement learning algorithm to solve this problem and present the results of numerical experiments that corroborate the effectiveness of this approach in comparison with common-sense greedy heuristics. △ Less

Submitted 2 May, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: 6 pages

Report number: LA-UR-24-23186

arXiv:2404.04361 [pdf, other]

Deciphering Political Entity Sentiment in News with Large Language Models: Zero-Shot and Few-Shot Strategies

Authors: Alapan Kuila, Sudeshna Sarkar

Abstract: Sentiment analysis plays a pivotal role in understanding public opinion, particularly in the political domain where the portrayal of entities in news articles influences public perception. In this paper, we investigate the effectiveness of Large Language Models (LLMs) in predicting entity-specific sentiment from political news articles. Leveraging zero-shot and few-shot strategies, we explore the… ▽ More Sentiment analysis plays a pivotal role in understanding public opinion, particularly in the political domain where the portrayal of entities in news articles influences public perception. In this paper, we investigate the effectiveness of Large Language Models (LLMs) in predicting entity-specific sentiment from political news articles. Leveraging zero-shot and few-shot strategies, we explore the capability of LLMs to discern sentiment towards political entities in news content. Employing a chain-of-thought (COT) approach augmented with rationale in few-shot in-context learning, we assess whether this method enhances sentiment prediction accuracy. Our evaluation on sentiment-labeled datasets demonstrates that LLMs, outperform fine-tuned BERT models in capturing entity-specific sentiment. We find that learning in-context significantly improves model performance, while the self-consistency mechanism enhances consistency in sentiment prediction. Despite the promising results, we observe inconsistencies in the effectiveness of the COT prompting method. Overall, our findings underscore the potential of LLMs in entity-centric sentiment analysis within the political news domain and highlight the importance of suitable prompting strategies and model architectures. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: Accepted in PoliticalNLP workshop co-located with LREC-COLING 2024

arXiv:2403.19062 [pdf, other]

GENESIS-RL: GEnerating Natural Edge-cases with Systematic Integration of Safety considerations and Reinforcement Learning

Authors: Hsin-Jung Yang, Joe Beck, Md Zahid Hasan, Ekin Beyazit, Subhadeep Chakraborty, Tichakorn Wongpiromsarn, Soumik Sarkar

Abstract: In the rapidly evolving field of autonomous systems, the safety and reliability of the system components are fundamental requirements. These components are often vulnerable to complex and unforeseen environments, making natural edge-case generation essential for enhancing system resilience. This paper presents GENESIS-RL, a novel framework that leverages system-level safety considerations and rein… ▽ More In the rapidly evolving field of autonomous systems, the safety and reliability of the system components are fundamental requirements. These components are often vulnerable to complex and unforeseen environments, making natural edge-case generation essential for enhancing system resilience. This paper presents GENESIS-RL, a novel framework that leverages system-level safety considerations and reinforcement learning techniques to systematically generate naturalistic edge cases. By simulating challenging conditions that mimic the real-world situations, our framework aims to rigorously test entire system's safety and reliability. Although demonstrated within the autonomous driving application, our methodology is adaptable across diverse autonomous systems. Our experimental validation, conducted on high-fidelity simulator underscores the overall effectiveness of this framework. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2403.18985 [pdf]

doi 10.1609/aaai.v38i21.30579

Robustness and Visual Explanation for Black Box Image, Video, and ECG Signal Classification with Reinforcement Learning

Authors: Soumyendu Sarkar, Ashwin Ramesh Babu, Sajad Mousavi, Vineet Gundecha, Avisek Naug, Sahand Ghorbanpour

Abstract: We present a generic Reinforcement Learning (RL) framework optimized for crafting adversarial attacks on different model types spanning from ECG signal analysis (1D), image classification (2D), and video classification (3D). The framework focuses on identifying sensitive regions and inducing misclassifications with minimal distortions and various distortion types. The novel RL method outperforms s… ▽ More We present a generic Reinforcement Learning (RL) framework optimized for crafting adversarial attacks on different model types spanning from ECG signal analysis (1D), image classification (2D), and video classification (3D). The framework focuses on identifying sensitive regions and inducing misclassifications with minimal distortions and various distortion types. The novel RL method outperforms state-of-the-art methods for all three applications, proving its efficiency. Our RL approach produces superior localization masks, enhancing interpretability for image classification and ECG analysis models. For applications such as ECG analysis, our platform highlights critical ECG segments for clinicians while ensuring resilience against prevalent distortions. This comprehensive tool aims to bolster both resilience with adversarial training and transparency across varied applications and data types. △ Less

Submitted 22 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

Comments: AAAI Proceedings reference: https://ojs.aaai.org/index.php/AAAI/article/view/30579

Journal ref: 2024 Proceedings of the AAAI Conference on Artificial Intelligence

arXiv:2403.14092 [pdf]

Carbon Footprint Reduction for Sustainable Data Centers in Real-Time

Authors: Soumyendu Sarkar, Avisek Naug, Ricardo Luna, Antonio Guillen, Vineet Gundecha, Sahand Ghorbanpour, Sajad Mousavi, Dejan Markovikj, Ashwin Ramesh Babu

Abstract: As machine learning workloads significantly increase energy consumption, sustainable data centers with low carbon emissions are becoming a top priority for governments and corporations worldwide. This requires a paradigm shift in optimizing power consumption in cooling and IT loads, shifting flexible loads based on the availability of renewable energy in the power grid, and leveraging battery stor… ▽ More As machine learning workloads significantly increase energy consumption, sustainable data centers with low carbon emissions are becoming a top priority for governments and corporations worldwide. This requires a paradigm shift in optimizing power consumption in cooling and IT loads, shifting flexible loads based on the availability of renewable energy in the power grid, and leveraging battery storage from the uninterrupted power supply in data centers, using collaborative agents. The complex association between these optimization strategies and their dependencies on variable external factors like weather and the power grid carbon intensity makes this a hard problem. Currently, a real-time controller to optimize all these goals simultaneously in a dynamic real-world setting is lacking. We propose a Data Center Carbon Footprint Reduction (DC-CFR) multi-agent Reinforcement Learning (MARL) framework that optimizes data centers for the multiple objectives of carbon footprint reduction, energy consumption, and energy cost. The results show that the DC-CFR MARL agents effectively resolved the complex interdependencies in optimizing cooling, load shifting, and energy storage in real-time for various locations under real-world dynamic weather and grid carbon intensity conditions. DC-CFR significantly outperformed the industry standard ASHRAE controller with a considerable reduction in carbon emissions (14.5%), energy usage (14.4%), and energy cost (13.7%) when evaluated over one year across multiple geographical regions. △ Less

Submitted 25 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

Journal ref: 2024 Proceedings of the AAAI Conference on Artificial Intelligence

arXiv:2403.07025 [pdf, other]

Enhancing Quantum Variational Algorithms with Zero Noise Extrapolation via Neural Networks

Authors: Subhasree Bhattacharjee, Soumyadip Sarkar, Kunal Das, Bikramjit Sarkar

Abstract: In the emergent realm of quantum computing, the Variational Quantum Eigensolver (VQE) stands out as a promising algorithm for solving complex quantum problems, especially in the noisy intermediate-scale quantum (NISQ) era. However, the ubiquitous presence of noise in quantum devices often limits the accuracy and reliability of VQE outcomes. This research introduces a novel approach to ameliorate t… ▽ More In the emergent realm of quantum computing, the Variational Quantum Eigensolver (VQE) stands out as a promising algorithm for solving complex quantum problems, especially in the noisy intermediate-scale quantum (NISQ) era. However, the ubiquitous presence of noise in quantum devices often limits the accuracy and reliability of VQE outcomes. This research introduces a novel approach to ameliorate this challenge by utilizing neural networks for zero noise extrapolation (ZNE) in VQE computations. By employing the Qiskit framework, we crafted parameterized quantum circuits using the RY-RZ ansatz and examined their behavior under varying levels of depolarizing noise. Our investigations spanned from determining the expectation values of a Hamiltonian, defined as a tensor product of Z operators, under different noise intensities to extracting the ground state energy. To bridge the observed outcomes under noise with the ideal noise-free scenario, we trained a Feed Forward Neural Network on the error probabilities and their associated expectation values. Remarkably, our model proficiently predicted the VQE outcome under hypothetical noise-free conditions. By juxtaposing the simulation results with real quantum device executions, we unveiled the discrepancies induced by noise and showcased the efficacy of our neural network-based ZNE technique in rectifying them. This integrative approach not only paves the way for enhanced accuracy in VQE computations on NISQ devices but also underlines the immense potential of hybrid quantum-classical paradigms in circumventing the challenges posed by quantum noise. Through this research, we envision a future where quantum algorithms can be reliably executed on noisy devices, bringing us one step closer to realizing the full potential of quantum computing. △ Less

Submitted 10 March, 2024; originally announced March 2024.

arXiv:2403.03920 [pdf, other]

Enhancing Instructional Quality: Leveraging Computer-Assisted Textual Analysis to Generate In-Depth Insights from Educational Artifacts

Authors: Zewei Tian, Min Sun, Alex Liu, Shawon Sarkar, Jing Liu

Abstract: This paper explores the transformative potential of computer-assisted textual analysis in enhancing instructional quality through in-depth insights from educational artifacts. We integrate Richard Elmore's Instructional Core Framework to examine how artificial intelligence (AI) and machine learning (ML) methods, particularly natural language processing (NLP), can analyze educational content, teach… ▽ More This paper explores the transformative potential of computer-assisted textual analysis in enhancing instructional quality through in-depth insights from educational artifacts. We integrate Richard Elmore's Instructional Core Framework to examine how artificial intelligence (AI) and machine learning (ML) methods, particularly natural language processing (NLP), can analyze educational content, teacher discourse, and student responses to foster instructional improvement. Through a comprehensive review and case studies within the Instructional Core Framework, we identify key areas where AI/ML integration offers significant advantages, including teacher coaching, student support, and content development. We unveil patterns that indicate AI/ML not only streamlines administrative tasks but also introduces novel pathways for personalized learning, providing actionable feedback for educators and contributing to a richer understanding of instructional dynamics. This paper emphasizes the importance of aligning AI/ML technologies with pedagogical goals to realize their full potential in educational settings, advocating for a balanced approach that considers ethical considerations, data quality, and the integration of human expertise. △ Less

Submitted 6 March, 2024; originally announced March 2024.

arXiv:2402.18751 [pdf, other]

Multi-Sensor and Multi-temporal High-Throughput Phenotyping for Monitoring and Early Detection of Water-Limiting Stress in Soybean

Authors: Sarah E. Jones, Timilehin Ayanlade, Benjamin Fallen, Talukder Z. Jubery, Arti Singh, Baskar Ganapathysubramanian, Soumik Sarkar, Asheesh K. Singh

Abstract: Soybean production is susceptible to biotic and abiotic stresses, exacerbated by extreme weather events. Water limiting stress, i.e. drought, emerges as a significant risk for soybean production, underscoring the need for advancements in stress monitoring for crop breeding and production. This project combines multi-modal information to identify the most effective and efficient automated methods t… ▽ More Soybean production is susceptible to biotic and abiotic stresses, exacerbated by extreme weather events. Water limiting stress, i.e. drought, emerges as a significant risk for soybean production, underscoring the need for advancements in stress monitoring for crop breeding and production. This project combines multi-modal information to identify the most effective and efficient automated methods to investigate drought response. We investigated a set of diverse soybean accessions using multiple sensors in a time series high-throughput phenotyping manner to: (1) develop a pipeline for rapid classification of soybean drought stress symptoms, and (2) investigate methods for early detection of drought stress. We utilized high-throughput time-series phenotyping using UAVs and sensors in conjunction with machine learning (ML) analytics, which offered a swift and efficient means of phenotyping. The red-edge and green bands were most effective to classify canopy wilting stress. The Red-Edge Chlorophyll Vegetation Index (RECI) successfully differentiated susceptible and tolerant soybean accessions prior to visual symptom development. We report pre-visual detection of soybean wilting using a combination of different vegetation indices. These results can contribute to early stress detection methodologies and rapid classification of drought responses in screening nurseries for breeding and production applications. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Comments: 25 pages, 5 figures

arXiv:2402.17346 [pdf, other]

Understanding the training of PINNs for unsteady flow past a plunging foil through the lens of input subdomain level loss function gradients

Authors: Rahul Sundar, Didier Lucor, Sunetra Sarkar

Abstract: Recently immersed boundary method-inspired physics-informed neural networks (PINNs) including the moving boundary-enabled PINNs (MB-PINNs) have shown the ability to accurately reconstruct velocity and recover pressure as a hidden variable for unsteady flow past moving bodies. Considering flow past a plunging foil, MB-PINNs were trained with global physics loss relaxation and also in conjunction wi… ▽ More Recently immersed boundary method-inspired physics-informed neural networks (PINNs) including the moving boundary-enabled PINNs (MB-PINNs) have shown the ability to accurately reconstruct velocity and recover pressure as a hidden variable for unsteady flow past moving bodies. Considering flow past a plunging foil, MB-PINNs were trained with global physics loss relaxation and also in conjunction with a physics-based undersampling method, obtaining good accuracy. The purpose of this study was to investigate which input spatial subdomain contributes to the training under the effect of physics loss relaxation and physics-based undersampling. In the context of MB-PINNs training, three spatial zones: the moving body, wake, and outer zones were defined. To quantify which spatial zone drives the training, two novel metrics are computed from the zonal loss component gradient statistics and the proportion of sample points in each zone. Results confirm that the learning indeed depends on the combined effect of the zonal loss component gradients and the proportion of points in each zone. Moreover, the dominant input zones are also the ones that have the strongest solution gradients in some sense. △ Less

Submitted 27 February, 2024; originally announced February 2024.

arXiv:2402.17337 [pdf, other]

Massive parallelization and performance enhancement of an immersed boundary method based unsteady flow solver

Authors: Rahul Sundar, Dipanjan Majumdar, Chhote Lal Shah, Sunetra Sarkar

Abstract: High-fidelity simulations of unsteady fluid flow are now possible with advancements in high-performance computing hardware and software frameworks. Since computational fluid dynamics (CFD) computations are dominated by linear algebraic routines, they can be significantly accelerated through massive parallelization on graphics processing units (GPUs). Thus, GPU implementation of high-fidelity CFD s… ▽ More High-fidelity simulations of unsteady fluid flow are now possible with advancements in high-performance computing hardware and software frameworks. Since computational fluid dynamics (CFD) computations are dominated by linear algebraic routines, they can be significantly accelerated through massive parallelization on graphics processing units (GPUs). Thus, GPU implementation of high-fidelity CFD solvers is essential in reducing the turnaround time for quicker design space exploration. In the present work, an immersed boundary method (IBM) based in-house flow solver has been ported to the GPU using OpenACC, a compiler directive-based heterogeneous parallel programming framework. Out of various GPU porting pathways available, OpenACC was chosen because of its minimum code intrusion, low development time, and striking similarity with OpenMP, a similar directive-based shared memory programming framework. A detailed validation study and performance analysis of the parallel solver implementations on the CPU and GPU are presented. The GPU implementation shows a speedup up to the order $O(10)$ over the CPU parallel version and up to the order $O(10^2)$ over the serial code. The GPU implementation also scales well with increasing mesh size owing to the efficient utilization of the GPU processor cores. △ Less

Submitted 27 February, 2024; originally announced February 2024.

arXiv:2402.17008 [pdf, other]

Benchmarking LLMs on the Semantic Overlap Summarization Task

Authors: John Salvador, Naman Bansal, Mousumi Akter, Souvika Sarkar, Anupam Das, Shubhra Kanti Karmaker

Abstract: Semantic Overlap Summarization (SOS) is a constrained multi-document summarization task, where the constraint is to capture the common/overlapping information between two alternative narratives. While recent advancements in Large Language Models (LLMs) have achieved superior performance in numerous summarization tasks, a benchmarking study of the SOS task using LLMs is yet to be performed. As LLMs… ▽ More Semantic Overlap Summarization (SOS) is a constrained multi-document summarization task, where the constraint is to capture the common/overlapping information between two alternative narratives. While recent advancements in Large Language Models (LLMs) have achieved superior performance in numerous summarization tasks, a benchmarking study of the SOS task using LLMs is yet to be performed. As LLMs' responses are sensitive to slight variations in prompt design, a major challenge in conducting such a benchmarking study is to systematically explore a variety of prompts before drawing a reliable conclusion. Fortunately, very recently, the TELeR taxonomy has been proposed which can be used to design and explore various prompts for LLMs. Using this TELeR taxonomy and 15 popular LLMs, this paper comprehensively evaluates LLMs on the SOS Task, assessing their ability to summarize overlapping information from multiple alternative narratives. For evaluation, we report well-established metrics like ROUGE, BERTscore, and SEM-F1$ on two different datasets of alternative narratives. We conclude the paper by analyzing the strengths and limitations of various LLMs in terms of their capabilities in capturing overlapping information The code and datasets used to conduct this study are available at https://anonymous.4open.science/r/llm_eval-E16D. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.15589 [pdf, other]

Prompting LLMs to Compose Meta-Review Drafts from Peer-Review Narratives of Scholarly Manuscripts

Authors: Shubhra Kanti Karmaker Santu, Sanjeev Kumar Sinha, Naman Bansal, Alex Knipper, Souvika Sarkar, John Salvador, Yash Mahajan, Sri Guttikonda, Mousumi Akter, Matthew Freestone, Matthew C. Williams Jr

Abstract: One of the most important yet onerous tasks in the academic peer-reviewing process is composing meta-reviews, which involves understanding the core contributions, strengths, and weaknesses of a scholarly manuscript based on peer-review narratives from multiple experts and then summarizing those multiple experts' perspectives into a concise holistic overview. Given the latest major developments in… ▽ More One of the most important yet onerous tasks in the academic peer-reviewing process is composing meta-reviews, which involves understanding the core contributions, strengths, and weaknesses of a scholarly manuscript based on peer-review narratives from multiple experts and then summarizing those multiple experts' perspectives into a concise holistic overview. Given the latest major developments in generative AI, especially Large Language Models (LLMs), it is very compelling to rigorously study the utility of LLMs in generating such meta-reviews in an academic peer-review setting. In this paper, we perform a case study with three popular LLMs, i.e., GPT-3.5, LLaMA2, and PaLM2, to automatically generate meta-reviews by prompting them with different types/levels of prompts based on the recently proposed TELeR taxonomy. Finally, we perform a detailed qualitative study of the meta-reviews generated by the LLMs and summarize our findings and recommendations for prompting LLMs for this complex task. △ Less

Submitted 23 February, 2024; originally announced February 2024.

ACM Class: I.2.7

arXiv:2402.10344 [pdf, other]

Evaluating NeRFs for 3D Plant Geometry Reconstruction in Field Conditions

Authors: Muhammad Arbab Arshad, Talukder Jubery, James Afful, Anushrut Jignasu, Aditya Balu, Baskar Ganapathysubramanian, Soumik Sarkar, Adarsh Krishnamurthy

Abstract: We evaluate different Neural Radiance Fields (NeRFs) techniques for reconstructing (3D) plants in varied environments, from indoor settings to outdoor fields. Traditional techniques often struggle to capture the complex details of plants, which is crucial for botanical and agricultural understanding. We evaluate three scenarios with increasing complexity and compare the results with the point clou… ▽ More We evaluate different Neural Radiance Fields (NeRFs) techniques for reconstructing (3D) plants in varied environments, from indoor settings to outdoor fields. Traditional techniques often struggle to capture the complex details of plants, which is crucial for botanical and agricultural understanding. We evaluate three scenarios with increasing complexity and compare the results with the point cloud obtained using LiDAR as ground truth data. In the most realistic field scenario, the NeRF models achieve a 74.65% F1 score with 30 minutes of training on the GPU, highlighting the efficiency and accuracy of NeRFs in challenging environments. These findings not only demonstrate the potential of NeRF in detailed and realistic 3D plant modeling but also suggest practical approaches for enhancing the speed and efficiency of the 3D reconstruction process. △ Less

Submitted 15 February, 2024; originally announced February 2024.

arXiv:2402.08055 [pdf, other]

A Quantum Algorithm Based Heuristic to Hide Sensitive Itemsets

Authors: Abhijeet Ghoshal, Yan Li, Syam Menon, Sumit Sarkar

Abstract: Quantum devices use qubits to represent information, which allows them to exploit important properties from quantum physics, specifically superposition and entanglement. As a result, quantum computers have the potential to outperform the most advanced classical computers. In recent years, quantum algorithms have shown hints of this promise, and many algorithms have been proposed for the quantum do… ▽ More Quantum devices use qubits to represent information, which allows them to exploit important properties from quantum physics, specifically superposition and entanglement. As a result, quantum computers have the potential to outperform the most advanced classical computers. In recent years, quantum algorithms have shown hints of this promise, and many algorithms have been proposed for the quantum domain. There are two key hurdles to solving difficult real-world problems on quantum computers. The first is on the hardware front -- the number of qubits in the most advanced quantum systems is too small to make the solution of large problems practical. The second involves the algorithms themselves -- as quantum computers use qubits, the algorithms that work there are fundamentally different from those that work on traditional computers. As a result of these constraints, research has focused on developing approaches to solve small versions of problems as proofs of concept -- recognizing that it would be possible to scale these up once quantum devices with enough qubits become available. Our objective in this paper is along the same lines. We present a quantum approach to solve a well-studied problem in the context of data sharing. This heuristic uses the well-known Quantum Approximate Optimization Algorithm (QAOA). We present results on experiments involving small datasets to illustrate how the problem could be solved using quantum algorithms. The results show that the method has potential and provide answers close to optimal. At the same time, we realize there are opportunities for improving the method further. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Journal ref: Workshop on Information Technologies and Systems WITS 2023

arXiv:2402.07281 [pdf, other]

Can Tree Based Approaches Surpass Deep Learning in Anomaly Detection? A Benchmarking Study

Authors: Santonu Sarkar, Shanay Mehta, Nicole Fernandes, Jyotirmoy Sarkar, Snehanshu Saha

Abstract: Detection of anomalous situations for complex mission-critical systems holds paramount importance when their service continuity needs to be ensured. A major challenge in detecting anomalies from the operational data arises due to the imbalanced class distribution problem since the anomalies are supposed to be rare events. This paper evaluates a diverse array of machine learning-based anomaly detec… ▽ More Detection of anomalous situations for complex mission-critical systems holds paramount importance when their service continuity needs to be ensured. A major challenge in detecting anomalies from the operational data arises due to the imbalanced class distribution problem since the anomalies are supposed to be rare events. This paper evaluates a diverse array of machine learning-based anomaly detection algorithms through a comprehensive benchmark study. The paper contributes significantly by conducting an unbiased comparison of various anomaly detection algorithms, spanning classical machine learning including various tree-based approaches to deep learning and outlier detection methods. The inclusion of 104 publicly available and a few proprietary industrial systems datasets enhances the diversity of the study, allowing for a more realistic evaluation of algorithm performance and emphasizing the importance of adaptability to real-world scenarios. The paper dispels the deep learning myth, demonstrating that though powerful, deep learning is not a universal solution in this case. We observed that recently proposed tree-based evolutionary algorithms outperform in many scenarios. We noticed that tree-based approaches catch a singleton anomaly in a dataset where deep learning methods fail. On the other hand, classical SVM performs the best on datasets with more than 10% anomalies, implying that such scenarios can be best modeled as a classification problem rather than anomaly detection. To our knowledge, such a study on a large number of state-of-the-art algorithms using diverse data sets, with the objective of guiding researchers and practitioners in making informed algorithmic choices, has not been attempted earlier. △ Less

Submitted 25 February, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

arXiv:2402.05521 [pdf, other]

Linearizing Models for Efficient yet Robust Private Inference

Authors: Sreetama Sarkar, Souvik Kundu, Peter A. Beerel

Abstract: The growing concern about data privacy has led to the development of private inference (PI) frameworks in client-server applications which protects both data privacy and model IP. However, the cryptographic primitives required yield significant latency overhead which limits its wide-spread application. At the same time, changing environments demand the PI service to be robust against various natur… ▽ More The growing concern about data privacy has led to the development of private inference (PI) frameworks in client-server applications which protects both data privacy and model IP. However, the cryptographic primitives required yield significant latency overhead which limits its wide-spread application. At the same time, changing environments demand the PI service to be robust against various naturally occurring and gradient-based perturbations. Despite several works focused on the development of latency-efficient models suitable for PI, the impact of these models on robustness has remained unexplored. Towards this goal, this paper presents RLNet, a class of robust linearized networks that can yield latency improvement via reduction of high-latency ReLU operations while improving the model performance on both clean and corrupted images. In particular, RLNet models provide a "triple win ticket" of improved classification accuracy on clean, naturally perturbed, and gradient-based perturbed images using a shared-mask shared-weight architecture with over an order of magnitude fewer ReLUs than baseline models. To demonstrate the efficacy of RLNet, we perform extensive experiments with ResNet and WRN model variants on CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets. Our experimental evaluations show that RLNet can yield models with up to 11.14x fewer ReLUs, with accuracy close to the all-ReLU models, on clean, naturally perturbed, and gradient-based perturbed images. Compared with the SoTA non-robust linearized models at similar ReLU budgets, RLNet achieves an improvement in adversarial accuracy of up to ~47%, naturally perturbed accuracy up to ~16.4%, while improving clean image accuracy up to ~1.5%. △ Less

Submitted 8 February, 2024; originally announced February 2024.

arXiv:2402.02145 [pdf, other]

Analyzing Sentiment Polarity Reduction in News Presentation through Contextual Perturbation and Large Language Models

Authors: Alapan Kuila, Somnath Jena, Sudeshna Sarkar, Partha Pratim Chakrabarti

Abstract: In today's media landscape, where news outlets play a pivotal role in shaping public opinion, it is imperative to address the issue of sentiment manipulation within news text. News writers often inject their own biases and emotional language, which can distort the objectivity of reporting. This paper introduces a novel approach to tackle this problem by reducing the polarity of latent sentiments i… ▽ More In today's media landscape, where news outlets play a pivotal role in shaping public opinion, it is imperative to address the issue of sentiment manipulation within news text. News writers often inject their own biases and emotional language, which can distort the objectivity of reporting. This paper introduces a novel approach to tackle this problem by reducing the polarity of latent sentiments in news content. Drawing inspiration from adversarial attack-based sentence perturbation techniques and a prompt based method using ChatGPT, we employ transformation constraints to modify sentences while preserving their core semantics. Using three perturbation methods: replacement, insertion, and deletion coupled with a context-aware masked language model, we aim to maximize the desired sentiment score for targeted news aspects through a beam search algorithm. Our experiments and human evaluations demonstrate the effectiveness of these two models in achieving reduced sentiment polarity with minimal modifications while maintaining textual similarity, fluency, and grammatical correctness. Comparative analysis confirms the competitive performance of the adversarial attack based perturbation methods and prompt-based methods, offering a promising solution to foster more objective news reporting and combat emotional language bias in the media. △ Less

Submitted 3 February, 2024; originally announced February 2024.

Comments: Accepted in ICON 2023

arXiv:2401.16577 [pdf, other]

LLMs as On-demand Customizable Service

Authors: Souvika Sarkar, Mohammad Fakhruddin Babar, Monowar Hasan, Shubhra Kanti Karmaker

Abstract: Large Language Models (LLMs) have demonstrated remarkable language understanding and generation capabilities. However, training, deploying, and accessing these models pose notable challenges, including resource-intensive demands, extended training durations, and scalability issues. To address these issues, we introduce a concept of hierarchical, distributed LLM architecture that aims at enhancing… ▽ More Large Language Models (LLMs) have demonstrated remarkable language understanding and generation capabilities. However, training, deploying, and accessing these models pose notable challenges, including resource-intensive demands, extended training durations, and scalability issues. To address these issues, we introduce a concept of hierarchical, distributed LLM architecture that aims at enhancing the accessibility and deployability of LLMs across heterogeneous computing platforms, including general-purpose computers (e.g., laptops) and IoT-style devices (e.g., embedded systems). By introducing a "layered" approach, the proposed architecture enables on-demand accessibility to LLMs as a customizable service. This approach also ensures optimal trade-offs between the available computational resources and the user's application needs. We envision that the concept of hierarchical LLM will empower extensive, crowd-sourced user bases to harness the capabilities of LLMs, thereby fostering advancements in AI technology in general. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.11023 [pdf, other]

Quantum circuit model for discrete-time three-state quantum walks on Cayley graphs

Authors: Rohit Sarma Sarkar, Bibhas Adhikari

Abstract: We develop qutrit circuit models for discrete-time three-state quantum walks on Cayley graphs corresponding to Dihedral groups $D_N$ and the additive groups of integers modulo any positive integer $N$. The proposed circuits comprise of elementary qutrit gates such as qutrit rotation gates, qutrit-$X$ gates and two-qutrit controlled-$X$ gates. First, we propose qutrit circuit representation of spec… ▽ More We develop qutrit circuit models for discrete-time three-state quantum walks on Cayley graphs corresponding to Dihedral groups $D_N$ and the additive groups of integers modulo any positive integer $N$. The proposed circuits comprise of elementary qutrit gates such as qutrit rotation gates, qutrit-$X$ gates and two-qutrit controlled-$X$ gates. First, we propose qutrit circuit representation of special unitary matrices of order three, and the block diagonal special unitary matrices with $3\times 3$ diagonal blocks, which correspond to multi-controlled $X$ gates and permutations of qutrit Toffoli gates. We show that one-layer qutrit circuit model need $O(3nN)$ two-qutrit control gates and $O(3N)$ one-qutrit rotation gates for these quantum walks when $N=3^n$. Finally, we numerically simulate these circuits to mimic its performance such as time-averaged probability of finding the walker at any vertex on noisy quantum computers. The simulated results for the time-averaged probability distributions for noisy and noiseless walks are further compared using KL-divergence and total variation distance. These results show that noise in gates in the circuits significantly impacts the distributions than amplitude damping or phase damping errors. △ Less

Submitted 19 January, 2024; originally announced January 2024.

arXiv:2401.07977 [pdf, ps, other]

Leveraging External Knowledge Resources to Enable Domain-Specific Comprehension

Authors: Saptarshi Sengupta, Connor Heaton, Prasenjit Mitra, Soumalya Sarkar

Abstract: Machine Reading Comprehension (MRC) has been a long-standing problem in NLP and, with the recent introduction of the BERT family of transformer based language models, it has come a long way to getting solved. Unfortunately, however, when BERT variants trained on general text corpora are applied to domain-specific text, their performance inevitably degrades on account of the domain shift i.e. genre… ▽ More Machine Reading Comprehension (MRC) has been a long-standing problem in NLP and, with the recent introduction of the BERT family of transformer based language models, it has come a long way to getting solved. Unfortunately, however, when BERT variants trained on general text corpora are applied to domain-specific text, their performance inevitably degrades on account of the domain shift i.e. genre/subject matter discrepancy between the training and downstream application data. Knowledge graphs act as reservoirs for either open or closed domain information and prior studies have shown that they can be used to improve the performance of general-purpose transformers in domain-specific applications. Building on existing work, we introduce a method using Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings extracted from knowledge graphs with the embeddings spaces of pre-trained language models (LMs). We fuse the aligned embeddings with open-domain LMs BERT and RoBERTa, and fine-tune them for two MRC tasks namely span detection (COVID-QA) and multiple-choice questions (PubMedQA). On the COVID-QA dataset, we see that our approach allows these models to perform similar to their domain-specific counterparts, Bio/Sci-BERT, as evidenced by the Exact Match (EM) metric. With regards to PubMedQA, we observe an overall improvement in accuracy while the F1 stays relatively the same over the domain-specific models. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2401.07098 [pdf, other]

A Novel Multi-Stage Prompting Approach for Language Agnostic MCQ Generation using GPT

Authors: Subhankar Maity, Aniket Deroy, Sudeshna Sarkar

Abstract: We introduce a multi-stage prompting approach (MSP) for the generation of multiple choice questions (MCQs), harnessing the capabilities of GPT models such as text-davinci-003 and GPT-4, renowned for their excellence across various NLP tasks. Our approach incorporates the innovative concept of chain-of-thought prompting, a progressive technique in which the GPT model is provided with a series of in… ▽ More We introduce a multi-stage prompting approach (MSP) for the generation of multiple choice questions (MCQs), harnessing the capabilities of GPT models such as text-davinci-003 and GPT-4, renowned for their excellence across various NLP tasks. Our approach incorporates the innovative concept of chain-of-thought prompting, a progressive technique in which the GPT model is provided with a series of interconnected cues to guide the MCQ generation process. Automated evaluations consistently demonstrate the superiority of our proposed MSP method over the traditional single-stage prompting (SSP) baseline, resulting in the production of high-quality distractors. Furthermore, the one-shot MSP technique enhances automatic evaluation results, contributing to improved distractor generation in multiple languages, including English, German, Bengali, and Hindi. In human evaluations, questions generated using our approach exhibit superior levels of grammaticality, answerability, and difficulty, highlighting its efficacy in various languages. △ Less

Submitted 13 January, 2024; originally announced January 2024.

Comments: Accepted at ECIR 2024(short paper)

arXiv:2401.06981 [pdf, ps, other]

Online Matroid Intersection: Submodular Water-Filling and Matroidal Welfare Maximization

Authors: Daniel Hathcock, Billy Jin, Kalen Patton, Sherry Sarkar, Michael Zlatin

Abstract: We study two problems in online matroid intersection. First, we consider the problem of maximizing the size of a common independent set between a general matroid and a partition matroid whose parts arrive online. This captures the classic online bipartite matching problem when both matroids are partition matroids. Our main result is a $(1 - \frac{1}{e})$-competitive algorithm for the fractional ve… ▽ More We study two problems in online matroid intersection. First, we consider the problem of maximizing the size of a common independent set between a general matroid and a partition matroid whose parts arrive online. This captures the classic online bipartite matching problem when both matroids are partition matroids. Our main result is a $(1 - \frac{1}{e})$-competitive algorithm for the fractional version of this problem. This applies even for the poly-matroid setting, where the rank function of the offline matroid is replaced with a general monotone submodular function. The key new ingredient for this result is the construction of a ''water level'' vector for poly-matroids, which allows us to generalize the classic water-filling algorithm for online bipartite matching. This construction reveals connections to submodular utility allocation markets and principal partition sequences of matroids. Our second result concerns the Online Submodular Welfare Maximization (OSWM) problem, in which items arriving online are allocated among a set of agents with the goal of maximizing their overall utility. If the utility function of each agent is a monotone, submodular function over the set of available items, then a simple greedy algorithm achieves a competitive ratio of $\frac{1}{2}$. Kapralov, Post, and Vondrák showed that in this case, no polynomial time algorithm achieves a competitive ratio of $\frac{1}{2} + \varepsilon$ for any $\varepsilon > 0$ unless NP = RP (SODA, 2013). We extend the RANKING algorithm of Karp, Vazirani, and Vazirani (STOC, 1990) to achieve an optimal $(1-\frac{1}{e})$-competitive algorithm for OSWM in the case that the utility function of each agent is the rank function of a matroid. △ Less

Submitted 13 January, 2024; originally announced January 2024.

arXiv:2312.12338 [pdf, other]

Smart Connected Farms and Networked Farmers to Tackle Climate Challenges Impacting Agricultural Production

Authors: Behzad J. Balabaygloo, Barituka Bekee, Samuel W. Blair, Suzanne Fey, Fateme Fotouhi, Ashish Gupta, Kevin Menke, Anusha Vangala, Jorge C. M. Palomares, Aaron Prestholt, Vishesh K. Tanwar, Xu Tao, Matthew E. Carroll, Sajal Das, Gil Depaula, Peter Kyveryga, Soumik Sarkar, Michelle Segovia, Simone Sylvestri, Corinne Valdivia, Asheesh K. Singh

Abstract: To meet the grand challenges of agricultural production including climate change impacts on crop production, a tight integration of social science, technology and agriculture experts including farmers are needed. There are rapid advances in information and communication technology, precision agriculture and data analytics, which are creating a fertile field for the creation of smart connected farm… ▽ More To meet the grand challenges of agricultural production including climate change impacts on crop production, a tight integration of social science, technology and agriculture experts including farmers are needed. There are rapid advances in information and communication technology, precision agriculture and data analytics, which are creating a fertile field for the creation of smart connected farms (SCF) and networked farmers. A network and coordinated farmer network provides unique advantages to farmers to enhance farm production and profitability, while tackling adverse climate events. The aim of this article is to provide a comprehensive overview of the state of the art in SCF including the advances in engineering, computer sciences, data sciences, social sciences and economics including data privacy, sharing and technology adoption. △ Less

Submitted 19 December, 2023; originally announced December 2023.

arXiv:2312.03088 [pdf, other]

LLMs for Multi-Modal Knowledge Extraction and Analysis in Intelligence/Safety-Critical Applications

Authors: Brett Israelsen, Soumalya Sarkar

Abstract: Large Language Models have seen rapid progress in capability in recent years; this progress has been accelerating and their capabilities, measured by various benchmarks, are beginning to approach those of humans. There is a strong demand to use such models in a wide variety of applications but, due to unresolved vulnerabilities and limitations, great care needs to be used before applying them to i… ▽ More Large Language Models have seen rapid progress in capability in recent years; this progress has been accelerating and their capabilities, measured by various benchmarks, are beginning to approach those of humans. There is a strong demand to use such models in a wide variety of applications but, due to unresolved vulnerabilities and limitations, great care needs to be used before applying them to intelligence and safety-critical applications. This paper reviews recent literature related to LLM assessment and vulnerabilities to synthesize the current research landscape and to help understand what advances are most critical to enable use of of these technologies in intelligence and safety-critical applications. The vulnerabilities are broken down into ten high-level categories and overlaid onto a high-level life cycle of an LLM. Some general categories of mitigations are reviewed. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: initial draft

arXiv:2312.02722 [pdf, other]

Improved Algorithms for Minimum-Membership Geometric Set Cover

Authors: Sathish Govindarajan, Siddhartha Sarkar

Abstract: Bandyapadhyay et al. introduced the generalized minimum-membership geometric set cover (GMMGSC) problem [SoCG, 2023], which is defined as follows. We are given two sets $P$ and $P'$ of points in $\mathbb{R}^{2}$, $n=\max(|P|, |P'|)$, and a set $\mathcal{S}$ of $m$ axis-parallel unit squares. The goal is to find a subset $\mathcal{S}^{*}\subseteq \mathcal{S}$ that covers all the points in $P$ while… ▽ More Bandyapadhyay et al. introduced the generalized minimum-membership geometric set cover (GMMGSC) problem [SoCG, 2023], which is defined as follows. We are given two sets $P$ and $P'$ of points in $\mathbb{R}^{2}$, $n=\max(|P|, |P'|)$, and a set $\mathcal{S}$ of $m$ axis-parallel unit squares. The goal is to find a subset $\mathcal{S}^{*}\subseteq \mathcal{S}$ that covers all the points in $P$ while minimizing $\mathsf{memb}(P', \mathcal{S}^{*})$, where $\mathsf{memb}(P', \mathcal{S}^{*})=\max_{p\in P'}|\{s\in \mathcal{S}^{*}: p\in s\}|$. We study GMMGSC problem and give a $16$-approximation algorithm that runs in $O(m^2\log m + m^2n)$ time. Our result is a significant improvement to the $144$-approximation given by Bandyapadhyay et al. that runs in $\tilde{O}(nm)$ time. GMMGSC problem is a generalization of another well-studied problem called Minimum Ply Geometric Set Cover (MPGSC), in which the goal is to minimize the ply of $\mathcal{S}^{*}$, where the ply is the maximum cardinality of a subset of the unit squares that have a non-empty intersection. The best-known result for the MPGSC problem is an $8$-approximation algorithm by Durocher et al. that runs in $O(n + m^{8}k^{4}\log k + m^{8}\log m\log k)$ time, where $k$ is the optimal ply value [WALCOM, 2023]. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: To appear in CALDAM 2024

arXiv:2312.01032 [pdf, other]

Harnessing the Power of Prompt-based Techniques for Generating School-Level Questions using Large Language Models

Authors: Subhankar Maity, Aniket Deroy, Sudeshna Sarkar

Abstract: Designing high-quality educational questions is a challenging and time-consuming task. In this work, we propose a novel approach that utilizes prompt-based techniques to generate descriptive and reasoning-based questions. However, current question-answering (QA) datasets are inadequate for conducting our experiments on prompt-based question generation (QG) in an educational setting. Therefore, we… ▽ More Designing high-quality educational questions is a challenging and time-consuming task. In this work, we propose a novel approach that utilizes prompt-based techniques to generate descriptive and reasoning-based questions. However, current question-answering (QA) datasets are inadequate for conducting our experiments on prompt-based question generation (QG) in an educational setting. Therefore, we curate a new QG dataset called EduProbe for school-level subjects, by leveraging the rich content of NCERT textbooks. We carefully annotate this dataset as quadruples of 1) Context: a segment upon which the question is formed; 2) Long Prompt: a long textual cue for the question (i.e., a longer sequence of words or phrases, covering the main theme of the context); 3) Short Prompt: a short textual cue for the question (i.e., a condensed representation of the key information or focus of the context); 4) Question: a deep question that aligns with the context and is coherent with the prompts. We investigate several prompt-based QG methods by fine-tuning pre-trained transformer-based large language models (LLMs), namely PEGASUS, T5, MBART, and BART. Moreover, we explore the performance of two general-purpose pre-trained LLMs such as Text-Davinci-003 and GPT-3.5-Turbo without any further training. By performing automatic evaluation, we show that T5 (with long prompt) outperforms all other models, but still falls short of the human baseline. Under human evaluation criteria, TextDavinci-003 usually shows better results than other models under various prompt settings. Even in the case of human evaluation criteria, QG models mostly fall short of the human baseline. Our code and dataset are available at: https://github.com/my625/PromptQG △ Less

Submitted 2 December, 2023; originally announced December 2023.

arXiv:2311.13152 [pdf, other]

Test-Time Augmentation for 3D Point Cloud Classification and Segmentation

Authors: Tuan-Anh Vu, Srinjay Sarkar, Zhiyuan Zhang, Binh-Son Hua, Sai-Kit Yeung

Abstract: Data augmentation is a powerful technique to enhance the performance of a deep learning task but has received less attention in 3D deep learning. It is well known that when 3D shapes are sparsely represented with low point density, the performance of the downstream tasks drops significantly. This work explores test-time augmentation (TTA) for 3D point clouds. We are inspired by the recent revoluti… ▽ More Data augmentation is a powerful technique to enhance the performance of a deep learning task but has received less attention in 3D deep learning. It is well known that when 3D shapes are sparsely represented with low point density, the performance of the downstream tasks drops significantly. This work explores test-time augmentation (TTA) for 3D point clouds. We are inspired by the recent revolution of learning implicit representation and point cloud upsampling, which can produce high-quality 3D surface reconstruction and proximity-to-surface, respectively. Our idea is to leverage the implicit field reconstruction or point cloud upsampling techniques as a systematic way to augment point cloud data. Mainly, we test both strategies by sampling points from the reconstructed results and using the sampled point cloud as test-time augmented data. We show that both strategies are effective in improving accuracy. We observed that point cloud upsampling for test-time augmentation can lead to more significant performance improvement on downstream tasks such as object classification and segmentation on the ModelNet40, ShapeNet, ScanObjectNN, and SemanticKITTI datasets, especially for sparse point clouds. △ Less

Submitted 21 November, 2023; originally announced November 2023.

Comments: This paper is accepted in 3DV 2024

arXiv:2311.12404 [pdf, other]

InterPrompt: Interpretable Prompting for Interrelated Interpersonal Risk Factors in Reddit Posts

Authors: MSVPJ Sathvik, Surjodeep Sarkar, Chandni Saxena, Sunghwan Sohn, Muskan Garg

Abstract: Mental health professionals and clinicians have observed the upsurge of mental disorders due to Interpersonal Risk Factors (IRFs). To simulate the human-in-the-loop triaging scenario for early detection of mental health disorders, we recognized textual indications to ascertain these IRFs : Thwarted Belongingness (TBe) and Perceived Burdensomeness (PBu) within personal narratives. In light of this,… ▽ More Mental health professionals and clinicians have observed the upsurge of mental disorders due to Interpersonal Risk Factors (IRFs). To simulate the human-in-the-loop triaging scenario for early detection of mental health disorders, we recognized textual indications to ascertain these IRFs : Thwarted Belongingness (TBe) and Perceived Burdensomeness (PBu) within personal narratives. In light of this, we use N-shot learning with GPT-3 model on the IRF dataset, and underscored the importance of fine-tuning GPT-3 model to incorporate the context-specific sensitivity and the interconnectedness of textual cues that represent both IRFs. In this paper, we introduce an Interpretable Prompting (InterPrompt)} method to boost the attention mechanism by fine-tuning the GPT-3 model. This allows a more sophisticated level of language modification by adjusting the pre-trained weights. Our model learns to detect usual patterns and underlying connections across both the IRFs, which leads to better system-level explainability and trustworthiness. The results of our research demonstrate that all four variants of GPT-3 model, when fine-tuned with InterPrompt, perform considerably better as compared to the baseline methods, both in terms of classification and explanation generation. △ Less

Submitted 21 November, 2023; originally announced November 2023.

Comments: 5 pages

arXiv:2311.10718 [pdf, ps, other]

Harnessing Deep Q-Learning for Enhanced Statistical Arbitrage in High-Frequency Trading: A Comprehensive Exploration

Authors: Soumyadip Sarkar

Abstract: The realm of High-Frequency Trading (HFT) is characterized by rapid decision-making processes that capitalize on fleeting market inefficiencies. As the financial markets become increasingly competitive, there is a pressing need for innovative strategies that can adapt and evolve with changing market dynamics. Enter Reinforcement Learning (RL), a branch of machine learning where agents learn by int… ▽ More The realm of High-Frequency Trading (HFT) is characterized by rapid decision-making processes that capitalize on fleeting market inefficiencies. As the financial markets become increasingly competitive, there is a pressing need for innovative strategies that can adapt and evolve with changing market dynamics. Enter Reinforcement Learning (RL), a branch of machine learning where agents learn by interacting with their environment, making it an intriguing candidate for HFT applications. This paper dives deep into the integration of RL in statistical arbitrage strategies tailored for HFT scenarios. By leveraging the adaptive learning capabilities of RL, we explore its potential to unearth patterns and devise trading strategies that traditional methods might overlook. We delve into the intricate exploration-exploitation trade-offs inherent in RL and how they manifest in the volatile world of HFT. Furthermore, we confront the challenges of applying RL in non-stationary environments, typical of financial markets, and investigate methodologies to mitigate associated risks. Through extensive simulations and backtests, our research reveals that RL not only enhances the adaptability of trading strategies but also shows promise in improving profitability metrics and risk-adjusted returns. This paper, therefore, positions RL as a pivotal tool for the next generation of HFT-based statistical arbitrage, offering insights for both researchers and practitioners in the field. △ Less

Submitted 13 September, 2023; originally announced November 2023.

arXiv:2311.10548 [pdf, other]

Efficient Profit Maximization in Reliability Concerned Static Vehicular Cloud System

Authors: Suvarthi Sarkar, Akshat Arun, Harshit Surekha, Aryabartta Sahu

Abstract: Modern electric VUs are equipped with a variety of increasingly potent computing, communication, and storage resources, and with this tremendous computation power in their arsenal can be used to enhance the computing power of regular cloud systems, which is termed as vehicular cloud. Unlike in the traditional cloud computing resources, these vehicular cloud resource moves around and participates i… ▽ More Modern electric VUs are equipped with a variety of increasingly potent computing, communication, and storage resources, and with this tremendous computation power in their arsenal can be used to enhance the computing power of regular cloud systems, which is termed as vehicular cloud. Unlike in the traditional cloud computing resources, these vehicular cloud resource moves around and participates in the vehicular cloud for a sporadic duration at parking places, shopping malls, etc. This introduces the dynamic nature of vehicular resource participation in the vehicular cloud. As the user-submitted task gets allocated on these vehicular units for execution and the dynamic stay nature of vehicular units, enforce the system to ensure the reliability of task execution by allocating multiple redundant vehicular units for the task. In this work, we are maximizing the profit of vehicular cloud by ensuring the reliability of task execution where user tasks come online manner with different revenue, execution, and deadline. We propose an efficient approach to solve this problem by considering (a) task classification based on the deadline and laxity of the task, (b) ordering of tasks for task admission based on the expected profit of the task, (c) classification of vehicular units based in expected residency time and reliability concerned redundant allocation of tasks of vehicular units considering this classification and (d) handing dynamic scenario of the vehicular unit leaving the cloud system by copying the maximum percentage of executed virtual machine of the task to the substitute unit. We compared our proposed profit maximization approach with the state of art approach and showed that our approach outperforms the state of art approach with an extra 10\% to 20\% profit margin. △ Less

Submitted 17 November, 2023; originally announced November 2023.

arXiv:2311.09024 [pdf, other]

Fast Certification of Vision-Language Models Using Incremental Randomized Smoothing

Authors: A K Nirala, A Joshi, C Hegde, S Sarkar

Abstract: A key benefit of deep vision-language models such as CLIP is that they enable zero-shot open vocabulary classification; the user has the ability to define novel class labels via natural language prompts at inference time. However, while CLIP-based zero-shot classifiers have demonstrated competitive performance across a range of domain shifts, they remain highly vulnerable to adversarial attacks. T… ▽ More A key benefit of deep vision-language models such as CLIP is that they enable zero-shot open vocabulary classification; the user has the ability to define novel class labels via natural language prompts at inference time. However, while CLIP-based zero-shot classifiers have demonstrated competitive performance across a range of domain shifts, they remain highly vulnerable to adversarial attacks. Therefore, ensuring the robustness of such models is crucial for their reliable deployment in the wild. In this work, we introduce Open Vocabulary Certification (OVC), a fast certification method designed for open-vocabulary models like CLIP via randomized smoothing techniques. Given a base "training" set of prompts and their corresponding certified CLIP classifiers, OVC relies on the observation that a classifier with a novel prompt can be viewed as a perturbed version of nearby classifiers in the base training set. Therefore, OVC can rapidly certify the novel classifier using a variation of incremental randomized smoothing. By using a caching trick, we achieve approximately two orders of magnitude acceleration in the certification process for novel prompts. To achieve further (heuristic) speedups, OVC approximates the embedding space at a given input using a multivariate normal distribution bypassing the need for sampling via forward passes through the vision backbone. We demonstrate the effectiveness of OVC on through experimental evaluation using multiple vision-language backbones on the CIFAR-10 and ImageNet test datasets. △ Less

Submitted 4 January, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

arXiv:2311.08438 [pdf, other]

doi 10.1007/978-3-031-25075-0_47

LocaliseBot: Multi-view 3D object localisation with differentiable rendering for robot grasping

Authors: Sujal Vijayaraghavan, Redwan Alqasemi, Rajiv Dubey, Sudeep Sarkar

Abstract: Robot grasp typically follows five stages: object detection, object localisation, object pose estimation, grasp pose estimation, and grasp planning. We focus on object pose estimation. Our approach relies on three pieces of information: multiple views of the object, the camera's extrinsic parameters at those viewpoints, and 3D CAD models of objects. The first step involves a standard deep learning… ▽ More Robot grasp typically follows five stages: object detection, object localisation, object pose estimation, grasp pose estimation, and grasp planning. We focus on object pose estimation. Our approach relies on three pieces of information: multiple views of the object, the camera's extrinsic parameters at those viewpoints, and 3D CAD models of objects. The first step involves a standard deep learning backbone (FCN ResNet) to estimate the object label, semantic segmentation, and a coarse estimate of the object pose with respect to the camera. Our novelty is using a refinement module that starts from the coarse pose estimate and refines it by optimisation through differentiable rendering. This is a purely vision-based approach that avoids the need for other information such as point cloud or depth images. We evaluate our object pose estimation approach on the ShapeNet dataset and show improvements over the state of the art. We also show that the estimated object pose results in 99.65% grasp accuracy with the ground truth grasp candidates on the Object Clutter Indoor Dataset (OCID) Grasp dataset, as computed using standard practice. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Showing 1–50 of 343 results for author: Sarkar, S