Skip to main content

Showing 1–50 of 333 results for author: Himanshu

  1. arXiv:2406.17968  [pdf, other

    cs.IR cs.AI cs.LG stat.ML

    Efficient Document Ranking with Learnable Late Interactions

    Authors: Ziwei Ji, Himanshu Jain, Andreas Veit, Sashank J. Reddi, Sadeep Jayasumana, Ankit Singh Rawat, Aditya Krishna Menon, Felix Yu, Sanjiv Kumar

    Abstract: Cross-Encoder (CE) and Dual-Encoder (DE) models are two fundamental approaches for query-document relevance in information retrieval. To predict relevance, CE models use joint query-document embeddings, while DE models maintain factorized query and document embeddings; usually, the former has higher quality while the latter benefits from lower latency. Recently, late-interaction models have been p… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.15444  [pdf, other

    cs.CL

    Investigating the Robustness of LLMs on Math Word Problems

    Authors: Ujjwala Anantheswaran, Himanshu Gupta, Kevin Scaria, Shreyas Verma, Chitta Baral, Swaroop Mishra

    Abstract: Large Language Models (LLMs) excel at various tasks, including solving math word problems (MWPs), but struggle with real-world problems containing irrelevant information. To address this, we propose a prompting framework that generates adversarial variants of MWPs by adding irrelevant variables. We introduce a dataset, ProbleMATHIC, containing both adversarial and non-adversarial MWPs. Our experim… ▽ More

    Submitted 30 May, 2024; originally announced June 2024.

  3. arXiv:2406.14236  [pdf, other

    quant-ph cs.DC

    NAC-QFL: Noise Aware Clustered Quantum Federated Learning

    Authors: Himanshu Sahu, Hari Prabhat Gupta

    Abstract: Recent advancements in quantum computing, alongside successful deployments of quantum communication, hold promises for revolutionizing mobile networks. While Quantum Machine Learning (QML) presents opportunities, it contends with challenges like noise in quantum devices and scalability. Furthermore, the high cost of quantum communication constrains the practical application of QML in real-world sc… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  4. arXiv:2406.10362  [pdf

    cs.DC cs.PF

    A Comparison of the Performance of the Molecular Dynamics Simulation Package GROMACS Implemented in the SYCL and CUDA Programming Models

    Authors: L. Apanasevich, Yogesh Kale, Himanshu Sharma, Ana Marija Sokovic

    Abstract: For many years, systems running Nvidia-based GPU architectures have dominated the heterogeneous supercomputer landscape. However, recently GPU chipsets manufactured by Intel and AMD have cut into this market and can now be found in some of the worlds fastest supercomputers. The June 2023 edition of the TOP500 list of supercomputers ranks the Frontier supercomputer at the Oak Ridge National Laborat… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  5. arXiv:2406.07486  [pdf, other

    quant-ph cs.AR

    Novel Optimized Designs of Modulo $2n+1$ Adder for Quantum Computing

    Authors: Bhaskar Gaur, Himanshu Thapliyal

    Abstract: Quantum modular adders are one of the most fundamental yet versatile quantum computation operations. They help implement functions of higher complexity, such as subtraction and multiplication, which are used in applications such as quantum cryptanalysis, quantum image processing, and securing communication. To the best of our knowledge, there is no existing design of quantum modulo $(2n+1)$ adder.… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 5 Figures, 1 Table

  6. arXiv:2406.06601  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer

    Authors: Himanshu Maurya, Atli Sigurgeirsson

    Abstract: Text-To-Speech (TTS) prosody transfer models can generate varied prosodic renditions, for the same text, by conditioning on a reference utterance. These models are trained with a reference that is identical to the target utterance. But when the reference utterance differs from the target text, as in cross-text prosody transfer, these models struggle to separate prosody from text, resulting in redu… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 4 pages (+1 references), 4 figures, to be presented at Interspeech 2024

  7. arXiv:2406.06556  [pdf, other

    cs.CL cs.AI

    Enhancing Presentation Slide Generation by LLMs with a Multi-Staged End-to-End Approach

    Authors: Sambaran Bandyopadhyay, Himanshu Maheshwari, Anandhavelu Natarajan, Apoorv Saxena

    Abstract: Generating presentation slides from a long document with multimodal elements such as text and images is an important task. This is time consuming and needs domain expertise if done manually. Existing approaches for generating a rich presentation from a document are often semi-automatic or only put a flat summary into the slides ignoring the importance of a good narrative. In this paper, we address… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  8. arXiv:2406.05294  [pdf, other

    quant-ph cs.AR cs.DC

    Residue Number System (RNS) based Distributed Quantum Addition

    Authors: Bhaskar Gaur, Travis S. Humble, Himanshu Thapliyal

    Abstract: Quantum Arithmetic faces limitations such as noise and resource constraints in the current Noisy Intermediate Scale Quantum (NISQ) era quantum computers. We propose using Distributed Quantum Computing (DQC) to overcome these limitations by substituting a higher depth quantum addition circuit with Residue Number System (RNS) based quantum modulo adders. The RNS-based distributed quantum addition ci… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 6 pages, 5 figures, 2 tables

  9. arXiv:2405.18948  [pdf, other

    cs.RO cs.LG

    Learning to Recover from Plan Execution Errors during Robot Manipulation: A Neuro-symbolic Approach

    Authors: Namasivayam Kalithasan, Arnav Tuli, Vishal Bindal, Himanshu Gaurav Singh, Parag Singla, Rohan Paul

    Abstract: Automatically detecting and recovering from failures is an important but challenging problem for autonomous robots. Most of the recent work on learning to plan from demonstrations lacks the ability to detect and recover from errors in the absence of an explicit state representation and/or a (sub-) goal check function. We propose an approach (blending learning with symbolic search) for automated er… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: This work has been submitted to the IEEE for possible publication

  10. arXiv:2405.13095  [pdf, other

    cs.CL cs.AI

    Presentations are not always linear! GNN meets LLM for Document-to-Presentation Transformation with Attribution

    Authors: Himanshu Maheshwari, Sambaran Bandyopadhyay, Aparna Garimella, Anandhavelu Natarajan

    Abstract: Automatically generating a presentation from the text of a long document is a challenging and useful problem. In contrast to a flat summary, a presentation needs to have a better and non-linear narrative, i.e., the content of a slide can come from different and non-contiguous parts of the given document. However, it is difficult to incorporate such non-linear mapping of content to slides and ensur… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: This paper is under review in a conference

  11. arXiv:2405.08120  [pdf, other

    cs.ET cs.AI

    From Questions to Insightful Answers: Building an Informed Chatbot for University Resources

    Authors: Subash Neupane, Elias Hossain, Jason Keith, Himanshu Tripathi, Farbod Ghiasi, Noorbakhsh Amiri Golilarz, Amin Amirlatifi, Sudip Mittal, Shahram Rahimi

    Abstract: This paper presents BARKPLUG V.2, a Large Language Model (LLM)-based chatbot system built using Retrieval Augmented Generation (RAG) pipelines to enhance the user experience and access to information within academic settings.The objective of BARKPLUG V.2 is to provide information to users about various campus resources, including academic departments, programs, campus facilities, and student resou… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  12. arXiv:2405.07499  [pdf, other

    quant-ph cs.ET

    Distributed Quantum Computation with Minimum Circuit Execution Time over Quantum Networks

    Authors: Ranjani G Sundaram, Himanshu Gupta, C. R. Ramakrishnan

    Abstract: Present quantum computers are constrained by limited qubit capacity and restricted physical connectivity, leading to challenges in large-scale quantum computations. Distributing quantum computations across a network of quantum computers is a promising way to circumvent these challenges and facilitate large quantum computations. However, distributed quantum computations require entanglements (to ex… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  13. arXiv:2405.05777  [pdf, other

    cs.CL cs.AI

    Towards a More Inclusive AI: Progress and Perspectives in Large Language Model Training for the Sámi Language

    Authors: Ronny Paul, Himanshu Buckchash, Shantipriya Parida, Dilip K. Prasad

    Abstract: Sámi, an indigenous language group comprising multiple languages, faces digital marginalization due to the limited availability of data and sophisticated language models designed for its linguistic intricacies. This work focuses on increasing technological participation for the Sámi language. We draw the attention of the ML community towards the language modeling problem of Ultra Low Resource (ULR… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  14. arXiv:2405.02774  [pdf, other

    cs.LG cs.AI cs.CL

    Get more for less: Principled Data Selection for Warming Up Fine-Tuning in LLMs

    Authors: Feiyang Kang, Hoang Anh Just, Yifan Sun, Himanshu Jahagirdar, Yuanzhi Zhang, Rongxing Du, Anit Kumar Sahu, Ruoxi Jia

    Abstract: This work focuses on leveraging and selecting from vast, unlabeled, open data to pre-fine-tune a pre-trained language model. The goal is to minimize the need for costly domain-specific data for subsequent fine-tuning while achieving desired performance levels. While many data selection algorithms have been designed for small-scale applications, rendering them unsuitable for our context, some emerg… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: Published as a conference paper at ICLR 2024

  15. arXiv:2405.00222  [pdf, other

    quant-ph cs.NI

    Optimized Distribution of Entanglement Graph States in Quantum Networks

    Authors: Xiaojie Fan, Caitao Zhan, Himanshu Gupta, C. R. Ramakrishnan

    Abstract: Building large-scale quantum computers, essential to demonstrating quantum advantage, is a key challenge. Quantum Networks (QNs) can help address this challenge by enabling the construction of large, robust, and more capable quantum computing platforms by connecting smaller quantum computers. Moreover, unlike classical systems, QNs can enable fully secured long-distance communication. Thus, quantu… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 11 pages, 13 figures

  16. arXiv:2404.17977  [pdf, other

    cs.AI cs.MA

    Advancing Healthcare Automation: Multi-Agent System for Medical Necessity Justification

    Authors: Himanshu Pandey, Akhil Amod, Shivang

    Abstract: Prior Authorization delivers safe, appropriate, and cost-effective care that is medically justified with evidence-based guidelines. However, the process often requires labor-intensive manual comparisons between patient medical records and clinical guidelines, that is both repetitive and time-consuming. Recent developments in Large Language Models (LLMs) have shown potential in addressing complex m… ▽ More

    Submitted 6 July, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

    Comments: Accepted at BioNLP2024

  17. arXiv:2404.13933  [pdf

    cs.HC

    Comparison of On-Orbit Manual Attitude Control Methods for Non-Docking Spacecraft Through Virtual Reality Simulation

    Authors: Ajit Krishnan, Himanshu Vishwakarma, Maharudra Kharsade, Pradipta Biswas

    Abstract: On-orbit manual attitude control of manned spacecraft is accomplished using external visual references and some method of three axis attitude control. All past, present, and developmental spacecraft feature the capability to manually control attitude for deorbit. National Aeronautics and Space Administration (NASA) spacecraft permit an aircraft windshield type front view, wherein an arc of the Ear… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    ACM Class: H.5.2

  18. arXiv:2404.12402  [pdf, other

    cs.LG cs.AI cs.NE

    Sup3r: A Semi-Supervised Algorithm for increasing Sparsity, Stability, and Separability in Hierarchy Of Time-Surfaces architectures

    Authors: Marco Rasetto, Himanshu Akolkar, Ryad Benosman

    Abstract: The Hierarchy Of Time-Surfaces (HOTS) algorithm, a neuromorphic approach for feature extraction from event data, presents promising capabilities but faces challenges in accuracy and compatibility with neuromorphic hardware. In this paper, we introduce Sup3r, a Semi-Supervised algorithm aimed at addressing these challenges. Sup3r enhances sparsity, stability, and separability in the HOTS networks.… ▽ More

    Submitted 30 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  19. arXiv:2404.12063  [pdf, other

    cs.LG cs.CE cs.NE math.NA

    FastVPINNs: Tensor-Driven Acceleration of VPINNs for Complex Geometries

    Authors: Thivin Anandh, Divij Ghose, Himanshu Jain, Sashikumaar Ganesan

    Abstract: Variational Physics-Informed Neural Networks (VPINNs) utilize a variational loss function to solve partial differential equations, mirroring Finite Element Analysis techniques. Traditional hp-VPINNs, while effective for high-frequency problems, are computationally intensive and scale poorly with increasing element counts, limiting their use in complex geometries. This work introduces FastVPINNs, a… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 31 pages, 19 figures, 4 algorithms

  20. arXiv:2404.07774  [pdf, other

    cs.LG cs.RO

    Sketch-Plan-Generalize: Continual Few-Shot Learning of Inductively Generalizable Spatial Concepts

    Authors: Namasivayam Kalithasan, Sachit Sachdeva, Himanshu Gaurav Singh, Vishal Bindal, Arnav Tuli, Gurarmaan Singh Panjeta, Divyanshu Aggarwal, Rohan Paul, Parag Singla

    Abstract: Our goal is to enable embodied agents to learn inductively generalizable spatial concepts, e.g., learning staircase as an inductive composition of towers of increasing height. Given a human demonstration, we seek a learning architecture that infers a succinct ${program}$ representation that explains the observed instance. Additionally, the approach should generalize inductively to novel structures… ▽ More

    Submitted 29 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

  21. arXiv:2403.13230  [pdf, other

    cs.NI

    BFT-PoLoc: A Byzantine Fortified Trigonometric Proof of Location Protocol using Internet Delays

    Authors: Peiyao Sheng, Vishal Sevani, Ranvir Rana, Himanshu Tyagi, Pramod Viswanath

    Abstract: Internet platforms depend on accurately determining the geographical locations of online users to deliver targeted services (e.g., advertising). The advent of decentralized platforms (blockchains) emphasizes the importance of geographically distributed nodes, making the validation of locations more crucial. In these decentralized settings, mutually non-trusting participants need to {\em prove} the… ▽ More

    Submitted 28 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  22. arXiv:2403.12938  [pdf, other

    cs.LG

    Neural Differential Algebraic Equations

    Authors: James Koch, Madelyn Shapiro, Himanshu Sharma, Draguna Vrabie, Jan Drgona

    Abstract: Differential-Algebraic Equations (DAEs) describe the temporal evolution of systems that obey both differential and algebraic constraints. Of particular interest are systems that contain implicit relationships between their components, such as conservation relationships. Here, we present Neural Differential-Algebraic Equations (NDAEs) suitable for data-driven modeling of DAEs. This methodology is b… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  23. arXiv:2403.11769  [pdf

    cs.SE

    EmpowerAbility: A portal for employment & scholarships for differently-abled

    Authors: Himanshu Raj, Shubham Kumar, Dr. J Kalaivani

    Abstract: The internet has become a vital resource for job seekers in today's technologically advanced world, particularly for those with impairments. They mainly rely on internet resources to find jobs that fit their particular requirements and skill set. Though some disabled candidates receive prompt responses and job offers, others find it difficult to traverse the intricate world of job portals, the eff… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  24. arXiv:2403.09724  [pdf, other

    cs.CL cs.CY cs.LG

    ClaimVer: Explainable Claim-Level Verification and Evidence Attribution of Text Through Knowledge Graphs

    Authors: Preetam Prabhu Srikar Dammu, Himanshu Naidu, Mouly Dewan, YoungMin Kim, Tanya Roosta, Aman Chadha, Chirag Shah

    Abstract: In the midst of widespread misinformation and disinformation through social media and the proliferation of AI-generated texts, it has become increasingly difficult for people to validate and trust information they encounter. Many fact-checking approaches and tools have been developed, but they often lack appropriate explainability or granularity to be useful in various contexts. A text validation… ▽ More

    Submitted 23 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  25. arXiv:2403.08086  [pdf, other

    cs.CV

    Flow-Based Visual Stream Compression for Event Cameras

    Authors: Daniel C. Stumpp, Himanshu Akolkar, Alan D. George, Ryad Benosman

    Abstract: As the use of neuromorphic, event-based vision sensors expands, the need for compression of their output streams has increased. While their operational principle ensures event streams are spatially sparse, the high temporal resolution of the sensors can result in high data rates from the sensor depending on scene dynamics. For systems operating in communication-bandwidth-constrained and power-cons… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 13 pages, 7 figures, 2 tables

  26. arXiv:2403.06888  [pdf

    physics.data-an cs.LG physics.app-ph

    Process signature-driven high spatio-temporal resolution alignment of multimodal data

    Authors: Abhishek Hanchate, Himanshu Balhara, Vishal S. Chindepalli, Satish T. S. Bukkapatnam

    Abstract: We present HiRA-Pro, a novel procedure to align, at high spatio-temporal resolutions, multimodal signals from real-world processes and systems that exhibit diverse transient, nonlinear stochastic dynamics, such as manufacturing machines. It is based on discerning and synchronizing the process signatures of salient kinematic and dynamic events in these disparate signals. HiRA-Pro addresses the chal… ▽ More

    Submitted 12 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

  27. arXiv:2403.02909  [pdf, other

    cs.CV cs.HC eess.IV

    Gaze-Vector Estimation in the Dark with Temporally Encoded Event-driven Neural Networks

    Authors: Abeer Banerjee, Naval K. Mehta, Shyam S. Prasad, Himanshu, Sumeet Saurav, Sanjay Singh

    Abstract: In this paper, we address the intricate challenge of gaze vector prediction, a pivotal task with applications ranging from human-computer interaction to driver monitoring systems. Our innovative approach is designed for the demanding setting of extremely low-light conditions, leveraging a novel temporal event encoding scheme, and a dedicated neural network architecture. The temporal encoding metho… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  28. GNSS Positioning using Cost Function Regulated Multilateration and Graph Neural Networks

    Authors: Amir Jalalirad, Davide Belli, Bence Major, Songwon Jee, Himanshu Shah, Will Morrison

    Abstract: In urban environments, where line-of-sight signals from GNSS satellites are frequently blocked by high-rise objects, GNSS receivers are subject to large errors in measuring satellite ranges. Heuristic methods are commonly used to estimate these errors and reduce the impact of noisy measurements on localization accuracy. In our work, we replace these error estimation heuristics with a deep learning… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: Published in The Proceedings of the Institute of Navigation GNSS+ 2023

  29. arXiv:2402.18032  [pdf, other

    cs.CV

    Human Shape and Clothing Estimation

    Authors: Aayush Gupta, Aditya Gulati, Himanshu, Lakshya LNU

    Abstract: Human shape and clothing estimation has gained significant prominence in various domains, including online shopping, fashion retail, augmented reality (AR), virtual reality (VR), and gaming. The visual representation of human shape and clothing has become a focal point for computer vision researchers in recent years. This paper presents a comprehensive survey of the major works in the field, focus… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  30. arXiv:2402.15115  [pdf, other

    stat.ML cs.LG physics.data-an

    Physics-constrained polynomial chaos expansion for scientific machine learning and uncertainty quantification

    Authors: Himanshu Sharma, Lukáš Novák, Michael D. Shields

    Abstract: We present a novel physics-constrained polynomial chaos expansion as a surrogate modeling method capable of performing both scientific machine learning (SciML) and uncertainty quantification (UQ) tasks. The proposed method possesses a unique capability: it seamlessly integrates SciML into UQ and vice versa, which allows it to quantify the uncertainties in SciML tasks effectively and leverage SciML… ▽ More

    Submitted 11 May, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: 34 pages, 15 figures

  31. arXiv:2402.11997  [pdf, other

    cs.CL cs.AI cs.LG

    Remember This Event That Year? Assessing Temporal Information and Reasoning in Large Language Models

    Authors: Himanshu Beniwal, Dishant Patel, Kowsik Nandagopan D, Hritik Ladia, Ankit Yadav, Mayank Singh

    Abstract: Large Language Models (LLMs) are increasingly ubiquitous, yet their ability to retain and reason about temporal information remains limited, hindering their application in real-world scenarios where understanding the sequential nature of events is crucial. Our study experiments with 12 state-of-the-art models (ranging from 2B to 70B+ parameters) on a novel numerical-temporal dataset, \textbf{TempU… ▽ More

    Submitted 5 July, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  32. arXiv:2402.07241  [pdf, other

    cs.CR

    Proof of Diligence: Cryptoeconomic Security for Rollups

    Authors: Peiyao Sheng, Ranvir Rana, Himanshu Tyagi, Pramod Viswanath

    Abstract: Layer 1 (L1) blockchains such as Ethereum are secured under an "honest supermajority of stake" assumption for a large pool of validators who verify each and every transaction on it. This high security comes at a scalability cost which not only effects the throughput of the blockchain but also results in high gas fees for executing transactions on chain. The most successful solution for this proble… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  33. arXiv:2402.06689  [pdf, other

    q-fin.ST cs.LG

    A Study on Stock Forecasting Using Deep Learning and Statistical Models

    Authors: Himanshu Gupta, Aditya Jaiswal

    Abstract: Predicting a fast and accurate model for stock price forecasting is been a challenging task and this is an active area of research where it is yet to be found which is the best way to forecast the stock price. Machine learning, deep learning and statistical analysis techniques are used here to get the accurate result so the investors can see the future trend and maximize the return of investment i… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  34. arXiv:2402.03796  [pdf, other

    cs.CV cs.AI cs.LG

    Face Detection: Present State and Research Directions

    Authors: Purnendu Prabhat, Himanshu Gupta, Ajeet Kumar Vishwakarma

    Abstract: The majority of computer vision applications that handle images featuring humans use face detection as a core component. Face detection still has issues, despite much research on the topic. Face detection's accuracy and speed might yet be increased. This review paper shows the progress made in this area as well as the substantial issues that still need to be tackled. The paper provides research di… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  35. TrICy: Trigger-guided Data-to-text Generation with Intent aware Attention-Copy

    Authors: Vibhav Agarwal, Sourav Ghosh, Harichandana BSS, Himanshu Arora, Barath Raj Kandur Raja

    Abstract: Data-to-text (D2T) generation is a crucial task in many natural language understanding (NLU) applications and forms the foundation of task-oriented dialog systems. In the context of conversational AI solutions that can work directly with local data on the user's device, architectures utilizing large pre-trained language models (PLMs) are impractical for on-device deployment due to a high memory fo… ▽ More

    Submitted 25 January, 2024; originally announced February 2024.

    Comments: Published in the IEEE/ACM Transactions on Audio, Speech, and Language Processing. (Sourav Ghosh and Vibhav Agarwal contributed equally to this work.)

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 1173-1184, 2024

  36. arXiv:2401.16227  [pdf, other

    cs.CV eess.IV

    A Volumetric Saliency Guided Image Summarization for RGB-D Indoor Scene Classification

    Authors: Preeti Meena, Himanshu Kumar, Sandeep Yadav

    Abstract: Image summary, an abridged version of the original visual content, can be used to represent the scene. Thus, tasks such as scene classification, identification, indexing, etc., can be performed efficiently using the unique summary. Saliency is the most commonly used technique for generating the relevant image summary. However, the definition of saliency is subjective in nature and depends upon the… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  37. arXiv:2401.10521  [pdf, other

    cs.CL cs.AI

    Cross-lingual Editing in Multilingual Language Models

    Authors: Himanshu Beniwal, Kowsik Nandagopan D, Mayank Singh

    Abstract: The training of large language models (LLMs) necessitates substantial data and computational resources, and updating outdated LLMs entails significant efforts and resources. While numerous model editing techniques (METs) have emerged to efficiently update model outputs without retraining, their effectiveness in multilingual LLMs, where knowledge is stored in diverse languages, remains an underexpl… ▽ More

    Submitted 3 February, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted at EACL 2024

  38. arXiv:2401.03855  [pdf, other

    cs.CL cs.AI

    PythonSaga: Redefining the Benchmark to Evaluate Code Generating LLMs

    Authors: Ankit Yadav, Himanshu Beniwal, Mayank Singh

    Abstract: Driven by the surge in code generation using large language models (LLMs), numerous benchmarks have emerged to evaluate these LLMs capabilities. We conducted a large-scale human evaluation of HumanEval and MBPP, two popular benchmarks for Python code generation, analyzing their diversity and difficulty. Our findings unveil a critical bias towards a limited set of programming concepts, neglecting m… ▽ More

    Submitted 4 July, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

  39. arXiv:2401.01637  [pdf, other

    cs.CL

    Social Media Ready Caption Generation for Brands

    Authors: Himanshu Maheshwari, Koustava Goswami, Apoorv Saxena, Balaji Vasan Srinivasan

    Abstract: Social media advertisements are key for brand marketing, aiming to attract consumers with captivating captions and pictures or logos. While previous research has focused on generating captions for general images, incorporating brand personalities into social media captioning remains unexplored. Brand personalities are shown to be affecting consumers' behaviours and social interactions and thus are… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  40. arXiv:2312.17300  [pdf, other

    cs.CR cs.LG

    Improving Intrusion Detection with Domain-Invariant Representation Learning in Latent Space

    Authors: Padmaksha Roy, Tyler Cody, Himanshu Singhal, Kevin Choi, Ming Jin

    Abstract: Domain generalization focuses on leveraging knowledge from multiple related domains with ample training data and labels to enhance inference on unseen in-distribution (IN) and out-of-distribution (OOD) domains. In our study, we introduce a two-phase representation learning technique using multi-task learning. This approach aims to cultivate a latent space from features spanning multiple domains, e… ▽ More

    Submitted 23 April, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  41. arXiv:2312.16596  [pdf, other

    cs.LG

    Enhancing Traffic Flow Prediction using Outlier-Weighted AutoEncoders: Handling Real-Time Changes

    Authors: Himanshu Choudhary, Marwan Hassani

    Abstract: In today's urban landscape, traffic congestion poses a critical challenge, especially during outlier scenarios. These outliers can indicate abrupt traffic peaks, drops, or irregular trends, often arising from factors such as accidents, events, or roadwork. Moreover, Given the dynamic nature of traffic, the need for real-time traffic modeling also becomes crucial to ensure accurate and up-to-date t… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: 10 pages

  42. arXiv:2312.12630  [pdf, other

    math.DS cs.LG math.CV math.FA math.SP

    Data-driven discovery with Limited Data Acquisition for fluid flow across cylinder

    Authors: Dr. Himanshu Singh

    Abstract: One of the central challenge for extracting governing principles of dynamical system via Dynamic Mode Decomposition (DMD) is about the limit data availability or formally called as Limited Data Acquisition in the present paper. In the interest of discovering the governing principles for a dynamical system with limited data acquisition, we provide a variant of Kernelized Extended DMD (KeDMD) based… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 52 Pages, 16 Figures, JULIA Coding Result for Dynamic Mode Decomposition, Part of this work selected for 42nd Annual Dynamic Days 2024 Conference (January 8 to 10) at University of California, Davis

    MSC Class: 37N10 76D05 76D25 47N50 47A25 68T01 28A10 28A35

  43. arXiv:2312.11996  [pdf, other

    cs.HC

    Toward Responsible AI Use: Considerations for Sustainability Impact Assessment

    Authors: Eva Thelisson, Grzegorz Mika, Quentin Schneiter, Kirtan Padh, Himanshu Verma

    Abstract: As AI/ML models, including Large Language Models, continue to scale with massive datasets, so does their consumption of undeniably limited natural resources, and impact on society. In this collaboration between AI, Sustainability, HCI and legal researchers, we aim to enable a transition to sustainable AI development by enabling stakeholders across the AI value chain to assess and quantitfy the env… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  44. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  45. arXiv:2312.10884  [pdf, other

    eess.SY cs.AI cs.LG math.OC

    Contextual Reinforcement Learning for Offshore Wind Farm Bidding

    Authors: David Cole, Himanshu Sharma, Wei Wang

    Abstract: We propose a framework for applying reinforcement learning to contextual two-stage stochastic optimization and apply this framework to the problem of energy market bidding of an off-shore wind farm. Reinforcement learning could potentially be used to learn close to optimal solutions for first stage variables of a two-stage stochastic program under different contexts. Under the proposed framework,… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

  46. arXiv:2312.10693  [pdf, other

    cs.LG math.FA

    An appointment with Reproducing Kernel Hilbert Space generated by Generalized Gaussian RBF as $L^2-$measure

    Authors: Himanshu Singh

    Abstract: Gaussian Radial Basis Function (RBF) Kernels are the most-often-employed kernels in artificial intelligence and machine learning routines for providing optimally-best results in contrast to their respective counter-parts. However, a little is known about the application of the Generalized Gaussian Radial Basis Function on various machine learning algorithms namely, kernel regression, support vecto… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: 20 pages, MATLAB CODE, 11 figures, Results presented in AMS Spring Eastern Sectional Meeting on April 2023

    MSC Class: NUMBER 68-Computer Science; 68T-Artificial Intelligence and 68T07-Artificial Neural Networks and Deep Learning

  47. arXiv:2312.09187  [pdf, other

    cs.LG

    Vision-Language Models as a Source of Rewards

    Authors: Kate Baumli, Satinder Baveja, Feryal Behbahani, Harris Chan, Gheorghe Comanici, Sebastian Flennerhag, Maxime Gazeau, Kristian Holsheimer, Dan Horgan, Michael Laskin, Clare Lyle, Hussain Masoom, Kay McKinney, Volodymyr Mnih, Alexander Neitz, Dmitry Nikulin, Fabio Pardo, Jack Parker-Holder, John Quan, Tim Rocktäschel, Himanshu Sahni, Tom Schaul, Yannick Schroecker, Stephen Spencer, Richie Steigerwald , et al. (2 additional authors not shown)

    Abstract: Building generalist agents that can accomplish many goals in rich open-ended environments is one of the research frontiers for reinforcement learning. A key limiting factor for building generalist agents with RL has been the need for a large number of reward functions for achieving different goals. We investigate the feasibility of using off-the-shelf vision-language models, or VLMs, as sources of… ▽ More

    Submitted 12 July, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 10 pages, 5 figures

  48. arXiv:2312.05463  [pdf

    cs.SI physics.soc-ph

    Do Socialization Restrictions Prevent Restaurants from Becoming Covid Hotspots?

    Authors: Aviral Bhatnagar, Himanshu Kharkwal, Jaideep Srivastava

    Abstract: Simulation models for infection spread can help understand what factors play a major role in infection spread. Health agencies like the Center for Disease Control (CDC) can accordingly mandate effective guidelines to curb the spread. We built an infection spread model to simulate disease propagation through airborne transmission to study the impact of restaurant operational policies on the Covid-1… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: 6 pages, 5 figures

    Journal ref: Computer Sciences ACTA SCIENTIFIC Journal Volume 3 Issue 9, 2021

  49. On Gradient Boosted Decision Trees and Neural Rankers: A Case-Study on Short-Video Recommendations at ShareChat

    Authors: Olivier Jeunen, Hitesh Sagtani, Himanshu Doi, Rasul Karimov, Neeti Pokharna, Danish Kalim, Aleksei Ustimenko, Christopher Green, Wenzhe Shi, Rishabh Mehrotra

    Abstract: Practitioners who wish to build real-world applications that rely on ranking models, need to decide which modelling paradigm to follow. This is not an easy choice to make, as the research literature on this topic has been shifting in recent years. In particular, whilst Gradient Boosted Decision Trees (GBDTs) have reigned supreme for more than a decade, the flexibility of neural networks has allowe… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: Appearing in the Industry Track Proceedings of the Forum for Information Retrieval Evaluation (FIRE '23)

  50. arXiv:2312.00766  [pdf, other

    cs.CV cs.AI

    Automated Material Properties Extraction For Enhanced Beauty Product Discovery and Makeup Virtual Try-on

    Authors: Fatemeh Taheri Dezaki, Himanshu Arora, Rahul Suresh, Amin Banitalebi-Dehkordi

    Abstract: The multitude of makeup products available can make it challenging to find the ideal match for desired attributes. An intelligent approach for product discovery is required to enhance the makeup shopping experience to make it more convenient and satisfying. However, enabling accurate and efficient product discovery requires extracting detailed attributes like color and finish type. Our work introd… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: Presented in Fifth Workshop on Recommender Systems in Fashion(fashionxrecsys) of ACM Conference on Recommender Systems