Skip to main content

Showing 1–50 of 327 results for author: Kumar, M

  1. arXiv:2407.08726  [pdf, other

    cs.CV

    Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data

    Authors: Cherie Ho, Jiaye Zou, Omar Alama, Sai Mitheran Jagadesh Kumar, Benjamin Chiang, Taneesh Gupta, Chen Wang, Nikhil Keetha, Katia Sycara, Sebastian Scherer

    Abstract: Top-down Bird's Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks. While recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets. In this context, we show that a more sca… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  2. arXiv:2407.07726  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    PaliGemma: A versatile 3B VLM for transfer

    Authors: Lucas Beyer, Andreas Steiner, André Susano Pinto, Alexander Kolesnikov, Xiao Wang, Daniel Salz, Maxim Neumann, Ibrahim Alabdulmohsin, Michael Tschannen, Emanuele Bugliarello, Thomas Unterthiner, Daniel Keysers, Skanda Koppula, Fangyu Liu, Adam Grycner, Alexey Gritsenko, Neil Houlsby, Manoj Kumar, Keran Rong, Julian Eisenschlos, Rishabh Kabra, Matthias Bauer, Matko Bošnjak, Xi Chen, Matthias Minderer , et al. (10 additional authors not shown)

    Abstract: PaliGemma is an open Vision-Language Model (VLM) that is based on the SigLIP-So400m vision encoder and the Gemma-2B language model. It is trained to be a versatile and broadly knowledgeable base model that is effective to transfer. It achieves strong performance on a wide variety of open-world tasks. We evaluate PaliGemma on almost 40 diverse tasks including standard VLM benchmarks, but also more… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  3. arXiv:2407.07128  [pdf, other

    cs.LG cs.SI stat.ML

    Modularity aided consistent attributed graph clustering via coarsening

    Authors: Samarth Bhatia, Yukti Makhija, Manoj Kumar, Sandeep Kumar

    Abstract: Graph clustering is an important unsupervised learning technique for partitioning graphs with attributes and detecting communities. However, current methods struggle to accurately capture true community structures and intra-cluster relations, be computationally efficient, and identify smaller communities. We address these challenges by integrating coarsening and modularity maximization, effectivel… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: The first two authors contributed equally to this work

  4. arXiv:2407.04664  [pdf, other

    cs.GT cs.DS

    The Degree of Fairness in Efficient House Allocation

    Authors: Hadi Hosseini, Medha Kumar, Sanjukta Roy

    Abstract: The classic house allocation problem is primarily concerned with finding a matching between a set of agents and a set of houses that guarantees some notion of economic efficiency (e.g. utilitarian welfare). While recent works have shifted focus on achieving fairness (e.g. minimizing the number of envious agents), they often come with notable costs on efficiency notions such as utilitarian or egali… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    ACM Class: F.2.0

  5. arXiv:2407.04651  [pdf, other

    cs.CV

    SAM Fewshot Finetuning for Anatomical Segmentation in Medical Images

    Authors: Weiyi Xie, Nathalie Willems, Shubham Patil, Yang Li, Mayank Kumar

    Abstract: We propose a straightforward yet highly effective few-shot fine-tuning strategy for adapting the Segment Anything (SAM) to anatomical segmentation tasks in medical images. Our novel approach revolves around reformulating the mask decoder within SAM, leveraging few-shot embeddings derived from a limited set of labeled images (few-shot collection) as prompts for querying anatomical objects captured… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 9 pages, Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2024

    ACM Class: I.4.6; I.5.4; I.5.1

  6. arXiv:2407.04335  [pdf, ps, other

    cs.LG cs.AI

    Geometrically Inspired Kernel Machines for Collaborative Learning Beyond Gradient Descent

    Authors: Mohit Kumar, Alexander Valentinitsch, Magdalena Fuchs, Mathias Brucker, Juliana Bowles, Adnan Husakovic, Ali Abbas, Bernhard A. Moser

    Abstract: This paper develops a novel mathematical framework for collaborative learning by means of geometrically inspired kernel machines which includes statements on the bounds of generalisation and approximation errors, and sample complexity. For classification problems, this approach allows us to learn bounded geometric structures around given data points and hence solve the global model learning proble… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  7. arXiv:2407.04053  [pdf, other

    cs.DC

    Edge AI: A Taxonomy, Systematic Review and Future Directions

    Authors: Sukhpal Singh Gill, Muhammed Golec, Jianmin Hu, Minxian Xu, Junhui Du, Huaming Wu, Guneet Kaur Walia, Subramaniam Subramanian Murugesan, Babar Ali, Mohit Kumar, Kejiang Ye, Prabal Verma, Surendra Kumar, Felix Cuadrado, Steve Uhlig

    Abstract: Edge Artificial Intelligence (AI) incorporates a network of interconnected systems and devices that receive, cache, process, and analyse data in close communication with the location where the data is captured with AI technology. Recent advancements in AI efficiency, the widespread use of Internet of Things (IoT) devices, and the emergence of edge computing have unlocked the enormous scope of Edge… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Preprint Version, 18 Figures

  8. arXiv:2406.14908  [pdf, other

    cs.HC

    Can we say a cat is a cat? Understanding the challenges in annotating physiological signal-based emotion data

    Authors: Pragya Singh, Mohan Kumar, Pushpendra Singh

    Abstract: Artificial Intelligence (AI) algorithms, trained on emotion data extracted from physiological signals, provide a promising approach to monitoring emotions, affect, and mental well-being. However, the field encounters challenges because there is a lack of effective methods for collecting high-quality data in everyday settings that genuinely reflect changes in emotion or affect. This paper presents… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 7 pages, To be published at PhysioCHI: Towards Best Practices for Integrating Physiological Signals in HCI, May 11, 2024, Honolulu, HI, USA

  9. arXiv:2406.10288  [pdf, other

    cs.CL cs.LG

    Mimicking User Data: On Mitigating Fine-Tuning Risks in Closed Large Language Models

    Authors: Francisco Eiras, Aleksandar Petrov, Phillip H. S. Torr, M. Pawan Kumar, Adel Bibi

    Abstract: Fine-tuning large language models on small, high-quality datasets can enhance their performance on specific downstream tasks. Recent research shows that fine-tuning on benign, instruction-following data can inadvertently undo the safety alignment process and increase a model's propensity to comply with harmful queries. Although critical, understanding and mitigating safety risks in well-defined ta… ▽ More

    Submitted 1 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  10. arXiv:2406.01698  [pdf, other

    cs.AR cs.AI cs.DC cs.LG

    Demystifying Platform Requirements for Diverse LLM Inference Use Cases

    Authors: Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong, Souvik Kundu, Sudarshan Srinivasan, Midhilesh Elavazhagan, Madhu Kumar, Tushar Krishna

    Abstract: Large language models (LLMs) have shown remarkable performance across a wide range of applications, often outperforming human experts. However, deploying these parameter-heavy models efficiently for diverse inference use cases requires carefully designed hardware platforms with ample computing, memory, and network resources. With LLM deployment scenarios and models evolving at breakneck speed, the… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 12 Pages, https://github.com/abhibambhaniya/GenZ-LLM-Analyzer

  11. arXiv:2405.18780  [pdf, other

    cs.AI cs.LG

    Quantitative Certification of Bias in Large Language Models

    Authors: Isha Chaudhary, Qian Hu, Manoj Kumar, Morteza Ziyadi, Rahul Gupta, Gagandeep Singh

    Abstract: Large Language Models (LLMs) can produce responses that exhibit social biases and support stereotypes. However, conventional benchmarking is insufficient to thoroughly evaluate LLM bias, as it can not scale to large sets of prompts and provides no guarantees. Therefore, we propose a novel certification framework QuaCer-B (Quantitative Certification of Bias) that provides formal guarantees on obtai… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  12. arXiv:2405.15341  [pdf, other

    cs.AI cs.CV

    V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM

    Authors: Abdur Rahman, Rajat Chawla, Muskaan Kumar, Arkajit Datta, Adarsh Jha, Mukunda NS, Ishaan Bhola

    Abstract: In the rapidly evolving landscape of AI research and application, Multimodal Large Language Models (MLLMs) have emerged as a transformative force, adept at interpreting and integrating information from diverse modalities such as text, images, and Graphical User Interfaces (GUIs). Despite these advancements, the nuanced interaction and understanding of GUIs pose a significant challenge, limiting th… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  13. arXiv:2405.14857  [pdf, other

    cs.CV cs.AI cs.LG

    Semantica: An Adaptable Image-Conditioned Diffusion Model

    Authors: Manoj Kumar, Neil Houlsby, Emiel Hoogeboom

    Abstract: We investigate the task of adapting image generative models to different datasets without finetuneing. To this end, we introduce Semantica, an image-conditioned diffusion model capable of generating images based on the semantics of a conditioning image. Semantica is trained exclusively on web-scale image pairs, that is it receives a random image from a webpage as conditional input and models anoth… ▽ More

    Submitted 10 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  14. arXiv:2405.04260  [pdf, other

    cs.LG cs.AI

    Verified Neural Compressed Sensing

    Authors: Rudy Bunel, Krishnamurthy Dvijotham, M. Pawan Kumar, Alessandro De Palma, Robert Stanforth

    Abstract: We develop the first (to the best of our knowledge) provably correct neural networks for a precise computational task, with the proof of correctness generated by an automated verification algorithm without any human input. Prior work on neural network verification has focused on partial specifications that, even when satisfied, are not sufficient to ensure that a neural network never makes errors.… ▽ More

    Submitted 8 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  15. arXiv:2405.02002  [pdf, ps, other

    cs.DC

    Optimizing Robot Dispersion on Grids: with and without Fault Tolerance

    Authors: Rik Banerjee, Manish Kumar, Anisur Rahaman Molla

    Abstract: The introduction and study of dispersing mobile robots across the nodes of an anonymous graph have recently gained traction and have been explored within various graph classes and settings. While optimal dispersion solution was established for {\em oriented} grids [Kshemkalyani et al., WALCOM 2020], a significant unresolved question pertains to whether achieving optimal dispersion is feasible on a… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  16. Towards Building Autonomous Data Services on Azure

    Authors: Yiwen Zhu, Yuanyuan Tian, Joyce Cahoon, Subru Krishnan, Ankita Agarwal, Rana Alotaibi, Jesús Camacho-Rodríguez, Bibin Chundatt, Andrew Chung, Niharika Dutta, Andrew Fogarty, Anja Gruenheid, Brandon Haynes, Matteo Interlandi, Minu Iyer, Nick Jurgens, Sumeet Khushalani, Brian Kroth, Manoj Kumar, Jyoti Leeka, Sergiy Matusevych, Minni Mittal, Andreas Mueller, Kartheek Muthyala, Harsha Nagulapalli , et al. (13 additional authors not shown)

    Abstract: Modern cloud has turned data services into easily accessible commodities. With just a few clicks, users are now able to access a catalog of data processing systems for a wide range of tasks. However, the cloud brings in both complexity and opportunity. While cloud users can quickly start an application by using various data services, it can be difficult to configure and optimize these services to… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: SIGMOD Companion of the 2023 International Conference on Management of Data. 2023

  17. arXiv:2404.16060  [pdf

    cs.HC physics.ed-ph physics.optics

    Pocket Schlieren: a background oriented schlieren imaging platform on a smartphone

    Authors: Diganta Rabha, Vimod Kumar, Akshay Kumar, Dinesh Saini, Manish Kumar

    Abstract: Background-oriented schlieren (BOS) is a powerful technique for flow visualization. Nevertheless, the widespread dissemination of BOS is impeded by its dependence on scientific cameras, computing hardware, and dedicated analysis software. In this work, we aim to democratize BOS by providing a smartphone based scientific tool called "Pocket Schlieren". Pocket Schlieren enables users to directly cap… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 24 pages, 6 figures, 4 Supplementary figures

  18. arXiv:2404.16048  [pdf, other

    cs.HC cs.AI

    GUIDE: Graphical User Interface Data for Execution

    Authors: Rajat Chawla, Adarsh Jha, Muskaan Kumar, Mukunda NS, Ishaan Bhola

    Abstract: In this paper, we introduce GUIDE, a novel dataset tailored for the advancement of Multimodal Large Language Model (MLLM) applications, particularly focusing on Robotic Process Automation (RPA) use cases. Our dataset encompasses diverse data from various websites including Apollo(62.67\%), Gmail(3.43\%), Calendar(10.98\%) and Canva(22.92\%). Each data entry includes an image, a task description, t… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 11 pages, 8 figures, 3 Tables and 1 Algorithm

  19. Visualizing Intelligent Tutor Interactions for Responsive Pedagogy

    Authors: Grace Guo, Aishwarya Mudgal Sunil Kumar, Adit Gupta, Adam Coscia, Chris MacLellan, Alex Endert

    Abstract: Intelligent tutoring systems leverage AI models of expert learning and student knowledge to deliver personalized tutoring to students. While these intelligent tutors have demonstrated improved student learning outcomes, it is still unclear how teachers might integrate them into curriculum and course planning to support responsive pedagogy. In this paper, we conducted a design study with five teach… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 9 pages, 5 figures, ACM AVI 2024

  20. arXiv:2404.07533  [pdf, other

    cs.LG cs.AI cs.ET

    IITP-VDLand: A Comprehensive Dataset on Decentraland Parcels

    Authors: Ankit K. Bhagat, Dipika Jha, Raju Halder, Rajendra N. Paramanik, Chandra M. Kumar

    Abstract: This paper presents IITP-VDLand, a comprehensive dataset of Decentraland parcels sourced from diverse platforms. Unlike existing datasets which have limited attributes and records, IITP-VDLand offers a rich array of attributes, encompassing parcel characteristics, trading history, past activities, transactions, and social media interactions. Alongside, we introduce a key attribute in the dataset,… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  21. arXiv:2403.17849  [pdf, other

    math.OC cs.RO

    Multi Agent Pathfinding for Noise Restricted Hybrid Fuel Unmanned Aerial Vehicles

    Authors: Drew Scott, Satyanarayana G. Manyam, David W. Casbeer, Manish Kumar, Isaac E. Weintraub

    Abstract: Multi Agent Path Finding (MAPF) seeks the optimal set of paths for multiple agents from respective start to goal locations such that no paths conflict. We address the MAPF problem for a fleet of hybrid-fuel unmanned aerial vehicles which are subject to location-dependent noise restrictions. We solve this problem by searching a constraint tree for which the subproblem at each node is a set of short… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 6 pages, 7 figures

  22. arXiv:2403.13716  [pdf, ps, other

    cs.DC cs.DS cs.MA

    Agent-based Leader Election, MST, and Beyond

    Authors: Ajay D. Kshemkalyani, Manish Kumar, Anisur Rahaman Molla, Gokarna Sharma

    Abstract: Leader election is one of the fundamental and well-studied problems in distributed computing. In this paper, we initiate the study of leader election using mobile agents. Suppose $n$ agents are positioned initially arbitrarily on the nodes of an arbitrary, anonymous, $n$-node, $m$-edge graph $G$. The agents relocate themselves autonomously on the nodes of $G$ and elect an agent as a leader such th… ▽ More

    Submitted 22 May, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: 25 pages

  23. arXiv:2403.12335  [pdf, other

    cs.LG

    Temporally-Consistent Koopman Autoencoders for Forecasting Dynamical Systems

    Authors: Indranil Nayak, Debdipta Goswami, Mrinal Kumar, Fernando Teixeira

    Abstract: Absence of sufficiently high-quality data often poses a key challenge in data-driven modeling of high-dimensional spatio-temporal dynamical systems. Koopman Autoencoders (KAEs) harness the expressivity of deep neural networks (DNNs), the dimension reduction capabilities of autoencoders, and the spectral properties of the Koopman operator to learn a reduced-order feature space with simpler, linear… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  24. arXiv:2403.10519  [pdf, other

    cs.CV

    Frozen Feature Augmentation for Few-Shot Image Classification

    Authors: Andreas Bär, Neil Houlsby, Mostafa Dehghani, Manoj Kumar

    Abstract: Training a linear classifier or lightweight model on top of pretrained vision model outputs, so-called 'frozen features', leads to impressive performance on a number of downstream few-shot tasks. Currently, frozen features are not modified during training. On the other hand, when networks are trained directly on images, data augmentation is a standard recipe that improves performance with no subst… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: CVPR 2024 (18 pages, main paper + supplementary material)

  25. arXiv:2403.01965  [pdf, ps, other

    cs.CC cs.DS

    Towards Deterministic Algorithms for Constant-Depth Factors of Constant-Depth Circuits

    Authors: Mrinal Kumar, Varun Ramanathan, Ramprasad Saptharishi, Ben Lee Volk

    Abstract: We design a deterministic subexponential time algorithm that takes as input a multivariate polynomial $f$ computed by a constant-depth circuit over rational numbers, and outputs a list $L$ of circuits (of unbounded depth and possibly with division gates) that contains all irreducible factors of $f$ computable by constant-depth circuits. This list $L$ might also include circuits that are spurious:… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  26. arXiv:2402.14817  [pdf, other

    cs.CV cs.LG

    Cameras as Rays: Pose Estimation via Ray Diffusion

    Authors: Jason Y. Zhang, Amy Lin, Moneish Kumar, Tzu-Hsuan Yang, Deva Ramanan, Shubham Tulsiani

    Abstract: Estimating camera poses is a fundamental task for 3D reconstruction and remains challenging given sparsely sampled views (<10). In contrast to existing approaches that pursue top-down prediction of global parametrizations of camera extrinsics, we propose a distributed representation of camera pose that treats a camera as a bundle of rays. This representation allows for a tight coupling with spatia… ▽ More

    Submitted 4 April, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: In ICLR 2024 (oral). v2-3: updated references. Project webpage: https://jasonyzhang.com/RayDiffusion

  27. arXiv:2311.17841  [pdf, ps, other

    cs.IT cs.CC

    Fast list-decoding of univariate multiplicity and folded Reed-Solomon codes

    Authors: Rohan Goyal, Prahladh Harsha, Mrinal Kumar, Ashutosh Shankar

    Abstract: We show that the known list-decoding algorithms for univariate multiplicity and folded Reed-Solomon codes can be made to run in $\tilde{O}(n)$ time. Univariate multiplicity codes and FRS codes are natural variants of Reed-Solomon codes that were discovered and studied for their applications to list decoding. It is known that for every $ε>0$, and rate $r \in (0,1)$, there exist explicit families of… ▽ More

    Submitted 12 March, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: Modified abstract and included references for HRW and KRRSS. At the time of Version [v1], we were unaware of the nearly-linear-time list-decoders of [HRW] and [KRRSW]. This has been addressed in the subsequent versions

  28. arXiv:2311.14335  [pdf, other

    cs.LG cs.AI

    Comparative Analysis of Transformers for Modeling Tabular Data: A Casestudy using Industry Scale Dataset

    Authors: Usneek Singh, Piyush Arora, Shamika Ganesan, Mohit Kumar, Siddhant Kulkarni, Salil R. Joshi

    Abstract: We perform a comparative analysis of transformer-based models designed for modeling tabular data, specifically on an industry-scale dataset. While earlier studies demonstrated promising outcomes on smaller public or synthetic datasets, the effectiveness did not extend to larger industry-scale datasets. The challenges identified include handling high-dimensional data, the necessity for efficient pr… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: Accepted at 7th Joint International Conference on Data Science & Management of Data (11th ACMIKDD CODS and 29th COMAD)

  29. arXiv:2311.12752  [pdf, ps, other

    cs.CC

    An Improved Line-Point Low-Degree Test

    Authors: Prahladh Harsha, Mrinal Kumar, Ramprasad Saptharishi, Madhu Sudan

    Abstract: We prove that the most natural low-degree test for polynomials over finite fields is ``robust'' in the high-error regime for linear-sized fields. Specifically we consider the ``local'' agreement of a function $f: \mathbb{F}_q^m \to \mathbb{F}_q$ from the space of degree-$d$ polynomials, i.e., the expected agreement of the function from univariate degree-$d$ polynomials over a randomly chosen line… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    MSC Class: 68Q25; 11T06 ACM Class: I.1.2; E.4

  30. arXiv:2311.07888  [pdf, other

    cs.RO cs.AI cs.LG

    RoboSense At Edge: Detecting Slip, Crumple and Shape of the Object in Robotic Hand for Teleoprations

    Authors: Sudev Kumar Padhi, Mohit Kumar, Debanka Giri, Subidh Ali

    Abstract: Slip and crumple detection is essential for performing robust manipulation tasks with a robotic hand (RH) like remote surgery. It has been one of the challenging problems in the robotics manipulation community. In this work, we propose a technique based on machine learning (ML) based techniques to detect the slip, and crumple as well as the shape of an object that is currently held in the robotic… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  31. arXiv:2310.08437  [pdf, other

    cs.DC

    Cold Start Latency in Serverless Computing: A Systematic Review, Taxonomy, and Future Directions

    Authors: Muhammed Golec, Guneet Kaur Walia, Mohit Kumar, Felix Cuadrado, Sukhpal Singh Gill, Steve Uhlig

    Abstract: Recently, academics and the corporate sector have paid attention to serverless computing, which enables dynamic scalability and an economic model. In serverless computing, users pay only for the time they actually spend using the resources. Although zero scaling optimises cost and resource utilisation, it is the fundamental reason for the serverless cold start problem. Various academic and corpora… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: 34 Pages, 16 Figures

  32. arXiv:2310.08004  [pdf, other

    cs.CC quant-ph

    On the Rational Degree of Boolean Functions and Applications

    Authors: Vishnu Iyer, Siddhartha Jain, Matt Kovacs-Deak, Vinayak M. Kumar, Luke Schaeffer, Daochen Wang, Michael Whitmeyer

    Abstract: We study a natural complexity measure of Boolean functions known as the (exact) rational degree. For total functions $f$, it is conjectured that $\mathrm{rdeg}(f)$ is polynomially related to $\mathrm{deg}(f)$, where $\mathrm{deg}(f)$ is the Fourier degree. Towards this conjecture, we show that symmetric functions have rational degree at least $\mathrm{deg}(f)/2$ and monotone functions have rationa… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 17 pages, 3 figures

  33. arXiv:2310.03884  [pdf, other

    cs.IT cs.LG eess.SP math.DG stat.ML

    Information Geometry for the Working Information Theorist

    Authors: Kumar Vijay Mishra, M. Ashok Kumar, Ting-Kam Leonard Wong

    Abstract: Information geometry is a study of statistical manifolds, that is, spaces of probability distributions from a geometric perspective. Its classical information-theoretic applications relate to statistical concepts such as Fisher information, sufficient statistics, and efficient estimators. Today, information geometry has emerged as an interdisciplinary field that finds applications in diverse areas… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: 12 pages, 3 figures, 1 table

  34. arXiv:2309.09701  [pdf, ps, other

    cs.CC cs.DS

    Deterministic Algorithms for Low Degree Factors of Constant Depth Circuits

    Authors: Mrinal Kumar, Varun Ramanathan, Ramprasad Saptharishi

    Abstract: For every constant $d$, we design a subexponential time deterministic algorithm that takes as input a multivariate polynomial $f$ given as a constant depth algebraic circuit over the field of rational numbers, and outputs all irreducible factors of $f$ of degree at most $d$ together with their respective multiplicities. Moreover, if $f$ is a sparse polynomial, then the algorithm runs in quasipolyn… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  35. arXiv:2309.04504  [pdf, other

    cs.LG cs.AI

    Compositional Learning of Visually-Grounded Concepts Using Reinforcement

    Authors: Zijun Lin, Haidi Azaman, M Ganesh Kumar, Cheston Tan

    Abstract: Children can rapidly generalize compositionally-constructed rules to unseen test sets. On the other hand, deep reinforcement learning (RL) agents need to be trained over millions of episodes, and their ability to generalize to unseen combinations remains unclear. Hence, we investigate the compositional abilities of RL agents, using the task of navigating to specified color-shape targets in synthet… ▽ More

    Submitted 3 May, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

  36. arXiv:2309.03483  [pdf, other

    cs.CV

    DetermiNet: A Large-Scale Diagnostic Dataset for Complex Visually-Grounded Referencing using Determiners

    Authors: Clarence Lee, M Ganesh Kumar, Cheston Tan

    Abstract: State-of-the-art visual grounding models can achieve high detection accuracy, but they are not designed to distinguish between all objects versus only certain objects of interest. In natural language, in order to specify a particular object or set of objects of interest, humans use determiners such as "my", "either" and "those". Determiners, as an important word class, are a type of schema in natu… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: 10 pages, 6 figures

  37. arXiv:2309.01252  [pdf, other

    cs.CV

    S2RF: Semantically Stylized Radiance Fields

    Authors: Dishani Lahiri, Neeraj Panse, Moneish Kumar

    Abstract: We present our method for transferring style from any arbitrary image(s) to object(s) within a 3D scene. Our primary objective is to offer more control in 3D scene stylization, facilitating the creation of customizable and stylized scene images from arbitrary viewpoints. To achieve this, we propose a novel approach that incorporates nearest neighborhood-based loss, allowing for flexible 3D scene r… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: AI for 3D Content Creation at International Conference on Computer Vision 2023

  38. arXiv:2308.15960  [pdf, ps, other

    cs.CV

    Fusing Pseudo Labels with Weak Supervision for Dynamic Traffic Scenarios

    Authors: Harshith Mohan Kumar, Sean Lawrence

    Abstract: Advanced Driver Assistance Systems (ADAS) have made significant strides, capitalizing on computer vision to enhance perception and decision-making capabilities. Nonetheless, the adaptation of these systems to diverse traffic scenarios poses challenges due to shifts in data distribution stemming from factors such as location, weather, and road infrastructure. To tackle this, we introduce a weakly-s… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

    Comments: This work was accepted as an extended abstract at the International Conference on Computer Vision (ICCV) 2023 BRAVO Workshop, Paris, France

  39. arXiv:2308.06082  [pdf, other

    cs.CR

    Security of XCB and HCTR

    Authors: Manish Kumar

    Abstract: Tweakable Enciphering Scheme (TES) is a length preserving scheme which provides confidentiality and admissible integrity. XCB (Extended Code Book) is a TES which was introduced in 2004. In 2007, it was modified and security bound was provided. Later, these two versions were referred to as XCBv1 and XCBv2 respectively. XCBv2 was proposed as the IEEE-std 1619.2 2010 for encryption of sector oriented… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: M.Tech Dissertation. Indian Statistical Institute, Kolkata, July 2018

  40. arXiv:2308.04599  [pdf, ps, other

    cs.CC

    Determinants vs. Algebraic Branching Programs

    Authors: Abhranil Chatterjee, Mrinal Kumar, Ben Lee Volk

    Abstract: We show that for every homogeneous polynomial of degree $d$, if it has determinantal complexity at most $s$, then it can be computed by a homogeneous algebraic branching program (ABP) of size at most $O(d^5s)$. Moreover, we show that for $\textit{most}$ homogeneous polynomials, the width of the resulting homogeneous ABP is just $s-1$ and the size is at most $O(ds)$. Thus, for constant degree hom… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  41. Blockchain inspired secure and reliable data exchange architecture for cyber-physical healthcare system 4.0

    Authors: Mohit Kumar, Hritu Raj, Nisha Chaurasia, Sukhpal Singh Gill

    Abstract: A cyber-physical system is considered to be a collection of strongly coupled communication systems and devices that poses numerous security trials in various industrial applications including healthcare. The security and privacy of patient data is still a big concern because healthcare data is sensitive and valuable, and it is most targeted over the internet. Moreover, from the industrial perspect… ▽ More

    Submitted 28 June, 2023; originally announced July 2023.

    Journal ref: Internet of Things and Cyber-Physical Systems, Volume 3, 2023, Pages 309-322

  42. arXiv:2307.13602  [pdf

    physics.soc-ph cs.DC

    Fortaleza: The emergence of a network hub

    Authors: Eric Bragion, Habiba Akter, Mohit Kumar, Minxian Xu, Ahmed M. Abdelmoniem, Sukhpal Singh Gill

    Abstract: Digitalisation, accelerated by the pandemic, has brought the opportunity for companies to expand their businesses beyond their geographic location and has considerably affected networks around the world. Cloud services have a better acceptance nowadays, and it is foreseen that this industry will grow exponentially in the following years. With more distributed networks that need to support customer… ▽ More

    Submitted 28 June, 2023; originally announced July 2023.

    Journal ref: Published in Internet of Things and Cyber-Physical Systems, Volume 3, 2023, Pages 272-279

  43. arXiv:2307.10934  [pdf, other

    cs.CV

    OCTraN: 3D Occupancy Convolutional Transformer Network in Unstructured Traffic Scenarios

    Authors: Aditya Nalgunda Ganesh, Dhruval Pobbathi Badrinath, Harshith Mohan Kumar, Priya SS, Surabhi Narayan

    Abstract: Modern approaches for vision-centric environment perception for autonomous navigation make extensive use of self-supervised monocular depth estimation algorithms that output disparity maps. However, when this disparity map is projected onto 3D space, the errors in disparity are magnified, resulting in a depth estimation error that increases quadratically as the distance from the camera increases.… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: This work was accepted as a spotlight presentation at the Transformers for Vision Workshop @CVPR 2023

  44. arXiv:2307.09610  [pdf, other

    cs.CR

    Design and Analysis of Pairing-Friendly Elliptic Curves for Cryptographic Primitives

    Authors: Mahender Kumar

    Abstract: Elliptic curve cryptography (ECC) is a remarkable mathematical tool that offers the same level of security as traditional public-key cryptography (PKC) with a significantly smaller key size and lower computational requirements. The use of pairing on elliptic curves has emerged as a vibrant field of research that provides enhanced security measures for the next generation of cryptographic systems.… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

  45. arXiv:2307.05922  [pdf, ps, other

    cs.DC cs.CR

    Sublinear Message Bounds of Authenticated Implicit Byzantine Agreement

    Authors: Manish Kumar, Anisur Rahaman Molla

    Abstract: This paper studies the message complexity of authenticated Byzantine agreement (BA) in synchronous, fully-connected distributed networks under an honest majority. We focus on the so-called {\em implicit} Byzantine agreement problem where each node starts with an input value and at the end a non-empty subset of the honest nodes should agree on a common input value by satisfying the BA properties (i… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

  46. arXiv:2307.02411  [pdf

    cs.CR

    An Approach to Remove Key Escrow Problem in ID-Based Encryption From Pairing

    Authors: Mahender Kumar

    Abstract: Key escrow refers to storing a copy of a cryptographic key with a trusted third party, typically a government agency or some other organization. Key escrow aims to ensure that law enforcement agencies can access encrypted data when necessary, for example, in criminal investigations or national security matters. However, key escrow also raises concerns about privacy and security. If the trusted thi… ▽ More

    Submitted 6 May, 2023; originally announced July 2023.

    Comments: M.Tech Dissertation

  47. Relaxed Local Correctability from Local Testing

    Authors: Vinayak M. Kumar, Geoffrey Mon

    Abstract: We construct the first asymptotically good relaxed locally correctable codes with polylogarithmic query complexity, bringing the upper bound polynomially close to the lower bound of Gur and Lachish (SICOMP 2021). Our result follows from showing that a high-rate locally testable code can boost the block length of a smaller relaxed locally correctable code, while preserving the correcting radius and… ▽ More

    Submitted 1 April, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: 15 pages. Improved exposition, changed notation; to appear in STOC 2024

  48. arXiv:2306.07915  [pdf, other

    cs.CV

    Image Captioners Are Scalable Vision Learners Too

    Authors: Michael Tschannen, Manoj Kumar, Andreas Steiner, Xiaohua Zhai, Neil Houlsby, Lucas Beyer

    Abstract: Contrastive pretraining on image-text pairs from the web is one of the most popular large-scale pretraining strategies for vision backbones, especially in the context of large multimodal models. At the same time, image captioning on this type of data is commonly considered an inferior pretraining strategy. In this paper, we perform a fair comparison of these two pretraining strategies, carefully m… ▽ More

    Submitted 21 December, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: Accepted at NeurIPS 2023. v2 adds SugarCrepe results and more ablations, v3 has minor fixes. v4 adds a code link ( https://github.com/google-research/big_vision ). v5 has minor fixes

  49. arXiv:2306.04431  [pdf, other

    cs.LG

    Faithful Knowledge Distillation

    Authors: Tom A. Lamb, Rudy Brunel, Krishnamurthy DJ Dvijotham, M. Pawan Kumar, Philip H. S. Torr, Francisco Eiras

    Abstract: Knowledge distillation (KD) has received much attention due to its success in compressing networks to allow for their deployment in resource-constrained systems. While the problem of adversarial robustness has been studied before in the KD setting, previous works overlook what we term the relative calibration of the student network with respect to its teacher in terms of soft confidences. In parti… ▽ More

    Submitted 11 August, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: 7pgs (main content), 4 figures

  50. arXiv:2305.16583  [pdf, other

    stat.ML cs.LG

    Detecting Errors in a Numerical Response via any Regression Model

    Authors: Hang Zhou, Jonas Mueller, Mayank Kumar, Jane-Ling Wang, Jing Lei

    Abstract: Noise plagues many numerical datasets, where the recorded values in the data may fail to match the true underlying values due to reasons including: erroneous sensors, data entry/processing mistakes, or imperfect human estimates. We consider general regression settings with covariates and a potentially corrupted response whose observed values may contain errors. By accounting for various uncertaint… ▽ More

    Submitted 12 March, 2024; v1 submitted 25 May, 2023; originally announced May 2023.