Skip to main content

Showing 1–50 of 76 results for author: Prasad, A

  1. arXiv:2407.09180  [pdf, other

    cs.AR

    iMIV: in-Memory Integrity Verification for NVM

    Authors: Rajat Jain, Aravinda Prasad, Sreenivas Subramoney, Arkaprava Basu

    Abstract: Non-volatile Memory (NVM) could bridge the gap between memory and storage. However, NVMs are susceptible to data remanence attacks. Thus, multiple security metadata must persist along with the data to protect the confidentiality and integrity of NVM-resident data. Persisting Bonsai Merkel Tree (BMT) nodes, critical for data integrity, can add significant overheads due to need to write large amount… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.05467  [pdf, other

    cs.DC cs.AI

    The infrastructure powering IBM's Gen AI model development

    Authors: Talia Gershon, Seetharami Seelam, Brian Belgodere, Milton Bonilla, Lan Hoang, Danny Barnett, I-Hsin Chung, Apoorve Mohan, Ming-Hung Chen, Lixiang Luo, Robert Walkup, Constantinos Evangelinos, Shweta Salaria, Marc Dombrowa, Yoonho Park, Apo Kayi, Liran Schour, Alim Alim, Ali Sydney, Pavlos Maniotis, Laurent Schares, Bernard Metzler, Bengi Karacali-Akyamac, Sophia Wen, Tatsuhiro Chiba , et al. (121 additional authors not shown)

    Abstract: AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering effi… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Corresponding Authors: Talia Gershon, Seetharami Seelam,Brian Belgodere, Milton Bonilla

  3. arXiv:2405.08486  [pdf, other

    cs.LG

    Gradient Boosting Mapping for Dimensionality Reduction and Feature Extraction

    Authors: Anri Patron, Ayush Prasad, Hoang Phuc Hau Luu, Kai Puolamäki

    Abstract: A fundamental problem in supervised learning is to find a good set of features or distance measures. If the new set of features is of lower dimensionality and can be obtained by a simple transformation of the original data, they can make the model understandable, reduce overfitting, and even help to detect distribution drift. We propose a supervised dimensionality reduction method Gradient Boostin… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 32 pages, 8 figures, 5 tables

  4. arXiv:2405.07503  [pdf, other

    cs.RO cs.AI

    Consistency Policy: Accelerated Visuomotor Policies via Consistency Distillation

    Authors: Aaditya Prasad, Kevin Lin, Jimmy Wu, Linqi Zhou, Jeannette Bohg

    Abstract: Many robotic systems, such as mobile manipulators or quadrotors, cannot be equipped with high-end GPUs due to space, weight, and power constraints. These constraints prevent these systems from leveraging recent developments in visuomotor policy architectures that require high-end GPUs to achieve fast policy inference. In this paper, we propose Consistency Policy, a faster and similarly powerful al… ▽ More

    Submitted 28 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: https://consistency-policy.github.io/

  5. arXiv:2405.04324  [pdf, other

    cs.AI cs.CL cs.SE

    Granite Code Models: A Family of Open Foundation Models for Code Intelligence

    Authors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal , et al. (21 additional authors not shown)

    Abstract: Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabili… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Corresponding Authors: Rameswar Panda, Ruchir Puri; Equal Contributors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang

  6. arXiv:2404.13886  [pdf, other

    cs.OS cs.ET

    Taming Server Memory TCO with Multiple Software-Defined Compressed Tiers

    Authors: Sandeep Kumar, Aravinda Prasad, Sreenivas Subramoney

    Abstract: Memory accounts for 33 - 50% of the total cost of ownership (TCO) in modern data centers. We propose a novel solution to tame memory TCO through the novel creation and judicious management of multiple software-defined compressed memory tiers. As opposed to the state-of-the-art solutions that employ a 2-Tier solution, a single compressed tier along with DRAM, we define multiple compressed tiers i… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  7. arXiv:2402.13212  [pdf, other

    cs.CL cs.AI cs.LG

    Soft Self-Consistency Improves Language Model Agents

    Authors: Han Wang, Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal

    Abstract: Generations from large language models (LLMs) can be improved by sampling and scoring multiple solutions to select a final answer. Current "sample and select" methods such as self-consistency (SC) rely on majority voting to score answers. However, when tasks have many distinct and valid answers, selection by voting requires a large number of samples. This makes SC prohibitively expensive for inter… ▽ More

    Submitted 5 June, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: ACL 2024 Camera-Ready, the first three authors contributed equally; Code: https://github.com/HanNight/soft_self_consistency

  8. arXiv:2401.16467  [pdf, other

    cs.SE cs.AI cs.CL cs.LG cs.PL

    ReGAL: Refactoring Programs to Discover Generalizable Abstractions

    Authors: Elias Stengel-Eskin, Archiki Prasad, Mohit Bansal

    Abstract: While large language models (LLMs) are increasingly being used for program synthesis, they lack the global view needed to develop useful abstractions; they generally predict programs one at a time, often repeating the same functionality. Generating redundant code from scratch is both inefficient and error-prone. To address this, we propose Refactoring for Generalizable Abstraction Learning (ReGAL)… ▽ More

    Submitted 6 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: ICML 2024 Camera-Ready; First two authors contributed equally; Code: https://github.com/esteng/regal_program_learning

  9. arXiv:2312.15006  [pdf, other

    cs.AI cs.CL cs.LG

    Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities

    Authors: Yuhao Chen, Chloe Wong, Hanwen Yang, Juan Aguenza, Sai Bhujangari, Benthan Vu, Xun Lei, Amisha Prasad, Manny Fluss, Eric Phuong, Minghao Liu, Raja Kumar, Vanshika Vats, James Davis

    Abstract: This study critically evaluates the efficacy of prompting methods in enhancing the mathematical reasoning capability of large language models (LLMs). The investigation uses three prescriptive prompting methods - simple, persona, and conversational prompting - known for their effectiveness in enhancing the linguistic tasks of LLMs. We conduct this analysis on OpenAI's LLM chatbot, ChatGPT-3.5, on e… ▽ More

    Submitted 20 February, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

  10. arXiv:2312.14750  [pdf, other

    cs.AR

    Siracusa: A 16 nm Heterogenous RISC-V SoC for Extended Reality with At-MRAM Neural Engine

    Authors: Arpan Suravi Prasad, Moritz Scherer, Francesco Conti, Davide Rossi, Alfio Di Mauro, Manuel Eggimann, Jorge Tómas Gómez, Ziyun Li, Syed Shakib Sarwar, Zhao Wang, Barbara De Salvo, Luca Benini

    Abstract: Extended reality (XR) applications are Machine Learning (ML)-intensive, featuring deep neural networks (DNNs) with millions of weights, tightly latency-bound (10-20 ms end-to-end), and power-constrained (low tens of mW average power). While ML performance and efficiency can be achieved by introducing neural engines within low-power systems-on-chip (SoCs), system-level power for nontrivial DNNs dep… ▽ More

    Submitted 14 April, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: Final accepted manuscript pre-print submitted to the IEEE Journal of Solid-State Circuits

  11. arXiv:2311.13821  [pdf, other

    cs.LG cs.AI cs.CE stat.AP

    HypUC: Hyperfine Uncertainty Calibration with Gradient-boosted Corrections for Reliable Regression on Imbalanced Electrocardiograms

    Authors: Uddeshya Upadhyay, Sairam Bade, Arjun Puranik, Shahir Asfahan, Melwin Babu, Francisco Lopez-Jimenez, Samuel J. Asirvatham, Ashim Prasad, Ajit Rajasekharan, Samir Awasthi, Rakesh Barve

    Abstract: The automated analysis of medical time series, such as the electrocardiogram (ECG), electroencephalogram (EEG), pulse oximetry, etc, has the potential to serve as a valuable tool for diagnostic decisions, allowing for remote monitoring of patients and more efficient use of expensive and time-consuming medical procedures. Deep neural networks (DNNs) have been demonstrated to process such signals ef… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

    Comments: Published at TMLR

    Journal ref: Transactions on Machine Learning Research (TMLR), 2023

  12. arXiv:2311.10275  [pdf, other

    cs.OS cs.AR cs.DB cs.DC

    Telescope: Telemetry at Terabyte Scale

    Authors: Alan Nair, Sandeep Kumar, Aravinda Prasad, Andy Rudoff, Sreenivas Subramoney

    Abstract: Data-hungry applications that require terabytes of memory have become widespread in recent years. To meet the memory needs of these applications, data centers are embracing tiered memory architectures with near and far memory tiers. Precise, efficient, and timely identification of hot and cold data and their placement in appropriate tiers is critical for performance in such systems. Unfortunately,… ▽ More

    Submitted 29 November, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

  13. arXiv:2311.05772  [pdf, other

    cs.AI cs.CL cs.LG

    ADaPT: As-Needed Decomposition and Planning with Language Models

    Authors: Archiki Prasad, Alexander Koller, Mareike Hartmann, Peter Clark, Ashish Sabharwal, Mohit Bansal, Tushar Khot

    Abstract: Large Language Models (LLMs) are increasingly being used for interactive decision-making tasks requiring planning and adapting to the environment. Recent works employ LLMs-as-agents in broadly two ways: iteratively determining the next action (iterative executors) or generating plans and executing sub-tasks using LLMs (plan-and-execute). However, these methods struggle with task complexity, as the… ▽ More

    Submitted 8 April, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: NAACL 2024 (findings) camera-ready. Project Page: https://allenai.github.io/adaptllm

  14. arXiv:2310.05861  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    Rephrase, Augment, Reason: Visual Grounding of Questions for Vision-Language Models

    Authors: Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal

    Abstract: An increasing number of vision-language tasks can be handled with little to no training, i.e., in a zero and few-shot manner, by marrying large language models (LLMs) to vision encoders, resulting in large vision-language models (LVLMs). While this has huge upsides, such as not requiring training data or custom architectures, how an input is presented to an LVLM can have a major impact on zero-sho… ▽ More

    Submitted 2 April, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 camera-ready (23 pages), Code: https://github.com/archiki/RepARe

  15. arXiv:2310.03370  [pdf, other

    cs.OS

    Motivating Next-Generation OS Physical Memory Management for Terabyte-Scale NVMMs

    Authors: Shivank Garg, Aravinda Prasad, Debadatta Mishra, Sreenivas Subramoney

    Abstract: Software managed byte-addressable hybrid memory systems consisting of DRAMs and NVMMs offer a lot of flexibility to design efficient large scale data processing applications. Operating systems (OS) play an important role in enabling the applications to realize the integrated benefits of DRAMs' low access latency and NVMMs' large capacity along with its persistent characteristics. In this paper, we… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: 14 pages, 24 figures, 2 tables

    ACM Class: D.4.8

  16. arXiv:2308.07473  [pdf, other

    cs.GT

    On Supermodular Contracts and Dense Subgraphs

    Authors: Ramiro Deo-Campo Vuong, Shaddin Dughmi, Neel Patel, Aditya Prasad

    Abstract: We study the combinatorial contract design problem, introduced and studied by Dutting et. al. (2021, 2022), in both the single and multi-agent settings. Prior work has examined the problem when the principal's utility function is submodular in the actions chosen by the agent(s). We complement this emerging literature with an examination of the problem when the principal's utility is supermodular… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

    Comments: 31 pages, 2 figures

  17. arXiv:2307.05911  [pdf, other

    cond-mat.mtrl-sci cs.LG

    Grain and Grain Boundary Segmentation using Machine Learning with Real and Generated Datasets

    Authors: Peter Warren, Nandhini Raju, Abhilash Prasad, Shajahan Hossain, Ramesh Subramanian, Jayanta Kapat, Navin Manjooran, Ranajay Ghosh

    Abstract: We report significantly improved accuracy of grain boundary segmentation using Convolutional Neural Networks (CNN) trained on a combination of real and generated data. Manual segmentation is accurate but time-consuming, and existing computational methods are faster but often inaccurate. To combat this dilemma, machine learning models can be used to achieve the accuracy of manual segmentation and h… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

  18. arXiv:2305.16798  [pdf, other

    cs.CL cs.AI

    Schema-Guided User Satisfaction Modeling for Task-Oriented Dialogues

    Authors: Yue Feng, Yunlong Jiao, Animesh Prasad, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai

    Abstract: User Satisfaction Modeling (USM) is one of the popular choices for task-oriented dialogue systems evaluation, where user satisfaction typically depends on whether the user's task goals were fulfilled by the system. Task-oriented dialogue systems use task schema, which is a set of task attributes, to encode the user's task goals. Existing studies on USM neglect explicitly modeling the user's task g… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  19. arXiv:2305.01155  [pdf, other

    eess.AS cs.CL cs.HC cs.SD

    Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding

    Authors: Juan Zuluaga-Gomez, Iuliia Nigmatulina, Amrutha Prasad, Petr Motlicek, Driss Khalil, Srikanth Madikeri, Allan Tart, Igor Szoke, Vincent Lenders, Mickael Rigault, Khalid Choukri

    Abstract: Voice communication between air traffic controllers (ATCos) and pilots is critical for ensuring safe and efficient air traffic control (ATC). This task requires high levels of awareness from ATCos and can be tedious and error-prone. Recent attempts have been made to integrate artificial intelligence (AI) into ATC in order to reduce the workload of ATCos. However, the development of data-driven AI… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

    Comments: Manuscript under review

  20. arXiv:2304.10703  [pdf, other

    cs.CL cs.AI cs.LG

    ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness

    Authors: Archiki Prasad, Swarnadeep Saha, Xiang Zhou, Mohit Bansal

    Abstract: Multi-step reasoning ability is fundamental to many natural language tasks, yet it is unclear what constitutes a good reasoning chain and how to evaluate them. Most existing methods focus solely on whether the reasoning chain leads to the correct conclusion, but this answer-oriented view may confound reasoning quality with other spurious shortcuts to predict the answer. To bridge this gap, we eval… ▽ More

    Submitted 30 November, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: EMNLP 2023 camera-ready (21 pages)

  21. arXiv:2304.07842  [pdf, other

    eess.AS cs.AI cs.HC

    A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers

    Authors: Juan Zuluaga-Gomez, Amrutha Prasad, Iuliia Nigmatulina, Petr Motlicek, Matthias Kleinert

    Abstract: In this paper we propose a novel virtual simulation-pilot engine for speeding up air traffic controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI) based tools. The virtual simulation-pilot engine receives spoken communications from ATCo trainees, and it performs automatic speech recognition and understanding. Thus, it goes beyond only transcribing the co… ▽ More

    Submitted 16 April, 2023; originally announced April 2023.

    Comments: Under review

  22. arXiv:2302.05941  [pdf, other

    cs.SE cs.AI

    Rapid Development of Compositional AI

    Authors: Lee Martie, Jessie Rosenberg, Veronique Demers, Gaoyuan Zhang, Onkar Bhardwaj, John Henning, Aditya Prasad, Matt Stallone, Ja Young Lee, Lucy Yip, Damilola Adesina, Elahe Paikari, Oscar Resendiz, Sarah Shaw, David Cox

    Abstract: Compositional AI systems, which combine multiple artificial intelligence components together with other application components to solve a larger problem, have no known pattern of development and are often approached in a bespoke and ad hoc style. This makes development slower and harder to reuse for future applications. To support the full rapid development cycle of compositional AI applications,… ▽ More

    Submitted 12 February, 2023; originally announced February 2023.

    Comments: Accepted to ICSE 2023, NIER track

    Journal ref: 2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Technologies Results Track (ICSE-NIER), Melbourne, Australia, 2023, pp. (forthcoming)

  23. arXiv:2302.00095  [pdf, ps, other

    cs.AR cs.CR cs.ET

    XCRYPT: Accelerating Lattice Based Cryptography with Memristor Crossbar Arrays

    Authors: Sarabjeet Singh, Xiong Fan, Ananth Krishna Prasad, Lin Jia, Anirban Nag, Rajeev Balasubramonian, Mahdi Nazm Bojnordi, Elaine Shi

    Abstract: This paper makes a case for accelerating lattice-based post quantum cryptography (PQC) with memristor based crossbars, and shows that these inherently error-tolerant algorithms are a good fit for noisy analog MAC operations in crossbars. We compare different NIST round-3 lattice-based candidates for PQC, and identify that SABER is not only a front-runner when executing on traditional systems, but… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

  24. arXiv:2212.08754  [pdf, other

    cs.CR

    A systematic literature review on Internet of Vehicles Security

    Authors: Priyank Sharma, Meet Patel, Apoorva Prasad

    Abstract: The Internet of Vehicles IoV commonly referred to as connected automobiles is a vast network that connects various entities including users sensors and vehicles They will connect across a network to lessen traffic accidents and improve both the security and safety of smart vehicles The Internet of Vehicles is subject to a wide variety of threats including spoofing attacks recognition attacks priva… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

    Comments: This article have 10 pages and 6 figures

    ACM Class: A.1

  25. arXiv:2212.07164  [pdf, other

    cs.CL cs.AI cs.LG eess.AS

    Speech and Natural Language Processing Technologies for Pseudo-Pilot Simulator

    Authors: Amrutha Prasad, Juan Zuluaga-Gomez, Petr Motlicek, Saeed Sarfjoo, Iuliia Nigmatulina, Karel Vesely

    Abstract: This paper describes a simple yet efficient repetition-based modular system for speeding up air-traffic controllers (ATCos) training. E.g., a human pilot is still required in EUROCONTROL's ESCAPE lite simulator (see https://www.eurocontrol.int/simulator/escape) during ATCo training. However, this need can be substituted by an automatic system that could act as a pilot. In this paper, we aim to dev… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

    Comments: Presented at Sesar Innovation Days 2022. https://www.sesarju.eu/sesarinnovationdays

  26. arXiv:2211.04054  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications

    Authors: Juan Zuluaga-Gomez, Karel Veselý, Igor Szöke, Alexander Blatt, Petr Motlicek, Martin Kocour, Mickael Rigault, Khalid Choukri, Amrutha Prasad, Seyyed Saeed Sarfjoo, Iuliia Nigmatulina, Claudia Cevenini, Pavel Kolčárek, Allan Tart, Jan Černocký, Dietrich Klakow

    Abstract: Personal assistants, automatic speech recognizers and dialogue understanding systems are becoming more critical in our interconnected digital world. A clear example is air traffic control (ATC) communications. ATC aims at guiding aircraft and controlling the airspace in a safe and optimal manner. These voice-based dialogues are carried between an air traffic controller (ATCO) and pilots via very-h… ▽ More

    Submitted 15 June, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

    Comments: Manuscript under review; The code is available at: https://github.com/idiap/atco2-corpus

  27. arXiv:2208.00102  [pdf, other

    cs.HC cs.IR

    An Open Source Interactive Visual Analytics Tool for Comparative Programming Comprehension

    Authors: Ayush Kumar, Ashish Kumar, Aakanksha Prasad, Michael Burch, Shenghui Cheng, Klaus Mueller

    Abstract: This paper proposes an open source visual analytics tool consisting of several views and perspectives on eye movement data collected during code reading tasks when writing computer programs. Hence the focus of this work is on code and program comprehension. The source code is shown as a visual stimulus. It can be inspected in combination with overlaid scanpaths in which the saccades can be visuall… ▽ More

    Submitted 29 July, 2022; originally announced August 2022.

    Comments: 9 pages, 15 figures

  28. arXiv:2204.10687  [pdf, other

    cs.AR

    SNE: an Energy-Proportional Digital Accelerator for Sparse Event-Based Convolutions

    Authors: Alfio Di Mauro, Arpan Suravi Prasad, Zhikai Huang, Matteo Spallanzani, Francesco Conti, Luca Benini

    Abstract: Event-based sensors are drawing increasing attention due to their high temporal resolution, low power consumption, and low bandwidth. To efficiently extract semantically meaningful information from sparse data streams produced by such sensors, we present a 4.5TOP/s/W digital accelerator capable of performing 4-bits-quantized event-based convolutional neural networks (eCNN). Compared to standard co… ▽ More

    Submitted 29 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: Accepted at DATE22

  29. arXiv:2203.16822  [pdf, other

    eess.AS cs.CL cs.LG

    How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications

    Authors: Juan Zuluaga-Gomez, Amrutha Prasad, Iuliia Nigmatulina, Saeed Sarfjoo, Petr Motlicek, Matthias Kleinert, Hartmut Helmke, Oliver Ohneiser, Qingran Zhan

    Abstract: Recent work on self-supervised pre-training focus on leveraging large-scale unlabeled speech data to build robust end-to-end (E2E) acoustic models (AM) that can be later fine-tuned on downstream tasks e.g., automatic speech recognition (ASR). Yet, few works investigated the impact on performance when the data properties substantially differ between the pre-training and fine-tuning phases, termed d… ▽ More

    Submitted 17 October, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: To be published in the 2022 IEEE Spoken Language Technology Workshop (SLT) (SLT 2022)

  30. arXiv:2203.07281  [pdf, other

    cs.CL cs.AI cs.LG

    GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models

    Authors: Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal

    Abstract: Providing natural language instructions in prompts is a useful new paradigm for improving task performance of large language models in a zero-shot setting. Recent work has aimed to improve such prompts via manual rewriting or gradient-based tuning. However, manual rewriting is time-consuming and requires subjective interpretation, while gradient-based tuning can be extremely computationally demand… ▽ More

    Submitted 26 April, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: EACL 2023 (20 pages)

  31. arXiv:2202.06409  [pdf, other

    eess.AS cs.CL cs.LG

    Distribution augmentation for low-resource expressive text-to-speech

    Authors: Mateusz Lajszczak, Animesh Prasad, Arent van Korlaar, Bajibabu Bollepalli, Antonio Bonafonte, Arnaud Joly, Marco Nicolis, Alexis Moinet, Thomas Drugman, Trevor Wood, Elena Sokolova

    Abstract: This paper presents a novel data augmentation technique for text-to-speech (TTS), that allows to generate new (text, audio) training examples without requiring any additional data. Our goal is to increase diversity of text conditionings available during training. This helps to reduce overfitting, especially in low-resource settings. Our method relies on substituting text and audio fragments in a w… ▽ More

    Submitted 19 February, 2022; v1 submitted 13 February, 2022; originally announced February 2022.

    Comments: ICASSP 2022: camera-ready

  32. arXiv:2202.03725  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    A two-step approach to leverage contextual data: speech recognition in air-traffic communications

    Authors: Iuliia Nigmatulina, Juan Zuluaga-Gomez, Amrutha Prasad, Seyyed Saeed Sarfjoo, Petr Motlicek

    Abstract: Automatic Speech Recognition (ASR), as the assistance of speech communication between pilots and air-traffic controllers, can significantly reduce the complexity of the task and increase the reliability of transmitted information. ASR application can lead to a lower number of incidents caused by misunderstanding and improve air traffic management (ATM) efficiency. Evidently, high accuracy predicti… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

    Comments: 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. arXiv admin note: text overlap with arXiv:2108.12156

    Journal ref: ICASSP 2022

  33. arXiv:2202.03406  [pdf, other

    stat.ML cs.LG q-fin.CP q-fin.RM stat.AP stat.CO

    Dependence model assessment and selection with DecoupleNets

    Authors: Marius Hofert, Avinash Prasad, Mu Zhu

    Abstract: Neural networks are suggested for learning a map from $d$-dimensional samples with any underlying dependence structure to multivariate uniformity in $d'$ dimensions. This map, termed DecoupleNet, is used for dependence model assessment and selection. If the data-generating dependence model was known, and if it was among the few analytically tractable ones, one such transformation for $d'=d$ is Ros… ▽ More

    Submitted 5 October, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    MSC Class: 62H99; 65C60; 60E05; 62M45; 00A72; 65C10; 62M10

  34. arXiv:2112.03377  [pdf, other

    cs.LG stat.CO stat.ME

    RafterNet: Probabilistic predictions in multi-response regression

    Authors: Marius Hofert, Avinash Prasad, Mu Zhu

    Abstract: A fully nonparametric approach for making probabilistic predictions in multi-response regression problems is introduced. Random forests are used as marginal models for each response variable and, as novel contribution of the present work, the dependence between the multiple response variables is modeled by a generative neural network. This combined modeling approach of random forests, correspondin… ▽ More

    Submitted 11 October, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

  35. arXiv:2110.05781  [pdf, other

    eess.AS cs.CL cs.LG

    BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications

    Authors: Juan Zuluaga-Gomez, Seyyed Saeed Sarfjoo, Amrutha Prasad, Iuliia Nigmatulina, Petr Motlicek, Karel Ondrej, Oliver Ohneiser, Hartmut Helmke

    Abstract: Automatic speech recognition (ASR) allows transcribing the communications between air traffic controllers (ATCOs) and aircraft pilots. The transcriptions are used later to extract ATC named entities, e.g., aircraft callsigns. One common challenge is speech activity detection (SAD) and speaker diarization (SD). In the failure condition, two or more segments remain in the same recording, jeopardizin… ▽ More

    Submitted 14 October, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: To be published in the 2022 IEEE Spoken Language Technology Workshop (SLT) (SLT 2022)

  36. arXiv:2110.01532  [pdf, other

    cs.LG stat.ML

    Differentiable Spline Approximations

    Authors: Minsu Cho, Aditya Balu, Ameya Joshi, Anjana Deva Prasad, Biswajit Khara, Soumik Sarkar, Baskar Ganapathysubramanian, Adarsh Krishnamurthy, Chinmay Hegde

    Abstract: The paradigm of differentiable programming has significantly enhanced the scope of machine learning via the judicious use of gradient-based optimization. However, standard differentiable programming methods (such as autodiff) typically require that the machine learning models be differentiable, limiting their applicability. Our goal in this paper is to use a new, principled approach to extend grad… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

    Comments: 9 pages, accepted in Neurips 2021

  37. arXiv:2109.11066  [pdf, other

    cs.CV cs.LG

    A two-step machine learning approach for crop disease detection: an application of GAN and UAV technology

    Authors: Aaditya Prasad, Nikhil Mehta, Matthew Horak, Wan D. Bae

    Abstract: Automated plant diagnosis is a technology that promises large increases in cost-efficiency for agriculture. However, multiple problems reduce the effectiveness of drones, including the inverse relationship between resolution and speed and the lack of adequate labeled training data. This paper presents a two-step machine learning approach that analyzes low-fidelity and high-fidelity images in seque… ▽ More

    Submitted 18 September, 2021; originally announced September 2021.

    Comments: 13 pages, 5 figures Preprint of an article submitted for consideration in the International Journal on Artificial Intelligence Tools, 2021, World Scientific Publishing Company, https://www.worldscientific.com/worldscinet/ijait

    ACM Class: I.2.6; I.2.10

  38. arXiv:2108.12175  [pdf, other

    cs.CL cs.LG eess.AS

    Grammar Based Speaker Role Identification for Air Traffic Control Speech Recognition

    Authors: Amrutha Prasad, Juan Zuluaga-Gomez, Petr Motlicek, Saeed Sarfjoo, Iuliia Nigmatulina, Oliver Ohneiser, Hartmut Helmke

    Abstract: Automatic Speech Recognition (ASR) for air traffic control is generally trained by pooling Air Traffic Controller (ATCO) and pilot data into one set. This is motivated by the fact that pilot's voice communications are more scarce than ATCOs. Due to this data imbalance and other reasons (e.g., varying acoustic conditions), the speech from ATCOs is usually recognized more accurately than from pilots… ▽ More

    Submitted 14 December, 2022; v1 submitted 27 August, 2021; originally announced August 2021.

    Comments: Presented at Sesar Innovation Days - 2022. See https://www.sesarju.eu/sesarinnovationdays

  39. arXiv:2108.11483  [pdf, other

    cs.LG math.OC stat.ML

    Heavy-tailed Streaming Statistical Estimation

    Authors: Che-Ping Tsai, Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Ravikumar

    Abstract: We consider the task of heavy-tailed statistical estimation given streaming $p$-dimensional samples. This could also be viewed as stochastic optimization under heavy-tailed distributions, with an additional $O(p)$ space complexity constraint. We design a clipped stochastic gradient descent algorithm and provide an improved analysis, under a more nuanced condition on the noise of the stochastic gra… ▽ More

    Submitted 25 February, 2022; v1 submitted 25 August, 2021; originally announced August 2021.

  40. arXiv:2108.08497  [pdf, other

    cs.AR eess.SY

    Monarch: A Durable Polymorphic Memory For Data Intensive Applications

    Authors: Ananth Krishna Prasad, Mahdi Nazm Bojnordi

    Abstract: 3D die stacking has often been proposed to build large-scale DRAM-based caches. Unfortunately, the power and performance overheads of DRAM limit the efficiency of high-bandwidth memories. Also, DRAM is facing serious scalability challenges that make alternative technologies more appealing. This paper examines Monarch, a resistive 3D stacked memory based on a novel reconfigurable crosspoint array c… ▽ More

    Submitted 19 August, 2021; originally announced August 2021.

    Comments: Submitted to IEEE TC

    ACM Class: B.3; E.2

  41. arXiv:2107.09931  [pdf, other

    cs.CL cs.LG

    The Effectiveness of Intermediate-Task Training for Code-Switched Natural Language Understanding

    Authors: Archiki Prasad, Mohammad Ali Rehan, Shreya Pathak, Preethi Jyothi

    Abstract: While recent benchmarks have spurred a lot of new work on improving the generalization of pretrained multilingual language models on multilingual tasks, techniques to improve code-switched natural language understanding tasks have been far less explored. In this work, we propose the use of bilingual intermediate pretraining as a reliable technique to derive large and consistent performance gains o… ▽ More

    Submitted 21 July, 2021; originally announced July 2021.

  42. arXiv:2105.07656  [pdf, other

    physics.flu-dyn cs.GR

    Thin-Film Smoothed Particle Hydrodynamics Fluid

    Authors: Mengdi Wang, Yitong Deng, Xiangxin Kong, Aditya H. Prasad, Shiying Xiong, Bo Zhu

    Abstract: We propose a particle-based method to simulate thin-film fluid that jointly facilitates aggressive surface deformation and vigorous tangential flows. We build our dynamics model from the surface tension driven Navier-Stokes equation with the dimensionality reduced using the asymptotic lubrication theory and customize a set of differential operators based on the weakly compressible Smoothed Particl… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: SIGGRAPH 2021 Technical Paper

  43. arXiv:2104.14547  [pdf, other

    cs.LG cs.CV

    NURBS-Diff: A Differentiable Programming Module for NURBS

    Authors: Anjana Deva Prasad, Aditya Balu, Harshil Shah, Soumik Sarkar, Chinmay Hegde, Adarsh Krishnamurthy

    Abstract: Boundary representations (B-reps) using Non-Uniform Rational B-splines (NURBS) are the de facto standard used in CAD, but their utility in deep learning-based approaches is not well researched. We propose a differentiable NURBS module to integrate NURBS representations of CAD models with deep learning methods. We mathematically define the derivatives of the NURBS curves or surfaces with respect to… ▽ More

    Submitted 13 January, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

  44. arXiv:2104.03643  [pdf, other

    cs.CL cs.CV cs.LG eess.AS

    Contextual Semi-Supervised Learning: An Approach To Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems

    Authors: Juan Zuluaga-Gomez, Iuliia Nigmatulina, Amrutha Prasad, Petr Motlicek, Karel Veselý, Martin Kocour, Igor Szöke

    Abstract: Air traffic management and specifically air-traffic control (ATC) rely mostly on voice communications between Air Traffic Controllers (ATCos) and pilots. In most cases, these voice communications follow a well-defined grammar that could be leveraged in Automatic Speech Recognition (ASR) technologies. The callsign used to address an airplane is an essential part of all ATCo-pilot communications. We… ▽ More

    Submitted 27 August, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

    Comments: Presented at: Interspeech conference 2021 (Brno, Czechia, August 30 - September 3)

  45. arXiv:2103.10779  [pdf, other

    cs.DC cs.OS cs.PF

    Page Table Management for Heterogeneous Memory Systems

    Authors: Sandeep Kumar, Aravinda Prasad, Smruti R. Sarangi, Sreenivas Subramoney

    Abstract: Modern enterprise servers are increasingly embracing tiered memory systems with a combination of low latency DRAMs and large capacity but high latency non-volatile main memories (NVMMs) such as Intel's Optane DC PMM. Prior works have focused on efficient placement and migration of data on a tiered memory system, but have not studied the optimal placement of page tables. Explicit and efficient pl… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

  46. arXiv:2102.10264  [pdf, other

    cs.LG cs.RO stat.ML

    On Proximal Policy Optimization's Heavy-tailed Gradients

    Authors: Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, J. Zico Kolter, Zachary C. Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar

    Abstract: Modern policy gradient algorithms such as Proximal Policy Optimization (PPO) rely on an arsenal of heuristics, including loss clipping and gradient clipping, to ensure successful learning. These heuristics are reminiscent of techniques from robust statistics, commonly used for estimation in outlier-rich (``heavy-tailed'') regimes. In this paper, we present a detailed empirical study to characteriz… ▽ More

    Submitted 12 July, 2021; v1 submitted 20 February, 2021; originally announced February 2021.

    Comments: ICML 2021

  47. arXiv:2102.06237  [pdf, other

    eess.AS cs.LG cs.SD

    An Investigation of End-to-End Models for Robust Speech Recognition

    Authors: Archiki Prasad, Preethi Jyothi, Rajbabu Velmurugan

    Abstract: End-to-end models for robust automatic speech recognition (ASR) have not been sufficiently well-explored in prior work. With end-to-end models, one could choose to preprocess the input speech using speech enhancement techniques and train the model using enhanced speech. Another alternative is to pass the noisy speech as input and modify the model architecture to adapt to noisy speech. A systematic… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

    Comments: Accepted to appear at ICASSP 2021

  48. arXiv:2101.04248  [pdf, other

    cs.CG math.GN

    Photo2CAD: Automated 3D solid reconstruction from 2D drawings using OpenCV

    Authors: Ajay B. Harish, Abhishek Rajendra Prasad

    Abstract: This study showcases the utilisation of OpenCV for extracting features from photos of 2D engineering drawings. These features are then employed to reconstruct 3D CAD models in SCAD format and generate 3D point cloud data similar to LIDAR scans. Many historical mechanical, aerospace, and civil engineering designs exist only as drawings, lacking software-generated CAD or BIM models. While 2D to 3D c… ▽ More

    Submitted 8 September, 2023; v1 submitted 11 January, 2021; originally announced January 2021.

  49. arXiv:2012.08036  [pdf, other

    stat.ML cs.LG stat.AP

    Applications of multivariate quasi-random sampling with neural networks

    Authors: Marius Hofert, Avinash Prasad, Mu Zhu

    Abstract: Generative moment matching networks (GMMNs) are suggested for modeling the cross-sectional dependence between stochastic processes. The stochastic processes considered are geometric Brownian motions and ARMA-GARCH models. Geometric Brownian motions lead to an application of pricing American basket call options under dependence and ARMA-GARCH models lead to an application of simulating predictive d… ▽ More

    Submitted 27 August, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

    Comments: 17 pages, 5 figures

  50. arXiv:2009.12961  [pdf, other

    eess.SY cs.IT

    Decentralized Age-of-Information Bandits

    Authors: Archiki Prasad, Vishal Jain, Sharayu Moharir

    Abstract: Age-of-Information (AoI) is a performance metric for scheduling systems that measures the freshness of the data available at the intended destination. AoI is formally defined as the time elapsed since the destination received the recent most update from the source. We consider the problem of scheduling to minimize the cumulative AoI in a multi-source multi-channel setting. Our focus is on the sett… ▽ More

    Submitted 18 January, 2021; v1 submitted 27 September, 2020; originally announced September 2020.

    Comments: Long-form version of paper accepted at IEEE WCNC 2021