Skip to main content

Showing 1–18 of 18 results for author: Batra, S

  1. arXiv:2404.01295  [pdf, other

    cs.CL cs.AI

    Towards Safety and Helpfulness Balanced Responses via Controllable Large Language Models

    Authors: Yi-Lin Tuan, Xilun Chen, Eric Michael Smith, Louis Martin, Soumya Batra, Asli Celikyilmaz, William Yang Wang, Daniel M. Bikel

    Abstract: As large language models (LLMs) become easily accessible nowadays, the trade-off between safety and helpfulness can significantly impact user experience. A model that prioritizes safety will cause users to feel less engaged and assisted while prioritizing helpfulness will potentially cause harm. Possible harms include teaching people how to build a bomb, exposing youth to inappropriate content, an… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  2. arXiv:2312.03567  [pdf, other

    cs.CL

    XAIQA: Explainer-Based Data Augmentation for Extractive Question Answering

    Authors: Joel Stremmel, Ardavan Saeedi, Hamid Hassanzadeh, Sanjit Batra, Jeffrey Hertzberg, Jaime Murillo, Eran Halperin

    Abstract: Extractive question answering (QA) systems can enable physicians and researchers to query medical records, a foundational capability for designing clinical studies and understanding patient medical history. However, building these systems typically requires expert-annotated QA pairs. Large language models (LLMs), which can perform extractive QA, depend on high quality data in their prompts, specia… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 8 pages

    MSC Class: I.2.7

  3. arXiv:2311.13735  [pdf, other

    cs.CL

    Surpassing GPT-4 Medical Coding with a Two-Stage Approach

    Authors: Zhichao Yang, Sanjit Singh Batra, Joel Stremmel, Eran Halperin

    Abstract: Recent advances in large language models (LLMs) show potential for clinical applications, such as clinical decision support and trial recommendations. However, the GPT-4 LLM predicts an excessive number of ICD codes for medical coding tasks, leading to high recall but low precision. To tackle this challenge, we introduce LLM-codex, a two-stage approach to predict ICD codes that first generates evi… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 19 pages

  4. arXiv:2309.13285  [pdf, other

    cs.RO cs.AI cs.MA

    Collision Avoidance and Navigation for a Quadrotor Swarm Using End-to-end Deep Reinforcement Learning

    Authors: Zhehui Huang, Zhaojing Yang, Rahul Krupani, Baskın Şenbaşlar, Sumeet Batra, Gaurav S. Sukhatme

    Abstract: End-to-end deep reinforcement learning (DRL) for quadrotor control promises many benefits -- easy deployment, task generalization and real-time execution capability. Prior end-to-end DRL-based methods have showcased the ability to deploy learned controllers onto single quadrotors or quadrotor teams maneuvering in simple, obstacle-free environments. However, the addition of obstacles increases the… ▽ More

    Submitted 5 May, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

    Comments: Accepted to ICRA 2024

  5. arXiv:2307.09288  [pdf, other

    cs.CL cs.AI

    Llama 2: Open Foundation and Fine-Tuned Chat Models

    Authors: Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini , et al. (43 additional authors not shown)

    Abstract: In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be… ▽ More

    Submitted 19 July, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

  6. arXiv:2306.09537  [pdf, other

    cs.RO cs.AI cs.LG cs.MA eess.SY

    QuadSwarm: A Modular Multi-Quadrotor Simulator for Deep Reinforcement Learning with Direct Thrust Control

    Authors: Zhehui Huang, Sumeet Batra, Tao Chen, Rahul Krupani, Tushar Kumar, Artem Molchanov, Aleksei Petrenko, James A. Preiss, Zhaojing Yang, Gaurav S. Sukhatme

    Abstract: Reinforcement learning (RL) has shown promise in creating robust policies for robotics tasks. However, contemporary RL algorithms are data-hungry, often requiring billions of environment transitions to train successful policies. This necessitates the use of fast and highly-parallelizable simulators. In addition to speed, such simulators need to model the physics of the robots and their interaction… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: Paper published in ICRA 2023 Workshop: The Role of Robotics Simulators for Unmanned Aerial Vehicles. The workshop can be found in https://imrclab.github.io/workshop-uav-sims-icra2023/

  7. arXiv:2305.18738  [pdf, other

    cs.LG cs.AI cs.RO

    Generating Behaviorally Diverse Policies with Latent Diffusion Models

    Authors: Shashank Hegde, Sumeet Batra, K. R. Zentner, Gaurav S. Sukhatme

    Abstract: Recent progress in Quality Diversity Reinforcement Learning (QD-RL) has enabled learning a collection of behaviorally diverse, high performing policies. However, these methods typically involve storing thousands of policies, which results in high space-complexity and poor scaling to additional behaviors. Condensing the archive into a single model while retaining the performance and coverage of the… ▽ More

    Submitted 23 June, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

  8. arXiv:2305.13795  [pdf, other

    cs.LG cs.AI

    Proximal Policy Gradient Arborescence for Quality Diversity Reinforcement Learning

    Authors: Sumeet Batra, Bryon Tjanaka, Matthew C. Fontaine, Aleksei Petrenko, Stefanos Nikolaidis, Gaurav Sukhatme

    Abstract: Training generally capable agents that thoroughly explore their environment and learn new and diverse skills is a long-term goal of robot learning. Quality Diversity Reinforcement Learning (QD-RL) is an emerging research area that blends the best aspects of both fields -- Quality Diversity (QD) provides a principled form of exploration and produces collections of behaviorally diverse agents, while… ▽ More

    Submitted 29 January, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted as a spotlight paper at ICLR 2024

  9. arXiv:2305.04890  [pdf

    cs.IR

    Steam Recommendation System

    Authors: Samin Batra, Varun Sharma, Yurou Sun, Xinyao Wang, Yinyu Wang

    Abstract: We aim to leverage the interactions between users and items in the Steam community to build a game recommendation system that makes personalized suggestions to players in order to boost Steam's revenue as well as improve the users' gaming experience. The whole project is built on Apache Spark and deals with Big Data. The final output of the project is a recommendation system that gives a list of t… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: 6 pages, 7 figures, 8 tables

  10. arXiv:2204.06454  [pdf, other

    cs.CV

    DMCNet: Diversified Model Combination Network for Understanding Engagement from Video Screengrabs

    Authors: Sarthak Batra, Hewei Wang, Avishek Nag, Philippe Brodeur, Marianne Checkley, Annette Klinkert, Soumyabrata Dev

    Abstract: Engagement is an essential indicator of the Quality-of-Learning Experience (QoLE) and plays a major role in developing intelligent educational interfaces. The number of people learning through Massively Open Online Courses (MOOCs) and other online resources has been increasing rapidly because they provide us with the flexibility to learn from anywhere at any time. This provides a good learning exp… ▽ More

    Submitted 13 April, 2022; originally announced April 2022.

    Comments: Published in Systems and Soft Computing, 2022

  11. arXiv:2109.07735  [pdf, other

    cs.RO

    Decentralized Control of Quadrotor Swarms with End-to-end Deep Reinforcement Learning

    Authors: Sumeet Batra, Zhehui Huang, Aleksei Petrenko, Tushar Kumar, Artem Molchanov, Gaurav S. Sukhatme

    Abstract: We demonstrate the possibility of learning drone swarm controllers that are zero-shot transferable to real quadrotors via large-scale multi-agent end-to-end reinforcement learning. We train policies parameterized by neural networks that are capable of controlling individual drones in a swarm in a fully decentralized manner. Our policies, trained in simulated environments with realistic quadrotor p… ▽ More

    Submitted 20 November, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

    Comments: 14 pages, 11 figures

  12. arXiv:2109.06723  [pdf, other

    cs.LG

    Simulations in Recommender Systems: An industry perspective

    Authors: Lucas Bernardi, Sakshi Batra, Cintia Alicia Bruscantini

    Abstract: The construction of effective Recommender Systems (RS) is a complex process, mainly due to the nature of RSs which involves large scale software-systems and human interactions. Iterative development processes require deep understanding of a current baseline as well as the ability to estimate the impact of changes in multiple variables of interest. Simulations are well suited to address both challe… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: G pages

  13. arXiv:2011.03877  [pdf, other

    cs.CL

    Best Practices for Data-Efficient Modeling in NLG:How to Train Production-Ready Neural Models with Less Data

    Authors: Ankit Arun, Soumya Batra, Vikas Bhardwaj, Ashwini Challa, Pinar Donmez, Peyman Heidari, Hakan Inan, Shashank Jain, Anuj Kumar, Shawn Mei, Karthik Mohan, Michael White

    Abstract: Natural language generation (NLG) is a critical component in conversational systems, owing to its role of formulating a correct and natural text response. Traditionally, NLG components have been deployed using template-based solutions. Although neural network solutions recently developed in the research community have been shown to provide several benefits, deployment of such model-based solutions… ▽ More

    Submitted 7 November, 2020; originally announced November 2020.

    Comments: Accepted for publication in COLING 2020

  14. arXiv:1912.00951  [pdf, other

    cs.RO cs.CV

    Augmented Reality for Human-Swarm Interaction in a Swarm-Robotic Chemistry Simulation

    Authors: Sumeet Batra, John Klingner, Nikolaus Correll

    Abstract: We present a method to register individual members of a robotic swarm in an augmented reality display while showing relevant information about swarm dynamics to the user that would be otherwise hidden. Individual swarm members and clusters of the same group are identified by their color, and by blinking at a specific time interval that is distinct from the time interval at which their neighbors bl… ▽ More

    Submitted 2 December, 2019; originally announced December 2019.

  15. arXiv:1509.08112  [pdf, other

    cs.LG

    Feature Selection for classification of hyperspectral data by minimizing a tight bound on the VC dimension

    Authors: Phool Preet, Sanjit Singh Batra, Jayadeva

    Abstract: Hyperspectral data consists of large number of features which require sophisticated analysis to be extracted. A popular approach to reduce computational cost, facilitate information representation and accelerate knowledge discovery is to eliminate bands that do not improve the classification and analysis methods being applied. In particular, algorithms that perform band elimination should be desig… ▽ More

    Submitted 27 September, 2015; originally announced September 2015.

    Comments: basic papers are on http://www.jayadeva.net

    MSC Class: 68T05; 68T10; 68Q32 ACM Class: I.5.1, I.5.2, I.4

  16. arXiv:1501.02432  [pdf, ps, other

    cs.LG

    Learning a Fuzzy Hyperplane Fat Margin Classifier with Minimum VC dimension

    Authors: Jayadeva, Sanjit Singh Batra, Siddarth Sabharwal

    Abstract: The Vapnik-Chervonenkis (VC) dimension measures the complexity of a learning machine, and a low VC dimension leads to good generalization. The recently proposed Minimal Complexity Machine (MCM) learns a hyperplane classifier by minimizing an exact bound on the VC dimension. This paper extends the MCM classifier to the fuzzy domain. The use of a fuzzy membership is known to reduce the effect of out… ▽ More

    Submitted 11 January, 2015; originally announced January 2015.

    Comments: arXiv admin note: text overlap with arXiv:1410.4573

    MSC Class: 68T05; 68T10; 68Q32 ACM Class: I.5.1; I.5.2

  17. arXiv:1410.7372  [pdf, ps, other

    cs.LG

    Feature Selection through Minimization of the VC dimension

    Authors: Jayadeva, Sanjit S. Batra, Siddharth Sabharwal

    Abstract: Feature selection involes identifying the most relevant subset of input features, with a view to improving generalization of predictive models by reducing overfitting. Directly searching for the most relevant combination of attributes is NP-hard. Variable selection is of critical importance in many applications, such as micro-array data analysis, where selecting a small number of discriminative fe… ▽ More

    Submitted 27 October, 2014; originally announced October 2014.

    Comments: arXiv admin note: text overlap with arXiv:1410.4573

    MSC Class: 68T05; 68T10; 68Q32 ACM Class: I.5.1; I.5.2

  18. Learning a hyperplane regressor by minimizing an exact bound on the VC dimension

    Authors: Jayadeva, Suresh Chandra, Siddarth Sabharwal, Sanjit S. Batra

    Abstract: The capacity of a learning machine is measured by its Vapnik-Chervonenkis dimension, and learning machines with a low VC dimension generalize better. It is well known that the VC dimension of SVMs can be very large or unbounded, even though they generally yield state-of-the-art learning performance. In this paper, we show how to learn a hyperplane regressor by minimizing an exact, or \boldmath{… ▽ More

    Submitted 16 October, 2014; originally announced October 2014.

    Comments: see http://www.sciencedirect.com/science/article/pii/S0925231214010194 or arXiv:1408.2803 for background information

    MSC Class: 68T05; 68T10; 68Q32 ACM Class: I.5.1, I.5.2

    Journal ref: Neurocomputing, Volume 171, 1 January 2016, Pages 1610-1616, ISSN 0925-2312