Skip to main content

Showing 1–27 of 27 results for author: Padhi, I

  1. arXiv:2407.06323  [pdf, ps, other

    cs.CL

    When in Doubt, Cascade: Towards Building Efficient and Capable Guardrails

    Authors: Manish Nagireddy, Inkit Padhi, Soumya Ghosh, Prasanna Sattigeri

    Abstract: Large language models (LLMs) have convincing performance in a variety of downstream tasks. However, these systems are prone to generating undesirable outputs such as harmful and biased text. In order to remedy such generations, the development of guardrail (or detector) models has gained traction. Motivated by findings from developing a detector for social bias, we adopt the notion of a use-mentio… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2406.13805  [pdf, other

    cs.CL cs.AI cs.LG

    WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia

    Authors: Yufang Hou, Alessandra Pascale, Javier Carnerero-Cano, Tigran Tchrakian, Radu Marinescu, Elizabeth Daly, Inkit Padhi, Prasanna Sattigeri

    Abstract: Retrieval-augmented generation (RAG) has emerged as a promising solution to mitigate the limitations of large language models (LLMs), such as hallucinations and outdated information. However, it remains unclear how LLMs handle knowledge conflicts arising from different augmented retrieved passages, especially when these passages originate from the same source and have equal trustworthiness. In thi… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  3. arXiv:2406.11780  [pdf, other

    cs.LG cs.AI cs.CL

    Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs

    Authors: Swanand Ravindra Kadhe, Farhan Ahmed, Dennis Wei, Nathalie Baracaldo, Inkit Padhi

    Abstract: Large language models (LLMs) have shown to pose social and ethical risks such as generating toxic language or facilitating malicious use of hazardous knowledge. Machine unlearning is a promising approach to improve LLM safety by directly removing harmful behaviors and knowledge. In this paper, we propose "SPlit, UNlearn, MerGE" (SPUNGE), a framework that can be used with any unlearning method to a… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  4. arXiv:2403.12805  [pdf, other

    cs.AI cs.CL

    Contextual Moral Value Alignment Through Context-Based Aggregation

    Authors: Pierre Dognin, Jesus Rios, Ronny Luss, Inkit Padhi, Matthew D Riemer, Miao Liu, Prasanna Sattigeri, Manish Nagireddy, Kush R. Varshney, Djallel Bouneffouf

    Abstract: Developing value-aligned AI agents is a complex undertaking and an ongoing challenge in the field of AI. Specifically within the domain of Large Language Models (LLMs), the capability to consolidate multiple independently trained dialogue agents, each aligned with a distinct moral value, into a unified system that can adapt to and be aligned with multiple moral values is of paramount importance. I… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  5. arXiv:2403.09704  [pdf, other

    cs.CL cs.AI cs.LG

    Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

    Authors: Swapnaja Achintalwar, Ioana Baldini, Djallel Bouneffouf, Joan Byamugisha, Maria Chang, Pierre Dognin, Eitan Farchi, Ndivhuwo Makondo, Aleksandra Mojsilovic, Manish Nagireddy, Karthikeyan Natesan Ramamurthy, Inkit Padhi, Orna Raz, Jesus Rios, Prasanna Sattigeri, Moninder Singh, Siphiwe Thwala, Rosario A. Uceda-Sosa, Kush R. Varshney

    Abstract: The alignment of large language models is usually done by model providers to add or control behaviors that are common or universally understood across use cases and contexts. In contrast, in this article, we present an approach and architecture that empowers application developers to tune a model to their particular values, social norms, laws and other regulations, and orchestrate between potentia… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 7 pages, 5 figures

  6. arXiv:2403.06009  [pdf, other

    cs.LG

    Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

    Authors: Swapnaja Achintalwar, Adriana Alvarado Garcia, Ateret Anaby-Tavor, Ioana Baldini, Sara E. Berger, Bishwaranjan Bhattacharjee, Djallel Bouneffouf, Subhajit Chaudhury, Pin-Yu Chen, Lamogha Chiazor, Elizabeth M. Daly, Kirushikesh DB, Rogério Abreu de Paula, Pierre Dognin, Eitan Farchi, Soumya Ghosh, Michael Hind, Raya Horesh, George Kour, Ja Young Lee, Nishtha Madaan, Sameep Mehta, Erik Miehling, Keerthiram Murugesan, Manish Nagireddy , et al. (13 additional authors not shown)

    Abstract: Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output to biased and toxic generations. Due to several limiting factors surrounding LLMs (training cost, API access, data availability, etc.), it may not always be feasible to impose direct safety constraints on a deployed model. Therefore, an efficient and reliable alternative is required. To this end, we presen… ▽ More

    Submitted 13 June, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  7. arXiv:2305.19466  [pdf, other

    cs.CL cs.AI cs.LG

    The Impact of Positional Encoding on Length Generalization in Transformers

    Authors: Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy

    Abstract: Length generalization, the ability to generalize from small training context sizes to larger ones, is a critical challenge in the development of Transformer-based language models. Positional encoding (PE) has been identified as a major factor influencing length generalization, but the exact impact of different PE schemes on extrapolation in downstream tasks remains unclear. In this paper, we condu… ▽ More

    Submitted 6 November, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted at NeurIPS 2023; 15 pages and 22 pages Appendix

  8. arXiv:2304.10819  [pdf, other

    cs.LG cs.AI stat.ML

    Auditing and Generating Synthetic Data with Controllable Trust Trade-offs

    Authors: Brian Belgodere, Pierre Dognin, Adam Ivankay, Igor Melnyk, Youssef Mroueh, Aleksandra Mojsilovic, Jiri Navratil, Apoorva Nitsure, Inkit Padhi, Mattia Rigotti, Jerret Ross, Yair Schiff, Radhika Vedpathak, Richard A. Young

    Abstract: Real-world data often exhibits bias, imbalance, and privacy risks. Synthetic datasets have emerged to address these issues. This paradigm relies on generative AI models to generate unbiased, privacy-preserving data while maintaining fidelity to the original data. However, assessing the trustworthiness of synthetic datasets and models is a critical challenge. We introduce a holistic auditing framew… ▽ More

    Submitted 9 June, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

    Comments: submitted

  9. arXiv:2212.06803  [pdf, other

    cs.LG cs.CY stat.ML

    Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting

    Authors: Prasanna Sattigeri, Soumya Ghosh, Inkit Padhi, Pierre Dognin, Kush R. Varshney

    Abstract: In consequential decision-making applications, mitigating unwanted biases in machine learning models that yield systematic disadvantage to members of groups delineated by sensitive attributes such as race and gender is one key intervention to strive for equity. Focusing on demographic parity and equality of opportunity, in this paper we propose an algorithm that improves the fairness of a pre-trai… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: Accepted at Neurips 2022

  10. arXiv:2210.07144  [pdf, other

    q-bio.BM cs.LG

    Reprogramming Pretrained Language Models for Antibody Sequence Infilling

    Authors: Igor Melnyk, Vijil Chenthamarakshan, Pin-Yu Chen, Payel Das, Amit Dhurandhar, Inkit Padhi, Devleena Das

    Abstract: Antibodies comprise the most versatile class of binding molecules, with numerous applications in biomedicine. Computational design of antibodies involves generating novel and diverse sequences, while maintaining structural consistency. Unique to antibodies, designing the complementarity-determining region (CDR), which determines the antigen binding affinity and specificity, creates its own unique… ▽ More

    Submitted 19 June, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: ICML 2023

  11. arXiv:2208.06665  [pdf, other

    cs.LG

    Cloud-Based Real-Time Molecular Screening Platform with MolFormer

    Authors: Brian Belgodere, Vijil Chenthamarakshan, Payel Das, Pierre Dognin, Toby Kurien, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young

    Abstract: With the prospect of automating a number of chemical tasks with high fidelity, chemical language processing models are emerging at a rapid speed. Here, we present a cloud-based real-time platform that allows users to virtually screen molecules of interest. For this purpose, molecular embeddings inferred from a recently proposed large chemical language model, named MolFormer, are leveraged. The pla… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

    Comments: Paper accepted at ECML PKDD 2022 demo track

  12. Accelerating Material Design with the Generative Toolkit for Scientific Discovery

    Authors: Matteo Manica, Jannis Born, Joris Cadow, Dimitrios Christofidellis, Ashish Dave, Dean Clarke, Yves Gaetan Nana Teukam, Giorgio Giannone, Samuel C. Hoffman, Matthew Buchan, Vijil Chenthamarakshan, Timothy Donovan, Hsiang Han Hsu, Federico Zipoli, Oliver Schilter, Akihiro Kishimoto, Lisa Hamada, Inkit Padhi, Karl Wehden, Lauren McHugh, Alexy Khrabrov, Payel Das, Seiji Takeda, John R. Smith

    Abstract: With the growing availability of data within various scientific domains, generative models hold enormous potential to accelerate scientific discovery. They harness powerful representations learned from datasets to speed up the formulation of novel hypotheses with the potential to impact material discovery broadly. We present the Generative Toolkit for Scientific Discovery (GT4SD). This extensible… ▽ More

    Submitted 31 January, 2023; v1 submitted 8 July, 2022; originally announced July 2022.

    Comments: 15 pages, 2 figures

    Journal ref: Nature Partner Journals (npj) Computational Materials 9, 69 (2023)

  13. arXiv:2108.12472  [pdf, other

    cs.CL cs.LG

    ReGen: Reinforcement Learning for Text and Knowledge Base Generation using Pretrained Language Models

    Authors: Pierre L. Dognin, Inkit Padhi, Igor Melnyk, Payel Das

    Abstract: Automatic construction of relevant Knowledge Bases (KBs) from text, and generation of semantically meaningful text from KBs are both long-standing goals in Machine Learning. In this paper, we present ReGen, a bidirectional generation of text and graph leveraging Reinforcement Learning (RL) to improve performance. Graph linearization enables us to re-frame both tasks as a sequence to sequence gener… ▽ More

    Submitted 27 August, 2021; originally announced August 2021.

    Comments: Accepted to appear in the main conference of EMNLP 2021

  14. arXiv:2106.09553  [pdf, other

    cs.LG cs.CL q-bio.BM

    Large-Scale Chemical Language Representations Capture Molecular Structure and Properties

    Authors: Jerret Ross, Brian Belgodere, Vijil Chenthamarakshan, Inkit Padhi, Youssef Mroueh, Payel Das

    Abstract: Models based on machine learning can enable accurate and fast molecular property predictions, which is of interest in drug discovery and material design. Various supervised machine learning models have demonstrated promising performance, but the vast chemical space and the limited availability of property labels make supervised learning challenging. Recently, unsupervised transformer-based languag… ▽ More

    Submitted 14 December, 2022; v1 submitted 17 June, 2021; originally announced June 2021.

    Comments: NMI 2022

  15. arXiv:2012.11696  [pdf, other

    cs.CV cs.LG

    Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge

    Authors: Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young, Brian Belgodere

    Abstract: Image captioning has recently demonstrated impressive progress largely owing to the introduction of neural network algorithms trained on curated dataset like MS-COCO. Often work in this field is motivated by the promise of deployment of captioning systems in practical applications. However, the scarcity of data and contexts in many competition datasets renders the utility of systems trained on the… ▽ More

    Submitted 18 June, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

    Comments: In submission to JAIR. Copyright may be transferred without notice, after which this version may no longer be accessible

  16. arXiv:2012.11691  [pdf, other

    cs.CV cs.LG

    Alleviating Noisy Data in Image Captioning with Cooperative Distillation

    Authors: Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff

    Abstract: Image captioning systems have made substantial progress, largely due to the availability of curated datasets like Microsoft COCO or Vizwiz that have accurate descriptions of their corresponding images. Unfortunately, scarce availability of such cleanly labeled data results in trained algorithms producing captions that can be terse and idiosyncratically specific to details in the image. We propose… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

    Comments: CVPR 2020 VizWiz Challenge

  17. arXiv:2012.04698  [pdf, other

    cs.CL cs.AI cs.LG

    Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text

    Authors: Nishtha Madaan, Inkit Padhi, Naveen Panwar, Diptikalyan Saha

    Abstract: Machine Learning has seen tremendous growth recently, which has led to larger adoption of ML systems for educational assessments, credit risk, healthcare, employment, criminal justice, to name a few. The trustworthiness of ML and NLP systems is a crucial aspect and requires a guarantee that the decisions they make are fair and robust. Aligned with this, we propose a framework GYC, to generate a se… ▽ More

    Submitted 17 March, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: Accepted at AAAI Conference on Artificial Intelligence (AAAI 2021)

  18. arXiv:2011.01843  [pdf, other

    cs.LG cs.AI

    Tabular Transformers for Modeling Multivariate Time Series

    Authors: Inkit Padhi, Yair Schiff, Igor Melnyk, Mattia Rigotti, Youssef Mroueh, Pierre Dognin, Jerret Ross, Ravi Nair, Erik Altman

    Abstract: Tabular datasets are ubiquitous in data science applications. Given their importance, it seems natural to apply state-of-the-art deep learning algorithms in order to fully unlock their potential. Here we propose neural network models that represent tabular time series that can optionally leverage their hierarchical structure. This results in two architectures for tabular time series: one for learn… ▽ More

    Submitted 11 February, 2021; v1 submitted 3 November, 2020; originally announced November 2020.

    Comments: Accepted to ICASSP, 2021; https://github.com/IBM/TabFormer

  19. arXiv:2010.14660  [pdf, other

    cs.CL cs.LG

    DualTKB: A Dual Learning Bridge between Text and Knowledge Base

    Authors: Pierre L. Dognin, Igor Melnyk, Inkit Padhi, Cicero Nogueira dos Santos, Payel Das

    Abstract: In this work, we present a dual learning approach for unsupervised text to path and path to text transfers in Commonsense Knowledge Bases (KBs). We investigate the impact of weak supervision by creating a weakly supervised dataset and show that even a slight amount of supervision can significantly improve the model performance and enable better-quality transfers. We examine different model archite… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: Equal Contributions of Authors Pierre L. Dognin, Igor Melnyk, and Inkit Padhi. Accepted at EMNLP'20

  20. arXiv:2005.11248  [pdf, other

    cs.LG q-bio.QM stat.ML

    Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics

    Authors: Payel Das, Tom Sercu, Kahini Wadhawan, Inkit Padhi, Sebastian Gehrmann, Flaviu Cipcigan, Vijil Chenthamarakshan, Hendrik Strobelt, Cicero dos Santos, Pin-Yu Chen, Yi Yan Yang, Jeremy Tan, James Hedrick, Jason Crain, Aleksandra Mojsilovic

    Abstract: De novo therapeutic design is challenged by a vast chemical repertoire and multiple constraints, e.g., high broad-spectrum potency and low toxicity. We propose CLaSS (Controlled Latent attribute Space Sampling) - an efficient computational method for attribute-controlled generation of molecules, which leverages guidance from classifiers trained on an informative latent space of molecules modeled u… ▽ More

    Submitted 25 February, 2021; v1 submitted 22 May, 2020; originally announced May 2020.

    Journal ref: Nature Biomedical Engineering (2021)

  21. arXiv:2005.03588  [pdf, other

    cs.CL cs.LG

    Learning Implicit Text Generation via Feature Matching

    Authors: Inkit Padhi, Pierre Dognin, Ke Bai, Cicero Nogueira dos Santos, Vijil Chenthamarakshan, Youssef Mroueh, Payel Das

    Abstract: Generative feature matching network (GFMN) is an approach for training implicit generative models for images by performing moment matching on features from pre-trained neural networks. In this paper, we present new GFMN formulations that are effective for sequential data. Our experimental results show the effectiveness of the proposed method, SeqGFMN, for three distinct generation tasks in English… ▽ More

    Submitted 8 May, 2020; v1 submitted 7 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  22. arXiv:2004.01215  [pdf, other

    cs.LG q-bio.QM stat.ML

    CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models

    Authors: Vijil Chenthamarakshan, Payel Das, Samuel C. Hoffman, Hendrik Strobelt, Inkit Padhi, Kar Wai Lim, Benjamin Hoover, Matteo Manica, Jannis Born, Teodoro Laino, Aleksandra Mojsilovic

    Abstract: The novel nature of SARS-CoV-2 calls for the development of efficient de novo drug design approaches. In this study, we propose an end-to-end framework, named CogMol (Controlled Generation of Molecules), for designing new drug-like small molecules targeting novel viral proteins with high affinity and off-target selectivity. CogMol combines adaptive pre-training of a molecular SMILES Variational Au… ▽ More

    Submitted 23 June, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

  23. arXiv:1910.14212  [pdf, other

    cs.LG stat.ML

    Sobolev Independence Criterion

    Authors: Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Dos Santos

    Abstract: We propose the Sobolev Independence Criterion (SIC), an interpretable dependency measure between a high dimensional random variable X and a response variable Y . SIC decomposes to the sum of feature importance scores and hence can be used for nonlinear feature selection. SIC can be seen as a gradient regularized Integral Probability Metric (IPM) between the joint distribution of the two random var… ▽ More

    Submitted 30 October, 2019; originally announced October 2019.

    Comments: NeurIPS 2019

  24. arXiv:1904.02762  [pdf, other

    cs.CV cs.LG

    Learning Implicit Generative Models by Matching Perceptual Features

    Authors: Cicero Nogueira dos Santos, Youssef Mroueh, Inkit Padhi, Pierre Dognin

    Abstract: Perceptual features (PFs) have been used with great success in tasks such as transfer learning, style transfer, and super-resolution. However, the efficacy of PFs as key source of information for learning generative models is not well studied. We investigate here the use of PFs in the context of learning implicit generative models through moment matching (MM). More specifically, we propose a new e… ▽ More

    Submitted 4 April, 2019; originally announced April 2019.

    Comments: 16 pages

    Journal ref: ICCV 2019

  25. arXiv:1810.07743  [pdf, other

    q-bio.QM cs.LG stat.ML

    PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences

    Authors: Payel Das, Kahini Wadhawan, Oscar Chang, Tom Sercu, Cicero Dos Santos, Matthew Riemer, Vijil Chenthamarakshan, Inkit Padhi, Aleksandra Mojsilovic

    Abstract: Given the emerging global threat of antimicrobial resistance, new methods for next-generation antimicrobial design are urgently needed. We report a peptide generation framework PepCVAE, based on a semi-supervised variational autoencoder (VAE) model, for designing novel antimicrobial peptide (AMP) sequences. Our model learns a rich latent space of the biological peptide context by taking advantage… ▽ More

    Submitted 13 November, 2018; v1 submitted 17 October, 2018; originally announced October 2018.

  26. arXiv:1805.07685  [pdf, other

    cs.CL cs.LG

    Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer

    Authors: Cicero Nogueira dos Santos, Igor Melnyk, Inkit Padhi

    Abstract: We introduce a new approach to tackle the problem of offensive language in online social media. Our approach uses unsupervised text style transfer to translate offensive sentences into non-offensive ones. We propose a new method for training encoder-decoders using non-parallel data that combines a collaborative classifier, attention and the cycle consistency loss. Experimental results on data from… ▽ More

    Submitted 19 May, 2018; originally announced May 2018.

    Comments: ACL 2018

  27. arXiv:1711.09395  [pdf, other

    cs.CL cs.AI cs.LG

    Improved Neural Text Attribute Transfer with Non-parallel Data

    Authors: Igor Melnyk, Cicero Nogueira dos Santos, Kahini Wadhawan, Inkit Padhi, Abhishek Kumar

    Abstract: Text attribute transfer using non-parallel data requires methods that can perform disentanglement of content and linguistic attributes. In this work, we propose multiple improvements over the existing approaches that enable the encoder-decoder framework to cope with the text attribute transfer from non-parallel data. We perform experiments on the sentiment transfer task using two datasets. For bot… ▽ More

    Submitted 4 December, 2017; v1 submitted 26 November, 2017; originally announced November 2017.

    Comments: NIPS 2017 Workshop on Learning Disentangled Representations: from Perception to Control