Skip to main content

Showing 1–37 of 37 results for author: Rajani, N

  1. arXiv:2402.05160  [pdf, other

    cs.SE cs.AI cs.LG

    What's documented in AI? Systematic Analysis of 32K AI Model Cards

    Authors: Weixin Liang, Nazneen Rajani, Xinyu Yang, Ezinwanne Ozoani, Eric Wu, Yiqun Chen, Daniel Scott Smith, James Zou

    Abstract: The rapid proliferation of AI models has underscored the importance of thorough documentation, as it enables users to understand, trust, and effectively utilize these models in various applications. Although developers are encouraged to produce model cards, it's not clear how much information or what information these cards contain. In this study, we conduct a comprehensive analysis of 32,111 AI m… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  2. arXiv:2310.16944  [pdf, other

    cs.LG cs.CL

    Zephyr: Direct Distillation of LM Alignment

    Authors: Lewis Tunstall, Edward Beeching, Nathan Lambert, Nazneen Rajani, Kashif Rasul, Younes Belkada, Shengyi Huang, Leandro von Werra, Clémentine Fourrier, Nathan Habib, Nathan Sarrazin, Omar Sanseviero, Alexander M. Rush, Thomas Wolf

    Abstract: We aim to produce a smaller language model that is aligned to user intent. Previous research has shown that applying distilled supervised fine-tuning (dSFT) on larger models significantly improves task accuracy; however, these models are unaligned, i.e. they do not respond well to natural prompts. To distill this property, we experiment with the use of preference data from AI Feedback (AIF). Start… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  3. arXiv:2308.00862  [pdf, ps, other

    cs.CY

    Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings

    Authors: Sarah Shoker, Andrew Reddie, Sarah Barrington, Ruby Booth, Miles Brundage, Husanjot Chahal, Michael Depp, Bill Drexel, Ritwik Gupta, Marina Favaro, Jake Hecla, Alan Hickey, Margarita Konaev, Kirthi Kumar, Nathan Lambert, Andrew Lohn, Cullen O'Keefe, Nazneen Rajani, Michael Sellitto, Robert Trager, Leah Walker, Alexa Wehsener, Jessica Young

    Abstract: Foundation models could eventually introduce several pathways for undermining state security: accidents, inadvertent escalation, unintentional conflict, the proliferation of weapons, and the interference with human diplomacy are just a few on a long list. The Confidence-Building Measures for Artificial Intelligence workshop hosted by the Geopolitics Team at OpenAI and the Berkeley Risk and Securit… ▽ More

    Submitted 3 August, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

  4. arXiv:2212.05129  [pdf, other

    cs.AI cs.LG

    Measuring Data

    Authors: Margaret Mitchell, Alexandra Sasha Luccioni, Nathan Lambert, Marissa Gerchick, Angelina McMillan-Major, Ezinwanne Ozoani, Nazneen Rajani, Tristan Thrush, Yacine Jernite, Douwe Kiela

    Abstract: We identify the task of measuring data to quantitatively characterize the composition of machine learning data and datasets. Similar to an object's height, width, and volume, data measurements quantify different attributes of data along common dimensions that support comparison. Several lines of research have proposed what we refer to as measurements, with differing terminology; we bring some of t… ▽ More

    Submitted 13 February, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

  5. arXiv:2211.07517  [pdf, other

    cs.CL cs.AI

    Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations

    Authors: Swarnadeep Saha, Peter Hase, Nazneen Rajani, Mohit Bansal

    Abstract: Recent work on explainable NLP has shown that few-shot prompting can enable large pretrained language models (LLMs) to generate grammatical and factual natural language explanations for data labels. In this work, we study the connection between explainability and sample hardness by investigating the following research question - "Are LLMs and humans equally good at explaining data labels for both… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: EMNLP 2022 (11 pages)

  6. arXiv:2211.05100  [pdf, other

    cs.CL

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More

    Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  7. arXiv:2210.12619  [pdf, other

    cs.CL

    Conformal Predictor for Improving Zero-shot Text Classification Efficiency

    Authors: Prafulla Kumar Choubey, Yu Bai, Chien-Sheng Wu, Wenhao Liu, Nazneen Rajani

    Abstract: Pre-trained language models (PLMs) have been shown effective for zero-shot (0shot) text classification. 0shot models based on natural language inference (NLI) and next sentence prediction (NSP) employ cross-encoder architecture and infer by making a forward pass through the model for each label-text pair separately. This increases the computational cost to make inferences linearly in the number of… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022

  8. arXiv:2210.05839  [pdf, other

    cs.CL cs.HC

    SEAL : Interactive Tool for Systematic Error Analysis and Labeling

    Authors: Nazneen Rajani, Weixin Liang, Lingjiao Chen, Meg Mitchell, James Zou

    Abstract: With the advent of Transformers, large language models (LLMs) have saturated well-known NLP benchmarks and leaderboards with high aggregate performance. However, many times these models systematically fail on tail data or rare groups not obvious in aggregate evaluation. Identifying such problematic data groups is even more challenging when there are no explicit labels (e.g., ethnicity, gender, etc… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: Accepted at EMNLP 2022 demo track

  9. arXiv:2210.01970  [pdf, other

    cs.LG

    Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements

    Authors: Leandro von Werra, Lewis Tunstall, Abhishek Thakur, Alexandra Sasha Luccioni, Tristan Thrush, Aleksandra Piktus, Felix Marty, Nazneen Rajani, Victor Mustar, Helen Ngo, Omar Sanseviero, Mario Šaško, Albert Villanova, Quentin Lhoest, Julien Chaumond, Margaret Mitchell, Alexander M. Rush, Thomas Wolf, Douwe Kiela

    Abstract: Evaluation is a key part of machine learning (ML), yet there is a lack of support and tooling to enable its informed and systematic practice. We introduce Evaluate and Evaluation on the Hub --a set of tools to facilitate the evaluation of models and datasets in ML. Evaluate is a library to support best practices for measurements, metrics, and comparisons of data and models. Its goal is to support… ▽ More

    Submitted 6 October, 2022; v1 submitted 30 September, 2022; originally announced October 2022.

  10. arXiv:2205.02894  [pdf, other

    cs.HC cs.AI cs.CL

    Interactive Model Cards: A Human-Centered Approach to Model Documentation

    Authors: Anamaria Crisan, Margaret Drouhard, Jesse Vig, Nazneen Rajani

    Abstract: Deep learning models for natural language processing (NLP) are increasingly adopted and deployed by analysts without formal training in NLP or machine learning (ML). However, the documentation intended to convey the model's details and appropriate use is tailored primarily to individuals with ML or NLP expertise. To address this gap, we conduct a design inquiry into interactive model cards, which… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Comments: To appear at ACM FAccT'22

    MSC Class: 68T01

  11. iSEA: An Interactive Pipeline for Semantic Error Analysis of NLP Models

    Authors: Jun Yuan, Jesse Vig, Nazneen Rajani

    Abstract: Error analysis in NLP models is essential to successful model development and deployment. One common approach for diagnosing errors is to identify subpopulations in the dataset where the model produces the most errors. However, existing approaches typically define subpopulations based on pre-defined features, which requires users to form hypotheses of errors in advance. To complement these approac… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

    Comments: Accepted at IUI 2022, 11 pages, 6 figures

  12. arXiv:2110.07280  [pdf, other

    cs.CL

    P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts

    Authors: Benjamin Newman, Prafulla Kumar Choubey, Nazneen Rajani

    Abstract: Recent work (e.g. LAMA (Petroni et al., 2019)) has found that the quality of the factual information extracted from Large Language Models (LLMs) depends on the prompts used to query them. This inconsistency is problematic because different users will query LLMs for the same information using different wording, but should receive the same, accurate responses regardless. In this work we aim to addre… ▽ More

    Submitted 19 April, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 15 pages, 6 figures, 4 tables

  13. arXiv:2110.07166  [pdf, other

    cs.CL

    CaPE: Contrastive Parameter Ensembling for Reducing Hallucination in Abstractive Summarization

    Authors: Prafulla Kumar Choubey, Alexander R. Fabbri, Jesse Vig, Chien-Sheng Wu, Wenhao Liu, Nazneen Fatema Rajani

    Abstract: Hallucination is a known issue for neural abstractive summarization models. Recent work suggests that the degree of hallucination may depend on errors in the training data. In this work, we propose a new method called Contrastive Parameter Ensembling (CaPE) to use training data more effectively, utilizing variations in noise in training samples to reduce hallucination. We first select clean and no… ▽ More

    Submitted 20 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

  14. arXiv:2110.04400  [pdf, other

    cs.CL

    HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models

    Authors: Tanya Goyal, Nazneen Fatema Rajani, Wenhao Liu, Wojciech Kryściński

    Abstract: Summarization systems make numerous "decisions" about summary properties during inference, e.g. degree of copying, specificity and length of outputs, etc. However, these are implicitly encoded within model parameters and specific styles cannot be enforced. To address this, we introduce HydraSum, a new summarization architecture that extends the single decoder framework of current models to a mixtu… ▽ More

    Submitted 21 October, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

    Comments: EMNLP2022

  15. arXiv:2105.08209  [pdf, other

    cs.CL

    BookSum: A Collection of Datasets for Long-form Narrative Summarization

    Authors: Wojciech Kryściński, Nazneen Rajani, Divyansh Agarwal, Caiming Xiong, Dragomir Radev

    Abstract: The majority of available text summarization datasets include short-form source documents that lack long-range causal and temporal dependencies, and often contain strong layout and stylistic biases. While relevant, such datasets will offer limited challenges for future generations of text summarization systems. We address these issues by introducing BookSum, a collection of datasets for long-form… ▽ More

    Submitted 6 December, 2022; v1 submitted 17 May, 2021; originally announced May 2021.

    Comments: 19 pages, 12 tables, 3 figures

  16. arXiv:2105.08021  [pdf, other

    cs.CL cs.AI

    Stage-wise Fine-tuning for Graph-to-Text Generation

    Authors: Qingyun Wang, Semih Yavuz, Victoria Lin, Heng Ji, Nazneen Rajani

    Abstract: Graph-to-text generation has benefited from pre-trained language models (PLMs) in achieving better performance than structured graph encoders. However, they fail to fully utilize the structure information of the input graph. In this paper, we aim to further improve the performance of the pre-trained language model by proposing a structured graph-to-text model with a two-step fine-tuning mechanism… ▽ More

    Submitted 30 May, 2021; v1 submitted 17 May, 2021; originally announced May 2021.

    Comments: 10 pages, Accepted by Proceedings of ACL-IJCNLP 2021 Student Research Workshop, Code and Resources at https://github.com/EagleW/Stage-wise-Fine-tuning

  17. arXiv:2104.07605  [pdf, other

    cs.CL

    SummVis: Interactive Visual Analysis of Models, Data, and Evaluation for Text Summarization

    Authors: Jesse Vig, Wojciech Kryściński, Karan Goel, Nazneen Fatema Rajani

    Abstract: Novel neural architectures, training strategies, and the availability of large-scale corpora haven been the driving force behind recent progress in abstractive text summarization. However, due to the black-box nature of neural models, uninformative evaluation metrics, and scarce tooling for model and data analysis, the true performance and failure modes of summarization models remain largely unkno… ▽ More

    Submitted 26 July, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: Accepted to ACL 2021 System Demonstrations

  18. arXiv:2101.04840  [pdf, other

    cs.CL cs.AI cs.LG

    Robustness Gym: Unifying the NLP Evaluation Landscape

    Authors: Karan Goel, Nazneen Rajani, Jesse Vig, Samson Tan, Jason Wu, Stephan Zheng, Caiming Xiong, Mohit Bansal, Christopher Ré

    Abstract: Despite impressive performance on standard benchmarks, deep neural networks are often brittle when deployed in real-world systems. Consequently, recent research has focused on testing the robustness of such models, resulting in a diverse set of evaluation methodologies ranging from adversarial attacks to rule-based data transformations. In this work, we identify challenges with evaluating NLP syst… ▽ More

    Submitted 12 January, 2021; originally announced January 2021.

    Comments: 34 pages, 8 figures, 6 tables

  19. arXiv:2012.15781  [pdf, other

    cs.LG cs.AI cs.CL

    FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging

    Authors: Han Guo, Nazneen Fatema Rajani, Peter Hase, Mohit Bansal, Caiming Xiong

    Abstract: Influence functions approximate the "influences" of training data-points for test predictions and have a wide variety of applications. Despite the popularity, their computational cost does not scale well with model and training data size. We present FastIF, a set of simple modifications to influence functions that significantly improves their run-time. We use k-Nearest Neighbors (kNN) to narrow th… ▽ More

    Submitted 9 September, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: 18 pages

  20. arXiv:2012.04281  [pdf, other

    cs.CL

    CTRLsum: Towards Generic Controllable Text Summarization

    Authors: Junxian He, Wojciech Kryściński, Bryan McCann, Nazneen Rajani, Caiming Xiong

    Abstract: Current summarization systems yield generic summaries that are disconnected from users' preferences and expectations. To address this limitation, we present CTRLsum, a novel framework for controllable summarization. Our approach enables users to control multiple aspects of generated summaries by interacting with the summarization system through textual input in the form of a set of keywords or des… ▽ More

    Submitted 8 December, 2020; originally announced December 2020.

    Comments: Preprint

  21. arXiv:2012.00195  [pdf, other

    cs.LG q-bio.BM

    Profile Prediction: An Alignment-Based Pre-Training Task for Protein Sequence Models

    Authors: Pascal Sturmfels, Jesse Vig, Ali Madani, Nazneen Fatema Rajani

    Abstract: For protein sequence datasets, unlabeled data has greatly outpaced labeled data due to the high cost of wet-lab characterization. Recent deep-learning approaches to protein prediction have shown that pre-training on unlabeled data can yield useful representations for downstream tasks. However, the optimal pre-training strategy remains an open question. Instead of strictly borrowing from natural la… ▽ More

    Submitted 30 November, 2020; originally announced December 2020.

  22. arXiv:2011.03161  [pdf, other

    cs.CL

    What's New? Summarizing Contributions in Scientific Literature

    Authors: Hiroaki Hayashi, Wojciech Kryściński, Bryan McCann, Nazneen Rajani, Caiming Xiong

    Abstract: With thousands of academic articles shared on a daily basis, it has become increasingly difficult to keep up with the latest scientific findings. To overcome this problem, we introduce a new task of disentangled paper summarization, which seeks to generate separate summaries for the paper contributions and the context of the work, making it easier to identify the key findings shared in articles. F… ▽ More

    Submitted 9 November, 2020; v1 submitted 5 November, 2020; originally announced November 2020.

    Comments: 9 pages, 5 tables, 2 figures

  23. arXiv:2010.12850  [pdf, other

    cs.CL

    CoCo: Controllable Counterfactuals for Evaluating Dialogue State Trackers

    Authors: Shiyang Li, Semih Yavuz, Kazuma Hashimoto, Jia Li, Tong Niu, Nazneen Rajani, Xifeng Yan, Yingbo Zhou, Caiming Xiong

    Abstract: Dialogue state trackers have made significant progress on benchmark datasets, but their generalization capability to novel and realistic scenarios beyond the held-out conversations is less understood. We propose controllable counterfactuals (CoCo) to bridge this gap and evaluate dialogue state tracking (DST) models on novel scenarios, i.e., would the system successfully tackle the request if the u… ▽ More

    Submitted 26 March, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: ICLR 2021

  24. arXiv:2010.12730  [pdf, other

    cs.CL

    Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality

    Authors: Gustavo Aguilar, Bryan McCann, Tong Niu, Nazneen Rajani, Nitish Keskar, Thamar Solorio

    Abstract: Byte-pair encoding (BPE) is a ubiquitous algorithm in the subword tokenization process of language models as it provides multiple benefits. However, this process is solely based on pre-training data statistics, making it hard for the tokenizer to handle infrequent spellings. On the other hand, though robust to misspellings, pure character-level models often lead to unreasonably long sequences and… ▽ More

    Submitted 23 September, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: Findings of EMNLP 2020

  25. arXiv:2010.09030  [pdf, other

    cs.CL cs.LG

    Explaining and Improving Model Behavior with k Nearest Neighbor Representations

    Authors: Nazneen Fatema Rajani, Ben Krause, Wengpeng Yin, Tong Niu, Richard Socher, Caiming Xiong

    Abstract: Interpretability techniques in NLP have mainly focused on understanding individual predictions using attention visualization or gradient-based saliency maps over tokens. We propose using k nearest neighbor (kNN) representations to identify training examples responsible for a model's predictions and obtain a corpus-level understanding of the model's behavior. Apart from interpretability, we show th… ▽ More

    Submitted 18 October, 2020; originally announced October 2020.

  26. arXiv:2010.07126  [pdf

    cs.AI

    Explaining Creative Artifacts

    Authors: Lav R. Varshney, Nazneen Fatema Rajani, Richard Socher

    Abstract: Human creativity is often described as the mental process of combining associative elements into a new form, but emerging computational creativity algorithms may not operate in this manner. Here we develop an inverse problem formulation to deconstruct the products of combinatorial and compositional creativity into associative chains as a form of post-hoc interpretation that matches the human creat… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: 2020 Workshop on Human Interpretability in Machine Learning (WHI), at ICML 2020

  27. arXiv:2010.06119  [pdf, other

    cs.CL cs.AI

    ReviewRobot: Explainable Paper Review Generation based on Knowledge Synthesis

    Authors: Qingyun Wang, Qi Zeng, Lifu Huang, Kevin Knight, Heng Ji, Nazneen Fatema Rajani

    Abstract: To assist human review process, we build a novel ReviewRobot to automatically assign a review score and write comments for multiple categories such as novelty and meaningful comparison. A good review needs to be knowledgeable, namely that the comments should be constructive and informative to help improve the paper; and explainable by providing detailed evidence. ReviewRobot achieves these goals v… ▽ More

    Submitted 3 December, 2020; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: 14 pages. Accepted by The 14th International Conference on Natural Language Generation (INLG 2020) Code and resource is available at https://github.com/EagleW/ReviewRobot

  28. arXiv:2010.02584  [pdf, other

    cs.CL

    Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a Start

    Authors: Wenpeng Yin, Nazneen Fatema Rajani, Dragomir Radev, Richard Socher, Caiming Xiong

    Abstract: A standard way to address different NLP problems is by first constructing a problem-specific dataset, then building a model to fit this dataset. To build the ultimate artificial intelligence, we desire a single machine that can handle diverse new problems, for which task-specific annotations are limited. We bring up textual entailment as a unified solver for such NLP problems. However, current res… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: EMNLP2020 Long, camera-ready

  29. arXiv:2009.06367  [pdf, other

    cs.CL cs.LG

    GeDi: Generative Discriminator Guided Sequence Generation

    Authors: Ben Krause, Akhilesh Deepak Gotmare, Bryan McCann, Nitish Shirish Keskar, Shafiq Joty, Richard Socher, Nazneen Fatema Rajani

    Abstract: While large-scale language models (LMs) are able to imitate the distribution of natural language well enough to generate realistic text, it is difficult to control which regions of the distribution they generate. This is especially problematic because datasets used for training large LMs usually contain significant toxicity, hate, bias, and negativity. We propose GeDi as an efficient method for us… ▽ More

    Submitted 22 October, 2020; v1 submitted 14 September, 2020; originally announced September 2020.

  30. arXiv:2007.02871  [pdf, other

    cs.CL

    DART: Open-Domain Structured Data Record to Text Generation

    Authors: Linyong Nan, Dragomir Radev, Rui Zhang, Amrit Rau, Abhinand Sivaprasad, Chiachun Hsieh, Xiangru Tang, Aadit Vyas, Neha Verma, Pranav Krishna, Yangxiaokang Liu, Nadia Irwanto, Jessica Pan, Faiaz Rahman, Ahmad Zaidi, Mutethia Mutuma, Yasin Tarabar, Ankit Gupta, Tao Yu, Yi Chern Tan, Xi Victoria Lin, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani

    Abstract: We present DART, an open domain structured DAta Record to Text generation dataset with over 82k instances (DARTs). Data-to-Text annotations can be a costly process, especially when dealing with tables which are the major source of structured data and contain nontrivial structures. To this end, we propose a procedure of extracting semantic triples from tables that encodes their structures by exploi… ▽ More

    Submitted 12 April, 2021; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: NAACL 2021

  31. arXiv:2006.15222  [pdf, other

    cs.CL cs.LG q-bio.BM

    BERTology Meets Biology: Interpreting Attention in Protein Language Models

    Authors: Jesse Vig, Ali Madani, Lav R. Varshney, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani

    Abstract: Transformer architectures have proven to learn useful representations for protein classification and generation tasks. However, these representations present challenges in interpretability. In this work, we demonstrate a set of methods for analyzing protein Transformer models through the lens of attention. We show that attention: (1) captures the folding structure of proteins, connecting amino aci… ▽ More

    Submitted 28 March, 2021; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: To appear in ICLR 2021

    ACM Class: I.2

  32. arXiv:2005.00965  [pdf, other

    cs.CL cs.LG

    Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation

    Authors: Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani, Bryan McCann, Vicente Ordonez, Caiming Xiong

    Abstract: Word embeddings derived from human-generated corpora inherit strong gender bias which can be further amplified by downstream models. Some commonly adopted debiasing approaches, including the seminal Hard Debias algorithm, apply post-processing procedures that project pre-trained word embeddings into a subspace orthogonal to an inferred gender subspace. We discover that semantic-agnostic corpus reg… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

    Comments: Accepted to ACL 2020

  33. arXiv:2005.00730  [pdf, other

    cs.CL cs.LG

    ESPRIT: Explaining Solutions to Physical Reasoning Tasks

    Authors: Nazneen Fatema Rajani, Rui Zhang, Yi Chern Tan, Stephan Zheng, Jeremy Weiss, Aadit Vyas, Abhijit Gupta, Caiming XIong, Richard Socher, Dragomir Radev

    Abstract: Neural networks lack the ability to reason about qualitative physics and so cannot generalize to scenarios and tasks unseen during training. We propose ESPRIT, a framework for commonsense reasoning about qualitative physics in natural language that generates interpretable descriptions of physical events. We use a two-step approach of first identifying the pivotal physical events in an environment… ▽ More

    Submitted 13 May, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  34. arXiv:1911.03429  [pdf, other

    cs.CL cs.AI cs.LG

    ERASER: A Benchmark to Evaluate Rationalized NLP Models

    Authors: Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, Byron C. Wallace

    Abstract: State-of-the-art models in NLP are now predominantly based on deep neural networks that are opaque in terms of how they come to make predictions. This limitation has increased interest in designing more interpretable deep models for NLP that reveal the `reasoning' behind model outputs. But work in this direction has been conducted on different datasets and tasks with correspondingly unique aims an… ▽ More

    Submitted 24 April, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: Accepted as a long paper to ACL2020 Website and leaderboard available at http://www.eraserbenchmark.com/ Code available at https://github.com/jayded/eraserbenchmark

  35. arXiv:1906.02361  [pdf, other

    cs.CL

    Explain Yourself! Leveraging Language Models for Commonsense Reasoning

    Authors: Nazneen Fatema Rajani, Bryan McCann, Caiming Xiong, Richard Socher

    Abstract: Deep learning models perform poorly on tasks that require commonsense reasoning, which often necessitates some form of world-knowledge or reasoning over information not immediately present in the input. We collect human explanations for commonsense reasoning in the form of natural language sequences and highlighted annotations in a new dataset called Common Sense Explanations (CoS-E). We use CoS-E… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: Accepted at ACL, 11 pages total

    Journal ref: In Proceedings of the Association for Computational Linguistics (ACL), 2019. Florence, Italy

  36. arXiv:1605.08764  [pdf, other

    cs.CL cs.CV cs.LG

    Stacking With Auxiliary Features

    Authors: Nazneen Fatema Rajani, Raymond J. Mooney

    Abstract: Ensembling methods are well known for improving prediction accuracy. However, they are limited in the sense that they cannot discriminate among component models effectively. In this paper, we propose stacking with auxiliary features that learns to fuse relevant information from multiple systems to improve performance. Auxiliary features enable the stacker to rely on systems that not just agree on… ▽ More

    Submitted 27 May, 2016; originally announced May 2016.

    Comments: arXiv admin note: substantial text overlap with arXiv:1604.04802

  37. arXiv:1604.04802  [pdf, other

    cs.CL cs.LG

    Supervised and Unsupervised Ensembling for Knowledge Base Population

    Authors: Nazneen Fatema Rajani, Raymond J. Mooney

    Abstract: We present results on combining supervised and unsupervised methods to ensemble multiple systems for two popular Knowledge Base Population (KBP) tasks, Cold Start Slot Filling (CSSF) and Tri-lingual Entity Discovery and Linking (TEDL). We demonstrate that our combined system along with auxiliary features outperforms the best performing system for both tasks in the 2015 competition, several ensembl… ▽ More

    Submitted 16 April, 2016; originally announced April 2016.