Skip to main content

Showing 1–50 of 79 results for author: Mckeown, K

  1. arXiv:2407.06501  [pdf, other

    cs.AI cs.CL

    STORYSUMM: Evaluating Faithfulness in Story Summarization

    Authors: Melanie Subbiah, Faisal Ladhak, Akankshya Mishra, Griffin Adams, Lydia B. Chilton, Kathleen McKeown

    Abstract: Human evaluation has been the gold standard for checking faithfulness in abstractive summarization. However, with a challenging source domain like narrative, multiple annotators can agree a summary is faithful, while missing details that are obvious errors only once pointed out. We therefore introduce a new dataset, STORYSUMM, comprising LLM summaries of short stories with localized faithfulness l… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2407.03956  [pdf, other

    cs.MA cs.CL

    Solving Zebra Puzzles Using Constraint-Guided Multi-Agent Systems

    Authors: Shmuel Berman, Kathleen McKeown, Baishakhi Ray

    Abstract: Prior research has enhanced the ability of Large Language Models (LLMs) to solve logic puzzles using techniques such as chain-of-thought prompting or introducing a symbolic representation. These frameworks are still usually insufficient to solve complicated logical problems, such as Zebra puzzles, due to the inherent complexity of translating natural language clues into logical statements. We intr… ▽ More

    Submitted 9 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    MSC Class: 68T01; 68T20; 68T27; ACM Class: I.2.3; I.2.6; I.2.7; I.2.11

  3. arXiv:2406.15586  [pdf, other

    cs.CL

    TinyStyler: Efficient Few-Shot Text Style Transfer with Authorship Embeddings

    Authors: Zachary Horvitz, Ajay Patel, Kanishk Singh, Chris Callison-Burch, Kathleen McKeown, Zhou Yu

    Abstract: The goal of text style transfer is to transform the style of texts while preserving their original meaning, often with only a few examples of the target style. Existing style transfer methods generally rely on the few-shot capabilities of large language models or on complex controllable text generation approaches that are inefficient and underperform on fluency metrics. We introduce TinyStyler, a… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  4. arXiv:2406.11665  [pdf, other

    cs.CL cs.AI cs.CV

    See It from My Perspective: Diagnosing the Western Cultural Bias of Large Vision-Language Models in Image Understanding

    Authors: Amith Ananthram, Elias Stengel-Eskin, Carl Vondrick, Mohit Bansal, Kathleen McKeown

    Abstract: Vision-language models (VLMs) can respond to queries about images in many languages. However, beyond language, culture affects how we see things. For example, individuals from Western cultures focus more on the central figure in an image while individuals from Eastern cultures attend more to scene context. In this work, we present a novel investigation that demonstrates and localizes VLMs' Western… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 17 pages, 7 figures. Code/models: https://github.com/amith-ananthram/see-it-from-my-perspective

  5. arXiv:2403.04770  [pdf, other

    cs.CL cs.LG

    Social Orientation: A New Feature for Dialogue Analysis

    Authors: Todd Morrill, Zhaoyuan Deng, Yanda Chen, Amith Ananthram, Colin Wayne Leach, Kathleen McKeown

    Abstract: There are many settings where it is useful to predict and explain the success or failure of a dialogue. Circumplex theory from psychology models the social orientations (e.g., Warm-Agreeable, Arrogant-Calculating) of conversation participants and can be used to predict and explain the outcome of social interactions. Our work is novel in its systematic application of social orientation tags to mode… ▽ More

    Submitted 25 February, 2024; originally announced March 2024.

    Comments: Accepted to LREC-COLING 2024

  6. arXiv:2403.01061  [pdf, other

    cs.CL

    Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers

    Authors: Melanie Subbiah, Sean Zhang, Lydia B. Chilton, Kathleen McKeown

    Abstract: We evaluate recent Large Language Models (LLMs) on the challenging task of summarizing short stories, which can be lengthy, and include nuanced subtext or scrambled timelines. Importantly, we work directly with authors to ensure that the stories have not been shared online (and therefore are unseen by the models), and to obtain informed evaluations of summary quality using judgments from the autho… ▽ More

    Submitted 11 July, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: pre-MIT Press publication version

  7. arXiv:2403.00794  [pdf, other

    cs.CL cs.AI cs.LG

    Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models

    Authors: Zachary Horvitz, Jingru Chen, Rahul Aditya, Harshvardhan Srivastava, Robert West, Zhou Yu, Kathleen McKeown

    Abstract: Humor is a fundamental facet of human cognition and interaction. Yet, despite recent advances in natural language processing, humor detection remains a challenging task that is complicated by the scarcity of datasets that pair humorous texts with similar non-humorous counterparts. In our work, we investigate whether large language models (LLMs), can generate synthetic data for humor detection via… ▽ More

    Submitted 21 June, 2024; v1 submitted 22 February, 2024; originally announced March 2024.

  8. arXiv:2402.18479  [pdf, other

    cs.CL

    NewsQs: Multi-Source Question Generation for the Inquiring Mind

    Authors: Alyssa Hwang, Kalpit Dixit, Miguel Ballesteros, Yassine Benajiba, Vittorio Castelli, Markus Dreyer, Mohit Bansal, Kathleen McKeown

    Abstract: We present NewsQs (news-cues), a dataset that provides question-answer pairs for multiple news documents. To create NewsQs, we augment a traditional multi-document summarization dataset with questions automatically generated by a T5-Large model fine-tuned on FAQ-style news articles from the News On the Web corpus. We show that fine-tuning a model with control codes produces questions that are judg… ▽ More

    Submitted 15 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: minor wording change

  9. arXiv:2402.13249  [pdf, other

    cs.CL cs.AI

    TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

    Authors: Liyan Tang, Igor Shalyminov, Amy Wing-mei Wong, Jon Burnsky, Jake W. Vincent, Yu'an Yang, Siffi Singh, Song Feng, Hwanjun Song, Hang Su, Lijia Sun, Yi Zhang, Saab Mansour, Kathleen McKeown

    Abstract: Single document news summarization has seen substantial progress on faithfulness in recent years, driven by research on the evaluation of factual consistency, or hallucinations. We ask whether these advances carry over to other text summarization domains. We propose a new evaluation benchmark on topic-focused dialogue summarization, generated by LLMs of varying sizes. We provide binary sentence-le… ▽ More

    Submitted 31 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: NAACL 2024; Linguistic annotations available at https://github.com/amazon-science/tofueval

  10. arXiv:2402.12530  [pdf, other

    cs.CL cs.AI cs.LG

    Parallel Structures in Pre-training Data Yield In-Context Learning

    Authors: Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He

    Abstract: Pre-trained language models (LMs) are capable of in-context learning (ICL): they can adapt to a task with only a few examples given in the prompt without any parameter update. However, it is unclear where this capability comes from as there is a stark distribution shift between pre-training text and ICL prompts. In this work, we study what patterns of the pre-training data contribute to ICL. We fi… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  11. arXiv:2311.07884  [pdf, other

    cs.CL

    Fair Abstractive Summarization of Diverse Perspectives

    Authors: Yusen Zhang, Nan Zhang, Yixin Liu, Alexander Fabbri, Junru Liu, Ryo Kamoi, Xiaoxin Lu, Caiming Xiong, Jieyu Zhao, Dragomir Radev, Kathleen McKeown, Rui Zhang

    Abstract: People from different social and demographic groups express diverse perspectives and conflicting opinions on a broad set of topics such as product reviews, healthcare, law, and politics. A fair summary should provide a comprehensive coverage of diverse perspectives without underrepresenting certain groups. However, current work in summarization metrics and Large Language Models (LLMs) evaluation h… ▽ More

    Submitted 29 March, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  12. arXiv:2308.15459  [pdf, other

    cs.CL cs.AI

    ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer

    Authors: Zachary Horvitz, Ajay Patel, Chris Callison-Burch, Zhou Yu, Kathleen McKeown

    Abstract: Textual style transfer is the task of transforming stylistic properties of text while preserving meaning. Target "styles" can be defined in numerous ways, ranging from single attributes (e.g, formality) to authorship (e.g, Shakespeare). Previous unsupervised style-transfer approaches generally rely on significant amounts of labeled data for only a fixed set of styles or require large language mode… ▽ More

    Submitted 22 February, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

  13. arXiv:2308.05317  [pdf, other

    cs.CL

    Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning

    Authors: Alexander Hanbo Li, Mingyue Shang, Evangelia Spiliopoulou, Jie Ma, Patrick Ng, Zhiguo Wang, Bonan Min, William Wang, Kathleen McKeown, Vittorio Castelli, Dan Roth, Bing Xiang

    Abstract: We present a novel approach for structured data-to-text generation that addresses the limitations of existing methods that primarily focus on specific types of structured data. Our proposed method aims to improve performance in multi-task training, zero-shot and few-shot scenarios by providing a unified representation that can handle various forms of structured data such as tables, knowledge graph… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

  14. arXiv:2307.08678  [pdf, other

    cs.CL cs.AI cs.LG

    Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations

    Authors: Yanda Chen, Ruiqi Zhong, Narutatsu Ri, Chen Zhao, He He, Jacob Steinhardt, Zhou Yu, Kathleen McKeown

    Abstract: Large language models (LLMs) are trained to imitate humans to explain human decisions. However, do LLMs explain themselves? Can they help humans build mental models of how LLMs process different inputs? To answer these questions, we propose to evaluate $\textbf{counterfactual simulatability}$ of natural language explanations: whether an explanation can enable humans to precisely infer the model's… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

  15. arXiv:2305.18265  [pdf, other

    cs.CL cs.AI cs.CY

    Check-COVID: Fact-Checking COVID-19 News Claims with Scientific Evidence

    Authors: Gengyu Wang, Kate Harwood, Lawrence Chillrud, Amith Ananthram, Melanie Subbiah, Kathleen McKeown

    Abstract: We present a new fact-checking benchmark, Check-COVID, that requires systems to verify claims about COVID-19 from news using evidence from scientific articles. This approach to fact-checking is particularly challenging as it requires checking internet text written in everyday language against evidence from journal articles written in formal academic language. Check-COVID contains 1, 504 expert-ann… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: Accepted as ACL 2023 Findings

  16. arXiv:2305.17779  [pdf, other

    cs.CL

    Generating EDU Extracts for Plan-Guided Summary Re-Ranking

    Authors: Griffin Adams, Alexander R. Fabbri, Faisal Ladhak, Kathleen McKeown, Noémie Elhadad

    Abstract: Two-step approaches, in which summary candidates are generated-then-reranked to return a single summary, can improve ROUGE scores over the standard single-step approach. Yet, standard decoding methods (i.e., beam search, nucleus sampling, and diverse beam search) produce candidates with redundant, and often low quality, content. In this paper, we design a novel method to generate candidates for re… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  17. arXiv:2305.17534  [pdf, other

    cs.CL cs.AI cs.LG

    Unsupervised Selective Rationalization with Noise Injection

    Authors: Adam Storek, Melanie Subbiah, Kathleen McKeown

    Abstract: A major issue with using deep learning models in sensitive applications is that they provide no explanation for their output. To address this problem, unsupervised selective rationalization produces rationales alongside predictions by chaining two jointly-trained components, a rationale generator and a predictor. Although this architecture guarantees that the prediction relies solely on the ration… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  18. arXiv:2305.14291  [pdf, other

    cs.CL

    Evaluation of African American Language Bias in Natural Language Generation

    Authors: Nicholas Deas, Jessi Grieser, Shana Kleiner, Desmond Patton, Elsbeth Turcan, Kathleen McKeown

    Abstract: We evaluate how well LLMs understand African American Language (AAL) in comparison to their performance on White Mainstream English (WME), the encouraged "standard" form of English taught in American classrooms. We measure LLM performance using automatic metrics and human judgments for two tasks: a counterpart generation task, where a model generates AAL (or WME) given WME (or AAL), and a masked s… ▽ More

    Submitted 12 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023 Camera-Ready

  19. arXiv:2305.14225  [pdf, other

    cs.CL

    ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media

    Authors: Kung-Hsiang Huang, Hou Pong Chan, Kathleen McKeown, Heng Ji

    Abstract: Considerable advancements have been made to tackle the misrepresentation of information derived from reference articles in the domains of fact-checking and faithful summarization. However, an unaddressed aspect remains - the identification of social media posts that manipulate information within associated news articles. This task presents a significant challenge, primarily due to the prevalence o… ▽ More

    Submitted 12 June, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

  20. arXiv:2305.12696  [pdf, other

    cs.CL

    Learning Interpretable Style Embeddings via Prompting LLMs

    Authors: Ajay Patel, Delip Rao, Ansh Kothary, Kathleen McKeown, Chris Callison-Burch

    Abstract: Style representation learning builds content-independent representations of author style in text. Stylometry, the analysis of style in text, is often performed by expert forensic linguists and no large dataset of stylometric annotations exists for training. Current style representation learning uses neural methods to disentangle style from content to create style vectors, however, these approaches… ▽ More

    Submitted 9 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

  21. arXiv:2303.03278  [pdf, other

    cs.CL cs.AI cs.LG

    Faithfulness-Aware Decoding Strategies for Abstractive Summarization

    Authors: David Wan, Mengwen Liu, Kathleen McKeown, Markus Dreyer, Mohit Bansal

    Abstract: Despite significant progress in understanding and improving faithfulness in abstractive summarization, the question of how decoding strategies affect faithfulness is less studied. We present a systematic study of the effect of generation techniques such as beam search and nucleus sampling on faithfulness in abstractive summarization. We find a consistent trend where beam search with large beam siz… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

    Comments: EACL 2023 (17 pages)

  22. arXiv:2302.00102  [pdf, other

    cs.CL cs.LG

    Towards Detecting Harmful Agendas in News Articles

    Authors: Melanie Subbiah, Amrita Bhattacharjee, Yilun Hua, Tharindu Kumarage, Huan Liu, Kathleen McKeown

    Abstract: Manipulated news online is a growing problem which necessitates the use of automated systems to curtail its spread. We argue that while misinformation and disinformation detection have been studied, there has been a lack of investment in the important open challenge of detecting harmful agendas in news articles; identifying harmful agendas is critical to flag news campaigns with the greatest poten… ▽ More

    Submitted 2 August, 2023; v1 submitted 31 January, 2023; originally announced February 2023.

    Comments: Camera-ready for ACL-WASSA 2023. First two authors contributed equally

  23. arXiv:2301.13848  [pdf, other

    cs.CL cs.AI cs.LG

    Benchmarking Large Language Models for News Summarization

    Authors: Tianyi Zhang, Faisal Ladhak, Esin Durmus, Percy Liang, Kathleen McKeown, Tatsunori B. Hashimoto

    Abstract: Large language models (LLMs) have shown promise for automatic summarization but the reasons behind their successes are poorly understood. By conducting a human evaluation on ten LLMs across different pretraining methods, prompts, and model scales, we make two important observations. First, we find instruction tuning, and not model size, is the key to the LLM's zero-shot summarization capability. S… ▽ More

    Submitted 31 January, 2023; originally announced January 2023.

  24. arXiv:2301.10483  [pdf, other

    cs.CL

    SWING: Balancing Coverage and Faithfulness for Dialogue Summarization

    Authors: Kung-Hsiang Huang, Siffi Singh, Xiaofei Ma, Wei Xiao, Feng Nan, Nicholas Dingwall, William Yang Wang, Kathleen McKeown

    Abstract: Missing information is a common issue of dialogue summarization where some information in the reference summaries is not covered in the generated summaries. To address this issue, we propose to utilize natural language inference (NLI) models to improve coverage while avoiding introducing factual inconsistencies. Specifically, we use NLI to compute fine-grained training signals to encourage the mod… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

    Comments: Accepted by Findings of EACL 2023

  25. arXiv:2212.10670  [pdf, other

    cs.CL cs.LG

    In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models

    Authors: Yukun Huang, Yanda Chen, Zhou Yu, Kathleen McKeown

    Abstract: Given the success with in-context learning of large pre-trained language models, we introduce in-context learning distillation to transfer in-context few-shot learning ability from large models to smaller models. We propose to combine in-context learning objectives with language modeling objectives to distill both the ability to read in-context examples and task knowledge to the smaller models. We… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

  26. arXiv:2211.11724  [pdf, other

    cs.CL

    Legal and Political Stance Detection of SCOTUS Language

    Authors: Noah Bergam, Emily Allaway, Kathleen McKeown

    Abstract: We analyze publicly available US Supreme Court documents using automated stance detection. In the first phase of our work, we investigate the extent to which the Court's public-facing language is political. We propose and calculate two distinct ideology metrics of SCOTUS justices using oral argument transcripts. We then compare these language-based metrics to existing social scientific measures of… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: Natural Legal Language Processing Workshop at EMNLP 2022

  27. arXiv:2211.05886  [pdf, ps, other

    cs.CL

    CREATIVESUMM: Shared Task on Automatic Summarization for Creative Writing

    Authors: Divyansh Agarwal, Alexander R. Fabbri, Simeng Han, Wojciech Kryściński, Faisal Ladhak, Bryan Li, Kathleen McKeown, Dragomir Radev, Tianyi Zhang, Sam Wiseman

    Abstract: This paper introduces the shared task of summarizing documents in several creative domains, namely literary texts, movie scripts, and television scripts. Summarizing these creative documents requires making complex literary interpretations, as well as understanding non-trivial temporal dependencies in texts containing varied styles of plot development and narrative structure. This poses unique cha… ▽ More

    Submitted 6 December, 2022; v1 submitted 10 November, 2022; originally announced November 2022.

    Comments: 4 pages + 3 for references and appendix

  28. arXiv:2211.04903  [pdf, other

    cs.CL

    Novel Chapter Abstractive Summarization using Spinal Tree Aware Sub-Sentential Content Selection

    Authors: Hardy Hardy, Miguel Ballesteros, Faisal Ladhak, Muhammad Khalifa, Vittorio Castelli, Kathleen McKeown

    Abstract: Summarizing novel chapters is a difficult task due to the input length and the fact that sentences that appear in the desired summaries draw content from multiple places throughout the chapter. We present a pipelined extractive-abstractive approach where the extractive step filters the content that is passed to the abstractive component. Extremely lengthy input also results in a highly skewed data… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  29. arXiv:2210.10045  [pdf, other

    cs.CL cs.AI

    SafeText: A Benchmark for Exploring Physical Safety in Language Models

    Authors: Sharon Levy, Emily Allaway, Melanie Subbiah, Lydia Chilton, Desmond Patton, Kathleen McKeown, William Yang Wang

    Abstract: Understanding what constitutes safe text is an important issue in natural language processing and can often prevent the deployment of models deemed harmful and unsafe. One such type of safety that has been scarcely studied is commonsense physical safety, i.e. text that is not explicitly violent and requires additional commonsense knowledge to comprehend that it leads to physical harm. We create th… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022

  30. arXiv:2210.09306  [pdf, other

    cs.AI cs.CL cs.LG

    Mitigating Covertly Unsafe Text within Natural Language Systems

    Authors: Alex Mei, Anisha Kabir, Sharon Levy, Melanie Subbiah, Emily Allaway, John Judge, Desmond Patton, Bruce Bimber, Kathleen McKeown, William Yang Wang

    Abstract: An increasingly prevalent problem for intelligent technologies is text safety, as uncontrolled systems may generate recommendations to their users that lead to injury or life-threatening consequences. However, the degree of explicitness of a generated statement that can cause physical harm varies. In this paper, we distinguish types of text that can lead to physical harm and establish one particul… ▽ More

    Submitted 20 March, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: In Findings of the 2022 Conference on Empirical Methods in Natural Language Processing

  31. arXiv:2209.07661  [pdf, other

    cs.CL cs.AI cs.LG

    On the Relation between Sensitivity and Accuracy in In-context Learning

    Authors: Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He

    Abstract: In-context learning (ICL) suffers from oversensitivity to the prompt, making it unreliable in real-world scenarios. We study the sensitivity of ICL with respect to multiple perturbation types. First, we find that label bias obscures the true sensitivity, and therefore prior work may have significantly underestimated ICL sensitivity. Second, we observe a strong negative correlation between ICL sens… ▽ More

    Submitted 27 January, 2024; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: EMNLP 2023 camera-ready

  32. arXiv:2205.11658  [pdf, other

    cs.CL

    Penguins Don't Fly: Reasoning about Generics through Instantiations and Exceptions

    Authors: Emily Allaway, Jena D. Hwang, Chandra Bhagavatula, Kathleen McKeown, Doug Downey, Yejin Choi

    Abstract: Generics express generalizations about the world (e.g., birds can fly) that are not universally true (e.g., newborn birds and penguins cannot fly). Commonsense knowledge bases, used extensively in NLP, encode some generic knowledge but rarely enumerate such exceptions and knowing when a generic statement holds or does not hold true is crucial for developing a comprehensive understanding of generic… ▽ More

    Submitted 24 March, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: EACL 2023

  33. arXiv:2205.11602  [pdf, other

    cs.CL

    Seeded Hierarchical Clustering for Expert-Crafted Taxonomies

    Authors: Anish Saha, Amith Ananthram, Emily Allaway, Heng Ji, Kathleen McKeown

    Abstract: Practitioners from many disciplines (e.g., political science) use expert-crafted taxonomies to make sense of large, unlabeled corpora. In this work, we study Seeded Hierarchical Clustering (SHC): the task of automatically fitting unlabeled data to such taxonomies using only a small set of labeled examples. We propose HierSeed, a novel weakly supervised algorithm for this task that uses only a smal… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

  34. arXiv:2204.10290  [pdf, other

    cs.CL

    Learning to Revise References for Faithful Summarization

    Authors: Griffin Adams, Han-Chin Shing, Qing Sun, Christopher Winestock, Kathleen McKeown, Noémie Elhadad

    Abstract: In real-world scenarios with naturally occurring datasets, reference summaries are noisy and may contain information that cannot be inferred from the source text. On large news corpora, removing low quality samples has been shown to reduce model hallucinations. Yet, for smaller, and/or noisier corpora, filtering is detrimental to performance. To improve reference quality while retaining all data,… ▽ More

    Submitted 11 October, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

    Comments: Findings of EMNLP 2022

  35. arXiv:2203.10254  [pdf, other

    cs.CL

    Read Top News First: A Document Reordering Approach for Multi-Document News Summarization

    Authors: Chao Zhao, Tenghao Huang, Somnath Basu Roy Chowdhury, Muthu Kumar Chandrasekaran, Kathleen McKeown, Snigdha Chaturvedi

    Abstract: A common method for extractive multi-document news summarization is to re-formulate it as a single-document summarization problem by concatenating all documents as a single meta-document. However, this method neglects the relative importance of documents. We propose a simple approach to reorder the documents according to their relative importance before concatenating and summarizing them. The reor… ▽ More

    Submitted 19 March, 2022; originally announced March 2022.

    Comments: Accepted at Findings of ACL 2022

  36. arXiv:2203.05386  [pdf, other

    cs.CL

    Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation

    Authors: Kung-Hsiang Huang, Kathleen McKeown, Preslav Nakov, Yejin Choi, Heng Ji

    Abstract: Despite recent advances in detecting fake news generated by neural models, their results are not readily applicable to effective detection of human-written disinformation. What limits the successful transfer between them is the sizable gap between machine-generated fake news and human-authored ones, including the notable differences in terms of style and underlying intent. With this in mind, we pr… ▽ More

    Submitted 15 May, 2023; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: Accepted by ACL 2023

  37. arXiv:2111.13993  [pdf, other

    cs.CL

    An analysis of document graph construction methods for AMR summarization

    Authors: Fei-Tzin Lee, Chris Kedzie, Nakul Verma, Kathleen McKeown

    Abstract: Meaning Representation (AMR) is a graph-based semantic representation for sentences, composed of collections of concepts linked by semantic relations. AMR-based approaches have found success in a variety of applications, but a challenge to using it in tasks that require document-level context is that it only represents individual sentences. Prior work in AMR-based summarization has automatically m… ▽ More

    Submitted 27 November, 2021; originally announced November 2021.

  38. arXiv:2109.08232  [pdf, other

    cs.CL

    A Bag of Tricks for Dialogue Summarization

    Authors: Muhammad Khalifa, Miguel Ballesteros, Kathleen McKeown

    Abstract: Dialogue summarization comes with its own peculiar challenges as opposed to news or scientific articles summarization. In this work, we explore four different challenges of the task: handling and differentiating parts of the dialogue belonging to multiple speakers, negation understanding, reasoning about the situation, and informal language understanding. Using a pretrained sequence-to-sequence la… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 - short paper

  39. arXiv:2109.05168  [pdf, other

    cs.CL

    Semantic Categorization of Social Knowledge for Commonsense Question Answering

    Authors: Gengyu Wang, Xiaochen Hou, Diyi Yang, Kathleen McKeown, Jing Huang

    Abstract: Large pre-trained language models (PLMs) have led to great success on various commonsense question answering (QA) tasks in an end-to-end fashion. However, little attention has been paid to what commonsense knowledge is needed to deeply characterize these QA tasks. In this work, we proposed to categorize the semantics needed for these tasks using the SocialIQA as an example. Building upon our label… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

    Comments: Accepted by SustaiNLP 2021 on EMNLP 2021

  40. arXiv:2108.13684  [pdf, other

    cs.CL

    Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization

    Authors: Faisal Ladhak, Esin Durmus, He He, Claire Cardie, Kathleen McKeown

    Abstract: Despite recent progress in abstractive summarization, systems still suffer from faithfulness errors. While prior work has proposed models that improve faithfulness, it is unclear whether the improvement comes from an increased level of extractiveness of the model outputs as one naive way to improve faithfulness is to make summarization models more extractive. In this work, we present a framework f… ▽ More

    Submitted 21 April, 2022; v1 submitted 31 August, 2021; originally announced August 2021.

    Comments: Published in ACL 2022 main conference

  41. arXiv:2106.02293  [pdf, other

    cs.CL cs.IR

    Cross-language Sentence Selection via Data Augmentation and Rationale Training

    Authors: Yanda Chen, Chris Kedzie, Suraj Nair, Petra Galuščáková, Rui Zhang, Douglas W. Oard, Kathleen McKeown

    Abstract: This paper proposes an approach to cross-language sentence selection in a low-resource setting. It uses data augmentation and negative sampling techniques on noisy parallel sentence data to directly learn a cross-lingual embedding-based query relevance model. Results show that this approach performs as well as or better than multiple state-of-the-art machine translation + monolingual retrieval sys… ▽ More

    Submitted 4 June, 2021; originally announced June 2021.

    Comments: ACL 2021 main conference

  42. arXiv:2105.06603  [pdf, other

    cs.CL

    Adversarial Learning for Zero-Shot Stance Detection on Social Media

    Authors: Emily Allaway, Malavika Srikanth, Kathleen McKeown

    Abstract: Stance detection on social media can help to identify and understand slanted news or commentary in everyday life. In this work, we propose a new model for zero-shot stance detection on Twitter that uses adversarial learning to generalize across topics. Our model achieves state-of-the-art performance on a number of unseen test topics with minimal computational costs. In addition, we extend zero-sho… ▽ More

    Submitted 13 May, 2021; originally announced May 2021.

    Comments: To appear in NAACL 2021

  43. arXiv:2105.04623  [pdf, other

    cs.CL cs.AI

    Improving Factual Consistency of Abstractive Summarization via Question Answering

    Authors: Feng Nan, Cicero Nogueira dos Santos, Henghui Zhu, Patrick Ng, Kathleen McKeown, Ramesh Nallapati, Dejiao Zhang, Zhiguo Wang, Andrew O. Arnold, Bing Xiang

    Abstract: A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents. The fact that automatic summarization may produce plausible-sounding yet inaccurate summaries is a major concern that limits its wide application. In this paper we present an approach to address factual consistency in summari… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

    Comments: ACL-IJCNLP 2021

  44. arXiv:2104.07868  [pdf, other

    cs.CL

    Segmenting Subtitles for Correcting ASR Segmentation Errors

    Authors: David Wan, Chris Kedzie, Faisal Ladhak, Elsbeth Turcan, Petra Galuščáková, Elena Zotkina, Zhengping Jiang, Peter Bell, Kathleen McKeown

    Abstract: Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation. In this work, we propose a model for correcting the acoustic segmentation of ASR models for low-resource languages to improve performance on downstream tasks.… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

  45. arXiv:2103.12953  [pdf, other

    cs.LG cs.CL

    Supporting Clustering with Contrastive Learning

    Authors: Dejiao Zhang, Feng Nan, Xiaokai Wei, Shangwen Li, Henghui Zhu, Kathleen McKeown, Ramesh Nallapati, Andrew Arnold, Bing Xiang

    Abstract: Unsupervised clustering aims at discovering the semantic categories of data according to some distance measured in the representation space. However, different categories often overlap with each other in the representation space at the beginning of the learning process, which poses a significant challenge for distance-based clustering in achieving good separation between different categories. To t… ▽ More

    Submitted 28 May, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: NAACL 2021

  46. arXiv:2102.09130  [pdf, other

    cs.CL cs.AI

    Entity-level Factual Consistency of Abstractive Text Summarization

    Authors: Feng Nan, Ramesh Nallapati, Zhiguo Wang, Cicero Nogueira dos Santos, Henghui Zhu, Dejiao Zhang, Kathleen McKeown, Bing Xiang

    Abstract: A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document. For example, state-of-the-art models trained on existing datasets exhibit entity hallucination, generating names of entities that are not present in the source document. We propose a set of new metrics to quantify the entity-level factual consistency of gene… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

    Comments: EACL 2021

  47. arXiv:2101.11059  [pdf, other

    cs.CL cs.AI cs.IR

    Event-Driven News Stream Clustering using Entity-Aware Contextual Embeddings

    Authors: Kailash Karthik Saravanakumar, Miguel Ballesteros, Muthu Kumar Chandrasekaran, Kathleen McKeown

    Abstract: We propose a method for online news stream clustering that is a variant of the non-parametric streaming K-means algorithm. Our model uses a combination of sparse and dense document representations, aggregates document-cluster similarity along these multiple representations and makes the clustering decision using a neural classifier. The weighted document-cluster similarity model is learned using a… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: To appear in Proceedings of The 16th Conference of the European Chapter of the Association for Computational Linguistics

    ACM Class: I.2.7

  48. arXiv:2012.02721  [pdf, other

    cs.CL

    Event Guided Denoising for Multilingual Relation Learning

    Authors: Amith Ananthram, Emily Allaway, Kathleen McKeown

    Abstract: General purpose relation extraction has recently seen considerable gains in part due to a massively data-intensive distant supervision technique from Soares et al. (2019) that produces state-of-the-art results across many benchmarks. In this work, we present a methodology for collecting high quality training data for relation extraction from unlabeled text that achieves a near-recreation of their… ▽ More

    Submitted 4 December, 2020; originally announced December 2020.

    Comments: COLING2020, short paper

  49. arXiv:2010.09693  [pdf, other

    cs.CL

    Subtitles to Segmentation: Improving Low-Resource Speech-to-Text Translation Pipelines

    Authors: David Wan, Zhengping Jiang, Chris Kedzie, Elsbeth Turcan, Peter Bell, Kathleen McKeown

    Abstract: In this work, we focus on improving ASR output segmentation in the context of low-resource language speech-to-text translation. ASR output segmentation is crucial, as ASR systems segment the input audio using purely acoustic information and are not guaranteed to output sentence-like segments. Since most MT systems expect sentences as input, feeding in longer unsegmented passages can lead to sub-op… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Journal ref: CLSST@LREC 2020 68-73

  50. arXiv:2010.09608  [pdf, other

    cs.CL

    Incorporating Terminology Constraints in Automatic Post-Editing

    Authors: David Wan, Chris Kedzie, Faisal Ladhak, Marine Carpuat, Kathleen McKeown

    Abstract: Users of machine translation (MT) may want to ensure the use of specific lexical terminologies. While there exist techniques for incorporating terminology constraints during inference for MT, current APE approaches cannot ensure that they will appear in the final translation. In this paper, we present both autoregressive and non-autoregressive models for lexically constrained APE, demonstrating th… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: To appear in WMT, 2020