Skip to main content

Showing 1–10 of 10 results for author: Vishnubhotla, K

  1. arXiv:2406.16767  [pdf, other

    cs.CL

    The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories

    Authors: Xi Yu Huang, Krishnapriya Vishnubhotla, Frank Rudzicz

    Abstract: The improved generative capabilities of large language models have made them a powerful tool for creative writing and storytelling. It is therefore important to quantitatively understand the nature of generated stories, and how they differ from human storytelling. We augment the Reddit WritingPrompts dataset with short stories generated by GPT-3.5, given the same prompts. We quantify and compare t… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2403.18933  [pdf, other

    cs.CL

    SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian Languages

    Authors: Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Meriem Beloucif, Christine De Kock, Oumaima Hourrane, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Krishnapriya Vishnubhotla, Seid Muhie Yimam, Saif M. Mohammad

    Abstract: We present the first shared task on Semantic Textual Relatedness (STR). While earlier shared tasks primarily focused on semantic similarity, we instead investigate the broader phenomenon of semantic relatedness across 14 languages: Afrikaans, Algerian Arabic, Amharic, English, Hausa, Hindi, Indonesian, Kinyarwanda, Marathi, Moroccan Arabic, Modern Standard Arabic, Punjabi, Spanish, and Telugu. The… ▽ More

    Submitted 17 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: SemEval 2024 Task Description Paper. arXiv admin note: text overlap with arXiv:2402.08638

  3. arXiv:2403.02474  [pdf, other

    cs.CL

    The Emotion Dynamics of Literary Novels

    Authors: Krishnapriya Vishnubhotla, Adam Hammond, Graeme Hirst, Saif M. Mohammad

    Abstract: Stories are rich in the emotions they exhibit in their narratives and evoke in the readers. The emotional journeys of the various characters within a story are central to their appeal. Computational analysis of the emotions of novels, however, has rarely examined the variation in the emotional trajectories of the different characters within them, instead considering the entire novel to represent a… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 8 pages plus appendices

  4. arXiv:2403.02281  [pdf, other

    cs.CL

    Emotion Granularity from Text: An Aggregate-Level Indicator of Mental Health

    Authors: Krishnapriya Vishnubhotla, Daniela Teodorescu, Mallory J. Feldman, Kristen A. Lindquist, Saif M. Mohammad

    Abstract: We are united in how emotions are central to shaping our experiences; and yet, individuals differ greatly in how we each identify, categorize, and express emotions. In psychology, variation in the ability of individuals to differentiate between emotion concepts is called emotion granularity (determined through self-reports of one's emotions). High emotion granularity has been linked with better me… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 9 pages plus appendices

  5. arXiv:2402.08638  [pdf, other

    cs.CL

    SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 Languages

    Authors: Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Abinew Ali Ayele, Pavan Baswani, Meriem Beloucif, Chris Biemann, Sofia Bourhim, Christine De Kock, Genet Shanko Dekebo, Oumaima Hourrane, Gopichand Kanumolu, Lokesh Madasu, Samuel Rutunda, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Hailegnaw Getaneh Tilaye, Krishnapriya Vishnubhotla, Genta Winata , et al. (2 additional authors not shown)

    Abstract: Exploring and quantifying semantic relatedness is central to representing language and holds significant implications across various NLP tasks. While earlier NLP research primarily focused on semantic similarity, often within the English language context, we instead investigate the broader phenomenon of semantic relatedness. In this paper, we present \textit{SemRel}, a new semantic relatedness dat… ▽ More

    Submitted 31 May, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: Accepted to the Findings of ACL 2024

  6. arXiv:2307.03734  [pdf, other

    cs.CL

    Improving Automatic Quotation Attribution in Literary Novels

    Authors: Krishnapriya Vishnubhotla, Frank Rudzicz, Graeme Hirst, Adam Hammond

    Abstract: Current models for quotation attribution in literary novels assume varying levels of available information in their training and test data, which poses a challenge for in-the-wild inference. Here, we approach quotation attribution as a set of four interconnected sub-tasks: character identification, coreference resolution, quotation identification, and speaker attribution. We benchmark state-of-the… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

    Comments: Accepted to ACL 2023, short paper

  7. arXiv:2204.05836  [pdf, other

    cs.CL

    The Project Dialogism Novel Corpus: A Dataset for Quotation Attribution in Literary Texts

    Authors: Krishnapriya Vishnubhotla, Adam Hammond, Graeme Hirst

    Abstract: We present the Project Dialogism Novel Corpus, or PDNC, an annotated dataset of quotations for English literary texts. PDNC contains annotations for 35,978 quotations across 22 full-length novels, and is by an order of magnitude the largest corpus of its kind. Each quotation is annotated for the speaker, addressees, type of quotation, referring expression, and character mentions within the quotati… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: Accepted for publication at LREC 2022

  8. arXiv:2204.04862  [pdf, other

    cs.CL

    Tweet Emotion Dynamics: Emotion Word Usage in Tweets from US and Canada

    Authors: Krishnapriya Vishnubhotla, Saif M. Mohammad

    Abstract: Over the last decade, Twitter has emerged as one of the most influential forums for social, political, and health discourse. In this paper, we introduce a massive dataset of more than 45 million geo-located tweets posted between 2015 and 2021 from US and Canada (TUSC), especially curated for natural language analysis. We also introduce Tweet Emotion Dynamics (TED) -- metrics to capture patterns of… ▽ More

    Submitted 4 May, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: Accepted for publication at LREC 2022 (camera-ready)

  9. arXiv:2110.04845  [pdf, other

    cs.CL

    What Makes Sentences Semantically Related: A Textual Relatedness Dataset and Empirical Study

    Authors: Mohamed Abdalla, Krishnapriya Vishnubhotla, Saif M. Mohammad

    Abstract: The degree of semantic relatedness of two units of language has long been considered fundamental to understanding meaning. Additionally, automatically determining relatedness has many applications such as question answering and summarization. However, prior NLP work has largely focused on semantic similarity, a subset of relatedness, because of a lack of relatedness datasets. In this paper, we int… ▽ More

    Submitted 20 March, 2023; v1 submitted 10 October, 2021; originally announced October 2021.

    Comments: Accepted to EACL 2023; Our dataset, data statement, and annotation questionnaire can be found at: https://doi.org/10.5281/zenodo.7599667

  10. arXiv:1904.02293  [pdf, other

    cs.CL cs.AI cs.LG

    Generative Adversarial Networks for text using word2vec intermediaries

    Authors: Akshay Budhkar, Krishnapriya Vishnubhotla, Safwan Hossain, Frank Rudzicz

    Abstract: Generative adversarial networks (GANs) have shown considerable success, especially in the realistic generation of images. In this work, we apply similar techniques for the generation of text. We propose a novel approach to handle the discrete nature of text, during training, using word embeddings. Our method is agnostic to vocabulary size and achieves competitive results relative to methods with v… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.