Skip to main content

Showing 1–50 of 72 results for author: Akhtar, M

  1. arXiv:2407.08842  [pdf, ps, other

    cs.CL

    Evaluating Nuanced Bias in Large Language Model Free Response Answers

    Authors: Jennifer Healey, Laurie Byrum, Md Nadeem Akhtar, Moumita Sinha

    Abstract: Pre-trained large language models (LLMs) can now be easily adapted for specific business purposes using custom prompts or fine tuning. These customizations are often iteratively re-engineered to improve some aspect of performance, but after each change businesses want to ensure that there has been no negative impact on the system's behavior around such critical issues as bias. Prior methods of ben… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 14 pages, 0 figures, submitted to NLDB 2024, Turin, Italy

  2. arXiv:2406.08881  [pdf, other

    cs.CL

    No perspective, no perception!! Perspective-aware Healthcare Answer Summarization

    Authors: Gauri Naik, Sharad Chandakacherla, Shweta Yadav, Md. Shad Akhtar

    Abstract: Healthcare Community Question Answering (CQA) forums offer an accessible platform for individuals seeking information on various healthcare-related topics. People find such platforms suitable for self-disclosure, seeking medical opinions, finding simplified explanations for their medical conditions, and answering others' questions. However, answers on these forums are typically diverse and prone t… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Findings

  3. arXiv:2406.03953  [pdf, other

    cs.CL

    Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech

    Authors: Neemesh Yadav, Sarah Masud, Vikram Goyal, Vikram Goyal, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: Employing language models to generate explanations for an incoming implicit hate post is an active area of research. The explanation is intended to make explicit the underlying stereotype and aid content moderators. The training often combines top-k relevant knowledge graph (KG) tuples to provide world knowledge and improve performance on standard metrics. Interestingly, our study presents conflic… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 17 Pages, 5 Figures, 13 Tables, ACL Findings 2024

  4. arXiv:2405.07765  [pdf, other

    cs.CL

    TANQ: An open domain dataset of table answered questions

    Authors: Mubashara Akhtar, Chenxi Pang, Andreea Marzoca, Yasemin Altun, Julian Martin Eisenschlos

    Abstract: Language models, potentially augmented with tool usage such as retrieval are becoming the go-to means of answering questions. Understanding and answering questions in real-world settings often requires retrieving information from different sources, processing and aggregating data to extract insights, and presenting complex findings in form of structured artifacts such as novel tables, charts, or i… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 10 pages

  5. arXiv:2404.18101  [pdf, other

    cs.LG

    Advancing Supervised Learning with the Wave Loss Function: A Robust and Smooth Approach

    Authors: Mushir Akhtar, M. Tanveer, Mohd. Arshad

    Abstract: Loss function plays a vital role in supervised learning frameworks. The selection of the appropriate loss function holds the potential to have a substantial impact on the proficiency attained by the acquired model. The training of supervised learning algorithms inherently adheres to predetermined loss functions during the optimization process. In this paper, we present a novel contribution to the… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  6. arXiv:2403.19546  [pdf, other

    cs.LG cs.AI cs.DB cs.IR

    Croissant: A Metadata Format for ML-Ready Datasets

    Authors: Mubashara Akhtar, Omar Benjelloun, Costanza Conforti, Pieter Gijsbers, Joan Giner-Miguelez, Nitisha Jain, Michael Kuchnik, Quentin Lhoest, Pierre Marcenac, Manil Maskey, Peter Mattson, Luis Oala, Pierre Ruyssen, Rajat Shinde, Elena Simperl, Goeffry Thomas, Slava Tykhonov, Joaquin Vanschoren, Jos van der Velde, Steffen Vogler, Carole-Jean Wu

    Abstract: Data is a critical resource for Machine Learning (ML), yet working with data remains a key friction point. This paper introduces Croissant, a metadata format for datasets that simplifies how data is used by ML tools and frameworks. Croissant makes datasets more discoverable, portable and interoperable, thereby addressing significant challenges in ML data management and responsible AI. Croissant is… ▽ More

    Submitted 30 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Published in Proceedings of ACM SIGMOD/PODS'24 Data Management for End-to-End Machine Learning (DEEM) Workshop https://dl.acm.org/doi/10.1145/3650203.3663326

  7. arXiv:2403.16771  [pdf

    cs.CL cs.LG

    Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation

    Authors: Kartik Kartik, Sanjana Soni, Anoop Kunchukuttan, Tanmoy Chakraborty, Md Shad Akhtar

    Abstract: The widespread online communication in a modern multilingual world has provided opportunities to blend more than one language (aka code-mixed language) in a single utterance. This has resulted a formidable challenge for the computational models due to the scarcity of annotated data and presence of noise. A potential solution to mitigate the data scarcity problem in low-resource setup is to leverag… ▽ More

    Submitted 29 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: 9 pages, 2 figures, to be published in LREC-COLING 2024

  8. arXiv:2403.10279  [pdf, other

    cs.CY

    Emotion-Aware Multimodal Fusion for Meme Emotion Detection

    Authors: Shivam Sharma, Ramaneswaran S, Md. Shad Akhtar, Tanmoy Chakraborty

    Abstract: The ever-evolving social media discourse has witnessed an overwhelming use of memes to express opinions or dissent. Besides being misused for spreading malcontent, they are mined by corporations and political parties to glean the public's opinion. Therefore, memes predominantly offer affect-enriched insights towards ascertaining the societal psyche. However, the current approaches are yet to model… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted to IEEE Transactions on Affective Computing

  9. arXiv:2403.10088  [pdf, other

    cs.CL cs.AI

    Intent-conditioned and Non-toxic Counterspeech Generation using Multi-Task Instruction Tuning with RLAIF

    Authors: Amey Hengle, Aswini Kumar, Sahajpreet Singh, Anil Bandhakavi, Md Shad Akhtar, Tanmoy Chakroborty

    Abstract: Counterspeech, defined as a response to mitigate online hate speech, is increasingly used as a non-censorial solution. Addressing hate speech effectively involves dispelling the stereotypes, prejudices, and biases often subtly implied in brief, single-sentence statements or abuses. These implicit expressions challenge language models, especially in seq2seq tasks, as model performance typically exc… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  10. arXiv:2403.00141  [pdf, other

    cs.CL cs.AI

    EROS: Entity-Driven Controlled Policy Document Summarization

    Authors: Joykirat Singh, Sehban Fazili, Rohan Jain, Md Shad Akhtar

    Abstract: Privacy policy documents have a crucial role in educating individuals about the collection, usage, and protection of users' personal data by organizations. However, they are notorious for their lengthy, complex, and convoluted language especially involving privacy-related entities. Hence, they pose a significant challenge to users who attempt to comprehend organization's data usage policy. In this… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

    Comments: Accepted in LREC-COLING 2024

  11. arXiv:2402.18944  [pdf, other

    cs.CL cs.AI

    SemEval 2024 -- Task 10: Emotion Discovery and Reasoning its Flip in Conversation (EDiReF)

    Authors: Shivani Kumar, Md Shad Akhtar, Erik Cambria, Tanmoy Chakraborty

    Abstract: We present SemEval-2024 Task 10, a shared task centred on identifying emotions and finding the rationale behind their flips within monolingual English and Hindi-English code-mixed dialogues. This task comprises three distinct subtasks - emotion recognition in conversation for code-mixed dialogues, emotion flip reasoning for code-mixed dialogues, and emotion flip reasoning for English dialogues. Pa… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 11 pages, 3 figures, 7 tables

  12. arXiv:2402.03740  [pdf, other

    cs.SI cs.CY cs.LG

    BotSSCL: Social Bot Detection with Self-Supervised Contrastive Learning

    Authors: Mohammad Majid Akhtar, Navid Shadman Bhuiyan, Rahat Masood, Muhammad Ikram, Salil S. Kanhere

    Abstract: The detection of automated accounts, also known as "social bots", has been an increasingly important concern for online social networks (OSNs). While several methods have been proposed for detecting social bots, significant research gaps remain. First, current models exhibit limitations in detecting sophisticated bots that aim to mimic genuine OSN users. Second, these methods often rely on simplis… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  13. arXiv:2402.02144  [pdf, other

    cs.CL

    Probing Critical Learning Dynamics of PLMs for Hate Speech Detection

    Authors: Sarah Masud, Mohammad Aflah Khan, Vikram Goyal, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: Despite the widespread adoption, there is a lack of research into how various critical aspects of pretrained language models (PLMs) affect their performance in hate speech detection. Through five research questions, our findings and recommendations lay the groundwork for empirically investigating different aspects of PLMs' use in hate speech detection. We deep dive into comparing different pretrai… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: 20 pages, 9 figures, 14 tables. Accepted at EACL'24

  14. arXiv:2401.16785  [pdf, other

    cs.LG

    HawkEye: Advancing Robust Regression with Bounded, Smooth, and Insensitive Loss Function

    Authors: Mushir Akhtar, M. Tanveer, Mohd. Arshad

    Abstract: Support vector regression (SVR) has garnered significant popularity over the past two decades owing to its wide range of applications across various fields. Despite its versatility, SVR encounters challenges when confronted with outliers and noise, primarily due to the use of the $\varepsilon$-insensitive loss function. To address this limitation, SVR with bounded loss functions has emerged as an… ▽ More

    Submitted 15 February, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

  15. arXiv:2311.09834  [pdf, other

    cs.CL

    Overview of the HASOC Subtrack at FIRE 2023: Identification of Tokens Contributing to Explicit Hate in English by Span Detection

    Authors: Sarah Masud, Mohammad Aflah Khan, Md. Shad Akhtar, Tanmoy Chakraborty

    Abstract: As hate speech continues to proliferate on the web, it is becoming increasingly important to develop computational methods to mitigate it. Reactively, using black-box models to identify hateful content can perplex users as to why their posts were automatically flagged as hateful. On the other hand, proactive mitigation can be achieved by suggesting rephrasing before a post is made public. However,… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 8 pages, 1 figure, 4 Tables

  16. arXiv:2311.07453  [pdf, other

    cs.CL cs.CV

    ChartCheck: Explainable Fact-Checking over Real-World Chart Images

    Authors: Mubashara Akhtar, Nikesh Subedi, Vivek Gupta, Sahar Tahmasebi, Oana Cocarascu, Elena Simperl

    Abstract: Whilst fact verification has attracted substantial interest in the natural language processing community, verifying misinforming statements against data visualizations such as charts has so far been overlooked. Charts are commonly used in the real-world to summarize and communicate key information, but they can also be easily misused to spread misinformation and promote certain agendas. In this pa… ▽ More

    Submitted 16 February, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

  17. arXiv:2311.02216  [pdf, other

    cs.CL cs.LG

    Exploring the Numerical Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular Data

    Authors: Mubashara Akhtar, Abhilash Shankarampeta, Vivek Gupta, Arpit Patil, Oana Cocarascu, Elena Simperl

    Abstract: Numbers are crucial for various real-world domains such as finance, economics, and science. Thus, understanding and reasoning with numbers are essential skills for language models to solve different tasks. While different numerical benchmarks have been introduced in recent years, they are limited to specific numerical aspects mostly. In this paper, we propose a hierarchical taxonomy for numerical… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: Accepted at EMNLP 2023 (Findings)

  18. arXiv:2310.19717  [pdf, other

    cs.LG stat.ML

    Support matrix machine: A review

    Authors: Anuradha Kumari, Mushir Akhtar, Rupal Shah, M. Tanveer

    Abstract: Support vector machine (SVM) is one of the most studied paradigms in the realm of machine learning for classification and regression problems. It relies on vectorized input data. However, a significant portion of the real-world data exists in matrix format, which is given as input to SVM by reshaping the matrices into vectors. The process of reshaping disrupts the spatial correlations inherent in… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  19. arXiv:2310.19267  [pdf, other

    cs.CL

    Overview of the CLAIMSCAN-2023: Uncovering Truth in Social Media through Claim Detection and Identification of Claim Spans

    Authors: Megha Sundriyal, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: A significant increase in content creation and information exchange has been made possible by the quick development of online social media platforms, which has been very advantageous. However, these platforms have also become a haven for those who disseminate false information, propaganda, and fake news. Claims are essential in forming our perceptions of the world, but sadly, they are frequently u… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  20. arXiv:2310.14206  [pdf, other

    cs.CL cs.LG

    Manifold-Preserving Transformers are Effective for Short-Long Range Encoding

    Authors: Ayan Sengupta, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: Multi-head self-attention-based Transformers have shown promise in different learning tasks. Albeit these models exhibit significant improvement in understanding short-term and long-term contexts from sequences, encoders of Transformers and their variants fail to preserve layer-wise contextual information. Transformers usually project tokens onto sparse manifolds and fail to preserve mathematical… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: 17 pages, 7 figures, 5 tables, Findings of the Association for Computational Linguistics: EMNLP2023

  21. arXiv:2310.13080  [pdf, other

    cs.CL cs.AI

    From Multilingual Complexity to Emotional Clarity: Leveraging Commonsense to Unveil Emotions in Code-Mixed Dialogues

    Authors: Shivani Kumar, Ramaneswaran S, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: Understanding emotions during conversation is a fundamental aspect of human communication, driving NLP research for Emotion Recognition in Conversation (ERC). While considerable research has focused on discerning emotions of individual speakers in monolingual dialogues, understanding the emotional dynamics in code-mixed conversations has received relatively less attention. This motivates our under… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: Paper accepted in EMNLP 2023. 15 pages, 6 figures, 9 tables

  22. arXiv:2309.09274  [pdf, other

    cs.CL

    Leveraging Social Discourse to Measure Check-worthiness of Claims for Fact-checking

    Authors: Megha Sundriyal, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: The expansion of online social media platforms has led to a surge in online content consumption. However, this has also paved the way for disseminating false claims and misinformation. As a result, there is an escalating demand for a substantial workforce to sift through and validate such unverified claims. Currently, these claims are manually verified by fact-checkers. Still, the volume of online… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: 28 pages, 2 figures, 8 tables

  23. arXiv:2309.02915  [pdf, other

    cs.CL cs.LG

    Persona-aware Generative Model for Code-mixed Language

    Authors: Ayan Sengupta, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: Code-mixing and script-mixing are prevalent across online social networks and multilingual societies. However, a user's preference toward code-mixing depends on the socioeconomic status, demographics of the user, and the local context, which existing generative models mostly ignore while generating code-mixed texts. In this work, we make a pioneering attempt to develop a persona-aware generative m… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

    Comments: 4 tables, 4 figures

  24. arXiv:2309.02250  [pdf, other

    cs.LG

    RoBoSS: A Robust, Bounded, Sparse, and Smooth Loss Function for Supervised Learning

    Authors: Mushir Akhtar, M. Tanveer, Mohd. Arshad

    Abstract: In the domain of machine learning algorithms, the significance of the loss function is paramount, especially in supervised learning tasks. It serves as a fundamental pillar that profoundly influences the behavior and efficacy of supervised learning algorithms. Traditional loss functions, while widely used, often struggle to handle noisy and high-dimensional data, impede model interpretability, and… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  25. arXiv:2309.01618  [pdf, other

    cs.CL

    Critical Behavioral Traits Foster Peer Engagement in Online Mental Health Communities

    Authors: Aseem Srivastava, Tanya Gupta, Alison Cerezo, Sarah Peregrine, Lord, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: Online Mental Health Communities (OMHCs), such as Reddit, have witnessed a surge in popularity as go-to platforms for seeking information and support in managing mental health needs. Platforms like Reddit offer immediate interactions with peers, granting users a vital space for seeking mental health assistance. However, the largely unregulated nature of these platforms introduces intricate challen… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  26. arXiv:2308.12497  [pdf, other

    cs.SI cs.CY cs.LG

    False Information, Bots and Malicious Campaigns: Demystifying Elements of Social Media Manipulations

    Authors: Mohammad Majid Akhtar, Rahat Masood, Muhammad Ikram, Salil S. Kanhere

    Abstract: The rapid spread of false information and persistent manipulation attacks on online social networks (OSNs), often for political, ideological, or financial gain, has affected the openness of OSNs. While researchers from various disciplines have investigated different manipulation-triggering elements of OSNs (such as understanding information diffusion on OSNs or detecting automated behavior of acco… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

  27. arXiv:2306.13959  [pdf, other

    cs.CL cs.AI

    Emotion Flip Reasoning in Multiparty Conversations

    Authors: Shivani Kumar, Shubham Dudeja, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: In a conversational dialogue, speakers may have different emotional states and their dynamics play an important role in understanding dialogue's emotional discourse. However, simply detecting emotions is not sufficient to entirely comprehend the speaker-specific changes in emotion that occur during a conversation. To understand the emotional dynamics of speakers in an efficient manner, it is imper… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

    Comments: Paper accepted in IEEE Transaction on AI. 12 pages, 5 figures, 11 tables

  28. arXiv:2305.15913  [pdf, other

    cs.CL cs.CY cs.MM

    MEMEX: Detecting Explanatory Evidence for Memes via Knowledge-Enriched Contextualization

    Authors: Shivam Sharma, Ramaneswaran S, Udit Arora, Md. Shad Akhtar, Tanmoy Chakraborty

    Abstract: Memes are a powerful tool for communication over social media. Their affinity for evolving across politics, history, and sociocultural phenomena makes them an ideal communication vehicle. To comprehend the subtle message conveyed within a meme, one must understand the background that facilitates its holistic assimilation. Besides digital archiving of memes and their metadata by a few websites like… ▽ More

    Submitted 27 May, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: 9 pages main + 1 ethics + 3 pages ref. + 4 pages app (Total: 17 pages)

  29. arXiv:2305.13776  [pdf, other

    cs.CL cs.AI

    Counterspeeches up my sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generation

    Authors: Rishabh Gupta, Shaily Desai, Manvi Goel, Anil Bandhakavi, Tanmoy Chakraborty, Md. Shad Akhtar

    Abstract: Counterspeech has been demonstrated to be an efficacious approach for combating hate speech. While various conventional and controlled approaches have been studied in recent years to generate counterspeech, a counterspeech with a certain intent may not be sufficient in every scenario. Due to the complex and multifaceted nature of hate speech, utilizing multiple forms of counter-narratives with var… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  30. arXiv:2305.13507  [pdf, other

    cs.CL cs.AI cs.CV

    Multimodal Automated Fact-Checking: A Survey

    Authors: Mubashara Akhtar, Michael Schlichtkrull, Zhijiang Guo, Oana Cocarascu, Elena Simperl, Andreas Vlachos

    Abstract: Misinformation is often conveyed in multiple modalities, e.g. a miscaptioned image. Multimodal misinformation is perceived as more credible by humans, and spreads faster than its text-only counterparts. While an increasing body of research investigates automated fact-checking (AFC), previous surveys mostly focus on text. In this survey, we conceptualise a framework for AFC including subtasks uniqu… ▽ More

    Submitted 25 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP): Findings

  31. arXiv:2304.08801  [pdf, other

    cs.CL cs.AI

    Speaker Profiling in Multiparty Conversations

    Authors: Shivani Kumar, Rishabh Gupta, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: In conversational settings, individuals exhibit unique behaviors, rendering a one-size-fits-all approach insufficient for generating responses by dialogue agents. Although past studies have aimed to create personalized dialogue agents using speaker persona information, they have relied on the assumption that the speaker's persona is already provided. However, this assumption is not always valid, e… ▽ More

    Submitted 19 April, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: 10 pages, 3 figures, 12 tables

  32. arXiv:2301.12729  [pdf, other

    cs.CL

    Response-act Guided Reinforced Dialogue Generation for Mental Health Counseling

    Authors: Aseem Srivastava, Ishan Pandey, Md. Shad Akhtar, Tanmoy Chakraborty

    Abstract: Virtual Mental Health Assistants (VMHAs) have become a prevalent method for receiving mental health counseling in the digital healthcare space. An assistive counseling conversation commences with natural open-ended topics to familiarize the client with the environment and later converges into more fine-grained domain-specific topics. Unlike other conversational systems, which are categorized as op… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: This paper has been accepted by The Web Conference (WWW) 2023

  33. arXiv:2301.11843  [pdf, other

    cs.CL cs.CV

    Reading and Reasoning over Chart Images for Evidence-based Automated Fact-Checking

    Authors: Mubashara Akhtar, Oana Cocarascu, Elena Simperl

    Abstract: Evidence data for automated fact-checking (AFC) can be in multiple modalities such as text, tables, images, audio, or video. While there is increasing interest in using images for AFC, previous works mostly focus on detecting manipulated or fake images. We propose a novel task, chart-based fact-checking, and introduce ChartBERT as the first model for AFC against chart evidence. ChartBERT leverages… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

    Comments: Accepted to EACL 2023 (Findings)

  34. arXiv:2301.11219  [pdf, other

    cs.CL cs.CY

    Characterizing the Entities in Harmful Memes: Who is the Hero, the Villain, the Victim?

    Authors: Shivam Sharma, Atharva Kulkarni, Tharun Suresh, Himanshi Mathur, Preslav Nakov, Md. Shad Akhtar, Tanmoy Chakraborty

    Abstract: Memes can sway people's opinions over social media as they combine visual and textual information in an easy-to-consume manner. Since memes instantly turn viral, it becomes crucial to infer their intent and potentially associated harmfulness to take timely measures as needed. A common problem associated with meme comprehension lies in detecting the entities referenced and characterizing the role o… ▽ More

    Submitted 10 April, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

    Comments: Accepted at EACL 2023 (Main Track). 9 Pages (main content), Limitations, Ethical Considerations + 4 Pages (Refs.) + Appendix; 8 Figures; 5 Tables; Paper ID: 804

  35. arXiv:2212.00715  [pdf, other

    cs.CY cs.CL

    What do you MEME? Generating Explanations for Visual Semantic Role Labelling in Memes

    Authors: Shivam Sharma, Siddhant Agarwal, Tharun Suresh, Preslav Nakov, Md. Shad Akhtar, Tanmoy Chakraborty

    Abstract: Memes are powerful means for effective communication on social media. Their effortless amalgamation of viral visuals and compelling messages can have far-reaching implications with proper marketing. Previous research on memes has primarily focused on characterizing their affective spectrum and detecting whether the meme's message insinuates any intended harm, such as hate, offense, racism, etc. Ho… ▽ More

    Submitted 20 December, 2022; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: Accepted at AAAI 2023 (Main Track). 7 Pages (main content) + 2 Pages (Refs.); 3 Figures; 6 Tables; Paper ID: 10326 (AAAI'23)

  36. arXiv:2211.11049  [pdf, other

    cs.CL cs.AI

    Explaining (Sarcastic) Utterances to Enhance Affect Understanding in Multimodal Dialogues

    Authors: Shivani Kumar, Ishani Mondal, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: Conversations emerge as the primary media for exchanging ideas and conceptions. From the listener's perspective, identifying various affective qualities, such as sarcasm, humour, and emotions, is paramount for comprehending the true connotation of the emitted utterance. However, one of the major hurdles faced in learning these affect dimensions is the presence of figurative language, viz. irony, m… ▽ More

    Submitted 22 November, 2022; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: Accepted at AAAI 2023. 11 Pages; 14 Tables; 3 Figures

  37. arXiv:2210.04710  [pdf, other

    cs.CL

    Empowering the Fact-checkers! Automatic Identification of Claim Spans on Twitter

    Authors: Megha Sundriyal, Atharva Kulkarni, Vaibhav Pulastya, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: The widespread diffusion of medical and political claims in the wake of COVID-19 has led to a voluminous rise in misinformation and fake news. The current vogue is to employ manual fact-checkers to efficiently classify and verify such data to combat this avalanche of claim-ridden misinformation. However, the rate of information dissemination is such that it vastly outpaces the fact-checkers' stren… ▽ More

    Submitted 11 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: Accepted at EMNLP22. 16 pages including Appendix

  38. arXiv:2209.14667  [pdf, other

    cs.CL cs.AI cs.MM

    Domain-aware Self-supervised Pre-training for Label-Efficient Meme Analysis

    Authors: Shivam Sharma, Mohd Khizir Siddiqui, Md. Shad Akhtar, Tanmoy Chakraborty

    Abstract: Existing self-supervised learning strategies are constrained to either a limited set of objectives or generic downstream tasks that predominantly target uni-modal applications. This has isolated progress for imperative multi-modal applications that are diverse in terms of complexity and domain-affinity, such as meme analysis. Here, we introduce two self-supervised pre-training methods, namely Ext-… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: Accepted at AACL-IJCNLP 2022 main conference. 9 Pages (main content); 6 Figures; 5 Tables and an Appendix

  39. arXiv:2209.13017  [pdf, ps, other

    cs.CL cs.LG cs.SI

    Public Wisdom Matters! Discourse-Aware Hyperbolic Fourier Co-Attention for Social-Text Classification

    Authors: Karish Grover, S. M. Phaneendra Angara, Md. Shad Akhtar, Tanmoy Chakraborty

    Abstract: Social media has become the fulcrum of all forms of communication. Classifying social texts such as fake news, rumour, sarcasm, etc. has gained significant attention. The surface-level signals expressed by a social-text itself may not be adequate for such tasks; therefore, recent methods attempted to incorporate other intrinsic signals such as user behavior and the underlying graph structure. Ofte… ▽ More

    Submitted 11 October, 2022; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: NeurIPS 2022

  40. arXiv:2209.03162  [pdf, other

    cs.SI cs.CY cs.LG

    Machine Learning-based Automatic Annotation and Detection of COVID-19 Fake News

    Authors: Mohammad Majid Akhtar, Bibhas Sharma, Ishan Karunanayake, Rahat Masood, Muhammad Ikram, Salil S. Kanhere

    Abstract: COVID-19 impacted every part of the world, although the misinformation about the outbreak traveled faster than the virus. Misinformation spread through online social networks (OSN) often misled people from following correct medical practices. In particular, OSN bots have been a primary source of disseminating false information and initiating cyber propaganda. Existing work neglects the presence of… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

  41. arXiv:2207.01494  [pdf, other

    cs.CY cs.CL

    Auxiliary Task Guided Interactive Attention Model for Question Difficulty Prediction

    Authors: Venktesh V, Md. Shad Akhtar, Mukesh Mohania, Vikram Goyal

    Abstract: Online learning platforms conduct exams to evaluate the learners in a monotonous way, where the questions in the database may be classified into Bloom's Taxonomy as varying levels in complexity from basic knowledge to advanced evaluation. The questions asked in these exams to all learners are very much static. It becomes important to ask new questions with different difficulty levels to each learn… ▽ More

    Submitted 24 May, 2022; originally announced July 2022.

    Comments: Accepted to AIED 2022 as a full paper

  42. arXiv:2206.04007  [pdf, other

    cs.CL

    Proactively Reducing the Hate Intensity of Online Posts via Hate Speech Normalization

    Authors: Sarah Masud, Manjot Bedi, Mohammad Aflah Khan, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: Curbing online hate speech has become the need of the hour; however, a blanket ban on such activities is infeasible for several geopolitical and cultural reasons. To reduce the severity of the problem, in this paper, we introduce a novel task, hate speech normalization, that aims to weaken the intensity of hatred exhibited by an online post. The intention of hate speech normalization is not to sup… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

    Comments: 11 pages, 4 figures, 12 tables. Accepted at KDD 2022 (ADS Track)

  43. arXiv:2206.03886  [pdf, other

    cs.CL

    Counseling Summarization using Mental Health Knowledge Guided Utterance Filtering

    Authors: Aseem Srivastava, Tharun Suresh, Sarah Peregrine, Lord, Md. Shad Akhtar, Tanmoy Chakraborty

    Abstract: The psychotherapy intervention technique is a multifaceted conversation between a therapist and a patient. Unlike general clinical discussions, psychotherapy's core components (viz. symptoms) are hard to distinguish, thus becoming a complex problem to summarize later. A structured counseling conversation may contain discussions about symptoms, history of mental health issues, or the discovery of t… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

    Comments: Full paper accepted at KDD 2022 -- ADS Track

  44. arXiv:2205.05738  [pdf, other

    cs.CL cs.AI cs.CV cs.CY cs.MM

    DISARM: Detecting the Victims Targeted by Harmful Memes

    Authors: Shivam Sharma, Md. Shad Akhtar, Preslav Nakov, Tanmoy Chakraborty

    Abstract: Internet memes have emerged as an increasingly popular means of communication on the Web. Although typically intended to elicit humour, they have been increasingly used to spread hatred, trolling, and cyberbullying, as well as to target specific individuals, communities, or society on political, socio-cultural, and psychological grounds. While previous work has focused on detecting harmful, hatefu… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Accepted at NAACL 2022 (Findings)

  45. arXiv:2205.04274  [pdf, other

    cs.CL cs.AI cs.CV

    Detecting and Understanding Harmful Memes: A Survey

    Authors: Shivam Sharma, Firoj Alam, Md. Shad Akhtar, Dimitar Dimitrov, Giovanni Da San Martino, Hamed Firooz, Alon Halevy, Fabrizio Silvestri, Preslav Nakov, Tanmoy Chakraborty

    Abstract: The automatic identification of harmful content online is of major concern for social media platforms, policymakers, and society. Researchers have studied textual, visual, and audio content, but typically in isolation. Yet, harmful content often combines multiple modalities, as in the case of memes, which are of particular interest due to their viral nature. With this in mind, here we offer a comp… ▽ More

    Submitted 29 May, 2022; v1 submitted 9 May, 2022; originally announced May 2022.

    Comments: Accepted at IJCAI-ECAI 2022 (Survey Track) - Editorial Feedback Revised, 9 pages (7 main + 2 reference pages)

  46. arXiv:2204.12753  [pdf, other

    cs.CL

    A Comprehensive Understanding of Code-mixed Language Semantics using Hierarchical Transformer

    Authors: Ayan Sengupta, Tharun Suresh, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: Being a popular mode of text-based communication in multilingual communities, code-mixing in online social media has became an important subject to study. Learning the semantics and morphology of code-mixed language remains a key challenge, due to scarcity of data and unavailability of robust and language-invariant representation learning technique. Any morphologically-rich language can benefit fr… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: 12 pages, 1 figure, 11 tables

  47. arXiv:2204.02155  [pdf, other

    cs.CL

    Detecting Anchors' Opinion in Hinghlish News Delivery

    Authors: Siddharth Sadhwani, Nishant Grover, Md Akhtar, Tanmoy Chakraborty

    Abstract: Humans like to express their opinions and crave the opinions of others. Mining and detecting opinions from various sources are beneficial to individuals, organisations, and even governments. One such organisation is news media, where a general norm is not to showcase opinions from their side. Anchors are the face of the digital media, and it is required for them not to be opinionated. However, at… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

  48. arXiv:2203.06419  [pdf, other

    cs.CL cs.AI

    When did you become so smart, oh wise one?! Sarcasm Explanation in Multi-modal Multi-party Dialogues

    Authors: Shivani Kumar, Atharva Kulkarni, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: Indirect speech such as sarcasm achieves a constellation of discourse goals in human communication. While the indirectness of figurative language warrants speakers to achieve certain pragmatic goals, it is challenging for AI agents to comprehend such idiosyncrasies of human communication. Though sarcasm identification has been a well-explored topic in dialogue analysis, for conversational systems… ▽ More

    Submitted 12 March, 2022; originally announced March 2022.

    Comments: Accepted in ACL 2022. 13 pages, 4 figures, 12 tables

  49. arXiv:2112.10339  [pdf

    cs.CR cs.HC cs.MM cs.NI

    Smart Home: Application using HTTP and MQTT as Communication Protocols

    Authors: Muneeb Ahmed, Mohd Majid Akhtar

    Abstract: This study discloses the development of a solution for realizing a smart home in the post COVID-19 era using the Internet of Things domain knowledge. COVID-19 outbreak has been catastrophic and impacted everyone's lives due to its rapid transmission from one body to another. This study aims to reduce virus transmission by eliminating the need to touch any common-point surface in a home, such as sw… ▽ More

    Submitted 20 December, 2021; originally announced December 2021.

  50. arXiv:2112.04873  [pdf, other

    cs.CL

    Nice perfume. How long did you marinate in it? Multimodal Sarcasm Explanation

    Authors: Poorav Desai, Tanmoy Chakraborty, Md Shad Akhtar

    Abstract: Sarcasm is a pervading linguistic phenomenon and highly challenging to explain due to its subjectivity, lack of context and deeply-felt opinion. In the multimodal setup, sarcasm is conveyed through the incongruity between the text and visual entities. Although recent approaches deal with sarcasm as a classification problem, it is unclear why an online post is identified as sarcastic. Without prope… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: Accepted for publication in AAAI-2022