subscribe to arXiv mailings

Improving Contextual Congruence Across Modalities for Effective Multimodal Marketing using Knowledge-infused Learning

Authors: Trilok Padhi, Ugur Kursuncu, Yaman Kumar, Valerie L. Shalin, Lane Peterson Fronczek

Abstract: The prevalence of smart devices with the ability to capture moments in multiple modalities has enabled users to experience multimodal information online. However, large Language (LLMs) and Vision models (LVMs) are still limited in capturing holistic meaning with cross-modal semantic relationships. Without explicit, common sense knowledge (e.g., as a knowledge graph), Visual Language Models (VLMs)… ▽ More The prevalence of smart devices with the ability to capture moments in multiple modalities has enabled users to experience multimodal information online. However, large Language (LLMs) and Vision models (LVMs) are still limited in capturing holistic meaning with cross-modal semantic relationships. Without explicit, common sense knowledge (e.g., as a knowledge graph), Visual Language Models (VLMs) only learn implicit representations by capturing high-level patterns in vast corpora, missing essential contextual cross-modal cues. In this work, we design a framework to couple explicit commonsense knowledge in the form of knowledge graphs with large VLMs to improve the performance of a downstream task, predicting the effectiveness of multi-modal marketing campaigns. While the marketing application provides a compelling metric for assessing our methods, our approach enables the early detection of likely persuasive multi-modal campaigns and the assessment and augmentation of marketing theory. △ Less

Submitted 5 February, 2024; originally announced February 2024.

ACM Class: I.2.7; I.2.10; I.2.4; I.2.1

arXiv:2104.10788 [pdf, other]

doi 10.1016/j.neucom.2021.11.095

Defining and Detecting Toxicity on Social Media: Context and Knowledge are Key

Authors: Amit Sheth, Valerie L. Shalin, Ugur Kursuncu

Abstract: Online platforms have become an increasingly prominent means of communication. Despite the obvious benefits to the expanded distribution of content, the last decade has resulted in disturbing toxic communication, such as cyberbullying and harassment. Nevertheless, detecting online toxicity is challenging due to its multi-dimensional, context sensitive nature. As exposure to online toxicity can hav… ▽ More Online platforms have become an increasingly prominent means of communication. Despite the obvious benefits to the expanded distribution of content, the last decade has resulted in disturbing toxic communication, such as cyberbullying and harassment. Nevertheless, detecting online toxicity is challenging due to its multi-dimensional, context sensitive nature. As exposure to online toxicity can have serious social consequences, reliable models and algorithms are required for detecting and analyzing such communication across the vast and growing space of social media. In this paper, we draw on psychological and social theory to define toxicity. Then, we provide an approach that identifies multiple dimensions of toxicity and incorporates explicit knowledge in a statistical learning algorithm to resolve ambiguity across such dimensions. △ Less

Submitted 3 October, 2021; v1 submitted 21 April, 2021; originally announced April 2021.

Journal ref: Neurocomputing. 490 (2022) 312-318

arXiv:2104.04140 [pdf, other]

doi 10.1371/journal.pone.0250448

Characterization of Time-variant and Time-invariant Assessment of Suicidality on Reddit using C-SSRS

Authors: Manas Gaur, Vamsi Aribandi, Amanuel Alambo, Ugur Kursuncu, Krishnaprasad Thirunarayan, Jonanthan Beich, Jyotishman Pathak, Amit Sheth

Abstract: Suicide is the 10th leading cause of death in the U.S (1999-2019). However, predicting when someone will attempt suicide has been nearly impossible. In the modern world, many individuals suffering from mental illness seek emotional support and advice on well-known and easily-accessible social media platforms such as Reddit. While prior artificial intelligence research has demonstrated the ability… ▽ More Suicide is the 10th leading cause of death in the U.S (1999-2019). However, predicting when someone will attempt suicide has been nearly impossible. In the modern world, many individuals suffering from mental illness seek emotional support and advice on well-known and easily-accessible social media platforms such as Reddit. While prior artificial intelligence research has demonstrated the ability to extract valuable information from social media on suicidal thoughts and behaviors, these efforts have not considered both severity and temporality of risk. The insights made possible by access to such data have enormous clinical potential - most dramatically envisioned as a trigger to employ timely and targeted interventions (i.e., voluntary and involuntary psychiatric hospitalization) to save lives. In this work, we address this knowledge gap by developing deep learning algorithms to assess suicide risk in terms of severity and temporality from Reddit data based on the Columbia Suicide Severity Rating Scale (C-SSRS). In particular, we employ two deep learning approaches: time-variant and time-invariant modeling, for user-level suicide risk assessment, and evaluate their performance against a clinician-adjudicated gold standard Reddit corpus annotated based on the C-SSRS. Our results suggest that the time-variant approach outperforms the time-invariant method in the assessment of suicide-related ideations and supportive behaviors (AUC:0.78), while the time-invariant model performed better in predicting suicide-related behaviors and suicide attempt (AUC:0.64). The proposed approach can be integrated with clinical diagnostic interviews for improving suicide risk assessments. △ Less

Submitted 8 April, 2021; originally announced April 2021.

Comments: 24 Pages, 8 Tables, 6 Figures; Accepted by PLoS One ; One of the two mentioned Datasets in the manuscript has Closed Access. We will make it public after PLoS One produces the manuscript

ACM Class: H.4; I.2; J.3; J.4

arXiv:2008.06465 [pdf, other]

doi 10.1007/978-3-030-60975-7_31

ALONE: A Dataset for Toxic Behavior among Adolescents on Twitter

Authors: Thilini Wijesiriwardene, Hale Inan, Ugur Kursuncu, Manas Gaur, Valerie L. Shalin, Krishnaprasad Thirunarayan, Amit Sheth, I. Budak Arpinar

Abstract: The convenience of social media has also enabled its misuse, potentially resulting in toxic behavior. Nearly 66% of internet users have observed online harassment, and 41% claim personal experience, with 18% facing severe forms of online harassment. This toxic communication has a significant impact on the well-being of young individuals, affecting mental health and, in some cases, resulting in sui… ▽ More The convenience of social media has also enabled its misuse, potentially resulting in toxic behavior. Nearly 66% of internet users have observed online harassment, and 41% claim personal experience, with 18% facing severe forms of online harassment. This toxic communication has a significant impact on the well-being of young individuals, affecting mental health and, in some cases, resulting in suicide. These communications exhibit complex linguistic and contextual characteristics, making recognition of such narratives challenging. In this paper, we provide a multimodal dataset of toxic social media interactions between confirmed high school students, called ALONE (AdoLescents ON twittEr), along with descriptive explanation. Each instance of interaction includes tweets, images, emoji and related metadata. Our observations show that individual tweets do not provide sufficient evidence for toxic behavior, and meaningful use of context in interactions can enable highlighting or exonerating tweets with purported toxicity. △ Less

Submitted 14 August, 2020; originally announced August 2020.

Comments: Accepted: Social Informatics 2020

Journal ref: International Conference on Social Informatics. 12467 (2020) 427-439

arXiv:1912.00512 [pdf, other]

Knowledge Infused Learning (K-IL): Towards Deep Incorporation of Knowledge in Deep Learning

Authors: Ugur Kursuncu, Manas Gaur, Amit Sheth

Abstract: Learning the underlying patterns in data goes beyond instance-based generalization to external knowledge represented in structured graphs or networks. Deep learning that primarily constitutes neural computing stream in AI has shown significant advances in probabilistically learning latent patterns using a multi-layered network of computational nodes (i.e., neurons/hidden units). Structured knowled… ▽ More Learning the underlying patterns in data goes beyond instance-based generalization to external knowledge represented in structured graphs or networks. Deep learning that primarily constitutes neural computing stream in AI has shown significant advances in probabilistically learning latent patterns using a multi-layered network of computational nodes (i.e., neurons/hidden units). Structured knowledge that underlies symbolic computing approaches and often supports reasoning, has also seen significant growth in recent years, in the form of broad-based (e.g., DBPedia, Yago) and domain, industry or application specific knowledge graphs. A common substrate with careful integration of the two will raise opportunities to develop neuro-symbolic learning approaches for AI, where conceptual and probabilistic representations are combined. As the incorporation of external knowledge will aid in supervising the learning of features for the model, deep infusion of representational knowledge from knowledge graphs within hidden layers will further enhance the learning process. Although much work remains, we believe that knowledge graphs will play an increasing role in developing hybrid neuro-symbolic intelligent systems (bottom-up deep learning with top-down symbolic computing) as well as in building explainable AI systems for which knowledge graphs will provide scaffolding for punctuating neural computing. In this position paper, we describe our motivation for such a neuro-symbolic approach and framework that combines knowledge graph and neural networks. △ Less

Submitted 29 February, 2020; v1 submitted 1 December, 2019; originally announced December 2019.

Journal ref: AAAI Spring Symposium on Combining Machine Learning and Knowledge Engineering in Practice. 1 (2020)

arXiv:1908.06520 [pdf, other]

doi 10.1145/3359253

Modeling Islamist Extremist Communications on Social Media using Contextual Dimensions: Religion, Ideology, and Hate

Authors: Ugur Kursuncu, Manas Gaur, Carlos Castillo, Amanuel Alambo, K. Thirunarayan, Valerie Shalin, Dilshod Achilov, I. Budak Arpinar, Amit Sheth

Abstract: Terror attacks have been linked in part to online extremist content. Although tens of thousands of Islamist extremism supporters consume such content, they are a small fraction relative to peaceful Muslims. The efforts to contain the ever-evolving extremism on social media platforms have remained inadequate and mostly ineffective. Divergent extremist and mainstream contexts challenge machine inter… ▽ More Terror attacks have been linked in part to online extremist content. Although tens of thousands of Islamist extremism supporters consume such content, they are a small fraction relative to peaceful Muslims. The efforts to contain the ever-evolving extremism on social media platforms have remained inadequate and mostly ineffective. Divergent extremist and mainstream contexts challenge machine interpretation, with a particular threat to the precision of classification algorithms. Our context-aware computational approach to the analysis of extremist content on Twitter breaks down this persuasion process into building blocks that acknowledge inherent ambiguity and sparsity that likely challenge both manual and automated classification. We model this process using a combination of three contextual dimensions -- religion, ideology, and hate -- each elucidating a degree of radicalization and highlighting independent features to render them computationally accessible. We utilize domain-specific knowledge resources for each of these contextual dimensions such as Qur'an for religion, the books of extremist ideologues and preachers for political ideology and a social media hate speech corpus for hate. Our study makes three contributions to reliable analysis: (i) Development of a computational approach rooted in the contextual dimensions of religion, ideology, and hate that reflects strategies employed by online Islamist extremist groups, (ii) An in-depth analysis of relevant tweet datasets with respect to these dimensions to exclude likely mislabeled users, and (iii) A framework for understanding online radicalization as a process to assist counter-programming. Given the potentially significant social impact, we evaluate the performance of our algorithms to minimize mislabeling, where our approach outperforms a competitive baseline by 10.2% in precision. △ Less

Submitted 5 October, 2020; v1 submitted 18 August, 2019; originally announced August 2019.

Comments: 22 pages

Journal ref: Proceedings of the ACM on Human-Computer Interaction. 3 (2019)

arXiv:1806.06813 [pdf, other]

doi 10.1109/WI.2018.00-50

"What's ur type?" Contextualized Classification of User Types in Marijuana-related Communications using Compositional Multiview Embedding

Authors: Ugur Kursuncu, Manas Gaur, Usha Lokala, Anurag Illendula, Krishnaprasad Thirunarayan, Raminta Daniulaityte, Amit Sheth, I. Budak Arpinar

Abstract: With 93% of pro-marijuana population in US favoring legalization of medical marijuana, high expectations of a greater return for Marijuana stocks, and public actively sharing information about medical, recreational and business aspects related to marijuana, it is no surprise that marijuana culture is thriving on Twitter. After the legalization of marijuana for recreational and medical purposes in… ▽ More With 93% of pro-marijuana population in US favoring legalization of medical marijuana, high expectations of a greater return for Marijuana stocks, and public actively sharing information about medical, recreational and business aspects related to marijuana, it is no surprise that marijuana culture is thriving on Twitter. After the legalization of marijuana for recreational and medical purposes in 29 states, there has been a dramatic increase in the volume of drug-related communication on Twitter. Specifically, Twitter accounts have been established for promotional and informational purposes, some prominent among them being American Ganja, Medical Marijuana Exchange, and Cannabis Now. Identification and characterization of different user types can allow us to conduct more fine-grained spatiotemporal analysis to identify dominant or emerging topics in the echo chambers of marijuana-related communities on Twitter. In this research, we mainly focus on classifying Twitter accounts created and run by ordinary users, retailers, and informed agencies. Classifying user accounts by type can enable better capturing and highlighting of aspects such as trending topics, business profiling of marijuana companies, and state-specific marijuana policymaking. Furthermore, type-based analysis can provide more profound understanding and reliable assessment of the implications of marijuana-related communications. We developed a comprehensive approach to classifying users by their types on Twitter through contextualization of their marijuana-related conversations. We accomplished this using compositional multiview embedding synthesized from People, Content, and Network views achieving 8% improvement over the empirical baseline. △ Less

Submitted 18 June, 2018; originally announced June 2018.

Journal ref: IEEE/WIC/ACM International Conference on Web Intelligence. (2018)

arXiv:1806.02377 [pdf, other]

doi 10.1007/978-3-319-94105-9_4

Predictive Analysis on Twitter: Techniques and Applications

Authors: Ugur Kursuncu, Manas Gaur, Usha Lokala, Krishnaprasad Thirunarayan, Amit Sheth, I. Budak Arpinar

Abstract: Predictive analysis of social media data has attracted considerable attention from the research community as well as the business world because of the essential and actionable information it can provide. Over the years, extensive experimentation and analysis for insights have been carried out using Twitter data in various domains such as healthcare, public health, politics, social sciences, and de… ▽ More Predictive analysis of social media data has attracted considerable attention from the research community as well as the business world because of the essential and actionable information it can provide. Over the years, extensive experimentation and analysis for insights have been carried out using Twitter data in various domains such as healthcare, public health, politics, social sciences, and demographics. In this chapter, we discuss techniques, approaches and state-of-the-art applications of predictive analysis of Twitter data. Specifically, we present fine-grained analysis involving aspects such as sentiment, emotion, and the use of domain knowledge in the coarse-grained analysis of Twitter data for making decisions and taking actions, and relate a few success stories. △ Less

Submitted 6 June, 2018; originally announced June 2018.

Journal ref: Emerging Research Challenges and Opportunities in Computational Social Network Analysis and Mining. (2019) 67-104

Showing 1–8 of 8 results for author: Kursuncu, U