Skip to main content

Showing 1–50 of 173 results for author: Ferrara, E

  1. arXiv:2407.07196  [pdf, ps, other

    cs.HC

    Large Language Models for Wearable Sensor-Based Human Activity Recognition, Health Monitoring, and Behavioral Modeling: A Survey of Early Trends, Datasets, and Challenges

    Authors: Emilio Ferrara

    Abstract: The proliferation of wearable technology enables the generation of vast amounts of sensor data, offering significant opportunities for advancements in health monitoring, activity recognition, and personalized medicine. However, the complexity and volume of this data present substantial challenges in data modeling and analysis, which have been tamed with approaches spanning time series modeling to… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  2. arXiv:2407.01471  [pdf, other

    cs.SI

    Tracking the 2024 US Presidential Election Chatter on Tiktok: A Public Multimodal Dataset

    Authors: Gabriela Pinto, Charles Bickham, Tanishq Salkar, Luca Luceri, Emilio Ferrara

    Abstract: This paper documents our release of a large-scale data collection of TikTok posts related to the upcoming 2024 U.S. Presidential Election. Our current data comprises 1.8 million videos published between November 1, 2023, and May 26, 2024. Its exploratory analysis identifies the most common keywords, hashtags, and bigrams in both Spanish and English posts, focusing on the election and the two main… ▽ More

    Submitted 2 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: The 2024 Election Integrity Initiative

    Report number: HUMANS Lab -- Working Paper No. 2024.3

  3. Tracing the Unseen: Uncovering Human Trafficking Patterns in Job Listings

    Authors: Siyi Zhou, Jiankun Peng, Emilio Ferrara

    Abstract: In the shadow of the digital revolution, the insidious issue of human trafficking has found new breeding grounds within the realms of social media and online job boards. Previous research efforts have predominantly centered on identifying victims via the analysis of escort advertisements. However, our work shifts the focus towards enabling a proactive approach: pinpointing potential traffickers be… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  4. arXiv:2406.11553  [pdf, other

    cs.SI

    The Susceptibility Paradox in Online Social Influence

    Authors: Luca Luceri, Jinyi Ye, Julie Jiang, Emilio Ferrara

    Abstract: Understanding susceptibility to online influence is crucial for mitigating the spread of misinformation and protecting vulnerable audiences. This paper investigates susceptibility to influence within social networks, focusing on the differential effects of influence-driven versus spontaneous behaviors on user content adoption. Our analysis reveals that influence-driven adoption exhibits high homop… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2406.01862  [pdf, other

    cs.CY

    Charting the Landscape of Nefarious Uses of Generative Artificial Intelligence for Online Election Interference

    Authors: Emilio Ferrara

    Abstract: Generative Artificial Intelligence (GenAI) and Large Language Models (LLMs) pose significant risks, particularly in the realm of online election interference. This paper explores the nefarious applications of GenAI, highlighting their potential to disrupt democratic processes through deepfakes, botnets, targeted misinformation campaigns, and synthetic identities.

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: The 2024 Election Integrity Initiative: HUMANS Lab - Working Paper No. 2024.1

  6. arXiv:2404.15457  [pdf, other

    cs.SI

    Hidden in Plain Sight: Exploring the Intersections of Mental Health, Eating Disorders, and Content Moderation on TikTok

    Authors: Charles Bickham, Kia Kazemi-Nia, Luca Luceri, Kristina Lerman, Emilio Ferrara

    Abstract: Social media platforms actively moderate content glorifying harmful behaviors like eating disorders, which include anorexia and bulimia. However, users have adapted to evade moderation by using coded hashtags. Our study investigates the prevalence of moderation evaders on the popular social media platform TikTok and contrasts their use and emotional valence with mainstream hashtags. We notice that… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 10 pages, 5 figures, 2 tables

  7. arXiv:2402.05904  [pdf, other

    cs.CL cs.CY cs.HC cs.SI

    FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs

    Authors: Eun Cheol Choi, Emilio Ferrara

    Abstract: Our society is facing rampant misinformation harming public health and trust. To address the societal challenge, we introduce FACT-GPT, a system leveraging Large Language Models (LLMs) to automate the claim matching stage of fact-checking. FACT-GPT, trained on a synthetic dataset, identifies social media content that aligns with, contradicts, or is irrelevant to previously debunked claims. Our eva… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  8. arXiv:2402.05882  [pdf, other

    cs.SI cs.CY cs.HC

    GET-Tok: A GenAI-Enriched Multimodal TikTok Dataset Documenting the 2022 Attempted Coup in Peru

    Authors: Gabriela Pinto, Keith Burghardt, Kristina Lerman, Emilio Ferrara

    Abstract: TikTok is one of the largest and fastest-growing social media sites in the world. TikTok features, however, such as voice transcripts, are often missing and other important features, such as OCR or video descriptions, do not exist. We introduce the Generative AI Enriched TikTok (GET-Tok) data, a pipeline for collecting TikTok videos and enriched data by augmenting the TikTok Research API with gene… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Github repository: https://github.com/gabbypinto/GET-Tok-Peru

  9. arXiv:2402.05873  [pdf, other

    cs.SI cs.CY

    Coordinated Activity Modulates the Behavior and Emotions of Organic Users: A Case Study on Tweets about the Gaza Conflict

    Authors: Priyanka Dey, Luca Luceri, Emilio Ferrara

    Abstract: Social media has become a crucial conduit for the swift dissemination of information during global crises. However, this also paves the way for the manipulation of narratives by malicious actors. This research delves into the interaction dynamics between coordinated (malicious) entities and organic (regular) users on Twitter amidst the Gaza conflict. Through the analysis of approximately 3.5 milli… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  10. arXiv:2402.05865  [pdf, other

    cs.HC cs.CY

    "Can You Play Anything Else?" Understanding Play Style Flexibility in League of Legends

    Authors: Emily Chen, Alexander Bisberg, Emilio Ferrara

    Abstract: This study investigates the concept of flexibility within League of Legends, a popular online multiplayer game, focusing on the relationship between user adaptability and team success. Utilizing a dataset encompassing players of varying skill levels and play styles, we calculate two measures of flexibility for each player: overall flexibility and temporal flexibility. Our findings suggest that the… ▽ More

    Submitted 10 July, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  11. arXiv:2401.08789  [pdf, other

    cs.SI

    Moral Values Underpinning COVID-19 Online Communication Patterns

    Authors: Julie Jiang, Luca Luceri, Emilio Ferrara

    Abstract: The COVID-19 pandemic has triggered profound societal changes, extending beyond its health impacts to the moralization of behaviors. Leveraging insights from moral psychology, this study delves into the moral fabric shaping online discussions surrounding COVID-19 over a span of nearly two years. Our investigation identifies four distinct user groups characterized by differences in morality, politi… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 11 pages, 8 figures, 2 tables

  12. arXiv:2401.00893  [pdf, other

    cs.SI cs.AI

    Social-LLM: Modeling User Behavior at Scale using Language Models and Social Network Data

    Authors: Julie Jiang, Emilio Ferrara

    Abstract: The proliferation of social network data has unlocked unprecedented opportunities for extensive, data-driven exploration of human behavior. The structural intricacies of social networks offer insights into various computational social science issues, particularly concerning social influence and information diffusion. However, modeling large-scale social network data comes with computational challe… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: 10 pages, 5 figures, 2 tables

  13. arXiv:2312.17423  [pdf, other

    cs.SI

    Social Bots: Detection and Challenges

    Authors: Kai-Cheng Yang, Onur Varol, Alexander C. Nwala, Mohsen Sayyadiharikandeh, Emilio Ferrara, Alessandro Flammini, Filippo Menczer

    Abstract: While social media are a key source of data for computational social science, their ease of manipulation by malicious actors threatens the integrity of online information exchanges and their analysis. In this Chapter, we focus on malicious social bots, a prominent vehicle for such manipulation. We start by discussing recent studies about the presence and actions of social bots in various online di… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: This is a draft of the chapter. The final version will be available in the Handbook of Computational Social Science edited by Taha Yasseri, forthcoming 2024, Edward Elgar Publishing Ltd. The material cannot be used for any other purpose without further permission of the publisher and is for private use only

  14. arXiv:2311.10781  [pdf, other

    cs.CL cs.AI

    Can Language Model Moderators Improve the Health of Online Discourse?

    Authors: Hyundong Cho, Shuai Liu, Taiwei Shi, Darpan Jain, Basem Rizk, Yuyang Huang, Zixun Lu, Nuan Wen, Jonathan Gratch, Emilio Ferrara, Jonathan May

    Abstract: Conversational moderation of online communities is crucial to maintaining civility for a constructive environment, but it is challenging to scale and harmful to moderators. The inclusion of sophisticated natural language generation modules as a force multiplier to aid human moderators is a tantalizing prospect, but adequate evaluation approaches have so far been elusive. In this paper, we establis… ▽ More

    Submitted 6 May, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: 9 pages, NAACL 2024 Main

  15. arXiv:2311.09734  [pdf, other

    cs.CL

    Tracking the Newsworthiness of Public Documents

    Authors: Alexander Spangher, Emilio Ferrara, Ben Welsh, Nanyun Peng, Serdar Tumgoren, Jonathan May

    Abstract: Journalists must find stories in huge amounts of textual data (e.g. leaks, bills, press releases) as part of their jobs: determining when and why text becomes news can help us understand coverage patterns and help us build assistive tools. Yet, this is challenging because very few labelled links exist, language use between corpora is very different, and text may be covered for a variety of reasons… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 9 pages, 7 pages appendix

  16. arXiv:2311.07816  [pdf, other

    cs.SI cs.AI

    Leveraging Large Language Models to Detect Influence Campaigns in Social Media

    Authors: Luca Luceri, Eric Boniardi, Emilio Ferrara

    Abstract: Social media influence campaigns pose significant challenges to public discourse and democracy. Traditional detection methods fall short due to the complexity and dynamic nature of social media. Addressing this, we propose a novel detection method using Large Language Models (LLMs) that incorporates both user metadata and network structures. By converting these elements into a text format, our app… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  17. arXiv:2311.05724  [pdf, other

    cs.SI cs.CY

    Susceptibility to Unreliable Information Sources: Swift Adoption with Minimal Exposure

    Authors: Jinyi Ye, Luca Luceri, Julie Jiang, Emilio Ferrara

    Abstract: Misinformation proliferation on social media platforms is a pervasive threat to the integrity of online public discourse. Genuine users, susceptible to others' influence, often unknowingly engage with, endorse, and re-share questionable pieces of information, collectively amplifying the spread of misinformation. In this study, we introduce an empirical framework to investigate users' susceptibilit… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: Accepted at the 2024 ACM Web Conference

  18. arXiv:2310.09884  [pdf, other

    cs.SI

    Unmasking the Web of Deceit: Uncovering Coordinated Activity to Expose Information Operations on Twitter

    Authors: Luca Luceri, Valeria Pantè, Keith Burghardt, Emilio Ferrara

    Abstract: Social media platforms, particularly Twitter, have become pivotal arenas for influence campaigns, often orchestrated by state-sponsored information operations (IOs). This paper delves into the detection of key players driving IOs by employing similarity graphs constructed from behavioral pattern data. We unveil that well-known, yet underutilized network properties can help accurately identify coor… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: Accepted at the 2024 ACM Web Conference

  19. arXiv:2310.09223  [pdf, other

    cs.CL cs.CY cs.HC

    Automated Claim Matching with Large Language Models: Empowering Fact-Checkers in the Fight Against Misinformation

    Authors: Eun Cheol Choi, Emilio Ferrara

    Abstract: In today's digital era, the rapid spread of misinformation poses threats to public well-being and societal trust. As online misinformation proliferates, manual verification by fact checkers becomes increasingly challenging. We introduce FACT-GPT (Fact-checking Augmentation with Claim matching Task-oriented Generative Pre-trained Transformer), a framework designed to automate the claim matching pha… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  20. arXiv:2310.07779  [pdf, other

    cs.SI

    Social Approval and Network Homophily as Motivators of Online Toxicity

    Authors: Julie Jiang, Luca Luceri, Joseph B. Walther, Emilio Ferrara

    Abstract: Online hate messaging is a pervasive issue plaguing the well-being of social media users. This research empirically investigates a novel theory positing that online hate may be driven primarily by the pursuit of social approval rather than a direct desire to harm the targets. Results show that toxicity is homophilous in users' social networks and that a user's propensity for hostility can be predi… ▽ More

    Submitted 29 February, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  21. arXiv:2310.05189  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Factuality Challenges in the Era of Large Language Models

    Authors: Isabelle Augenstein, Timothy Baldwin, Meeyoung Cha, Tanmoy Chakraborty, Giovanni Luca Ciampaglia, David Corney, Renee DiResta, Emilio Ferrara, Scott Hale, Alon Halevy, Eduard Hovy, Heng Ji, Filippo Menczer, Ruben Miguez, Preslav Nakov, Dietram Scheufele, Shivam Sharma, Giovanni Zagni

    Abstract: The emergence of tools based on Large Language Models (LLMs), such as OpenAI's ChatGPT, Microsoft's Bing Chat, and Google's Bard, has garnered immense public attention. These incredibly useful, natural-sounding tools mark significant advances in natural language generation, yet they exhibit a propensity to generate false, erroneous, or misleading content -- commonly referred to as "hallucinations.… ▽ More

    Submitted 9 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: Our article offers a comprehensive examination of the challenges and risks associated with Large Language Models (LLMs), focusing on their potential impact on the veracity of information in today's digital landscape

  22. arXiv:2310.00737  [pdf, other

    cs.CY cs.AI cs.CL cs.HC

    GenAI Against Humanity: Nefarious Applications of Generative Artificial Intelligence and Large Language Models

    Authors: Emilio Ferrara

    Abstract: Generative Artificial Intelligence (GenAI) and Large Language Models (LLMs) are marvels of technology; celebrated for their prowess in natural language processing and multimodal content generation, they promise a transformative future. But as with all powerful tools, they come with their shadows. Picture living in a world where deepfakes are indistinguishable from reality, where synthetic identiti… ▽ More

    Submitted 22 January, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

    Comments: Accepted in: Journal of Computational Social Science

    Journal ref: J Comput Soc Sc (2024)

  23. The Butterfly Effect in Artificial Intelligence Systems: Implications for AI Bias and Fairness

    Authors: Emilio Ferrara

    Abstract: The Butterfly Effect, a concept originating from chaos theory, underscores how small changes can have significant and unpredictable impacts on complex systems. In the context of AI fairness and bias, the Butterfly Effect can stem from a variety of sources, such as small biases or skewed data inputs during algorithm development, saddle points in training, or distribution shifts in data between trai… ▽ More

    Submitted 2 February, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

    Comments: Cite as: Machine Learning with Applications, Volume 15, 2024, 100525 10.1016/j.mlwa.2024.100525

    Journal ref: Machine Learning with Applications, Volume 15, 2024, 100525

  24. arXiv:2305.19230  [pdf, other

    cs.CL cs.AI

    Controlled Text Generation with Hidden Representation Transformations

    Authors: Vaibhav Kumar, Hana Koorehdavoudi, Masud Moshtaghi, Amita Misra, Ankit Chadha, Emilio Ferrara

    Abstract: We propose CHRT (Control Hidden Representation Transformation) - a controlled language generation framework that steers large language models to generate text pertaining to certain attributes (such as toxicity). CHRT gains attribute control by modifying the hidden representation of the base model through learned transformations. We employ a contrastive-learning framework to learn these transformat… ▽ More

    Submitted 31 May, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023 as a long paper (Findings)

  25. arXiv:2305.14904  [pdf, other

    cs.CL cs.AI cs.CY

    Identifying Informational Sources in News Articles

    Authors: Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

    Abstract: News articles are driven by the informational sources journalists use in reporting. Modeling when, how and why sources get used together in stories can help us better understand the information we consume and even help journalists with the task of producing it. In this work, we take steps toward this goal by constructing the largest and widest-ranging annotated dataset, to date, of informational s… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 13 pages

  26. Fairness And Bias in Artificial Intelligence: A Brief Survey of Sources, Impacts, And Mitigation Strategies

    Authors: Emilio Ferrara

    Abstract: The significant advancements in applying Artificial Intelligence (AI) to healthcare decision-making, medical diagnosis, and other domains have simultaneously raised concerns about the fairness and bias of AI systems. This is particularly critical in areas like healthcare, employment, criminal justice, credit scoring, and increasingly, in generative AI models (GenAI) that produce synthetic media. S… ▽ More

    Submitted 7 December, 2023; v1 submitted 15 April, 2023; originally announced April 2023.

    Journal ref: Sci 2024, 6(1), 3

  27. Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models

    Authors: Emilio Ferrara

    Abstract: As the capabilities of generative language models continue to advance, the implications of biases ingrained within these models have garnered increasing attention from researchers, practitioners, and the broader public. This article investigates the challenges and risks associated with biases in large-scale language models like ChatGPT. We discuss the origins of biases, stemming from, among others… ▽ More

    Submitted 13 November, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: Published on First Monday https://firstmonday.org/ojs/index.php/fm/article/view/13346/11365

    Journal ref: First Monday, Volume 28, Number 11 - 6 November 2023

  28. arXiv:2304.02983  [pdf, other

    cs.CL cs.SI

    Leveraging Social Interactions to Detect Misinformation on Social Media

    Authors: Tommaso Fornaciari, Luca Luceri, Emilio Ferrara, Dirk Hovy

    Abstract: Detecting misinformation threads is crucial to guarantee a healthy environment on social media. We address the problem using the data set created during the COVID-19 pandemic. It contains cascades of tweets discussing information weakly labeled as reliable or unreliable, based on a previous evaluation of the information source. The models identifying unreliable threads usually rely on textual feat… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

  29. arXiv:2304.02800  [pdf, other

    cs.SI physics.soc-ph

    Unveiling the Dynamics of Censorship, COVID-19 Regulations, and Protest: An Empirical Study of Chinese Subreddit r/china_irl

    Authors: Siyi Zhou, Luca Luceri, Emilio Ferrara

    Abstract: The COVID-19 pandemic has intensified numerous social issues that warrant academic investigation. Although information dissemination has been extensively studied, the silenced voices and censored content also merit attention due to their role in mobilizing social movements. In this paper, we provide empirical evidence to explore the relationships among COVID-19 regulations, censorship, and protest… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

  30. arXiv:2304.01371  [pdf, other

    cs.SI

    The Interconnected Nature of Online Harm and Moderation: Investigating the Cross-Platform Spread of Harmful Content between YouTube and Twitter

    Authors: Valerio La Gatta, Luca Luceri, Francesco Fabbri, Emilio Ferrara

    Abstract: The proliferation of harmful content shared online poses a threat to online information integrity and the integrity of discussion across platforms. Despite various moderation interventions adopted by social media platforms, researchers and policymakers are calling for holistic solutions. This study explores how a target platform could leverage content that has been deemed harmful on a source platf… ▽ More

    Submitted 6 April, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: 14 pages, 8 figures

    ACM Class: H.3.3

  31. Retrieving false claims on Twitter during the Russia-Ukraine conflict

    Authors: Valerio La Gatta, Chiyu Wei, Luca Luceri, Francesco Pierri, Emilio Ferrara

    Abstract: Nowadays, false and unverified information on social media sway individuals' perceptions during major geo-political events and threaten the quality of the whole digital information ecosystem. Since the Russian invasion of Ukraine, several fact-checking organizations have been actively involved in verifying stories related to the conflict that circulated online. In this paper, we leverage a public… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: 7 pages, 2 figures, WWW23 Companion Proceedings

    ACM Class: H.3.3

  32. Propaganda and Misinformation on Facebook and Twitter during the Russian Invasion of Ukraine

    Authors: Francesco Pierri, Luca Luceri, Nikhil Jindal, Emilio Ferrara

    Abstract: Online social media represent an oftentimes unique source of information, and having access to reliable and unbiased content is crucial, especially during crises and contentious events. We study the spread of propaganda and misinformation that circulated on Facebook and Twitter during the first few months of the Russia-Ukraine conflict. By leveraging two large datasets of millions of social media… ▽ More

    Submitted 20 February, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: Accepted at WebSci'2023

    Journal ref: WebSci '23: Proceedings of the 15th ACM Web Science Conference 2023, April 2023, Pages 65-74

  33. arXiv:2211.11113  [pdf, other

    cs.SI cs.IR

    From Fake News to #FakeNews: Mining Direct and Indirect Relationships among Hashtags for Fake News Detection

    Authors: Xinyi Zhou, Reza Zafarani, Emilio Ferrara

    Abstract: The COVID-19 pandemic has gained worldwide attention and allowed fake news, such as ``COVID-19 is the flu,'' to spread quickly and widely on social media. Combating this coronavirus infodemic demands effective methods to detect fake news. To this end, we propose a method to infer news credibility from hashtags involved in news dissemination on social media, motivated by the tight connection betwee… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

  34. Twitter Spam and False Accounts Prevalence, Detection and Characterization: A Survey

    Authors: Emilio Ferrara

    Abstract: The issue of quantifying and characterizing various forms of social media manipulation and abuse has been at the forefront of the computational social science research community for over a decade. In this paper, I provide a (non-comprehensive) survey of research efforts aimed at estimating the prevalence of spam and false accounts on Twitter, as well as characterizing their use, activity, and beha… ▽ More

    Submitted 7 February, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

    Comments: Published on First Monday, 27(12), 2022 https://firstmonday.org/ojs/index.php/fm/article/view/12872/10749

  35. Exposing Influence Campaigns in the Age of LLMs: A Behavioral-Based AI Approach to Detecting State-Sponsored Trolls

    Authors: Fatima Ezzeddine, Luca Luceri, Omran Ayoub, Ihab Sbeity, Gianluca Nogara, Emilio Ferrara, Silvia Giordano

    Abstract: The detection of state-sponsored trolls operating in influence campaigns on social media is a critical and unsolved challenge for the research community, which has significant implications beyond the online realm. To address this challenge, we propose a new AI-based solution that identifies troll accounts solely through behavioral cues associated with their sequences of sharing activity, encompass… ▽ More

    Submitted 11 October, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: 22

    Journal ref: EPJ Data Sci. 12, 46 (2023)

  36. arXiv:2209.09339  [pdf, other

    cs.SI

    Identifying and Characterizing Behavioral Classes of Radicalization within the QAnon Conspiracy on Twitter

    Authors: Emily L. Wang, Luca Luceri, Francesco Pierri, Emilio Ferrara

    Abstract: Social media provide a fertile ground where conspiracy theories and radical ideas can flourish, reach broad audiences, and sometimes lead to hate or violence beyond the online world itself. QAnon represents a notable example of a political conspiracy that started out on social media but turned mainstream, in part due to public endorsement by influential political figures. Nowadays, QAnon conspirac… ▽ More

    Submitted 6 April, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: 12 pages, 11 figures, 2 tables

    Journal ref: The 17th International AAAI Conference on Web and Social Media (ICWSM 2023)

  37. How does Twitter account moderation work? Dynamics of account creation and suspension on Twitter during major geopolitical events

    Authors: Francesco Pierri, Luca Luceri, Emily Chen, Emilio Ferrara

    Abstract: Social media moderation policies are often at the center of public debate, and their implementation and enactment are sometimes surrounded by a veil of mystery. Unsurprisingly, due to limited platform transparency and data access, relatively little research has been devoted to characterizing moderation dynamics, especially in the context of controversial events and the platform activity associated… ▽ More

    Submitted 7 October, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: See published version at EPJ Data Science

    Journal ref: EPJ Data Science 2023

  38. arXiv:2208.02932  [pdf, other

    cs.AI cs.HC cs.LG

    Human Decision Makings on Curriculum Reinforcement Learning with Difficulty Adjustment

    Authors: Yilei Zeng, Jiali Duan, Yang Li, Emilio Ferrara, Lerrel Pinto, C. -C. Jay Kuo, Stefanos Nikolaidis

    Abstract: Human-centered AI considers human experiences with AI performance. While abundant research has been helping AI achieve superhuman performance either by fully automatic or weak supervision learning, fewer endeavors are experimenting with how AI can tailor to humans' preferred skill level given fine-grained input. In this work, we guide the curriculum reinforcement learning results towards a preferr… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

    Comments: 6 pages, 7 figures

    ACM Class: I.2.6

  39. GCN-WP -- Semi-Supervised Graph Convolutional Networks for Win Prediction in Esports

    Authors: Alexander J. Bisberg, Emilio Ferrara

    Abstract: Win prediction is crucial to understanding skill modeling, teamwork and matchmaking in esports. In this paper we propose GCN-WP, a semi-supervised win prediction model for esports based on graph convolutional networks. This model learns the structure of an esports league over the course of a season (1 year) and makes predictions on another similar league. This model integrates over 30 features abo… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

  40. What are Your Pronouns? Examining Gender Pronoun Usage on Twitter

    Authors: Julie Jiang, Emily Chen, Luca Luceri, Goran Murić, Francesco Pierri, Ho-Chun Herbert Chang, Emilio Ferrara

    Abstract: Stating your gender pronouns, along with your name, is becoming the new norm of self-introductions at school, at the workplace, and online. The increasing prevalence and awareness of nonconforming gender identities put discussions of developing gender-inclusive language at the forefront. This work presents the first empirical research on gender pronoun usage on large-scale social media. Leveraging… ▽ More

    Submitted 27 October, 2023; v1 submitted 22 July, 2022; originally announced July 2022.

    Comments: 11 pages, 7 figures, 1 table

    Journal ref: 2023 Workshop Proceedings of the 17th International AAAI Conference on Web and Social Medi

  41. Geolocated Social Media Posts are Happier: Understanding the Characteristics of Check-in Posts on Twitter

    Authors: Julie Jiang, Jesse Thomason, Francesco Barbieri, Emilio Ferrara

    Abstract: The increasing prevalence of location-sharing features on social media has enabled researchers to ground computational social science research using geolocated data, affording opportunities to study human mobility, the impact of real-world events, and more. This paper analyzes what crucially separates posts with geotags from those without. We find that users who share location are not representati… ▽ More

    Submitted 13 February, 2023; v1 submitted 22 July, 2022; originally announced July 2022.

    Comments: 11 pages, 10 figures, 2 tables

    Journal ref: 15th ACM Web Science Conference 2023 (WebSci '23)

  42. The Gift That Keeps on Giving: Generosity is Contagious in Multiplayer Online Games

    Authors: Alexander J. Bisberg, Julie Jiang, Yilei Zeng, Emily Chen, Emilio Ferrara

    Abstract: Understanding social interactions and generous behaviors have long been of considerable interest in the social science community. While the contagion of generosity is documented in the real world, less is known about such phenomenon in virtual worlds and whether it has an actionable impact on user behavior and retention. In this work, we analyze social dynamics in the virtual world of the popular… ▽ More

    Submitted 12 October, 2022; v1 submitted 21 July, 2022; originally announced July 2022.

    Comments: 22 pages, 6 figures, 6 tables. To appear in the Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW 2022

    Journal ref: PACM on Human-Computer Interaction 6, CSCW2, Article 395 (November 2022)

  43. arXiv:2207.08349  [pdf, other

    cs.SI cs.LG physics.soc-ph

    Retweet-BERT: Political Leaning Detection Using Language Features and Information Diffusion on Social Networks

    Authors: Julie Jiang, Xiang Ren, Emilio Ferrara

    Abstract: Estimating the political leanings of social media users is a challenging and ever more pressing problem given the increase in social media consumption. We introduce Retweet-BERT, a simple and scalable model to estimate the political leanings of Twitter users. Retweet-BERT leverages the retweet network structure and the language used in users' profile descriptions. Our assumptions stem from pattern… ▽ More

    Submitted 6 April, 2023; v1 submitted 17 July, 2022; originally announced July 2022.

    Comments: 11 pages, 3 figures, 4 tables. arXiv admin note: text overlap with arXiv:2103.10979

    Journal ref: The 17th International AAAI Conference on Web and Social Media (ICWSM 2023)

  44. arXiv:2207.03086  [pdf, other

    cs.AI

    Word Embedding for Social Sciences: An Interdisciplinary Survey

    Authors: Akira Matsui, Emilio Ferrara

    Abstract: To extract essential information from complex data, computer scientists have been developing machine learning models that learn low-dimensional representation mode. From such advances in machine learning research, not only computer scientists but also social scientists have benefited and advanced their research because human behavior or social phenomena lies in complex data. However, this emerging… ▽ More

    Submitted 15 June, 2024; v1 submitted 7 July, 2022; originally announced July 2022.

  45. arXiv:2206.09535  [pdf, other

    cs.HC cs.AI

    Extracting Fast and Slow: User-Action Embedding with Inter-temporal Information

    Authors: Akira Matsui, Emilio Ferrara

    Abstract: With the recent development of technology, data on detailed human temporal behaviors has become available. Many methods have been proposed to mine those human dynamic behavior data and revealed valuable insights for research and businesses. However, most methods analyze only sequence of actions and do not study the inter-temporal information such as the time intervals between actions in a holistic… ▽ More

    Submitted 19 June, 2022; originally announced June 2022.

  46. arXiv:2205.09693  [pdf, other

    cs.HC cs.CY

    Individual and Collective Performance Deteriorate in a New Team: A Case Study of CS:GO Tournaments

    Authors: Weiwei Zhang, Goran Muric, Emilio Ferrara

    Abstract: How does the team formation relates to team performance in professional video game playing? This study examined one aspect of group dynamics - team switching - and aims to answer how changing a team affects individual and collective performance in eSports tournaments. In this study we test the hypothesis that switching teams can be detrimental to individual and team performance both in short term… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

  47. arXiv:2203.16309  [pdf, other

    cs.LG cs.AI

    Zero-shot meta-learning for small-scale data from human subjects

    Authors: Julie Jiang, Kristina Lerman, Emilio Ferrara

    Abstract: While developments in machine learning led to impressive performance gains on big data, many human subjects data are, in actuality, small and sparsely labeled. Existing methods applied to such data often do not easily generalize to out-of-sample subjects. Instead, models must make predictions on test data that may be drawn from a different distribution, a problem known as \textit{zero-shot learnin… ▽ More

    Submitted 1 April, 2023; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: 10 pages, 7 figures

    Journal ref: The 11th IEEE International Conference on Healthcare Informatics (ICHI 2023)

  48. arXiv:2203.07488  [pdf, other

    cs.SI cs.CY cs.DL

    Tweets in Time of Conflict: A Public Dataset Tracking the Twitter Discourse on the War Between Ukraine and Russia

    Authors: Emily Chen, Emilio Ferrara

    Abstract: On February 24, 2022, Russia invaded Ukraine. In the days that followed, reports kept flooding in from layman to news anchors of a conflict quickly escalating into war. Russia faced immediate backlash and condemnation from the world at large. While the war continues to contribute to an ongoing humanitarian and refugee crisis in Ukraine, a second battlefield has emerged in the online space, both in… ▽ More

    Submitted 10 April, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: Dataset at https://github.com/echen102/ukraine-russia

  49. arXiv:2202.12413  [pdf, other

    cs.SI cs.CL cs.LG

    Construction of Large-Scale Misinformation Labeled Datasets from Social Media Discourse using Label Refinement

    Authors: Karishma Sharma, Emilio Ferrara, Yan Liu

    Abstract: Malicious accounts spreading misinformation has led to widespread false and misleading narratives in recent times, especially during the COVID-19 pandemic, and social media platforms struggle to eliminate these contents rapidly. This is because adapting to new domains requires human intensive fact-checking that is slow and difficult to scale. To address this challenge, we propose to leverage news-… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

    Journal ref: WWW (2022)

  50. Botometer 101: Social bot practicum for computational social scientists

    Authors: Kai-Cheng Yang, Emilio Ferrara, Filippo Menczer

    Abstract: Social bots have become an important component of online social media. Deceptive bots, in particular, can manipulate online discussions of important issues ranging from elections to public health, threatening the constructive exchange of information. Their ubiquity makes them an interesting research subject and requires researchers to properly handle them when conducting studies using social media… ▽ More

    Submitted 21 August, 2022; v1 submitted 5 January, 2022; originally announced January 2022.

    Comments: 16 pages, 5 figures

    Journal ref: Journal of Computational Social Science (2022)