Social and Information Networks
See recent articles
- [1] arXiv:2407.12864 [pdf, html, other]
-
Title: Clustering Time-Evolving Networks Using the Dynamic Graph LaplacianSubjects: Social and Information Networks (cs.SI); Machine Learning (cs.LG); Dynamical Systems (math.DS); Machine Learning (stat.ML)
Time-evolving graphs arise frequently when modeling complex dynamical systems such as social networks, traffic flow, and biological processes. Developing techniques to identify and analyze communities in these time-varying graph structures is an important challenge. In this work, we generalize existing spectral clustering algorithms from static to dynamic graphs using canonical correlation analysis (CCA) to capture the temporal evolution of clusters. Based on this extended canonical correlation framework, we define the dynamic graph Laplacian and investigate its spectral properties. We connect these concepts to dynamical systems theory via transfer operators, and illustrate the advantages of our method on benchmark graphs by comparison with existing methods. We show that the dynamic graph Laplacian allows for a clear interpretation of cluster structure evolution over time for directed and undirected graphs.
- [2] arXiv:2407.12876 [pdf, html, other]
-
Title: Exploring the Use of Abusive Generative AI Models on CivitaiComments: Accepted to ACM Multimedia 2024Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
The rise of generative AI is transforming the landscape of digital imagery, and exerting a significant influence on online creative communities. This has led to the emergence of AI-Generated Content (AIGC) social platforms, such as Civitai. These distinctive social platforms allow users to build and share their own generative AI models, thereby enhancing the potential for more diverse artistic expression. Designed in the vein of social networks, they also provide artists with the means to showcase their creations (generated from the models), engage in discussions, and obtain feedback, thus nurturing a sense of community. Yet, this openness also raises concerns about the abuse of such platforms, e.g., using models to disseminate deceptive deepfakes or infringe upon copyrights. To explore this, we conduct the first comprehensive empirical study of an AIGC social platform, focusing on its use for generating abusive content. As an exemplar, we construct a comprehensive dataset covering Civitai, the largest available AIGC social platform. Based on this dataset of 87K models and 2M images, we explore the characteristics of content and discuss strategies for moderation to better govern these platforms.
- [3] arXiv:2407.12968 [pdf, html, other]
-
Title: Multi-Platform Framing Analysis: A Case Study of Kristiansand Quran BurningSubjects: Social and Information Networks (cs.SI)
The framing of events in various media and discourse spaces is crucial in the era of misinformation and polarization. Many studies, however, are limited to specific media or networks, disregarding the importance of cross-platform diffusion. This study overcomes that limitation by conducting a multi-platform framing analysis on Twitter, YouTube, and traditional media analyzing the 2019 Koran burning in Kristiansand, Norway. It examines media and policy frames and uncovers network connections through shared URLs. The findings show that online news emphasizes the incident's legality, while social media focuses on its morality, with harsh hate speech prevalent in YouTube comments. Additionally, YouTube is identified as the most self-contained community, whereas Twitter is the most open to external inputs.
- [4] arXiv:2407.13549 [pdf, html, other]
-
Title: Evaluating the effect of viral news on social media engagementEmanuele Sangiorgio, Niccolò Di Marco, Gabriele Etta, Matteo Cinelli, Roy Cerqueti, Walter QuattrociocchiSubjects: Social and Information Networks (cs.SI)
This study examines Facebook and YouTube content from over a thousand news outlets in four European languages from 2018 to 2023, using a Bayesian structural time-series model to evaluate the impact of viral posts. Our results show that most viral events do not significantly increase engagement and rarely lead to sustained growth. The virality effect usually depends on the engagement trend preceding the viral post, typically reversing it. When news emerges unexpectedly, viral events enhances users' engagement, reactivating the collective response process. In contrast, when virality manifests after a sustained growth phase, it represents the final burst of that growth process, followed by a decline in attention. Moreover, quick viral effects fade faster, while slower processes lead to more persistent growth. These findings highlight the transient effect of viral events and underscore the importance of consistent, steady attention-building strategies to establish a solid connection with the user base rather than relying on sudden visibility spikes.
New submissions for Friday, 19 July 2024 (showing 4 of 4 entries )
- [5] arXiv:2407.13071 (cross-list from cs.CY) [pdf, other]
-
Title: Analysing the Public Discourse around OpenAI's Text-To-Video Model 'Sora' using Topic ModelingSubjects: Computers and Society (cs.CY); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG); Social and Information Networks (cs.SI)
The recent introduction of OpenAI's text-to-video model Sora has sparked widespread public discourse across online communities. This study aims to uncover the dominant themes and narratives surrounding Sora by conducting topic modeling analysis on a corpus of 1,827 Reddit comments from five relevant subreddits (r/OpenAI, r/technology, r/singularity, r/vfx, and r/ChatGPT). The comments were collected over a two-month period following Sora's announcement in February 2024. After preprocessing the data, Latent Dirichlet Allocation (LDA) was employed to extract four key topics: 1) AI Impact and Trends in Sora Discussions, 2) Public Opinion and Concerns about Sora, 3) Artistic Expression and Video Creation with Sora, and 4) Sora's Applications in Media and Entertainment. Visualizations including word clouds, bar charts, and t-SNE clustering provided insights into the importance of topic keywords and the distribution of comments across topics. The results highlight prominent narratives around Sora's potential impact on industries and employment, public sentiment and ethical concerns, creative applications, and use cases in the media and entertainment sectors. While limited to Reddit data within a specific timeframe, this study offers a framework for understanding public perceptions of emerging generative AI technologies through online discourse analysis.
- [6] arXiv:2407.13251 (cross-list from cs.LG) [pdf, html, other]
-
Title: Motif-Consistent Counterfactuals with Adversarial Refinement for Graph-Level Anomaly DetectionComments: Accepted by KDD 2024Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI)
Graph-level anomaly detection is significant in diverse domains. To improve detection performance, counterfactual graphs have been exploited to benefit the generalization capacity by learning causal relations. Most existing studies directly introduce perturbations (e.g., flipping edges) to generate counterfactual graphs, which are prone to alter the semantics of generated examples and make them off the data manifold, resulting in sub-optimal performance. To address these issues, we propose a novel approach, Motif-consistent Counterfactuals with Adversarial Refinement (MotifCAR), for graph-level anomaly detection. The model combines the motif of one graph, the core subgraph containing the identification (category) information, and the contextual subgraph (non-motif) of another graph to produce a raw counterfactual graph. However, the produced raw graph might be distorted and cannot satisfy the important counterfactual properties: Realism, Validity, Proximity and Sparsity. Towards that, we present a Generative Adversarial Network (GAN)-based graph optimizer to refine the raw counterfactual graphs. It adopts the discriminator to guide the generator to generate graphs close to realistic data, i.e., meet the property Realism. Further, we design the motif consistency to force the motif of the generated graphs to be consistent with the realistic graphs, meeting the property Validity. Also, we devise the contextual loss and connection loss to control the contextual subgraph and the newly added links to meet the properties Proximity and Sparsity. As a result, the model can generate high-quality counterfactual graphs. Experiments demonstrate the superiority of MotifCAR.
- [7] arXiv:2407.13566 (cross-list from cs.CY) [pdf, other]
-
Title: Decentralised Governance for Autonomous Cyber-Physical SystemsKelsie Nabben (1), Hongyang Wang (2), Michael Zargham (3) ((1) European University Institute, (2) ETH Zurich, (3) Block Science)Subjects: Computers and Society (cs.CY); Social and Information Networks (cs.SI); Systems and Control (eess.SY)
This paper examines the potential for Cyber-Physical Systems (CPS) to be governed in a decentralised manner, whereby blockchain-based infrastructure facilitates the communication between digital and physical domains through self-governing and self-organising principles. Decentralised governance paradigms that integrate computation in physical domains (such as 'Decentralised Autonomous Organisations' (DAOs)) represent a novel approach to autono-mous governance and operations. These have been described as akin to cybernetic systems. Through the lens of a case study of an autonomous cabin called "no1s1" which demonstrates self-ownership via blockchain-based control and feedback loops, this research explores the potential for blockchain infrastructure to be utilised in the management of physical systems. By highlighting the considerations and challenges of decentralised governance in managing autonomous physical spaces, the study reveals that autonomy in the governance of autonomous CPS is not merely a technological feat but also involves a complex mesh of functional and social dynamics. These findings underscore the importance of developing continuous feedback loops and adaptive governance frameworks within decentralised CPS to address both expected and emergent challenges. This investigation contributes to the fields of infra-structure studies and Cyber-Physical Systems engineering. It also contributes to the discourse on decentralised governance and autonomous management of physical spaces by offering both practical insights and providing a framework for future research.
Cross submissions for Friday, 19 July 2024 (showing 3 of 3 entries )
- [8] arXiv:2403.14951 (replaced) [pdf, html, other]
-
Title: Simple Graph CondensationComments: ECML-PKDD 2024Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Social and Information Networks (cs.SI)
The burdensome training costs on large-scale graphs have aroused significant interest in graph condensation, which involves tuning Graph Neural Networks (GNNs) on a small condensed graph for use on the large-scale original graph. Existing methods primarily focus on aligning key metrics between the condensed and original graphs, such as gradients, output distribution and trajectories of GNNs, yielding satisfactory performance on downstream tasks. However, these complex metrics necessitate intricate external parameters and can potentially disrupt the optimization process of the condensation graph, making the condensation process highly demanding and unstable. Motivated by the recent success of simplified models across various domains, we propose a simplified approach to metric alignment in graph condensation, aiming to reduce unnecessary complexity inherited from intricate metrics. We introduce the Simple Graph Condensation (SimGC) framework, which aligns the condensed graph with the original graph from the input layer to the prediction layer, guided by a pre-trained Simple Graph Convolution (SGC) model on the original graph. Importantly, SimGC eliminates external parameters and exclusively retains the target condensed graph during the condensation process. This straightforward yet effective strategy achieves a significant speedup of up to 10 times compared to existing graph condensation methods while performing on par with state-of-the-art baselines. Comprehensive experiments conducted on seven benchmark datasets demonstrate the effectiveness of SimGC in prediction accuracy, condensation time, and generalization capability. Our code is available at this https URL.
- [9] arXiv:2405.07764 (replaced) [pdf, html, other]
-
Title: LGDE: Local Graph-based Dictionary ExpansionComments: Python code available at: this https URLSubjects: Computation and Language (cs.CL); Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)
We present Local Graph-based Dictionary Expansion (LGDE), a method for data-driven discovery of the semantic neighbourhood of words using tools from manifold learning and network science. At the heart of LGDE lies the creation of a word similarity graph from the geometry of word embeddings followed by local community detection based on graph diffusion. The diffusion in the local graph manifold allows the exploration of the complex nonlinear geometry of word embeddings to capture word similarities based on paths of semantic association, over and above direct pairwise similarities. Exploiting such semantic neighbourhoods enables the expansion of dictionaries of pre-selected keywords, an important step for tasks in information retrieval, such as database queries and online data collection. We validate LGDE on a corpus of English-language hate speech-related posts from Reddit and Gab and show that LGDE enriches the list of keywords with significantly better performance than threshold methods based on direct word similarities. We further demonstrate our method through a real-world use case from communication science, where LGDE is evaluated quantitatively on the expansion of a conspiracy-related dictionary from online data collected and analysed by domain experts. Our empirical results and expert user assessment indicate that LGDE expands the seed dictionary with more useful keywords due to the manifold-learning-based similarity network.
- [10] arXiv:2406.07693 (replaced) [pdf, other]
-
Title: A Labelled Dataset for Sentiment Analysis of Videos on YouTube, TikTok, and Other Sources about the 2024 Outbreak of MeaslesNirmalya Thakur, Vanessa Su, Mingchen Shao, Kesha A. Patel, Hongseok Jeong, Victoria Knieling, Andrew BianComments: 19 pagesSubjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Social and Information Networks (cs.SI)
The work of this paper presents a dataset that contains the data of 4011 videos about the ongoing outbreak of measles published on 264 websites on the internet between January 1, 2024, and May 31, 2024. The dataset is available at this https URL. These websites primarily include YouTube and TikTok, which account for 48.6% and 15.2% of the videos, respectively. The remainder of the websites include Instagram and Facebook as well as the websites of various global and local news organizations. For each of these videos, the URL of the video, title of the post, description of the post, and the date of publication of the video are presented as separate attributes in the dataset. After developing this dataset, sentiment analysis (using VADER), subjectivity analysis (using TextBlob), and fine-grain sentiment analysis (using DistilRoBERTa-base) of the video titles and video descriptions were performed. This included classifying each video title and video description into (i) one of the sentiment classes i.e. positive, negative, or neutral, (ii) one of the subjectivity classes i.e. highly opinionated, neutral opinionated, or least opinionated, and (iii) one of the fine-grain sentiment classes i.e. fear, surprise, joy, sadness, anger, disgust, or neutral. These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for performing sentiment analysis or subjectivity analysis in this field as well as for other applications. Finally, this paper also presents a list of open research questions that may be investigated using this dataset.