-
A Large-scale Universal Evaluation Benchmark For Face Forgery Detection
Authors:
Yijun Bei,
Hengrui Lou,
Jinsong Geng,
Erteng Liu,
Lechao Cheng,
Jie Song,
Mingli Song,
Zunlei Feng
Abstract:
With the rapid development of AI-generated content (AIGC) technology, the production of realistic fake facial images and videos that deceive human visual perception has become possible. Consequently, various face forgery detection techniques have been proposed to identify such fake facial content. However, evaluating the effectiveness and generalizability of these detection techniques remains a si…
▽ More
With the rapid development of AI-generated content (AIGC) technology, the production of realistic fake facial images and videos that deceive human visual perception has become possible. Consequently, various face forgery detection techniques have been proposed to identify such fake facial content. However, evaluating the effectiveness and generalizability of these detection techniques remains a significant challenge. To address this, we have constructed a large-scale evaluation benchmark called DeepFaceGen, aimed at quantitatively assessing the effectiveness of face forgery detection and facilitating the iterative development of forgery detection technology. DeepFaceGen consists of 776,990 real face image/video samples and 773,812 face forgery image/video samples, generated using 34 mainstream face generation techniques. During the construction process, we carefully consider important factors such as content diversity, fairness across ethnicities, and availability of comprehensive labels, in order to ensure the versatility and convenience of DeepFaceGen. Subsequently, DeepFaceGen is employed in this study to evaluate and analyze the performance of 13 mainstream face forgery detection techniques from various perspectives. Through extensive experimental analysis, we derive significant findings and propose potential directions for future research. The code and dataset for DeepFaceGen are available at https://github.com/HengruiLou/DeepFaceGen.
△ Less
Submitted 13 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
Better Late Than Never: Formulating and Benchmarking Recommendation Editing
Authors:
Chengyu Lai,
Sheng Zhou,
Zhimeng Jiang,
Qiaoyu Tan,
Yuanchen Bei,
Jiawei Chen,
Ningyu Zhang,
Jiajun Bu
Abstract:
Recommendation systems play a pivotal role in suggesting items to users based on their preferences. However, in online platforms, these systems inevitably offer unsuitable recommendations due to limited model capacity, poor data quality, or evolving user interests. Enhancing user experience necessitates efficiently rectify such unsuitable recommendation behaviors. This paper introduces a novel and…
▽ More
Recommendation systems play a pivotal role in suggesting items to users based on their preferences. However, in online platforms, these systems inevitably offer unsuitable recommendations due to limited model capacity, poor data quality, or evolving user interests. Enhancing user experience necessitates efficiently rectify such unsuitable recommendation behaviors. This paper introduces a novel and significant task termed recommendation editing, which focuses on modifying known and unsuitable recommendation behaviors. Specifically, this task aims to adjust the recommendation model to eliminate known unsuitable items without accessing training data or retraining the model. We formally define the problem of recommendation editing with three primary objectives: strict rectification, collaborative rectification, and concentrated rectification. Three evaluation metrics are developed to quantitatively assess the achievement of each objective. We present a straightforward yet effective benchmark for recommendation editing using novel Editing Bayesian Personalized Ranking Loss. To demonstrate the effectiveness of the proposed method, we establish a comprehensive benchmark that incorporates various methods from related fields. Codebase is available at https://github.com/cycl2018/Recommendation-Editing.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Revisiting the Message Passing in Heterophilous Graph Neural Networks
Authors:
Zhuonan Zheng,
Yuanchen Bei,
Sheng Zhou,
Yao Ma,
Ming Gu,
HongJia XU,
Chengyu Lai,
Jiawei Chen,
Jiajun Bu
Abstract:
Graph Neural Networks (GNNs) have demonstrated strong performance in graph mining tasks due to their message-passing mechanism, which is aligned with the homophily assumption that adjacent nodes exhibit similar behaviors. However, in many real-world graphs, connected nodes may display contrasting behaviors, termed as heterophilous patterns, which has attracted increased interest in heterophilous G…
▽ More
Graph Neural Networks (GNNs) have demonstrated strong performance in graph mining tasks due to their message-passing mechanism, which is aligned with the homophily assumption that adjacent nodes exhibit similar behaviors. However, in many real-world graphs, connected nodes may display contrasting behaviors, termed as heterophilous patterns, which has attracted increased interest in heterophilous GNNs (HTGNNs). Although the message-passing mechanism seems unsuitable for heterophilous graphs due to the propagation of class-irrelevant information, it is still widely used in many existing HTGNNs and consistently achieves notable success. This raises the question: why does message passing remain effective on heterophilous graphs? To answer this question, in this paper, we revisit the message-passing mechanisms in heterophilous graph neural networks and reformulate them into a unified heterophilious message-passing (HTMP) mechanism. Based on HTMP and empirical analysis, we reveal that the success of message passing in existing HTGNNs is attributed to implicitly enhancing the compatibility matrix among classes. Moreover, we argue that the full potential of the compatibility matrix is not completely achieved due to the existence of incomplete and noisy semantic neighborhoods in real-world heterophilous graphs. To bridge this gap, we introduce a new approach named CMGNN, which operates within the HTMP mechanism to explicitly leverage and improve the compatibility matrix. A thorough evaluation involving 10 benchmark datasets and comparative analysis against 13 well-established baselines highlights the superior performance of the HTMP mechanism and CMGNN method.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Guarding Graph Neural Networks for Unsupervised Graph Anomaly Detection
Authors:
Yuanchen Bei,
Sheng Zhou,
Jinke Shi,
Yao Ma,
Haishuai Wang,
Jiajun Bu
Abstract:
Unsupervised graph anomaly detection aims at identifying rare patterns that deviate from the majority in a graph without the aid of labels, which is important for a variety of real-world applications. Recent advances have utilized Graph Neural Networks (GNNs) to learn effective node representations by aggregating information from neighborhoods. This is motivated by the hypothesis that nodes in the…
▽ More
Unsupervised graph anomaly detection aims at identifying rare patterns that deviate from the majority in a graph without the aid of labels, which is important for a variety of real-world applications. Recent advances have utilized Graph Neural Networks (GNNs) to learn effective node representations by aggregating information from neighborhoods. This is motivated by the hypothesis that nodes in the graph tend to exhibit consistent behaviors with their neighborhoods. However, such consistency can be disrupted by graph anomalies in multiple ways. Most existing methods directly employ GNNs to learn representations, disregarding the negative impact of graph anomalies on GNNs, resulting in sub-optimal node representations and anomaly detection performance. While a few recent approaches have redesigned GNNs for graph anomaly detection under semi-supervised label guidance, how to address the adverse effects of graph anomalies on GNNs in unsupervised scenarios and learn effective representations for anomaly detection are still under-explored. To bridge this gap, in this paper, we propose a simple yet effective framework for Guarding Graph Neural Networks for Unsupervised Graph Anomaly Detection (G3AD). Specifically, G3AD introduces two auxiliary networks along with correlation constraints to guard the GNNs from inconsistent information encoding. Furthermore, G3AD introduces an adaptive caching module to guard the GNNs from solely reconstructing the observed data that contains anomalies. Extensive experiments demonstrate that our proposed G3AD can outperform seventeen state-of-the-art methods on both synthetic and real-world datasets.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Metamorpheus: Interactive, Affective, and Creative Dream Narration Through Metaphorical Visual Storytelling
Authors:
Qian Wan,
Xin Feng,
Yining Bei,
Zhiqi Gao,
Zhicong Lu
Abstract:
Human emotions are essentially molded by lived experiences, from which we construct personalised meaning. The engagement in such meaning-making process has been practiced as an intervention in various psychotherapies to promote wellness. Nevertheless, to support recollecting and recounting lived experiences in everyday life remains under explored in HCI. It also remains unknown how technologies su…
▽ More
Human emotions are essentially molded by lived experiences, from which we construct personalised meaning. The engagement in such meaning-making process has been practiced as an intervention in various psychotherapies to promote wellness. Nevertheless, to support recollecting and recounting lived experiences in everyday life remains under explored in HCI. It also remains unknown how technologies such as generative AI models can facilitate the meaning making process, and ultimately support affective mindfulness. In this paper we present Metamorpheus, an affective interface that engages users in a creative visual storytelling of emotional experiences during dreams. Metamorpheus arranges the storyline based on a dream's emotional arc, and provokes self-reflection through the creation of metaphorical images and text depictions. The system provides metaphor suggestions, and generates visual metaphors and text depictions using generative AI models, while users can apply generations to recolour and re-arrange the interface to be visually affective. Our experience-centred evaluation manifests that, by interacting with Metamorpheus, users can recall their dreams in vivid detail, through which they relive and reflect upon their experiences in a meaningful way.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Large Language Model Interaction Simulator for Cold-Start Item Recommendation
Authors:
Feiran Huang,
Zhenghang Yang,
Junyi Jiang,
Yuanchen Bei,
Yijie Zhang,
Hao Chen
Abstract:
Recommending cold items is a long-standing challenge for collaborative filtering models because these cold items lack historical user interactions to model their collaborative features. The gap between the content of cold items and their behavior patterns makes it difficult to generate accurate behavioral embeddings for cold items. Existing cold-start models use mapping functions to generate fake…
▽ More
Recommending cold items is a long-standing challenge for collaborative filtering models because these cold items lack historical user interactions to model their collaborative features. The gap between the content of cold items and their behavior patterns makes it difficult to generate accurate behavioral embeddings for cold items. Existing cold-start models use mapping functions to generate fake behavioral embeddings based on the content feature of cold items. However, these generated embeddings have significant differences from the real behavioral embeddings, leading to a negative impact on cold recommendation performance. To address this challenge, we propose an LLM Interaction Simulator (LLM-InS) to model users' behavior patterns based on the content aspect. This simulator allows recommender systems to simulate vivid interactions for each cold item and transform them from cold to warm items directly. Specifically, we outline the designing and training process of a tailored LLM-simulator that can simulate the behavioral patterns of users and items. Additionally, we introduce an efficient "filtering-and-refining" approach to take full advantage of the simulation power of the LLMs. Finally, we propose an updating method to update the embeddings of the items. we unified trains for both cold and warm items within a recommender model based on the simulated and real interactions. Extensive experiments using real behavioral embeddings demonstrate that our proposed model, LLM-InS, outperforms nine state-of-the-art cold-start methods and three LLM models in cold-start item recommendations.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Multi-Behavior Collaborative Filtering with Partial Order Graph Convolutional Networks
Authors:
Yijie Zhang,
Yuanchen Bei,
Hao Chen,
Qijie Shen,
Zheng Yuan,
Huan Gong,
Senzhang Wang,
Feiran Huang,
Xiao Huang
Abstract:
Representing information of multiple behaviors in the single graph collaborative filtering (CF) vector has been a long-standing challenge. This is because different behaviors naturally form separate behavior graphs and learn separate CF embeddings. Existing models merge the separate embeddings by appointing the CF embeddings for some behaviors as the primary embedding and utilizing other auxiliari…
▽ More
Representing information of multiple behaviors in the single graph collaborative filtering (CF) vector has been a long-standing challenge. This is because different behaviors naturally form separate behavior graphs and learn separate CF embeddings. Existing models merge the separate embeddings by appointing the CF embeddings for some behaviors as the primary embedding and utilizing other auxiliaries to enhance the primary embedding. However, this approach often results in the joint embedding performing well on the main tasks but poorly on the auxiliary ones. To address the problem arising from the separate behavior graphs, we propose the concept of Partial Order Recommendation Graphs (POG). POG defines the partial order relation of multiple behaviors and models behavior combinations as weighted edges to merge separate behavior graphs into a joint POG. Theoretical proof verifies that POG can be generalized to any given set of multiple behaviors. Based on POG, we propose the tailored Partial Order Graph Convolutional Networks (POGCN) that convolute neighbors' information while considering the behavior relations between users and items. POGCN also introduces a partial-order BPR sampling strategy for efficient and effective multiple-behavior CF training. POGCN has been successfully deployed on the homepage of Alibaba for two months, providing recommendation services for over one billion users. Extensive offline experiments conducted on three public benchmark datasets demonstrate that POGCN outperforms state-of-the-art multi-behavior baselines across all types of behaviors. Furthermore, online A/B tests confirm the superiority of POGCN in billion-scale recommender systems.
△ Less
Submitted 20 June, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
Macro Graph Neural Networks for Online Billion-Scale Recommender Systems
Authors:
Hao Chen,
Yuanchen Bei,
Qijie Shen,
Yue Xu,
Sheng Zhou,
Wenbing Huang,
Feiran Huang,
Senzhang Wang,
Xiao Huang
Abstract:
Predicting Click-Through Rate (CTR) in billion-scale recommender systems poses a long-standing challenge for Graph Neural Networks (GNNs) due to the overwhelming computational complexity involved in aggregating billions of neighbors. To tackle this, GNN-based CTR models usually sample hundreds of neighbors out of the billions to facilitate efficient online recommendations. However, sampling only a…
▽ More
Predicting Click-Through Rate (CTR) in billion-scale recommender systems poses a long-standing challenge for Graph Neural Networks (GNNs) due to the overwhelming computational complexity involved in aggregating billions of neighbors. To tackle this, GNN-based CTR models usually sample hundreds of neighbors out of the billions to facilitate efficient online recommendations. However, sampling only a small portion of neighbors results in a severe sampling bias and the failure to encompass the full spectrum of user or item behavioral patterns. To address this challenge, we name the conventional user-item recommendation graph as "micro recommendation graph" and introduce a more suitable MAcro Recommendation Graph (MAG) for billion-scale recommendations. MAG resolves the computational complexity problems in the infrastructure by reducing the node count from billions to hundreds. Specifically, MAG groups micro nodes (users and items) with similar behavior patterns to form macro nodes. Subsequently, we introduce tailored Macro Graph Neural Networks (MacGNN) to aggregate information on a macro level and revise the embeddings of macro nodes. MacGNN has already served Taobao's homepage feed for two months, providing recommendations for over one billion users. Extensive offline experiments on three public benchmark datasets and an industrial dataset present that MacGNN significantly outperforms twelve CTR baselines while remaining computationally efficient. Besides, online A/B tests confirm MacGNN's superiority in billion-scale recommender systems.
△ Less
Submitted 8 May, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Acoustic Three-dimensional Chern Insulators with Arbitrary Chern Vectors
Authors:
Yang Linyun,
Xi Xiang,
Meng Yan,
Zhu Zhenxiao,
Wu Ying,
Chen Jingming,
Cheng Minqi,
Xiang Kexin,
Shum Perry Ping,
Yang Yihao,
Chen Hongsheng,
Li Jian,
Yan Bei,
Liu Gui-Geng,
Zhang Baile,
Gao Zhen
Abstract:
The Chern vector is a vectorial generalization of the scalar Chern number, being able to characterize the topological phase of three-dimensional (3D) Chern insulators. Such a vectorial generalization extends the applicability of Chern-type bulk-boundary correspondence from one-dimensional (1D) edge states to two-dimensional (2D) surface states, whose unique features, such as forming nontrivial tor…
▽ More
The Chern vector is a vectorial generalization of the scalar Chern number, being able to characterize the topological phase of three-dimensional (3D) Chern insulators. Such a vectorial generalization extends the applicability of Chern-type bulk-boundary correspondence from one-dimensional (1D) edge states to two-dimensional (2D) surface states, whose unique features, such as forming nontrivial torus knots or links in the surface Brillouin zone, have been demonstrated recently in 3D photonic crystals. However, since it is still unclear how to achieve an arbitrary Chern vector, so far the surface-state torus knots or links can emerge, not on the surface of a single crystal as in other 3D topological phases, but only along an internal domain wall between two crystals with perpendicular Chern vectors. Here, we extend the 3D Chern insulator phase to acoustic crystals for sound waves, and propose a scheme to construct an arbitrary Chern vector that allows the emergence of surface-state torus knots or links on the surface of a single crystal. These results provide a complete picture of bulk-boundary correspondence for Chern vectors, and may find use in novel applications in topological acoustics.
△ Less
Submitted 13 January, 2024;
originally announced January 2024.
-
Reinforcement Neighborhood Selection for Unsupervised Graph Anomaly Detection
Authors:
Yuanchen Bei,
Sheng Zhou,
Qiaoyu Tan,
Hao Xu,
Hao Chen,
Zhao Li,
Jiajun Bu
Abstract:
Unsupervised graph anomaly detection is crucial for various practical applications as it aims to identify anomalies in a graph that exhibit rare patterns deviating significantly from the majority of nodes. Recent advancements have utilized Graph Neural Networks (GNNs) to learn high-quality node representations for anomaly detection by aggregating information from neighborhoods. However, the presen…
▽ More
Unsupervised graph anomaly detection is crucial for various practical applications as it aims to identify anomalies in a graph that exhibit rare patterns deviating significantly from the majority of nodes. Recent advancements have utilized Graph Neural Networks (GNNs) to learn high-quality node representations for anomaly detection by aggregating information from neighborhoods. However, the presence of anomalies may render the observed neighborhood unreliable and result in misleading information aggregation for node representation learning. Selecting the proper neighborhood is critical for graph anomaly detection but also challenging due to the absence of anomaly-oriented guidance and the interdependence with representation learning. To address these issues, we utilize the advantages of reinforcement learning in adaptively learning in complex environments and propose a novel method that incorporates Reinforcement neighborhood selection for unsupervised graph ANomaly Detection (RAND). RAND begins by enriching the candidate neighbor pool of the given central node with multiple types of indirect neighbors. Next, RAND designs a tailored reinforcement anomaly evaluation module to assess the reliability and reward of considering the given neighbor. Finally, RAND selects the most reliable subset of neighbors based on these rewards and introduces an anomaly-aware aggregator to amplify messages from reliable neighbors while diminishing messages from unreliable ones. Extensive experiments on both three synthetic and two real-world datasets demonstrate that RAND outperforms the state-of-the-art methods.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
Alleviating Behavior Data Imbalance for Multi-Behavior Graph Collaborative Filtering
Authors:
Yijie Zhang,
Yuanchen Bei,
Shiqi Yang,
Hao Chen,
Zhiqing Li,
Lijia Chen,
Feiran Huang
Abstract:
Graph collaborative filtering, which learns user and item representations through message propagation over the user-item interaction graph, has been shown to effectively enhance recommendation performance. However, most current graph collaborative filtering models mainly construct the interaction graph on a single behavior domain (e.g. click), even though users exhibit various types of behaviors o…
▽ More
Graph collaborative filtering, which learns user and item representations through message propagation over the user-item interaction graph, has been shown to effectively enhance recommendation performance. However, most current graph collaborative filtering models mainly construct the interaction graph on a single behavior domain (e.g. click), even though users exhibit various types of behaviors on real-world platforms, including actions like click, cart, and purchase. Furthermore, due to variations in user engagement, there exists an imbalance in the scale of different types of behaviors. For instance, users may click and view multiple items but only make selective purchases from a small subset of them. How to alleviate the behavior imbalance problem and utilize information from the multiple behavior graphs concurrently to improve the target behavior conversion (e.g. purchase) remains underexplored. To this end, we propose IMGCF, a simple but effective model to alleviate behavior data imbalance for multi-behavior graph collaborative filtering. Specifically, IMGCF utilizes a multi-task learning framework for collaborative filtering on multi-behavior graphs. Then, to mitigate the data imbalance issue, IMGCF improves representation learning on the sparse behavior by leveraging representations learned from the behavior domain with abundant data volumes. Experiments on two widely-used multi-behavior datasets demonstrate the effectiveness of IMGCF.
△ Less
Submitted 12 November, 2023;
originally announced November 2023.
-
Modeling Spatiotemporal Periodicity and Collaborative Signal for Local-Life Service Recommendation
Authors:
Huixuan Chi,
Hao Xu,
Mengya Liu,
Yuanchen Bei,
Sheng Zhou,
Danyang Liu,
Mengdi Zhang
Abstract:
Online local-life service platforms provide services like nearby daily essentials and food delivery for hundreds of millions of users. Different from other types of recommender systems, local-life service recommendation has the following characteristics: (1) spatiotemporal periodicity, which means a user's preferences for items vary from different locations at different times. (2) spatiotemporal c…
▽ More
Online local-life service platforms provide services like nearby daily essentials and food delivery for hundreds of millions of users. Different from other types of recommender systems, local-life service recommendation has the following characteristics: (1) spatiotemporal periodicity, which means a user's preferences for items vary from different locations at different times. (2) spatiotemporal collaborative signal, which indicates similar users have similar preferences at specific locations and times. However, most existing methods either focus on merely the spatiotemporal contexts in sequences, or model the user-item interactions without spatiotemporal contexts in graphs. To address this issue, we design a new method named SPCS in this paper. Specifically, we propose a novel spatiotemporal graph transformer (SGT) layer, which explicitly encodes relative spatiotemporal contexts, and aggregates the information from multi-hop neighbors to unify spatiotemporal periodicity and collaborative signal. With extensive experiments on both public and industrial datasets, this paper validates the state-of-the-art performance of SPCS.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
Co-movement Pattern Mining from Videos
Authors:
Dongxiang Zhang,
Teng Ma,
Junnan Hu,
Yijun Bei,
Kian-Lee Tan,
Gang Chen
Abstract:
Co-movement pattern mining from GPS trajectories has been an intriguing subject in spatial-temporal data mining. In this paper, we extend this research line by migrating the data source from GPS sensors to surveillance cameras, and presenting the first investigation into co-movement pattern mining from videos. We formulate the new problem, re-define the spatial-temporal proximity constraints from…
▽ More
Co-movement pattern mining from GPS trajectories has been an intriguing subject in spatial-temporal data mining. In this paper, we extend this research line by migrating the data source from GPS sensors to surveillance cameras, and presenting the first investigation into co-movement pattern mining from videos. We formulate the new problem, re-define the spatial-temporal proximity constraints from cameras deployed in a road network, and theoretically prove its hardness. Due to the lack of readily applicable solutions, we adapt existing techniques and propose two competitive baselines using Apriori-based enumerator and CMC algorithm, respectively.
As the principal technical contributions, we introduce a novel index called temporal-cluster suffix tree (TCS-tree), which performs two-level temporal clustering within each camera and constructs a suffix tree from the resulting clusters. Moreover, we present a sequence-ahead pruning framework based on TCS-tree, which allows for the simultaneous leverage of all pattern constraints to filter candidate paths. Finally, to reduce verification cost on the candidate paths, we propose a sliding-window based co-movement pattern enumeration strategy and a hashing-based dominance eliminator, both of which are effective in avoiding redundant operations.
We conduct extensive experiments for scalability and effectiveness analysis. Our results validate the efficiency of the proposed index and mining algorithm, which runs remarkably faster than the two baseline methods. Additionally, we construct a video database with 1169 cameras and perform an end-to-end pipeline analysis to study the performance gap between GPS-driven and video-driven methods. Our results demonstrate that the derived patterns from the video-driven approach are similar to those derived from groundtruth trajectories, providing evidence of its effectiveness.
△ Less
Submitted 10 October, 2023; v1 submitted 10 August, 2023;
originally announced August 2023.
-
CPDG: A Contrastive Pre-Training Method for Dynamic Graph Neural Networks
Authors:
Yuanchen Bei,
Hao Xu,
Sheng Zhou,
Huixuan Chi,
Haishuai Wang,
Mengdi Zhang,
Zhao Li,
Jiajun Bu
Abstract:
Dynamic graph data mining has gained popularity in recent years due to the rich information contained in dynamic graphs and their widespread use in the real world. Despite the advances in dynamic graph neural networks (DGNNs), the rich information and diverse downstream tasks have posed significant difficulties for the practical application of DGNNs in industrial scenarios. To this end, in this pa…
▽ More
Dynamic graph data mining has gained popularity in recent years due to the rich information contained in dynamic graphs and their widespread use in the real world. Despite the advances in dynamic graph neural networks (DGNNs), the rich information and diverse downstream tasks have posed significant difficulties for the practical application of DGNNs in industrial scenarios. To this end, in this paper, we propose to address them by pre-training and present the Contrastive Pre-Training Method for Dynamic Graph Neural Networks (CPDG). CPDG tackles the challenges of pre-training for DGNNs, including generalization capability and long-short term modeling capability, through a flexible structural-temporal subgraph sampler along with structural-temporal contrastive pre-training schemes. Extensive experiments conducted on both large-scale research and industrial dynamic graph datasets show that CPDG outperforms existing methods in dynamic graph pre-training for various downstream tasks under three transfer settings.
△ Less
Submitted 24 December, 2023; v1 submitted 6 July, 2023;
originally announced July 2023.
-
Estimation of control area in badminton doubles with pose information from top and back view drone videos
Authors:
Ning Ding,
Kazuya Takeda,
Wenhui Jin,
Yingjiu Bei,
Keisuke Fujii
Abstract:
The application of visual tracking to the performance analysis of sports players in dynamic competitions is vital for effective coaching. In doubles matches, coordinated positioning is crucial for maintaining control of the court and minimizing opponents' scoring opportunities. The analysis of such teamwork plays a vital role in understanding the dynamics of the game. However, previous studies hav…
▽ More
The application of visual tracking to the performance analysis of sports players in dynamic competitions is vital for effective coaching. In doubles matches, coordinated positioning is crucial for maintaining control of the court and minimizing opponents' scoring opportunities. The analysis of such teamwork plays a vital role in understanding the dynamics of the game. However, previous studies have primarily focused on analyzing and assessing singles players without considering occlusion in broadcast videos. These studies have relied on discrete representations, which involve the analysis and representation of specific actions (e.g., strokes) or events that occur during the game while overlooking the meaningful spatial distribution. In this work, we present the first annotated drone dataset from top and back views in badminton doubles and propose a framework to estimate the control area probability map, which can be used to evaluate teamwork performance. We present an efficient framework of deep neural networks that enables the calculation of full probability surfaces. This framework utilizes the embedding of a Gaussian mixture map of players' positions and employs graph convolution on their poses. In the experiment, we verify our approach by comparing various baselines and discovering the correlations between the score and control area. Additionally, we propose a practical application for assessing optimal positioning to provide instructions during a game. Our approach offers both visual and quantitative evaluations of players' movements, thereby providing valuable insights into doubles teamwork. The dataset and related project code is available at https://github.com/Ning-D/Drone_BD_ControlArea
△ Less
Submitted 26 October, 2023; v1 submitted 7 May, 2023;
originally announced May 2023.
-
Flattened Graph Convolutional Networks For Recommendation
Authors:
Yue Xu,
Hao Chen,
Zengde Deng,
Yuanchen Bei,
Feiran Huang
Abstract:
Graph Convolutional Networks (GCNs) and their variants have achieved significant performances on various recommendation tasks. However, many existing GCN models tend to perform recursive aggregations among all related nodes, which can arise severe computational burden to hinder their application to large-scale recommendation tasks. To this end, this paper proposes the flattened GCN~(FlatGCN) model…
▽ More
Graph Convolutional Networks (GCNs) and their variants have achieved significant performances on various recommendation tasks. However, many existing GCN models tend to perform recursive aggregations among all related nodes, which can arise severe computational burden to hinder their application to large-scale recommendation tasks. To this end, this paper proposes the flattened GCN~(FlatGCN) model, which is able to achieve superior performance with remarkably less complexity compared with existing models. Our main contribution is three-fold. First, we propose a simplified but powerful GCN architecture which aggregates the neighborhood information using one flattened GCN layer, instead of recursively. The aggregation step in FlatGCN is parameter-free such that it can be pre-computed with parallel computation to save memory and computational cost. Second, we propose an informative neighbor-infomax sampling method to select the most valuable neighbors by measuring the correlation among neighboring nodes based on a principled metric. Third, we propose a layer ensemble technique which improves the expressiveness of the learned representations by assembling the layer-wise neighborhood representations at the final layer. Extensive experiments on three datasets verify that our proposed model outperforms existing GCN models considerably and yields up to a few orders of magnitude speedup in training efficiency.
△ Less
Submitted 25 September, 2022;
originally announced October 2022.
-
Concept Whitening for Interpretable Image Recognition
Authors:
Zhi Chen,
Yijie Bei,
Cynthia Rudin
Abstract:
What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important, but the calculations of neural networks are very challenging to understand. Attempts to see inside their hidden layers can either be misleading, unusable, or rely on the latent space to possess properties that it may not have. In this work, rather than…
▽ More
What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important, but the calculations of neural networks are very challenging to understand. Attempts to see inside their hidden layers can either be misleading, unusable, or rely on the latent space to possess properties that it may not have. In this work, rather than attempting to analyze a neural network posthoc, we introduce a mechanism, called concept whitening (CW), to alter a given layer of the network to allow us to better understand the computation leading up to that layer. When a concept whitening module is added to a CNN, the axes of the latent space are aligned with known concepts of interest. By experiment, we show that CW can provide us a much clearer understanding for how the network gradually learns concepts over layers. CW is an alternative to a batch normalization layer in that it normalizes, and also decorrelates (whitens) the latent space. CW can be used in any layer of the network without hurting predictive performance.
△ Less
Submitted 7 December, 2020; v1 submitted 5 February, 2020;
originally announced February 2020.
-
New Techniques for Preserving Global Structure and Denoising with Low Information Loss in Single-Image Super-Resolution
Authors:
Yijie Bei,
Alex Damian,
Shijia Hu,
Sachit Menon,
Nikhil Ravi,
Cynthia Rudin
Abstract:
This work identifies and addresses two important technical challenges in single-image super-resolution: (1) how to upsample an image without magnifying noise and (2) how to preserve large scale structure when upsampling. We summarize the techniques we developed for our second place entry in Track 1 (Bicubic Downsampling), seventh place entry in Track 2 (Realistic Adverse Conditions), and seventh p…
▽ More
This work identifies and addresses two important technical challenges in single-image super-resolution: (1) how to upsample an image without magnifying noise and (2) how to preserve large scale structure when upsampling. We summarize the techniques we developed for our second place entry in Track 1 (Bicubic Downsampling), seventh place entry in Track 2 (Realistic Adverse Conditions), and seventh place entry in Track 3 (Realistic difficult) in the 2018 NTIRE Super-Resolution Challenge. Furthermore, we present new neural network architectures that specifically address the two challenges listed above: denoising and preservation of large-scale structure.
△ Less
Submitted 15 June, 2018; v1 submitted 9 May, 2018;
originally announced May 2018.