-
Can Current Explainability Help Provide References in Clinical Notes to Support Humans Annotate Medical Codes?
Authors:
Byung-Hak Kim,
Zhongfen Deng,
Philip S. Yu,
Varun Ganapathi
Abstract:
The medical codes prediction problem from clinical notes has received substantial interest in the NLP community, and several recent studies have shown the state-of-the-art (SOTA) code prediction results of full-fledged deep learning-based methods. However, most previous SOTA works based on deep learning are still in early stages in terms of providing textual references and explanations of the pred…
▽ More
The medical codes prediction problem from clinical notes has received substantial interest in the NLP community, and several recent studies have shown the state-of-the-art (SOTA) code prediction results of full-fledged deep learning-based methods. However, most previous SOTA works based on deep learning are still in early stages in terms of providing textual references and explanations of the predicted codes, despite the fact that this level of explainability of the prediction outcomes is critical to gaining trust from professional medical coders. This raises the important question of how well current explainability methods apply to advanced neural network models such as transformers to predict correct codes and present references in clinical notes that support code prediction. First, we present an explainable Read, Attend, and Code (xRAC) framework and assess two approaches, attention score-based xRAC-ATTN and model-agnostic knowledge-distillation-based xRAC-KD, through simplified but thorough human-grounded evaluations with SOTA transformer-based model, RAC. We find that the supporting evidence text highlighted by xRAC-ATTN is of higher quality than xRAC-KD whereas xRAC-KD has potential advantages in production deployment scenarios. More importantly, we show for the first time that, given the current state of explainability methodologies, using the SOTA medical codes prediction system still requires the expertise and competencies of professional coders, even though its prediction accuracy is superior to that of human coders. This, we believe, is a very meaningful step toward developing explainable and accurate machine learning systems for fully autonomous medical code prediction from clinical notes.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Sequential Recommendation with Auxiliary Item Relationships via Multi-Relational Transformer
Authors:
Ziwei Fan,
Zhiwei Liu,
Chen Wang,
Peijie Huang,
Hao Peng,
Philip S. Yu
Abstract:
Sequential Recommendation (SR) models user dynamics and predicts the next preferred items based on the user history. Existing SR methods model the 'was interacted before' item-item transitions observed in sequences, which can be viewed as an item relationship. However, there are multiple auxiliary item relationships, e.g., items from similar brands and with similar contents in real-world scenarios…
▽ More
Sequential Recommendation (SR) models user dynamics and predicts the next preferred items based on the user history. Existing SR methods model the 'was interacted before' item-item transitions observed in sequences, which can be viewed as an item relationship. However, there are multiple auxiliary item relationships, e.g., items from similar brands and with similar contents in real-world scenarios. Auxiliary item relationships describe item-item affinities in multiple different semantics and alleviate the long-lasting cold start problem in the recommendation. However, it remains a significant challenge to model auxiliary item relationships in SR.
To simultaneously model high-order item-item transitions in sequences and auxiliary item relationships, we propose a Multi-relational Transformer capable of modeling auxiliary item relationships for SR (MT4SR). Specifically, we propose a novel self-attention module, which incorporates arbitrary item relationships and weights item relationships accordingly. Second, we regularize intra-sequence item relationships with a novel regularization module to supervise attentions computations. Third, for inter-sequence item relationship pairs, we introduce a novel inter-sequence related items modeling module. Finally, we conduct experiments on four benchmark datasets and demonstrate the effectiveness of MT4SR over state-of-the-art methods and the improvements on the cold start problem. The code is available at https://github.com/zfan20/MT4SR.
△ Less
Submitted 28 October, 2022; v1 submitted 24 October, 2022;
originally announced October 2022.
-
Entity-to-Text based Data Augmentation for various Named Entity Recognition Tasks
Authors:
Xuming Hu,
Yong Jiang,
Aiwei Liu,
Zhongqiang Huang,
Pengjun Xie,
Fei Huang,
Lijie Wen,
Philip S. Yu
Abstract:
Data augmentation techniques have been used to alleviate the problem of scarce labeled data in various NER tasks (flat, nested, and discontinuous NER tasks). Existing augmentation techniques either manipulate the words in the original text that break the semantic coherence of the text, or exploit generative models that ignore preserving entities in the original text, which impedes the use of augme…
▽ More
Data augmentation techniques have been used to alleviate the problem of scarce labeled data in various NER tasks (flat, nested, and discontinuous NER tasks). Existing augmentation techniques either manipulate the words in the original text that break the semantic coherence of the text, or exploit generative models that ignore preserving entities in the original text, which impedes the use of augmentation techniques on nested and discontinuous NER tasks. In this work, we propose a novel Entity-to-Text based data augmentation technique named EnTDA to add, delete, replace or swap entities in the entity list of the original texts, and adopt these augmented entity lists to generate semantically coherent and entity preserving texts for various NER tasks. Furthermore, we introduce a diversity beam search to increase the diversity during the text generation process. Experiments on thirteen NER datasets across three tasks (flat, nested, and discontinuous NER tasks) and two settings (full data and low resource settings) show that EnTDA could bring more performance improvements compared to the baseline augmentation techniques.
△ Less
Submitted 26 May, 2023; v1 submitted 19 October, 2022;
originally announced October 2022.
-
CLARE: A Semi-supervised Community Detection Algorithm
Authors:
Xixi Wu,
Yun Xiong,
Yao Zhang,
Yizhu Jiao,
Caihua Shan,
Yiheng Sun,
Yangyong Zhu,
Philip S. Yu
Abstract:
Community detection refers to the task of discovering closely related subgraphs to understand the networks. However, traditional community detection algorithms fail to pinpoint a particular kind of community. This limits its applicability in real-world networks, e.g., distinguishing fraud groups from normal ones in transaction networks. Recently, semi-supervised community detection emerges as a so…
▽ More
Community detection refers to the task of discovering closely related subgraphs to understand the networks. However, traditional community detection algorithms fail to pinpoint a particular kind of community. This limits its applicability in real-world networks, e.g., distinguishing fraud groups from normal ones in transaction networks. Recently, semi-supervised community detection emerges as a solution. It aims to seek other similar communities in the network with few labeled communities as training data. Existing works can be regarded as seed-based: locate seed nodes and then develop communities around seeds. However, these methods are quite sensitive to the quality of selected seeds since communities generated around a mis-detected seed may be irrelevant. Besides, they have individual issues, e.g., inflexibility and high computational overhead. To address these issues, we propose CLARE, which consists of two key components, Community Locator and Community Rewriter. Our idea is that we can locate potential communities and then refine them. Therefore, the community locator is proposed for quickly locating potential communities by seeking subgraphs that are similar to training ones in the network. To further adjust these located communities, we devise the community rewriter. Enhanced by deep reinforcement learning, it suggests intelligent decisions, such as adding or dropping nodes, to refine community structures flexibly. Extensive experiments verify both the effectiveness and efficiency of our work compared with prior state-of-the-art approaches on multiple real-world datasets.
△ Less
Submitted 15 October, 2022;
originally announced October 2022.
-
Metaverse: Survey, Applications, Security, and Opportunities
Authors:
Jiayi Sun,
Wensheng Gan,
Han-Chieh Chao,
Philip S. Yu
Abstract:
As a fusion of various emerging digital technologies, the Metaverse aims to build a virtual shared digital space. It is closely related to extended reality, digital twin, blockchain, and other technologies. Its goal is to build a digital space based on the real world, form a virtual economic system, and expand the space of human activities, which injects new vitality into the social, economic, and…
▽ More
As a fusion of various emerging digital technologies, the Metaverse aims to build a virtual shared digital space. It is closely related to extended reality, digital twin, blockchain, and other technologies. Its goal is to build a digital space based on the real world, form a virtual economic system, and expand the space of human activities, which injects new vitality into the social, economic, and other fields. In this article, we make the following contributions. We first introduce the basic concepts such as the development process, definition, and characteristics of the Metaverse. After that, we analyze the overall framework and supporting technologies of the Metaverse and summarize the status of common fields. Finally, we focus on the security and privacy issues that exist in the Metaverse, and give corresponding solutions or directions. We also discuss the challenges and possible research directions of the Metaverse. We believe this comprehensive review can provide an overview of the Metaverse and research directions for some multidisciplinary studies.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Variational Graph Generator for Multi-View Graph Clustering
Authors:
Jianpeng Chen,
Yawen Ling,
Jie Xu,
Yazhou Ren,
Shudong Huang,
Xiaorong Pu,
Zhifeng Hao,
Philip S. Yu,
Lifang He
Abstract:
Multi-view graph clustering (MGC) methods are increasingly being studied due to the explosion of multi-view data with graph structural information. The critical point of MGC is to better utilize the view-specific and view-common information in features and graphs of multiple views. However, existing works have an inherent limitation that they are unable to concurrently utilize the consensus graph…
▽ More
Multi-view graph clustering (MGC) methods are increasingly being studied due to the explosion of multi-view data with graph structural information. The critical point of MGC is to better utilize the view-specific and view-common information in features and graphs of multiple views. However, existing works have an inherent limitation that they are unable to concurrently utilize the consensus graph information across multiple graphs and the view-specific feature information. To address this issue, we propose Variational Graph Generator for Multi-View Graph Clustering (VGMGC). Specifically, a novel variational graph generator is proposed to extract common information among multiple graphs. This generator infers a reliable variational consensus graph based on a priori assumption over multiple graphs. Then a simple yet effective graph encoder in conjunction with the multi-view clustering objective is presented to learn the desired graph embeddings for clustering, which embeds the inferred view-common graph and view-specific graphs together with features. Finally, theoretical results illustrate the rationality of VGMGC by analyzing the uncertainty of the inferred consensus graph with information bottleneck principle. Extensive experiments demonstrate the superior performance of our VGMGC over SOTAs.
△ Less
Submitted 16 December, 2022; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Deep Clustering: A Comprehensive Survey
Authors:
Yazhou Ren,
Jingyu Pu,
Zhimeng Yang,
Jie Xu,
Guofeng Li,
Xiaorong Pu,
Philip S. Yu,
Lifang He
Abstract:
Cluster analysis plays an indispensable role in machine learning and data mining. Learning a good data representation is crucial for clustering algorithms. Recently, deep clustering, which can learn clustering-friendly representations using deep neural networks, has been broadly applied in a wide range of clustering tasks. Existing surveys for deep clustering mainly focus on the single-view fields…
▽ More
Cluster analysis plays an indispensable role in machine learning and data mining. Learning a good data representation is crucial for clustering algorithms. Recently, deep clustering, which can learn clustering-friendly representations using deep neural networks, has been broadly applied in a wide range of clustering tasks. Existing surveys for deep clustering mainly focus on the single-view fields and the network architectures, ignoring the complex application scenarios of clustering. To address this issue, in this paper we provide a comprehensive survey for deep clustering in views of data sources. With different data sources and initial conditions, we systematically distinguish the clustering methods in terms of methodology, prior knowledge, and architecture. Concretely, deep clustering methods are introduced according to four categories, i.e., traditional single-view deep clustering, semi-supervised deep clustering, deep multi-view clustering, and deep transfer clustering. Finally, we discuss the open challenges and potential future opportunities in different fields of deep clustering.
△ Less
Submitted 8 October, 2022;
originally announced October 2022.
-
Contrast Pattern Mining: A Survey
Authors:
Yao Chen,
Wensheng Gan,
Yongdong Wu,
Philip S. Yu
Abstract:
Contrast pattern mining (CPM) is an important and popular subfield of data mining. Traditional sequential patterns cannot describe the contrast information between different classes of data, while contrast patterns involving the concept of contrast can describe the significant differences between datasets under different contrast conditions. Based on the number of papers published in this field, w…
▽ More
Contrast pattern mining (CPM) is an important and popular subfield of data mining. Traditional sequential patterns cannot describe the contrast information between different classes of data, while contrast patterns involving the concept of contrast can describe the significant differences between datasets under different contrast conditions. Based on the number of papers published in this field, we find that researchers' interest in CPM is still active. Since CPM has many research questions and research methods. It is difficult for new researchers in the field to understand the general situation of the field in a short period of time. Therefore, the purpose of this article is to provide an up-to-date comprehensive and structured overview of the research direction of contrast pattern mining. First, we present an in-depth understanding of CPM, including basic concepts, types, mining strategies, and metrics for assessing discriminative ability. Then we classify CPM methods according to their characteristics into boundary-based algorithms, tree-based algorithms, evolutionary fuzzy system-based algorithms, decision tree-based algorithms, and other algorithms. In addition, we list the classical algorithms of these methods and discuss their advantages and disadvantages. Advanced topics in CPM are presented. Finally, we conclude our survey with a discussion of the challenges and opportunities in this field.
△ Less
Submitted 27 September, 2022;
originally announced September 2022.
-
Totally-ordered Sequential Rules for Utility Maximization
Authors:
Chunkai Zhang,
Maohua Lyu,
Wensheng Gan,
Philip S. Yu
Abstract:
High utility sequential pattern mining (HUSPM) is a significant and valuable activity in knowledge discovery and data analytics with many real-world applications. In some cases, HUSPM can not provide an excellent measure to predict what will happen. High utility sequential rule mining (HUSRM) discovers high utility and high confidence sequential rules, allowing it to solve the problem in HUSPM. Al…
▽ More
High utility sequential pattern mining (HUSPM) is a significant and valuable activity in knowledge discovery and data analytics with many real-world applications. In some cases, HUSPM can not provide an excellent measure to predict what will happen. High utility sequential rule mining (HUSRM) discovers high utility and high confidence sequential rules, allowing it to solve the problem in HUSPM. All existing HUSRM algorithms aim to find high-utility partially-ordered sequential rules (HUSRs), which are not consistent with reality and may generate fake HUSRs. Therefore, in this paper, we formulate the problem of high utility totally-ordered sequential rule mining and propose two novel algorithms, called TotalSR and TotalSR+, which aim to identify all high utility totally-ordered sequential rules (HTSRs). TotalSR creates a utility table that can efficiently calculate antecedent support and a utility prefix sum list that can compute the remaining utility in O(1) time for a sequence. We also introduce a left-first expansion strategy that can utilize the anti-monotonic property to use a confidence pruning strategy. TotalSR can also drastically reduce the search space with the help of utility upper bounds pruning strategies, avoiding much more meaningless computation. In addition, TotalSR+ uses an auxiliary antecedent record table to more efficiently discover HTSRs. Finally, there are numerous experimental results on both real and synthetic datasets demonstrating that TotalSR is significantly more efficient than algorithms with fewer pruning strategies, and TotalSR+ is significantly more efficient than TotalSR in terms of running time and scalability.
△ Less
Submitted 27 September, 2022;
originally announced September 2022.
-
Scene Graph Modification as Incremental Structure Expanding
Authors:
Xuming Hu,
Zhijiang Guo,
Yu Fu,
Lijie Wen,
Philip S. Yu
Abstract:
A scene graph is a semantic representation that expresses the objects, attributes, and relationships between objects in a scene. Scene graphs play an important role in many cross modality tasks, as they are able to capture the interactions between images and texts. In this paper, we focus on scene graph modification (SGM), where the system is required to learn how to update an existing scene graph…
▽ More
A scene graph is a semantic representation that expresses the objects, attributes, and relationships between objects in a scene. Scene graphs play an important role in many cross modality tasks, as they are able to capture the interactions between images and texts. In this paper, we focus on scene graph modification (SGM), where the system is required to learn how to update an existing scene graph based on a natural language query. Unlike previous approaches that rebuilt the entire scene graph, we frame SGM as a graph expansion task by introducing the incremental structure expanding (ISE). ISE constructs the target graph by incrementally expanding the source graph without changing the unmodified structure. Based on ISE, we further propose a model that iterates between nodes prediction and edges prediction, inferring more accurate and harmonious expansion decisions progressively. In addition, we construct a challenging dataset that contains more complicated queries and larger scene graphs than existing datasets. Experiments on four benchmarks demonstrate the effectiveness of our approach, which surpasses the previous state-of-the-art model by large margins.
△ Less
Submitted 15 September, 2022;
originally announced September 2022.
-
PERFECT: A Hyperbolic Embedding for Joint User and Community Alignment
Authors:
Li Sun,
Zhongbao Zhang,
Jiawei Zhang,
Feiyang Wang,
Yang Du,
Sen Su,
Philip S. Yu
Abstract:
Social network alignment shows fundamental importance in a wide spectrum of applications. To the best of our knowledge, existing studies mainly focus on network alignment at the individual user level, requiring abundant common information between shared individual users. For the networks that cannot meet such requirements, social community structures actually provide complementary and critical inf…
▽ More
Social network alignment shows fundamental importance in a wide spectrum of applications. To the best of our knowledge, existing studies mainly focus on network alignment at the individual user level, requiring abundant common information between shared individual users. For the networks that cannot meet such requirements, social community structures actually provide complementary and critical information at a slightly coarse-grained level, alignment of which will provide additional information for user alignment. In turn, user alignment also reveals more clues for community alignment. Hence, in this paper, we introduce the problem of joint social network alignment, which aims to align users and communities across social networks simultaneously. Key challenges lie in that 1) how to learn the representations of both users and communities, and 2) how to make user alignment and community alignment benefit from each other. To address these challenges, we first elaborate on the characteristics of real-world networks with the notion of delta-hyperbolicity, and show the superiority of hyperbolic space for representing social networks. Then, we present a novel hyperbolic embedding approach for the joint social network alignment, referred to as PERFECT, in a unified optimization. Extensive experiments on real-world datasets show the superiority of PERFECT in both user alignment and community alignment.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
Cross-Network Social User Embedding with Hybrid Differential Privacy Guarantees
Authors:
Jiaqian Ren,
Lei Jiang,
Hao Peng,
Lingjuan Lyu,
Zhiwei Liu,
Chaochao Chen,
Jia Wu,
Xu Bai,
Philip S. Yu
Abstract:
Integrating multiple online social networks (OSNs) has important implications for many downstream social mining tasks, such as user preference modelling, recommendation, and link prediction. However, it is unfortunately accompanied by growing privacy concerns about leaking sensitive user information. How to fully utilize the data from different online social networks while preserving user privacy…
▽ More
Integrating multiple online social networks (OSNs) has important implications for many downstream social mining tasks, such as user preference modelling, recommendation, and link prediction. However, it is unfortunately accompanied by growing privacy concerns about leaking sensitive user information. How to fully utilize the data from different online social networks while preserving user privacy remains largely unsolved. To this end, we propose a Cross-network Social User Embedding framework, namely DP-CroSUE, to learn the comprehensive representations of users in a privacy-preserving way. We jointly consider information from partially aligned social networks with differential privacy guarantees. In particular, for each heterogeneous social network, we first introduce a hybrid differential privacy notion to capture the variation of privacy expectations for heterogeneous data types. Next, to find user linkages across social networks, we make unsupervised user embedding-based alignment in which the user embeddings are achieved by the heterogeneous network embedding technology. To further enhance user embeddings, a novel cross-network GCN embedding model is designed to transfer knowledge across networks through those aligned users. Extensive experiments on three real-world datasets demonstrate that our approach makes a significant improvement on user interest prediction tasks as well as defending user attribute inference attacks from embedding.
△ Less
Submitted 4 September, 2022;
originally announced September 2022.
-
ContrastVAE: Contrastive Variational AutoEncoder for Sequential Recommendation
Authors:
Yu Wang,
Hengrui Zhang,
Zhiwei Liu,
Liangwei Yang,
Philip S. Yu
Abstract:
Aiming at exploiting the rich information in user behaviour sequences, sequential recommendation has been widely adopted in real-world recommender systems. However, current methods suffer from the following issues: 1) sparsity of user-item interactions, 2) uncertainty of sequential records, 3) long-tail items. In this paper, we propose to incorporate contrastive learning into the framework of Vari…
▽ More
Aiming at exploiting the rich information in user behaviour sequences, sequential recommendation has been widely adopted in real-world recommender systems. However, current methods suffer from the following issues: 1) sparsity of user-item interactions, 2) uncertainty of sequential records, 3) long-tail items. In this paper, we propose to incorporate contrastive learning into the framework of Variational AutoEncoders to address these challenges simultaneously. Firstly, we introduce ContrastELBO, a novel training objective that extends the conventional single-view ELBO to two-view case and theoretically builds a connection between VAE and contrastive learning from a two-view perspective. Then we propose Contrastive Variational AutoEncoder (ContrastVAE in short), a two-branched VAE model with contrastive regularization as an embodiment of ContrastELBO for sequential recommendation. We further introduce two simple yet effective augmentation strategies named model augmentation and variational augmentation to create a second view of a sequence and thus making contrastive learning possible. Experiments on four benchmark datasets demonstrate the effectiveness of ContrastVAE and the proposed augmentation methods. Codes are available at https://github.com/YuWang-1024/ContrastVAE
△ Less
Submitted 5 December, 2022; v1 submitted 26 August, 2022;
originally announced September 2022.
-
A Generic Algorithm for Top-K On-Shelf Utility Mining
Authors:
Jiahui Chen,
Xu Guo,
Wensheng Gan,
Shichen Wan,
Philip S. Yu
Abstract:
On-shelf utility mining (OSUM) is an emerging research direction in data mining. It aims to discover itemsets that have high relative utility in their selling time period. Compared with traditional utility mining, OSUM can find more practical and meaningful patterns in real-life applications. However, there is a major drawback to traditional OSUM. For normal users, it is hard to define a minimum t…
▽ More
On-shelf utility mining (OSUM) is an emerging research direction in data mining. It aims to discover itemsets that have high relative utility in their selling time period. Compared with traditional utility mining, OSUM can find more practical and meaningful patterns in real-life applications. However, there is a major drawback to traditional OSUM. For normal users, it is hard to define a minimum threshold minutil for mining the right amount of on-shelf high utility itemsets. On one hand, if the threshold is set too high, the number of patterns would not be enough. On the other hand, if the threshold is set too low, too many patterns will be discovered and cause an unnecessary waste of time and memory consumption. To address this issue, the user usually directly specifies a parameter k, where only the top-k high relative utility itemsets would be considered. Therefore, in this paper, we propose a generic algorithm named TOIT for mining Top-k On-shelf hIgh-utility paTterns to solve this problem. TOIT applies a novel strategy to raise the minutil based on the on-shelf datasets. Besides, two novel upper-bound strategies named subtree utility and local utility are applied to prune the search space. By adopting the strategies mentioned above, the TOIT algorithm can narrow the search space as early as possible, improve the mining efficiency, and reduce the memory consumption, so it can obtain better performance than other algorithms. A series of experiments have been conducted on real datasets with different styles to compare the effects with the state-of-the-art KOSHU algorithm. The experimental results showed that TOIT outperforms KOSHU in both running time and memory consumption.
△ Less
Submitted 26 August, 2022;
originally announced August 2022.
-
A Self-supervised Riemannian GNN with Time Varying Curvature for Temporal Graph Learning
Authors:
Li Sun,
Junda Ye,
Hao Peng,
Philip S. Yu
Abstract:
Representation learning on temporal graphs has drawn considerable research attention owing to its fundamental importance in a wide spectrum of real-world applications. Though a number of studies succeed in obtaining time-dependent representations, it still faces significant challenges. On the one hand, most of the existing methods restrict the embedding space with a certain curvature. However, the…
▽ More
Representation learning on temporal graphs has drawn considerable research attention owing to its fundamental importance in a wide spectrum of real-world applications. Though a number of studies succeed in obtaining time-dependent representations, it still faces significant challenges. On the one hand, most of the existing methods restrict the embedding space with a certain curvature. However, the underlying geometry in fact shifts among the positive curvature hyperspherical, zero curvature Euclidean and negative curvature hyperbolic spaces in the evolvement over time. On the other hand, these methods usually require abundant labels to learn temporal representations, and thereby notably limit their wide use in the unlabeled graphs of the real applications. To bridge this gap, we make the first attempt to study the problem of self-supervised temporal graph representation learning in the general Riemannian space, supporting the time-varying curvature to shift among hyperspherical, Euclidean and hyperbolic spaces. In this paper, we present a novel self-supervised Riemannian graph neural network (SelfRGNN). Specifically, we design a curvature-varying Riemannian GNN with a theoretically grounded time encoding, and formulate a functional curvature over time to model the evolvement shifting among the positive, zero and negative curvature spaces. To enable the self-supervised learning, we propose a novel reweighting self-contrastive approach, exploring the Riemannian space itself without augmentation, and propose an edge-based self-supervised curvature learning with the Ricci curvature. Extensive experiments show the superiority of SelfRGNN, and moreover, the case study shows the time-varying curvature of temporal graph in reality.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Position-aware Structure Learning for Graph Topology-imbalance by Relieving Under-reaching and Over-squashing
Authors:
Qingyun Sun,
Jianxin Li,
Haonan Yuan,
Xingcheng Fu,
Hao Peng,
Cheng Ji,
Qian Li,
Philip S. Yu
Abstract:
Topology-imbalance is a graph-specific imbalance problem caused by the uneven topology positions of labeled nodes, which significantly damages the performance of GNNs. What topology-imbalance means and how to measure its impact on graph learning remain under-explored. In this paper, we provide a new understanding of topology-imbalance from a global view of the supervision information distribution…
▽ More
Topology-imbalance is a graph-specific imbalance problem caused by the uneven topology positions of labeled nodes, which significantly damages the performance of GNNs. What topology-imbalance means and how to measure its impact on graph learning remain under-explored. In this paper, we provide a new understanding of topology-imbalance from a global view of the supervision information distribution in terms of under-reaching and over-squashing, which motivates two quantitative metrics as measurements. In light of our analysis, we propose a novel position-aware graph structure learning framework named PASTEL, which directly optimizes the information propagation path and solves the topology-imbalance issue in essence. Our key insight is to enhance the connectivity of nodes within the same class for more supervision information, thereby relieving the under-reaching and over-squashing phenomena. Specifically, we design an anchor-based position encoding mechanism, which better incorporates relative topology position and enhances the intra-class inductive bias by maximizing the label influence. We further propose a class-wise conflict measure as the edge weights, which benefits the separation of different node classes. Extensive experiments demonstrate the superior potential and adaptability of PASTEL in enhancing GNNs' power in different data annotation scenarios.
△ Less
Submitted 17 August, 2022;
originally announced August 2022.
-
From Known to Unknown: Quality-aware Self-improving Graph Neural Network for Open Set Social Event Detection
Authors:
Jiaqian Ren,
Lei Jiang,
Hao Peng,
Yuwei Cao,
Jia Wu,
Philip S. Yu,
Lifang He
Abstract:
State-of-the-art Graph Neural Networks (GNNs) have achieved tremendous success in social event detection tasks when restricted to a closed set of events. However, considering the large amount of data needed for training a neural network and the limited ability of a neural network in handling previously unknown data, it remains a challenge for existing GNN-based methods to operate in an open set se…
▽ More
State-of-the-art Graph Neural Networks (GNNs) have achieved tremendous success in social event detection tasks when restricted to a closed set of events. However, considering the large amount of data needed for training a neural network and the limited ability of a neural network in handling previously unknown data, it remains a challenge for existing GNN-based methods to operate in an open set setting. To address this problem, we design a Quality-aware Self-improving Graph Neural Network (QSGNN) which extends the knowledge from known to unknown by leveraging the best of known samples and reliable knowledge transfer. Specifically, to fully exploit the labeled data, we propose a novel supervised pairwise loss with an additional orthogonal inter-class relation constraint to train the backbone GNN encoder. The learnt, already-known events further serve as strong reference bases for the unknown ones, which greatly prompts knowledge acquisition and transfer. When the model is generalized to unknown data, to ensure the effectiveness and reliability, we further leverage the reference similarity distribution vectors for pseudo pairwise label generation, selection and quality assessment. Following the diversity principle of active learning, our method selects diverse pair samples with the generated pseudo labels to fine-tune the GNN encoder. Besides, we propose a novel quality-guided optimization in which the contributions of pseudo labels are weighted based on consistency. We thoroughly evaluate our model on two large real-world social event datasets. Experiments demonstrate that our model achieves state-of-the-art results and extends well to unknown events.
△ Less
Submitted 14 August, 2022;
originally announced August 2022.
-
Time Lag Aware Sequential Recommendation
Authors:
Lihua Chen,
Ning Yang,
Philip S Yu
Abstract:
Although a variety of methods have been proposed for sequential recommendation, it is still far from being well solved partly due to two challenges. First, the existing methods often lack the simultaneous consideration of the global stability and local fluctuation of user preference, which might degrade the learning of a user's current preference. Second, the existing methods often use a scalar ba…
▽ More
Although a variety of methods have been proposed for sequential recommendation, it is still far from being well solved partly due to two challenges. First, the existing methods often lack the simultaneous consideration of the global stability and local fluctuation of user preference, which might degrade the learning of a user's current preference. Second, the existing methods often use a scalar based weighting schema to fuse the long-term and short-term preferences, which is too coarse to learn an expressive embedding of current preference. To address the two challenges, we propose a novel model called Time Lag aware Sequential Recommendation (TLSRec), which integrates a hierarchical modeling of user preference and a time lag sensitive fine-grained fusion of the long-term and short-term preferences. TLSRec employs a hierarchical self-attention network to learn users' preference at both global and local time scales, and a neural time gate to adaptively regulate the contributions of the long-term and short-term preferences for the learning of a user's current preference at the aspect level and based on the lag between the current time and the time of the last behavior of a user. The extensive experiments conducted on real datasets verify the effectiveness of TLSRec.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
Automating DBSCAN via Deep Reinforcement Learning
Authors:
Ruitong Zhang,
Hao Peng,
Yingtong Dou,
Jia Wu,
Qingyun Sun,
Jingyi Zhang,
Philip S. Yu
Abstract:
DBSCAN is widely used in many scientific and engineering fields because of its simplicity and practicality. However, due to its high sensitivity parameters, the accuracy of the clustering result depends heavily on practical experience. In this paper, we first propose a novel Deep Reinforcement Learning guided automatic DBSCAN parameters search framework, namely DRL-DBSCAN. The framework models the…
▽ More
DBSCAN is widely used in many scientific and engineering fields because of its simplicity and practicality. However, due to its high sensitivity parameters, the accuracy of the clustering result depends heavily on practical experience. In this paper, we first propose a novel Deep Reinforcement Learning guided automatic DBSCAN parameters search framework, namely DRL-DBSCAN. The framework models the process of adjusting the parameter search direction by perceiving the clustering environment as a Markov decision process, which aims to find the best clustering parameters without manual assistance. DRL-DBSCAN learns the optimal clustering parameter search policy for different feature distributions via interacting with the clusters, using a weakly-supervised reward training policy network. In addition, we also present a recursive search mechanism driven by the scale of the data to efficiently and controllably process large parameter spaces. Extensive experiments are conducted on five artificial and real-world datasets based on the proposed four working modes. The results of offline and online tasks show that the DRL-DBSCAN not only consistently improves DBSCAN clustering accuracy by up to 26% and 25% respectively, but also can stably find the dominant parameters with high computational efficiency. The code is available at https://github.com/RingBDStack/DRL-DBSCAN.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking
Authors:
Xuming Hu,
Zhijiang Guo,
Guanyu Wu,
Aiwei Liu,
Lijie Wen,
Philip S. Yu
Abstract:
The explosion of misinformation spreading in the media ecosystem urges for automated fact-checking. While misinformation spans both geographic and linguistic boundaries, most work in the field has focused on English. Datasets and tools available in other languages, such as Chinese, are limited. In order to bridge this gap, we construct CHEF, the first CHinese Evidence-based Fact-checking dataset o…
▽ More
The explosion of misinformation spreading in the media ecosystem urges for automated fact-checking. While misinformation spans both geographic and linguistic boundaries, most work in the field has focused on English. Datasets and tools available in other languages, such as Chinese, are limited. In order to bridge this gap, we construct CHEF, the first CHinese Evidence-based Fact-checking dataset of 10K real-world claims. The dataset covers multiple domains, ranging from politics to public health, and provides annotated evidence retrieved from the Internet. Further, we develop established baselines and a novel approach that is able to model the evidence retrieval as a latent variable, allowing jointly training with the veracity prediction model in an end-to-end fashion. Extensive experiments show that CHEF will provide a challenging testbed for the development of fact-checking systems designed to retrieve and reason over non-English claims.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs
Authors:
Kay Liu,
Yingtong Dou,
Yue Zhao,
Xueying Ding,
Xiyang Hu,
Ruitong Zhang,
Kaize Ding,
Canyu Chen,
Hao Peng,
Kai Shu,
Lichao Sun,
Jundong Li,
George H. Chen,
Zhihao Jia,
Philip S. Yu
Abstract:
Detecting which nodes in graphs are outliers is a relatively new machine learning task with numerous applications. Despite the proliferation of algorithms developed in recent years for this task, there has been no standard comprehensive setting for performance evaluation. Consequently, it has been difficult to understand which methods work well and when under a broad range of settings. To bridge t…
▽ More
Detecting which nodes in graphs are outliers is a relatively new machine learning task with numerous applications. Despite the proliferation of algorithms developed in recent years for this task, there has been no standard comprehensive setting for performance evaluation. Consequently, it has been difficult to understand which methods work well and when under a broad range of settings. To bridge this gap, we present--to the best of our knowledge--the first comprehensive benchmark for unsupervised outlier node detection on static attributed graphs called BOND, with the following highlights. (1) We benchmark the outlier detection performance of 14 methods ranging from classical matrix factorization to the latest graph neural networks. (2) Using nine real datasets, our benchmark assesses how the different detection methods respond to two major types of synthetic outliers and separately to "organic" (real non-synthetic) outliers. (3) Using an existing random graph generation technique, we produce a family of synthetically generated datasets of different graph sizes that enable us to compare the running time and memory usage of the different outlier detection algorithms. Based on our experimental results, we discuss the pros and cons of existing graph outlier detection algorithms, and we highlight opportunities for future research. Importantly, our code is freely available and meant to be easily extendable: https://github.com/pygod-team/pygod/tree/main/benchmark
△ Less
Submitted 15 October, 2022; v1 submitted 20 June, 2022;
originally announced June 2022.
-
Collaborative Knowledge Graph Fusion by Exploiting the Open Corpus
Authors:
Yue Wang,
Yao Wan,
Lu Bai,
Lixin Cui,
Zhuo Xu,
Ming Li,
Philip S. Yu,
Edwin R Hancock
Abstract:
To alleviate the challenges of building Knowledge Graphs (KG) from scratch, a more general task is to enrich a KG using triples from an open corpus, where the obtained triples contain noisy entities and relations. It is challenging to enrich a KG with newly harvested triples while maintaining the quality of the knowledge representation. This paper proposes a system to refine a KG using information…
▽ More
To alleviate the challenges of building Knowledge Graphs (KG) from scratch, a more general task is to enrich a KG using triples from an open corpus, where the obtained triples contain noisy entities and relations. It is challenging to enrich a KG with newly harvested triples while maintaining the quality of the knowledge representation. This paper proposes a system to refine a KG using information harvested from an additional corpus. To this end, we formulate our task as two coupled sub-tasks, namely join event extraction (JEE) and knowledge graph fusion (KGF). We then propose a Collaborative Knowledge Graph Fusion Framework to allow our sub-tasks to mutually assist one another in an alternating manner. More concretely, the explorer carries out the JEE supervised by both the ground-truth annotation and an existing KG provided by the supervisor. The supervisor then evaluates the triples extracted by the explorer and enriches the KG with those that are highly ranked. To implement this evaluation, we further propose a Translated Relation Alignment Scoring Mechanism to align and translate the extracted triples to the prior KG. Experiments verify that this collaboration can both improve the performance of the JEE and the KGF.
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
Towards Target Sequential Rules
Authors:
Wensheng Gan,
Gengsen Huang,
Jian Weng,
Tianlong Gu,
Philip S. Yu
Abstract:
In many real-world applications, sequential rule mining (SRM) can provide prediction and recommendation functions for a variety of services. It is an important technique of pattern mining to discover all valuable rules that belong to high-frequency and high-confidence sequential rules. Although several algorithms of SRM are proposed to solve various practical problems, there are no studies on targ…
▽ More
In many real-world applications, sequential rule mining (SRM) can provide prediction and recommendation functions for a variety of services. It is an important technique of pattern mining to discover all valuable rules that belong to high-frequency and high-confidence sequential rules. Although several algorithms of SRM are proposed to solve various practical problems, there are no studies on target sequential rules. Targeted sequential rule mining aims at mining the interesting sequential rules that users focus on, thus avoiding the generation of other invalid and unnecessary rules. This approach can further improve the efficiency of users in analyzing rules and reduce the consumption of data resources. In this paper, we provide the relevant definitions of target sequential rule and formulate the problem of targeted sequential rule mining. Furthermore, we propose an efficient algorithm, called targeted sequential rule mining (TaSRM). Several pruning strategies and an optimization are introduced to improve the efficiency of TaSRM. Finally, a large number of experiments are conducted on different benchmarks, and we analyze the results in terms of their running time, memory consumption, and scalability, as well as query cases with different query rules. It is shown that the novel algorithm TaSRM and its variants can achieve better experimental performance compared to the existing baseline algorithm.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination
Authors:
Yizhen Zheng,
Shirui Pan,
Vincent Cs Lee,
Yu Zheng,
Philip S. Yu
Abstract:
Graph contrastive learning (GCL) alleviates the heavy reliance on label information for graph representation learning (GRL) via self-supervised learning schemes. The core idea is to learn by maximising mutual information for similar instances, which requires similarity computation between two node instances. However, GCL is inefficient in both time and memory consumption. In addition, GCL normally…
▽ More
Graph contrastive learning (GCL) alleviates the heavy reliance on label information for graph representation learning (GRL) via self-supervised learning schemes. The core idea is to learn by maximising mutual information for similar instances, which requires similarity computation between two node instances. However, GCL is inefficient in both time and memory consumption. In addition, GCL normally requires a large number of training epochs to be well-trained on large-scale datasets. Inspired by an observation of a technical defect (i.e., inappropriate usage of Sigmoid function) commonly used in two representative GCL works, DGI and MVGRL, we revisit GCL and introduce a new learning paradigm for self-supervised graph representation learning, namely, Group Discrimination (GD), and propose a novel GD-based method called Graph Group Discrimination (GGD). Instead of similarity computation, GGD directly discriminates two groups of node samples with a very simple binary cross-entropy loss. In addition, GGD requires much fewer training epochs to obtain competitive performance compared with GCL methods on large-scale datasets. These two advantages endow GGD with very efficient property. Extensive experiments show that GGD outperforms state-of-the-art self-supervised methods on eight datasets. In particular, GGD can be trained in 0.18 seconds (6.44 seconds including data preprocessing) on ogbn-arxiv, which is orders of magnitude (10,000+) faster than GCL baselines while consuming much less memory. Trained with 9 hours on ogbn-papers100M with billion edges, GGD outperforms its GCL counterparts in both accuracy and efficiency.
△ Less
Submitted 16 October, 2022; v1 submitted 3 June, 2022;
originally announced June 2022.
-
A Multi-level Supervised Contrastive Learning Framework for Low-Resource Natural Language Inference
Authors:
Shu'ang Li,
Xuming Hu,
Li Lin,
Aiwei Liu,
Lijie Wen,
Philip S. Yu
Abstract:
Natural Language Inference (NLI) is a growingly essential task in natural language understanding, which requires inferring the relationship between the sentence pairs (premise and hypothesis). Recently, low-resource natural language inference has gained increasing attention, due to significant savings in manual annotation costs and a better fit with real-world scenarios. Existing works fail to cha…
▽ More
Natural Language Inference (NLI) is a growingly essential task in natural language understanding, which requires inferring the relationship between the sentence pairs (premise and hypothesis). Recently, low-resource natural language inference has gained increasing attention, due to significant savings in manual annotation costs and a better fit with real-world scenarios. Existing works fail to characterize discriminative representations between different classes with limited training data, which may cause faults in label prediction. Here we propose a multi-level supervised contrastive learning framework named MultiSCL for low-resource natural language inference. MultiSCL leverages a sentence-level and pair-level contrastive learning objective to discriminate between different classes of sentence pairs by bringing those in one class together and pushing away those in different classes. MultiSCL adopts a data augmentation module that generates different views for input samples to better learn the latent representation. The pair-level representation is obtained from a cross attention module. We conduct extensive experiments on two public NLI datasets in low-resource settings, and the accuracy of MultiSCL exceeds other models by 3.1% on average. Moreover, our method outperforms the previous state-of-the-art method on cross-domain tasks of text classification.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
Evidential Temporal-aware Graph-based Social Event Detection via Dempster-Shafer Theory
Authors:
Jiaqian Ren,
Lei Jiang,
Hao Peng,
Zhiwei Liu,
Jia Wu,
Philip S. Yu
Abstract:
The rising popularity of online social network services has attracted lots of research on mining social media data, especially on mining social events. Social event detection, due to its wide applications, has now become a trivial task. State-of-the-art approaches exploiting Graph Neural Networks (GNNs) usually follow a two-step strategy: 1) constructing text graphs based on various views (\textit…
▽ More
The rising popularity of online social network services has attracted lots of research on mining social media data, especially on mining social events. Social event detection, due to its wide applications, has now become a trivial task. State-of-the-art approaches exploiting Graph Neural Networks (GNNs) usually follow a two-step strategy: 1) constructing text graphs based on various views (\textit{co-user}, \textit{co-entities} and \textit{co-hashtags}); and 2) learning a unified text representation by a specific GNN model. Generally, the results heavily rely on the quality of the constructed graphs and the specific message passing scheme. However, existing methods have deficiencies in both aspects: 1) They fail to recognize the noisy information induced by unreliable views. 2) Temporal information which works as a vital indicator of events is neglected in most works. To this end, we propose ETGNN, a novel Evidential Temporal-aware Graph Neural Network. Specifically, we construct view-specific graphs whose nodes are the texts and edges are determined by several types of shared elements respectively. To incorporate temporal information into the message passing scheme, we introduce a novel temporal-aware aggregator which assigns weights to neighbours according to an adaptive time exponential decay formula. Considering the view-specific uncertainty, the representations of all views are converted into mass functions through evidential deep learning (EDL) neural networks, and further combined via Dempster-Shafer theory (DST) to make the final detection. Experimental results on three real-world datasets demonstrate the effectiveness of ETGNN in accuracy, reliability and robustness in social event detection.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
HiURE: Hierarchical Exemplar Contrastive Learning for Unsupervised Relation Extraction
Authors:
Xuming Hu,
Shuliang Liu,
Chenwei Zhang,
Shu`ang Li,
Lijie Wen,
Philip S. Yu
Abstract:
Unsupervised relation extraction aims to extract the relationship between entities from natural language sentences without prior information on relational scope or distribution. Existing works either utilize self-supervised schemes to refine relational feature signals by iteratively leveraging adaptive clustering and classification that provoke gradual drift problems, or adopt instance-wise contra…
▽ More
Unsupervised relation extraction aims to extract the relationship between entities from natural language sentences without prior information on relational scope or distribution. Existing works either utilize self-supervised schemes to refine relational feature signals by iteratively leveraging adaptive clustering and classification that provoke gradual drift problems, or adopt instance-wise contrastive learning which unreasonably pushes apart those sentence pairs that are semantically similar. To overcome these defects, we propose a novel contrastive learning framework named HiURE, which has the capability to derive hierarchical signals from relational feature space using cross hierarchy attention and effectively optimize relation representation of sentences under exemplar-wise contrastive learning. Experimental results on two public datasets demonstrate the advanced effectiveness and robustness of HiURE on unsupervised relation extraction when compared with state-of-the-art models.
△ Less
Submitted 20 February, 2023; v1 submitted 4 May, 2022;
originally announced May 2022.
-
XLTime: A Cross-Lingual Knowledge Transfer Framework for Temporal Expression Extraction
Authors:
Yuwei Cao,
William Groves,
Tanay Kumar Saha,
Joel R. Tetreault,
Alex Jaimes,
Hao Peng,
Philip S. Yu
Abstract:
Temporal Expression Extraction (TEE) is essential for understanding time in natural language. It has applications in Natural Language Processing (NLP) tasks such as question answering, information retrieval, and causal inference. To date, work in this area has mostly focused on English as there is a scarcity of labeled data for other languages. We propose XLTime, a novel framework for multilingual…
▽ More
Temporal Expression Extraction (TEE) is essential for understanding time in natural language. It has applications in Natural Language Processing (NLP) tasks such as question answering, information retrieval, and causal inference. To date, work in this area has mostly focused on English as there is a scarcity of labeled data for other languages. We propose XLTime, a novel framework for multilingual TEE. XLTime works on top of pre-trained language models and leverages multi-task learning to prompt cross-language knowledge transfer both from English and within the non-English languages. XLTime alleviates problems caused by a shortage of data in the target language. We apply XLTime with different language models and show that it outperforms the previous automatic SOTA methods on French, Spanish, Portuguese, and Basque, by large margins. XLTime also closes the gap considerably on the handcrafted HeidelTime method.
△ Less
Submitted 3 May, 2022;
originally announced May 2022.
-
PyGOD: A Python Library for Graph Outlier Detection
Authors:
Kay Liu,
Yingtong Dou,
Xueying Ding,
Xiyang Hu,
Ruitong Zhang,
Hao Peng,
Lichao Sun,
Philip S. Yu
Abstract:
PyGOD is an open-source Python library for detecting outliers in graph data. As the first comprehensive library of its kind, PyGOD supports a wide array of leading graph-based methods for outlier detection under an easy-to-use, well-documented API designed for use by both researchers and practitioners. PyGOD provides modularized components of the different detectors implemented so that users can e…
▽ More
PyGOD is an open-source Python library for detecting outliers in graph data. As the first comprehensive library of its kind, PyGOD supports a wide array of leading graph-based methods for outlier detection under an easy-to-use, well-documented API designed for use by both researchers and practitioners. PyGOD provides modularized components of the different detectors implemented so that users can easily customize each detector for their purposes. To ease the construction of detection workflows, PyGOD offers numerous commonly used utility functions. To scale computation to large graphs, PyGOD supports functionalities for deep models such as sampling and mini-batch processing. PyGOD uses best practices in fostering code reliability and maintainability, including unit testing, continuous integration, and code coverage. To facilitate accessibility, PyGOD is released under a BSD 2-Clause license at https://pygod.org and at the Python Package Index (PyPI).
△ Less
Submitted 2 June, 2024; v1 submitted 26 April, 2022;
originally announced April 2022.
-
A Survey on Location-Driven Influence Maximization
Authors:
Taotao Cai,
Quan Z. Sheng,
Xiangyu Song,
Jian Yang,
Shuang Wang,
Wei Emma Zhang,
Jia Wu,
Philip S. Yu
Abstract:
Influence Maximization (IM), which aims to select a set of users from a social network to maximize the expected number of influenced users, is an evergreen hot research topic. Its research outcomes significantly impact real-world applications such as business marketing. The booming location-based network platforms of the last decade appeal to the researchers embedding the location information into…
▽ More
Influence Maximization (IM), which aims to select a set of users from a social network to maximize the expected number of influenced users, is an evergreen hot research topic. Its research outcomes significantly impact real-world applications such as business marketing. The booming location-based network platforms of the last decade appeal to the researchers embedding the location information into traditional IM research. In this survey, we provide a comprehensive review of the existing location-driven IM studies from the perspective of the following key aspects: (1) a review of the application scenarios of these works, (2) the diffusion models to evaluate the influence propagation, and (3) a comprehensive study of the approaches to deal with the location-driven IM problems together with a particular focus on the accelerating techniques. In the end, we draw prospects into the research directions in future IM research.
△ Less
Submitted 14 September, 2022; v1 submitted 17 April, 2022;
originally announced April 2022.
-
Multifaceted Improvements for Conversational Open-Domain Question Answering
Authors:
Tingting Liang,
Yixuan Jiang,
Congying Xia,
Ziqiang Zhao,
Yuyu Yin,
Philip S. Yu
Abstract:
Open-domain question answering (OpenQA) is an important branch of textual QA which discovers answers for the given questions based on a large number of unstructured documents. Effectively mining correct answers from the open-domain sources still has a fair way to go. Existing OpenQA systems might suffer from the issues of question complexity and ambiguity, as well as insufficient background knowle…
▽ More
Open-domain question answering (OpenQA) is an important branch of textual QA which discovers answers for the given questions based on a large number of unstructured documents. Effectively mining correct answers from the open-domain sources still has a fair way to go. Existing OpenQA systems might suffer from the issues of question complexity and ambiguity, as well as insufficient background knowledge. Recently, conversational OpenQA is proposed to address these issues with the abundant contextual information in the conversation. Promising as it might be, there exist several fundamental limitations including the inaccurate question understanding, the coarse ranking for passage selection, and the inconsistent usage of golden passage in the training and inference phases. To alleviate these limitations, in this paper, we propose a framework with Multifaceted Improvements for Conversational open-domain Question Answering (MICQA). Specifically, MICQA has three significant advantages. First, the proposed KL-divergence based regularization is able to lead to a better question understanding for retrieval and answer reading. Second, the added post-ranker module can push more relevant passages to the top placements and be selected for reader with a two-aspect constrains. Third, the well designed curriculum learning strategy effectively narrows the gap between the golden passage settings of training and inference, and encourages the reader to find true answer without the golden passage assistance. Extensive experiments conducted on the publicly available dataset OR-QuAC demonstrate the superiority of MICQA over the state-of-the-art model in conversational OpenQA task.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
Improving Contrastive Learning with Model Augmentation
Authors:
Zhiwei Liu,
Yongjun Chen,
Jia Li,
Man Luo,
Philip S. Yu,
Caiming Xiong
Abstract:
The sequential recommendation aims at predicting the next items in user behaviors, which can be solved by characterizing item relationships in sequences. Due to the data sparsity and noise issues in sequences, a new self-supervised learning (SSL) paradigm is proposed to improve the performance, which employs contrastive learning between positive and negative views of sequences.
However, existing…
▽ More
The sequential recommendation aims at predicting the next items in user behaviors, which can be solved by characterizing item relationships in sequences. Due to the data sparsity and noise issues in sequences, a new self-supervised learning (SSL) paradigm is proposed to improve the performance, which employs contrastive learning between positive and negative views of sequences.
However, existing methods all construct views by adopting augmentation from data perspectives, while we argue that 1) optimal data augmentation methods are hard to devise, 2) data augmentation methods destroy sequential correlations, and 3) data augmentation fails to incorporate comprehensive self-supervised signals.
Therefore, we investigate the possibility of model augmentation to construct view pairs. We propose three levels of model augmentation methods: neuron masking, layer dropping, and encoder complementing.
This work opens up a novel direction in constructing views for contrastive SSL. Experiments verify the efficacy of model augmentation for the SSL in the sequential recommendation. Code is available\footnote{\url{https://github.com/salesforce/SRMA}}.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
Deep reinforcement learning guided graph neural networks for brain network analysis
Authors:
Xusheng Zhao,
Jia Wu,
Hao Peng,
Amin Beheshti,
Jessica J. M. Monaghan,
David McAlpine,
Heivet Hernandez-Perez,
Mark Dras,
Qiong Dai,
Yangyang Li,
Philip S. Yu,
Lifang He
Abstract:
Modern neuroimaging techniques, such as diffusion tensor imaging (DTI) and functional magnetic resonance imaging (fMRI), enable us to model the human brain as a brain network or connectome. Capturing brain networks' structural information and hierarchical patterns is essential for understanding brain functions and disease states. Recently, the promising network representation learning capability o…
▽ More
Modern neuroimaging techniques, such as diffusion tensor imaging (DTI) and functional magnetic resonance imaging (fMRI), enable us to model the human brain as a brain network or connectome. Capturing brain networks' structural information and hierarchical patterns is essential for understanding brain functions and disease states. Recently, the promising network representation learning capability of graph neural networks (GNNs) has prompted many GNN-based methods for brain network analysis to be proposed. Specifically, these methods apply feature aggregation and global pooling to convert brain network instances into meaningful low-dimensional representations used for downstream brain network analysis tasks. However, existing GNN-based methods often neglect that brain networks of different subjects may require various aggregation iterations and use GNN with a fixed number of layers to learn all brain networks. Therefore, how to fully release the potential of GNNs to promote brain network analysis is still non-trivial. To solve this problem, we propose a novel brain network representation framework, namely BN-GNN, which searches for the optimal GNN architecture for each brain network. Concretely, BN-GNN employs deep reinforcement learning (DRL) to train a meta-policy to automatically determine the optimal number of feature aggregations (reflected in the number of GNN layers) required for a given brain network. Extensive experiments on eight real-world brain network datasets demonstrate that our proposed BN-GNN improves the performance of traditional GNNs on different brain network analysis tasks.
△ Less
Submitted 24 July, 2022; v1 submitted 18 March, 2022;
originally announced March 2022.
-
G$^3$SR: Global Graph Guided Session-based Recommendation
Authors:
Zhi-Hong Deng,
Chang-Dong Wang,
Ling Huang,
Jian-Huang Lai,
Philip S. Yu
Abstract:
Session-based recommendation tries to make use of anonymous session data to deliver high-quality recommendation under the condition that user-profiles and the complete historical behavioral data of a target user are unavailable. Previous works consider each session individually and try to capture user interests within a session. Despite their encouraging results, these models can only perceive int…
▽ More
Session-based recommendation tries to make use of anonymous session data to deliver high-quality recommendation under the condition that user-profiles and the complete historical behavioral data of a target user are unavailable. Previous works consider each session individually and try to capture user interests within a session. Despite their encouraging results, these models can only perceive intra-session items and cannot draw upon the massive historical relational information. To solve this problem, we propose a novel method named G$^3$SR (Global Graph Guided Session-based Recommendation). G$^3$SR decomposes the session-based recommendation workflow into two steps. First, a global graph is built upon all session data, from which the global item representations are learned in an unsupervised manner. Then, these representations are refined on session graphs under the graph networks, and a readout function is used to generate session representations for each session. Extensive experiments on two real-world benchmark datasets show remarkable and consistent improvements of the G$^3$SR method over the state-of-the-art methods, especially for cold items.
△ Less
Submitted 12 March, 2022;
originally announced March 2022.
-
Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation in Few Shots
Authors:
Wenting Zhao,
Ye Liu,
Yao Wan,
Philip S. Yu
Abstract:
Few-shot table-to-text generation is a task of composing fluent and faithful sentences to convey table content using limited data. Despite many efforts having been made towards generating impressive fluent sentences by fine-tuning powerful pre-trained language models, the faithfulness of generated content still needs to be improved. To this end, this paper proposes a novel approach Attend, Memoriz…
▽ More
Few-shot table-to-text generation is a task of composing fluent and faithful sentences to convey table content using limited data. Despite many efforts having been made towards generating impressive fluent sentences by fine-tuning powerful pre-trained language models, the faithfulness of generated content still needs to be improved. To this end, this paper proposes a novel approach Attend, Memorize and Generate (called AMG), inspired by the text generation process of humans. In particular, AMG (1) attends over the multi-granularity of context using a novel strategy based on table slot level and traditional token-by-token level attention to exploit both the table structure and natural linguistic information; (2) dynamically memorizes the table slot allocation states; and (3) generates faithful sentences according to both the context and memory allocation states. Comprehensive experiments with human evaluation on three domains (i.e., humans, songs, and books) of the Wiki dataset show that our model can generate higher qualified texts when compared with several state-of-the-art baselines, in both fluency and faithfulness.
△ Less
Submitted 1 March, 2022;
originally announced March 2022.
-
TaSPM: Targeted Sequential Pattern Mining
Authors:
Gengsen Huang,
Wensheng Gan,
Philip S. Yu
Abstract:
Sequential pattern mining (SPM) is an important technique of pattern mining, which has many applications in reality. Although many efficient sequential pattern mining algorithms have been proposed, there are few studies can focus on target sequences. Targeted querying sequential patterns can not only reduce the number of sequences generated by SPM, but also improve the efficiency of users in perfo…
▽ More
Sequential pattern mining (SPM) is an important technique of pattern mining, which has many applications in reality. Although many efficient sequential pattern mining algorithms have been proposed, there are few studies can focus on target sequences. Targeted querying sequential patterns can not only reduce the number of sequences generated by SPM, but also improve the efficiency of users in performing pattern analysis. The current algorithms available on targeted sequence querying are based on specific scenarios and cannot be generalized to other applications. In this paper, we formulate the problem of targeted sequential pattern mining and propose a generic framework namely TaSPM, based on the fast CM-SPAM algorithm. What's more, to improve the efficiency of TaSPM on large-scale datasets and multiple-items-based sequence datasets, we propose several pruning strategies to reduce meaningless operations in mining processes. Totally four pruning strategies are designed in TaSPM, and hence it can terminate unnecessary pattern extensions quickly and achieve better performance. Finally, we conduct extensive experiments on different datasets to compare the existing SPM algorithms with TaSPM. Experiments show that the novel targeted mining algorithm TaSPM can achieve faster running time and less memory consumption.
△ Less
Submitted 26 February, 2022;
originally announced February 2022.
-
Towards Revenue Maximization with Popular and Profitable Products
Authors:
Wensheng Gan,
Guoting Chen,
Hongzhi Yin,
Philippe Fournier-Viger,
Chien-Ming Chen,
Philip S. Yu
Abstract:
Economic-wise, a common goal for companies conducting marketing is to maximize the return revenue/profit by utilizing the various effective marketing strategies. Consumer behavior is crucially important in economy and targeted marketing, in which behavioral economics can provide valuable insights to identify the biases and profit from customers. Finding credible and reliable information on product…
▽ More
Economic-wise, a common goal for companies conducting marketing is to maximize the return revenue/profit by utilizing the various effective marketing strategies. Consumer behavior is crucially important in economy and targeted marketing, in which behavioral economics can provide valuable insights to identify the biases and profit from customers. Finding credible and reliable information on products' profitability is, however, quite difficult since most products tends to peak at certain times w.r.t. seasonal sales cycle in a year. On-Shelf Availability (OSA) plays a key factor for performance evaluation. Besides, staying ahead of hot product trends means we can increase marketing efforts without selling out the inventory. To fulfill this gap, in this paper, we first propose a general profit-oriented framework to address the problem of revenue maximization based on economic behavior, and compute the 0n-shelf Popular and most Profitable Products (OPPPs) for the targeted marketing. To tackle the revenue maximization problem, we model the k-satisfiable product concept and propose an algorithmic framework for searching OPPP and its variants. Extensive experiments are conducted on several real-world datasets to evaluate the effectiveness and efficiency of the proposed algorithm.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
Graph Masked Autoencoders with Transformers
Authors:
Sixiao Zhang,
Hongxu Chen,
Haoran Yang,
Xiangguo Sun,
Philip S. Yu,
Guandong Xu
Abstract:
Recently, transformers have shown promising performance in learning graph representations. However, there are still some challenges when applying transformers to real-world scenarios due to the fact that deep transformers are hard to train from scratch and the quadratic memory consumption w.r.t. the number of nodes. In this paper, we propose Graph Masked Autoencoders (GMAEs), a self-supervised tra…
▽ More
Recently, transformers have shown promising performance in learning graph representations. However, there are still some challenges when applying transformers to real-world scenarios due to the fact that deep transformers are hard to train from scratch and the quadratic memory consumption w.r.t. the number of nodes. In this paper, we propose Graph Masked Autoencoders (GMAEs), a self-supervised transformer-based model for learning graph representations. To address the above two challenges, we adopt the masking mechanism and the asymmetric encoder-decoder design. Specifically, GMAE takes partially masked graphs as input, and reconstructs the features of the masked nodes. The encoder and decoder are asymmetric, where the encoder is a deep transformer and the decoder is a shallow transformer. The masking mechanism and the asymmetric design make GMAE a memory-efficient model compared with conventional transformers. We show that, when serving as a conventional self-supervised graph representation model, GMAE achieves state-of-the-art performance on both the graph classification task and the node classification task under common downstream evaluation protocols. We also show that, compared with training in an end-to-end manner from scratch, we can achieve comparable performance after pre-training and fine-tuning using GMAE while simplifying the training process.
△ Less
Submitted 12 May, 2022; v1 submitted 16 February, 2022;
originally announced February 2022.
-
Graph Neural Networks for Graphs with Heterophily: A Survey
Authors:
Xin Zheng,
Yi Wang,
Yixin Liu,
Ming Li,
Miao Zhang,
Di Jin,
Philip S. Yu,
Shirui Pan
Abstract:
Recent years have witnessed fast developments of graph neural networks (GNNs) that have benefited myriads of graph analytic tasks and applications. In general, most GNNs depend on the homophily assumption that nodes belonging to the same class are more likely to be connected. However, as a ubiquitous graph property in numerous real-world scenarios, heterophily, i.e., nodes with different labels te…
▽ More
Recent years have witnessed fast developments of graph neural networks (GNNs) that have benefited myriads of graph analytic tasks and applications. In general, most GNNs depend on the homophily assumption that nodes belonging to the same class are more likely to be connected. However, as a ubiquitous graph property in numerous real-world scenarios, heterophily, i.e., nodes with different labels tend to be linked, significantly limits the performance of tailor-made homophilic GNNs. Hence, GNNs for heterophilic graphs are gaining increasing research attention to enhance graph learning with heterophily. In this paper, we provide a comprehensive review of GNNs for heterophilic graphs. Specifically, we propose a systematic taxonomy that essentially governs existing heterophilic GNN models, along with a general summary and detailed analysis. Furthermore, we discuss the correlation between graph heterophily and various graph research domains, aiming to facilitate the development of more effective GNNs across a spectrum of practical applications and learning tasks in the graph research community. In the end, we point out the potential directions to advance and stimulate more future research and applications on heterophilic graph learning with GNNs.
△ Less
Submitted 24 February, 2024; v1 submitted 14 February, 2022;
originally announced February 2022.
-
Deep learning for drug repurposing: methods, databases, and applications
Authors:
Xiaoqin Pan,
Xuan Lin,
Dongsheng Cao,
Xiangxiang Zeng,
Philip S. Yu,
Lifang He,
Ruth Nussinov,
Feixiong Cheng
Abstract:
Drug development is time-consuming and expensive. Repurposing existing drugs for new therapies is an attractive solution that accelerates drug development at reduced experimental costs, specifically for Coronavirus Disease 2019 (COVID-19), an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). However, comprehensively obtaining and productively integrating av…
▽ More
Drug development is time-consuming and expensive. Repurposing existing drugs for new therapies is an attractive solution that accelerates drug development at reduced experimental costs, specifically for Coronavirus Disease 2019 (COVID-19), an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). However, comprehensively obtaining and productively integrating available knowledge and big biomedical data to effectively advance deep learning models is still challenging for drug repurposing in other complex diseases. In this review, we introduce guidelines on how to utilize deep learning methodologies and tools for drug repurposing. We first summarized the commonly used bioinformatics and pharmacogenomics databases for drug repurposing. Next, we discuss recently developed sequence-based and graph-based representation approaches as well as state-of-the-art deep learning-based methods. Finally, we present applications of drug repurposing to fight the COVID-19 pandemic, and outline its future challenges.
△ Less
Submitted 8 February, 2022;
originally announced February 2022.
-
Large-scale Personalized Video Game Recommendation via Social-aware Contextualized Graph Neural Network
Authors:
Liangwei Yang,
Zhiwei Liu,
Yu Wang,
Chen Wang,
Ziwei Fan,
Philip S. Yu
Abstract:
Because of the large number of online games available nowadays, online game recommender systems are necessary for users and online game platforms. The former can discover more potential online games of their interests, and the latter can attract users to dwell longer in the platform. This paper investigates the characteristics of user behaviors with respect to the online games on the Steam platfor…
▽ More
Because of the large number of online games available nowadays, online game recommender systems are necessary for users and online game platforms. The former can discover more potential online games of their interests, and the latter can attract users to dwell longer in the platform. This paper investigates the characteristics of user behaviors with respect to the online games on the Steam platform. Based on the observations, we argue that a satisfying recommender system for online games is able to characterize: personalization, game contextualization and social connection. However, simultaneously solving all is rather challenging for game recommendation. Firstly, personalization for game recommendation requires the incorporation of the dwelling time of engaged games, which are ignored in existing methods. Secondly, game contextualization should reflect the complex and high-order properties of those relations. Last but not least, it is problematic to use social connections directly for game recommendations due to the massive noise within social connections. To this end, we propose a Social-aware Contextualized Graph Neural Recommender System (SCGRec), which harnesses three perspectives to improve game recommendation. We conduct a comprehensive analysis of users' online game behaviors, which motivates the necessity of handling those three characteristics in the online game recommendation.
△ Less
Submitted 10 February, 2022; v1 submitted 7 February, 2022;
originally announced February 2022.
-
Link Prediction with Contextualized Self-Supervision
Authors:
Daokun Zhang,
Jie Yin,
Philip S. Yu
Abstract:
Link prediction aims to infer the link existence between pairs of nodes in networks/graphs. Despite their wide application, the success of traditional link prediction algorithms is hindered by three major challenges -- link sparsity, node attribute noise and dynamic changes -- that are faced by many real-world networks. To address these challenges, we propose a Contextualized Self-Supervised Learn…
▽ More
Link prediction aims to infer the link existence between pairs of nodes in networks/graphs. Despite their wide application, the success of traditional link prediction algorithms is hindered by three major challenges -- link sparsity, node attribute noise and dynamic changes -- that are faced by many real-world networks. To address these challenges, we propose a Contextualized Self-Supervised Learning (CSSL) framework that fully exploits structural context prediction for link prediction. The proposed CSSL framework learns a link encoder to infer the link existence probability from paired node embeddings, which are constructed via a transformation on node attributes. To generate informative node embeddings for link prediction, structural context prediction is leveraged as a self-supervised learning task to boost the link prediction performance. Two types of structural context are investigated, i.e., context nodes collected from random walks vs. context subgraphs. The CSSL framework can be trained in an end-to-end manner, with the learning of model parameters supervised by both the link prediction and self-supervised learning tasks. The proposed CSSL is a generic and flexible framework in the sense that it can handle both attributed and non-attributed networks, and operate under both transductive and inductive link prediction settings. Extensive experiments and ablation studies on seven real-world benchmark networks demonstrate the superior performance of the proposed self-supervision based link prediction algorithm over state-of-the-art baselines, on different types of networks under both transductive and inductive settings. The proposed CSSL also yields competitive performance in terms of its robustness to node attribute noise and scalability over large-scale networks.
△ Less
Submitted 7 September, 2022; v1 submitted 24 January, 2022;
originally announced January 2022.
-
Dual Space Graph Contrastive Learning
Authors:
Haoran Yang,
Hongxu Chen,
Shirui Pan,
Lin Li,
Philip S. Yu,
Guandong Xu
Abstract:
Unsupervised graph representation learning has emerged as a powerful tool to address real-world problems and achieves huge success in the graph learning domain. Graph contrastive learning is one of the unsupervised graph representation learning methods, which recently attracts attention from researchers and has achieved state-of-the-art performances on various tasks. The key to the success of grap…
▽ More
Unsupervised graph representation learning has emerged as a powerful tool to address real-world problems and achieves huge success in the graph learning domain. Graph contrastive learning is one of the unsupervised graph representation learning methods, which recently attracts attention from researchers and has achieved state-of-the-art performances on various tasks. The key to the success of graph contrastive learning is to construct proper contrasting pairs to acquire the underlying structural semantics of the graph. However, this key part is not fully explored currently, most of the ways generating contrasting pairs focus on augmenting or perturbating graph structures to obtain different views of the input graph. But such strategies could degrade the performances via adding noise into the graph, which may narrow down the field of the applications of graph contrastive learning. In this paper, we propose a novel graph contrastive learning method, namely \textbf{D}ual \textbf{S}pace \textbf{G}raph \textbf{C}ontrastive (DSGC) Learning, to conduct graph contrastive learning among views generated in different spaces including the hyperbolic space and the Euclidean space. Since both spaces have their own advantages to represent graph data in the embedding spaces, we hope to utilize graph contrastive learning to bridge the spaces and leverage advantages from both sides. The comparison experiment results show that DSGC achieves competitive or better performances among all the datasets. In addition, we conduct extensive experiments to analyze the impact of different graph encoders on DSGC, giving insights about how to better leverage the advantages of contrastive learning between different spaces.
△ Less
Submitted 4 March, 2022; v1 submitted 18 January, 2022;
originally announced January 2022.
-
Sequential Recommendation via Stochastic Self-Attention
Authors:
Ziwei Fan,
Zhiwei Liu,
Alice Wang,
Zahra Nazari,
Lei Zheng,
Hao Peng,
Philip S. Yu
Abstract:
Sequential recommendation models the dynamics of a user's previous behaviors in order to forecast the next item, and has drawn a lot of attention. Transformer-based approaches, which embed items as vectors and use dot-product self-attention to measure the relationship between items, demonstrate superior capabilities among existing sequential methods. However, users' real-world sequential behaviors…
▽ More
Sequential recommendation models the dynamics of a user's previous behaviors in order to forecast the next item, and has drawn a lot of attention. Transformer-based approaches, which embed items as vectors and use dot-product self-attention to measure the relationship between items, demonstrate superior capabilities among existing sequential methods. However, users' real-world sequential behaviors are \textit{\textbf{uncertain}} rather than deterministic, posing a significant challenge to present techniques. We further suggest that dot-product-based approaches cannot fully capture \textit{\textbf{collaborative transitivity}}, which can be derived in item-item transitions inside sequences and is beneficial for cold start items. We further argue that BPR loss has no constraint on positive and sampled negative items, which misleads the optimization. We propose a novel \textbf{STO}chastic \textbf{S}elf-\textbf{A}ttention~(STOSA) to overcome these issues. STOSA, in particular, embeds each item as a stochastic Gaussian distribution, the covariance of which encodes the uncertainty. We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences, which effectively incorporates uncertainty into model training. Wasserstein attentions also enlighten the collaborative transitivity learning as it satisfies triangle inequality. Moreover, we introduce a novel regularization term to the ranking loss, which assures the dissimilarity between positive and the negative items. Extensive experiments on five real-world benchmark datasets demonstrate the superiority of the proposed model over state-of-the-art baselines, especially on cold start items. The code is available in \url{https://github.com/zfan20/STOSA}.
△ Less
Submitted 5 March, 2022; v1 submitted 16 January, 2022;
originally announced January 2022.
-
Multi-Sparse-Domain Collaborative Recommendation via Enhanced Comprehensive Aspect Preference Learning
Authors:
Xiaoyun Zhao,
Ning Yang,
Philip S. Yu
Abstract:
Cross-domain recommendation (CDR) has been attracting increasing attention of researchers for its ability to alleviate the data sparsity problem in recommender systems. However, the existing single-target or dual-target CDR methods often suffer from two drawbacks, the assumption of at least one rich domain and the heavy dependence on domain-invariant preference, which are impractical in real world…
▽ More
Cross-domain recommendation (CDR) has been attracting increasing attention of researchers for its ability to alleviate the data sparsity problem in recommender systems. However, the existing single-target or dual-target CDR methods often suffer from two drawbacks, the assumption of at least one rich domain and the heavy dependence on domain-invariant preference, which are impractical in real world where sparsity is ubiquitous and might degrade the user preference learning. To overcome these issues, we propose a Multi-Sparse-Domain Collaborative Recommendation (MSDCR) model for multi-target cross-domain recommendation. Unlike traditional CDR methods, MSDCR treats the multiple relevant domains as all sparse and can simultaneously improve the recommendation performance in each domain. We propose a Multi-Domain Separation Network (MDSN) and a Gated Aspect Preference Enhancement (GAPE) module for MSDCR to enhance a user's domain-specific aspect preferences in a domain by transferring the complementary aspect preferences in other domains, during which the uniqueness of the domain-specific preference can be preserved through the adversarial training offered by MDSN and the complementarity can be adaptively determined by GAPE. Meanwhile, we propose a Multi-Domain Adaptation Network (MDAN) for MSDCR to capture a user's domain-invariant aspect preference. With the integration of the enhanced domain-specific aspect preference and the domain-invariant aspect preference, MSDCR can reach a comprehensive understanding of a user's preference in each sparse domain. At last, the extensive experiments conducted on real datasets demonstrate the remarkable superiority of MSDCR over the state-of-the-art single-domain recommendation models and CDR models.
△ Less
Submitted 16 January, 2022;
originally announced January 2022.
-
Learning from Atypical Behavior: Temporary Interest Aware Recommendation Based on Reinforcement Learning
Authors:
Ziwen Du,
Ning Yang,
Zhonghua Yu,
Philip S. Yu
Abstract:
Traditional robust recommendation methods view atypical user-item interactions as noise and aim to reduce their impact with some kind of noise filtering technique, which often suffers from two challenges. First, in real world, atypical interactions may signal users' temporary interest different from their general preference. Therefore, simply filtering out the atypical interactions as noise may be…
▽ More
Traditional robust recommendation methods view atypical user-item interactions as noise and aim to reduce their impact with some kind of noise filtering technique, which often suffers from two challenges. First, in real world, atypical interactions may signal users' temporary interest different from their general preference. Therefore, simply filtering out the atypical interactions as noise may be inappropriate and degrade the personalization of recommendations. Second, it is hard to acquire the temporary interest since there are no explicit supervision signals to indicate whether an interaction is atypical or not. To address this challenges, we propose a novel model called Temporary Interest Aware Recommendation (TIARec), which can distinguish atypical interactions from normal ones without supervision and capture the temporary interest as well as the general preference of users. Particularly, we propose a reinforcement learning framework containing a recommender agent and an auxiliary classifier agent, which are jointly trained with the objective of maximizing the cumulative return of the recommendations made by the recommender agent. During the joint training process, the classifier agent can judge whether the interaction with an item recommended by the recommender agent is atypical, and the knowledge about learning temporary interest from atypical interactions can be transferred to the recommender agent, which makes the recommender agent able to alone make recommendations that balance the general preference and temporary interest of users. At last, the experiments conducted on real world datasets verify the effectiveness of TIARec.
△ Less
Submitted 16 January, 2022;
originally announced January 2022.
-
Interpretable and Effective Reinforcement Learning for Attacking against Graph-based Rumor Detection
Authors:
Yuefei Lyu,
Xiaoyu Yang,
Jiaxin Liu,
Philip S. Yu,
Sihong Xie,
Xi Zhang
Abstract:
Social networks are frequently polluted by rumors, which can be detected by advanced models such as graph neural networks. However, the models are vulnerable to attacks and understanding the vulnerabilities is critical to rumor detection in practice. To discover subtle vulnerabilities, we design a powerful attacking algorithm to camouflage rumors in social networks based on reinforcement learning…
▽ More
Social networks are frequently polluted by rumors, which can be detected by advanced models such as graph neural networks. However, the models are vulnerable to attacks and understanding the vulnerabilities is critical to rumor detection in practice. To discover subtle vulnerabilities, we design a powerful attacking algorithm to camouflage rumors in social networks based on reinforcement learning that can interact with and attack any black-box detectors. The environment has exponentially large state spaces, high-order graph dependencies, and delayed noisy rewards, making the state-of-the-art end-to-end approaches difficult to learn features as large learning costs and expressive limitation of graph deep models. Instead, we design domain-specific features to avoid learning features and produce interpretable attack policies. To further speed up policy optimization, we devise: (i) a credit assignment method that decomposes delayed rewards to atomic attacking actions proportional to the their camouflage effects on target rumors; (ii) a time-dependent control variate to reduce reward variance due to large graphs and many attacking steps, supported by the reward variance analysis and a Bayesian analysis of the prediction distribution. On three real world datasets of rumor detection tasks, we demonstrate: (i) the effectiveness of the learned attacking policy compared to rule-based attacks and current end-to-end approaches; (ii) the usefulness of the proposed credit assignment strategy and variance reduction components; (iii) the interpretability of the policy when generating strong attacks via the case study.
△ Less
Submitted 14 October, 2022; v1 submitted 15 January, 2022;
originally announced January 2022.
-
Translational Concept Embedding for Generalized Compositional Zero-shot Learning
Authors:
He Huang,
Wei Tang,
Jiawei Zhang,
Philip S. Yu
Abstract:
Generalized compositional zero-shot learning means to learn composed concepts of attribute-object pairs in a zero-shot fashion, where a model is trained on a set of seen concepts and tested on a combined set of seen and unseen concepts. This task is very challenging because of not only the gap between seen and unseen concepts but also the contextual dependency between attributes and objects. This…
▽ More
Generalized compositional zero-shot learning means to learn composed concepts of attribute-object pairs in a zero-shot fashion, where a model is trained on a set of seen concepts and tested on a combined set of seen and unseen concepts. This task is very challenging because of not only the gap between seen and unseen concepts but also the contextual dependency between attributes and objects. This paper introduces a new approach, termed translational concept embedding, to solve these two difficulties in a unified framework. It models the effect of applying an attribute to an object as adding a translational attribute feature to an object prototype. We explicitly take into account of the contextual dependency between attributes and objects by generating translational attribute features conditionally dependent on the object prototypes. Furthermore, we design a ratio variance constraint loss to promote the model's generalization ability on unseen concepts. It regularizes the distances between concepts by utilizing knowledge from their pretrained word embeddings. We evaluate the performance of our model under both the unbiased and biased concept classification tasks, and show that our model is able to achieve good balance in predicting unseen and seen concepts.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
Graph Structure Learning with Variational Information Bottleneck
Authors:
Qingyun Sun,
Jianxin Li,
Hao Peng,
Jia Wu,
Xingcheng Fu,
Cheng Ji,
Philip S. Yu
Abstract:
Graph Neural Networks (GNNs) have shown promising results on a broad spectrum of applications. Most empirical studies of GNNs directly take the observed graph as input, assuming the observed structure perfectly depicts the accurate and complete relations between nodes. However, graphs in the real world are inevitably noisy or incomplete, which could even exacerbate the quality of graph representat…
▽ More
Graph Neural Networks (GNNs) have shown promising results on a broad spectrum of applications. Most empirical studies of GNNs directly take the observed graph as input, assuming the observed structure perfectly depicts the accurate and complete relations between nodes. However, graphs in the real world are inevitably noisy or incomplete, which could even exacerbate the quality of graph representations. In this work, we propose a novel Variational Information Bottleneck guided Graph Structure Learning framework, namely VIB-GSL, in the perspective of information theory. VIB-GSL advances the Information Bottleneck (IB) principle for graph structure learning, providing a more elegant and universal framework for mining underlying task-relevant relations. VIB-GSL learns an informative and compressive graph structure to distill the actionable information for specific downstream tasks. VIB-GSL deduces a variational approximation for irregular graph data to form a tractable IB objective function, which facilitates training stability. Extensive experimental results demonstrate that the superior effectiveness and robustness of VIB-GSL.
△ Less
Submitted 16 December, 2021;
originally announced December 2021.
-
Semi-Supervised Variational User Identity Linkage via Noise-Aware Self-Learning
Authors:
Chaozhuo Li,
Senzhang Wang,
Zheng Liu,
Xing Xie,
Lei Chen,
Philip S. Yu
Abstract:
User identity linkage, which aims to link identities of a natural person across different social platforms, has attracted increasing research interest recently. Existing approaches usually first embed the identities as deterministic vectors in a shared latent space, and then learn a classifier based on the available annotations. However, the formation and characteristics of real-world social platf…
▽ More
User identity linkage, which aims to link identities of a natural person across different social platforms, has attracted increasing research interest recently. Existing approaches usually first embed the identities as deterministic vectors in a shared latent space, and then learn a classifier based on the available annotations. However, the formation and characteristics of real-world social platforms are full of uncertainties, which makes these deterministic embedding based methods sub-optimal. In addition, it is intractable to collect sufficient linkage annotations due to the tremendous gaps between different platforms. Semi-supervised models utilize the unlabeled data to help capture the intrinsic data distribution, which are more promising in practical usage. However, the existing semi-supervised linkage methods heavily rely on the heuristically defined similarity measurements to incorporate the innate closeness between labeled and unlabeled samples. Such manually designed assumptions may not be consistent with the actual linkage signals and further introduce the noises. To address the mentioned limitations, in this paper we propose a novel Noise-aware Semi-supervised Variational User Identity Linkage (NSVUIL) model. Specifically, we first propose a novel supervised linkage module to incorporate the available annotations. Each social identity is represented by a Gaussian distribution in the Wasserstein space to simultaneously preserve the fine-grained social profiles and model the uncertainty of identities. Then, a noise-aware self-learning module is designed to faithfully augment the few available annotations, which is capable of filtering noises from the pseudo-labels generated by the supervised module.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.