subscribe to arXiv mailings

TEG-DB: A Comprehensive Dataset and Benchmark of Textual-Edge Graphs

Authors: Zhuofeng Li, Zixing Gou, Xiangnan Zhang, Zhongyuan Liu, Sirui Li, Yuntong Hu, Chen Ling, Zheng Zhang, Liang Zhao

Abstract: Text-Attributed Graphs (TAGs) augment graph structures with natural language descriptions, facilitating detailed depictions of data and their interconnections across various real-world settings. However, existing TAG datasets predominantly feature textual information only at the nodes, with edges typically represented by mere binary or categorical attributes. This lack of rich textual edge annotat… ▽ More Text-Attributed Graphs (TAGs) augment graph structures with natural language descriptions, facilitating detailed depictions of data and their interconnections across various real-world settings. However, existing TAG datasets predominantly feature textual information only at the nodes, with edges typically represented by mere binary or categorical attributes. This lack of rich textual edge annotations significantly limits the exploration of contextual relationships between entities, hindering deeper insights into graph-structured data. To address this gap, we introduce Textual-Edge Graphs Datasets and Benchmark (TEG-DB), a comprehensive and diverse collection of benchmark textual-edge datasets featuring rich textual descriptions on nodes and edges. The TEG-DB datasets are large-scale and encompass a wide range of domains, from citation networks to social networks. In addition, we conduct extensive benchmark experiments on TEG-DB to assess the extent to which current techniques, including pre-trained language models, graph neural networks, and their combinations, can utilize textual node and edge information. Our goal is to elicit advancements in textual-edge graph research, specifically in developing methodologies that exploit rich textual node and edge descriptions to enhance graph analysis and provide deeper insights into complex real-world networks. The entire TEG-DB project is publicly accessible as an open-source repository on Github, accessible at https://github.com/Zhuofeng-Li/TEG-Benchmark. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.08231 [pdf, other]

Using Deep Convolutional Neural Networks to Detect Rendered Glitches in Video Games

Authors: Carlos Garcia Ling, Konrad Tollmar, Linus Gisslen

Abstract: In this paper, we present a method using Deep Convolutional Neural Networks (DCNNs) to detect common glitches in video games. The problem setting consists of an image (800x800 RGB) as input to be classified into one of five defined classes, normal image, or one of four different kinds of glitches (stretched, low resolution, missing and placeholder textures). Using a supervised approach, we train a… ▽ More In this paper, we present a method using Deep Convolutional Neural Networks (DCNNs) to detect common glitches in video games. The problem setting consists of an image (800x800 RGB) as input to be classified into one of five defined classes, normal image, or one of four different kinds of glitches (stretched, low resolution, missing and placeholder textures). Using a supervised approach, we train a ShuffleNetV2 using generated data. This work focuses on detecting texture graphical anomalies achieving arguably good performance with an accuracy of 86.8\%, detecting 88\% of the glitches with a false positive rate of 8.7\%, and with the models being able to generalize and detect glitches even in unseen objects. We apply a confidence measure as well to tackle the issue with false positives as well as an effective way of aggregating images to achieve better detection in production. The main use of this work is the partial automatization of graphical testing in the final stages of video game development. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 8 pages, 6 figures, AAIDE conference

arXiv:2405.20790 [pdf, other]

Intersectional Unfairness Discovery

Authors: Gezheng Xu, Qi Chen, Charles Ling, Boyu Wang, Changjian Shui

Abstract: AI systems have been shown to produce unfair results for certain subgroups of population, highlighting the need to understand bias on certain sensitive attributes. Current research often falls short, primarily focusing on the subgroups characterized by a single sensitive attribute, while neglecting the nature of intersectional fairness of multiple sensitive attributes. This paper focuses on its on… ▽ More AI systems have been shown to produce unfair results for certain subgroups of population, highlighting the need to understand bias on certain sensitive attributes. Current research often falls short, primarily focusing on the subgroups characterized by a single sensitive attribute, while neglecting the nature of intersectional fairness of multiple sensitive attributes. This paper focuses on its one fundamental aspect by discovering diverse high-bias subgroups under intersectional sensitive attributes. Specifically, we propose a Bias-Guided Generative Network (BGGN). By treating each bias value as a reward, BGGN efficiently generates high-bias intersectional sensitive attributes. Experiments on real-world text and image datasets demonstrate a diverse and efficient discovery of BGGN. To further evaluate the generated unseen but possible unfair intersectional sensitive attributes, we formulate them as prompts and use modern generative AI to produce new texts and images. The results of frequently generating biased data provides new insights of discovering potential unfairness in popular modern generative AI systems. Warning: This paper contains generative examples that are offensive in nature. △ Less

Submitted 6 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

Comments: ICML-2024 camera-ready

arXiv:2405.16800 [pdf, other]

TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations

Authors: Zheng Zhang, Yuntong Hu, Bo Pan, Chen Ling, Liang Zhao

Abstract: Text-Attributed Graphs (TAGs) enhance graph structures with natural language descriptions, enabling detailed representation of data and their relationships across a broad spectrum of real-world scenarios. Despite the potential for deeper insights, existing TAG representation learning primarily relies on supervised methods, necessitating extensive labeled data and limiting applicability across dive… ▽ More Text-Attributed Graphs (TAGs) enhance graph structures with natural language descriptions, enabling detailed representation of data and their relationships across a broad spectrum of real-world scenarios. Despite the potential for deeper insights, existing TAG representation learning primarily relies on supervised methods, necessitating extensive labeled data and limiting applicability across diverse contexts. This paper introduces a new self-supervised learning framework, Text-And-Graph Multi-View Alignment (TAGA), which overcomes these constraints by integrating TAGs' structural and semantic dimensions. TAGA constructs two complementary views: Text-of-Graph view, which organizes node texts into structured documents based on graph topology, and the Graph-of-Text view, which converts textual nodes and connections into graph data. By aligning representations from both views, TAGA captures joint textual and structural information. In addition, a novel structure-preserving random walk algorithm is proposed for efficient training on large-sized TAGs. Our framework demonstrates strong performance in zero-shot and few-shot scenarios across eight real-world datasets. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16606 [pdf, other]

Link Prediction on Textual Edge Graphs

Authors: Chen Ling, Zhuofeng Li, Yuntong Hu, Zheng Zhang, Zhongyuan Liu, Shuang Zheng, Liang Zhao

Abstract: Textual-edge Graphs (TEGs), characterized by rich text annotations on edges, are increasingly significant in network science due to their ability to capture rich contextual information among entities. Existing works have proposed various edge-aware graph neural networks (GNNs) or let language models directly make predictions. However, they often fall short of fully capturing the contextualized sem… ▽ More Textual-edge Graphs (TEGs), characterized by rich text annotations on edges, are increasingly significant in network science due to their ability to capture rich contextual information among entities. Existing works have proposed various edge-aware graph neural networks (GNNs) or let language models directly make predictions. However, they often fall short of fully capturing the contextualized semantics on edges and graph topology, respectively. This inadequacy is particularly evident in link prediction tasks that require a comprehensive understanding of graph topology and semantics between nodes. In this paper, we present a novel framework - Link2Doc, designed especially for link prediction on textual-edge graphs. Specifically, we propose to summarize neighborhood information between node pairs as a human-written document to preserve both semantic and topology information. A self-supervised learning model is then utilized to enhance GNN's text-understanding ability from language models. Empirical evaluations, including link prediction, edge classification, parameter analysis, runtime comparison, and ablation studies, on four real-world datasets demonstrate that Link2Doc achieves generally better performance against existing edge-aware GNNs and pre-trained language models in predicting links on TEGs. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16506 [pdf, other]

GRAG: Graph Retrieval-Augmented Generation

Authors: Yuntong Hu, Zhihan Lei, Zheng Zhang, Bo Pan, Chen Ling, Liang Zhao

Abstract: While Retrieval-Augmented Generation (RAG) enhances the accuracy and relevance of responses by generative language models, it falls short in graph-based contexts where both textual and topological information are important. Naive RAG approaches inherently neglect the structural intricacies of textual graphs, resulting in a critical gap in the generation process. To address this challenge, we intro… ▽ More While Retrieval-Augmented Generation (RAG) enhances the accuracy and relevance of responses by generative language models, it falls short in graph-based contexts where both textual and topological information are important. Naive RAG approaches inherently neglect the structural intricacies of textual graphs, resulting in a critical gap in the generation process. To address this challenge, we introduce $\textbf{Graph Retrieval-Augmented Generation (GRAG)}$, which significantly enhances both the retrieval and generation processes by emphasizing the importance of subgraph structures. Unlike RAG approaches that focus solely on text-based entity retrieval, GRAG maintains an acute awareness of graph topology, which is crucial for generating contextually and factually coherent responses. Our GRAG approach consists of four main stages: indexing of $k$-hop ego-graphs, graph retrieval, soft pruning to mitigate the impact of irrelevant entities, and generation with pruned textual subgraphs. GRAG's core workflow-retrieving textual subgraphs followed by soft pruning-efficiently identifies relevant subgraph structures while avoiding the computational infeasibility typical of exhaustive subgraph searches, which are NP-hard. Moreover, we propose a novel prompting strategy that achieves lossless conversion from textual subgraphs to hierarchical text descriptions. Extensive experiments on graph multi-hop reasoning benchmarks demonstrate that in scenarios requiring multi-hop reasoning on textual graphs, our GRAG approach significantly outperforms current state-of-the-art RAG methods while effectively mitigating hallucinations. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: 14 pages, 4 figures

arXiv:2405.11764 [pdf, other]

doi 10.1145/3626772.3657802

Modeling User Fatigue for Sequential Recommendation

Authors: Nian Li, Xin Ban, Cheng Ling, Chen Gao, Lantao Hu, Peng Jiang, Kun Gai, Yong Li, Qingmin Liao

Abstract: Recommender systems filter out information that meets user interests. However, users may be tired of the recommendations that are too similar to the content they have been exposed to in a short historical period, which is the so-called user fatigue. Despite the significance for a better user experience, user fatigue is seldom explored by existing recommenders. In fact, there are three main challen… ▽ More Recommender systems filter out information that meets user interests. However, users may be tired of the recommendations that are too similar to the content they have been exposed to in a short historical period, which is the so-called user fatigue. Despite the significance for a better user experience, user fatigue is seldom explored by existing recommenders. In fact, there are three main challenges to be addressed for modeling user fatigue, including what features support it, how it influences user interests, and how its explicit signals are obtained. In this paper, we propose to model user Fatigue in interest learning for sequential Recommendations (FRec). To address the first challenge, based on a multi-interest framework, we connect the target item with historical items and construct an interest-aware similarity matrix as features to support fatigue modeling. Regarding the second challenge, built upon feature cross, we propose a fatigue-enhanced multi-interest fusion to capture long-term interest. In addition, we develop a fatigue-gated recurrent unit for short-term interest learning, with temporal fatigue representations as important inputs for constructing update and reset gates. For the last challenge, we propose a novel sequence augmentation to obtain explicit fatigue signals for contrastive learning. We conduct extensive experiments on real-world datasets, including two public datasets and one large-scale industrial dataset. Experimental results show that FRec can improve AUC and GAUC up to 0.026 and 0.019 compared with state-of-the-art models, respectively. Moreover, large-scale online experiments demonstrate the effectiveness of FRec for fatigue reduction. Our codes are released at https://github.com/tsinghua-fib-lab/SIGIR24-FRec. △ Less

Submitted 22 May, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

Comments: SIGIR 2024

arXiv:2405.10124 [pdf, ps, other]

Smoothing Linear Codes by Rényi Divergence and Applications to Security Reduction

Authors: Hao Yan, Cong Ling

Abstract: The concept of the smoothing parameter plays a crucial role in both lattice-based and code-based cryptography, primarily due to its effectiveness in achieving nearly uniform distributions through the addition of noise. Recent research by Pathegama and Barg has determined the optimal smoothing bound for random codes under Rényi Divergence for any order $α\in (1, \infty)$ \cite{pathegama2024r}. Cons… ▽ More The concept of the smoothing parameter plays a crucial role in both lattice-based and code-based cryptography, primarily due to its effectiveness in achieving nearly uniform distributions through the addition of noise. Recent research by Pathegama and Barg has determined the optimal smoothing bound for random codes under Rényi Divergence for any order $α\in (1, \infty)$ \cite{pathegama2024r}. Considering the inherent complexity of encoding/decoding algorithms in random codes, our research introduces enhanced structural elements into these coding schemes. Specifically, this paper presents a novel derivation of the smoothing bound for random linear codes, maintaining the same order of Rényi Divergence and achieving optimality for any $α\in (1,\infty)$. We extend this framework under KL Divergence by transitioning from random linear codes to random self-dual codes, and subsequently to random quasi-cyclic codes, incorporating progressively more structures. As an application, we derive an average-case to average-case reduction from the Learning Parity with Noise (LPN) problem to the average-case decoding problem. This reduction aligns with the parameter regime in \cite{debris2022worst}, but uniquely employs Rényi divergence and directly considers Bernoulli noise, instead of combining ball noise and Bernoulli noise. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.09784 [pdf, other]

Online bipartite matching with imperfect advice

Authors: Davin Choo, Themis Gouleakis, Chun Kai Ling, Arnab Bhattacharyya

Abstract: We study the problem of online unweighted bipartite matching with $n$ offline vertices and $n$ online vertices where one wishes to be competitive against the optimal offline algorithm. While the classic RANKING algorithm of Karp et al. [1990] provably attains competitive ratio of $1-1/e > 1/2$, we show that no learning-augmented method can be both 1-consistent and strictly better than $1/2$-robust… ▽ More We study the problem of online unweighted bipartite matching with $n$ offline vertices and $n$ online vertices where one wishes to be competitive against the optimal offline algorithm. While the classic RANKING algorithm of Karp et al. [1990] provably attains competitive ratio of $1-1/e > 1/2$, we show that no learning-augmented method can be both 1-consistent and strictly better than $1/2$-robust under the adversarial arrival model. Meanwhile, under the random arrival model, we show how one can utilize methods from distribution testing to design an algorithm that takes in external advice about the online vertices and provably achieves competitive ratio interpolating between any ratio attainable by advice-free methods and the optimal ratio of 1, depending on the advice quality. △ Less

Submitted 23 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

Comments: Accepted into ICML 2024

arXiv:2405.04051 [pdf, ps, other]

On the quantization goodness of polar lattices

Authors: Ling Liu, Shanxiang Lyu, Cong Ling, Baoming Bai

Abstract: In this work, we prove that polar lattices, when tailored for lossy compression, are quantization-good in the sense that their normalized second moments approach $\frac{1}{2πe}$ as the dimension of lattices increases. It has been predicted by Zamir et al. \cite{ZamirQZ96} that the Entropy Coded Dithered Quantization (ECDQ) system using quantization-good lattices can achieve the rate-distortion bou… ▽ More In this work, we prove that polar lattices, when tailored for lossy compression, are quantization-good in the sense that their normalized second moments approach $\frac{1}{2πe}$ as the dimension of lattices increases. It has been predicted by Zamir et al. \cite{ZamirQZ96} that the Entropy Coded Dithered Quantization (ECDQ) system using quantization-good lattices can achieve the rate-distortion bound of i.i.d. Gaussian sources. In our previous work \cite{LingQZ}, we established that polar lattices are indeed capable of attaining the same objective. It is reasonable to conjecture that polar lattices also demonstrate quantization goodness in the context of lossy compression. This study confirms this hypothesis. △ Less

Submitted 13 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: 12 pages, 5 figures, submitted to IEEE for possible publication

arXiv:2405.03070 [pdf, other]

Layered Graph Security Games

Authors: Jakub Černý, Chun Kai Ling, Christian Kroer, Garud Iyengar

Abstract: Security games model strategic interactions in adversarial real-world applications. Such applications often involve extremely large but highly structured strategy sets (e.g., selecting a distribution over all patrol routes in a given graph). In this paper, we represent each player's strategy space using a layered graph whose paths represent an exponentially large strategy space. Our formulation en… ▽ More Security games model strategic interactions in adversarial real-world applications. Such applications often involve extremely large but highly structured strategy sets (e.g., selecting a distribution over all patrol routes in a given graph). In this paper, we represent each player's strategy space using a layered graph whose paths represent an exponentially large strategy space. Our formulation entails not only classic pursuit-evasion games, but also other security games, such as those modeling anti-terrorism and logistical interdiction. We study two-player zero-sum games under two distinct utility models: linear and binary utilities. We show that under linear utilities, Nash equilibrium can be computed in polynomial time, while binary utilities may lead to situations where even computing a best-response is computationally intractable. To this end, we propose a practical algorithm based on incremental strategy generation and mixed integer linear programs. We show through extensive experiments that our algorithm efficiently computes $ε$-equilibrium for many games of interest. We find that target values and graph structure often have a larger influence on running times as compared to the size of the graph per se. △ Less

Submitted 9 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

Comments: In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. IJCAI Press, 2024

arXiv:2405.01680 [pdf, other]

Physics-Informed Neural Networks: Minimizing Residual Loss with Wide Networks and Effective Activations

Authors: Nima Hosseini Dashtbayaz, Ghazal Farhani, Boyu Wang, Charles X. Ling

Abstract: The residual loss in Physics-Informed Neural Networks (PINNs) alters the simple recursive relation of layers in a feed-forward neural network by applying a differential operator, resulting in a loss landscape that is inherently different from those of common supervised problems. Therefore, relying on the existing theory leads to unjustified design choices and suboptimal performance. In this work,… ▽ More The residual loss in Physics-Informed Neural Networks (PINNs) alters the simple recursive relation of layers in a feed-forward neural network by applying a differential operator, resulting in a loss landscape that is inherently different from those of common supervised problems. Therefore, relying on the existing theory leads to unjustified design choices and suboptimal performance. In this work, we analyze the residual loss by studying its characteristics at critical points to find the conditions that result in effective training of PINNs. Specifically, we first show that under certain conditions, the residual loss of PINNs can be globally minimized by a wide neural network. Furthermore, our analysis also reveals that an activation function with well-behaved high-order derivatives plays a crucial role in minimizing the residual loss. In particular, to solve a $k$-th order PDE, the $k$-th derivative of the activation function should be bijective. The established theory paves the way for designing and choosing effective activation functions for PINNs and explains why periodic activations have shown promising performance in certain cases. Finally, we verify our findings by conducting a set of experiments on several PDEs. Our code is publicly available at https://github.com/nimahsn/pinns_tf2. △ Less

Submitted 12 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

Comments: Accepted at IJCAI 2024. V2: Corrected typos

arXiv:2404.14668 [pdf, other]

Source Localization for Cross Network Information Diffusion

Authors: Chen Ling, Tanmoy Chowdhury, Jie Ji, Sirui Li, Andreas Züfle, Liang Zhao

Abstract: Source localization aims to locate information diffusion sources only given the diffusion observation, which has attracted extensive attention in the past few years. Existing methods are mostly tailored for single networks and may not be generalized to handle more complex networks like cross-networks. Cross-network is defined as two interconnected networks, where one network's functionality depend… ▽ More Source localization aims to locate information diffusion sources only given the diffusion observation, which has attracted extensive attention in the past few years. Existing methods are mostly tailored for single networks and may not be generalized to handle more complex networks like cross-networks. Cross-network is defined as two interconnected networks, where one network's functionality depends on the other. Source localization on cross-networks entails locating diffusion sources on the source network by only giving the diffused observation in the target network. The task is challenging due to challenges including: 1) diffusion sources distribution modeling; 2) jointly considering both static and dynamic node features; and 3) heterogeneous diffusion patterns learning. In this work, we propose a novel method, namely CNSL, to handle the three primary challenges. Specifically, we propose to learn the distribution of diffusion sources through Bayesian inference and leverage disentangled encoders to separately learn static and dynamic node features. The learning objective is coupled with the cross-network information propagation estimation model to make the inference of diffusion sources considering the overall diffusion process. Additionally, we also provide two novel cross-network datasets collected by ourselves. Extensive experiments are conducted on both datasets to demonstrate the effectiveness of \textit{CNSL} in handling the source localization on cross-networks. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: Code and data are available at: https://github.com/tanmoysr/CNSL/

arXiv:2403.13574 [pdf, other]

A Large Language Model Enhanced Sequential Recommender for Joint Video and Comment Recommendation

Authors: Bowen Zheng, Zihan Lin, Enze Liu, Chen Yang, Enyang Bai, Cheng Ling, Wayne Xin Zhao, Ji-Rong Wen

Abstract: In online video platforms, reading or writing comments on interesting videos has become an essential part of the video watching experience. However, existing video recommender systems mainly model users' interaction behaviors with videos, lacking consideration of comments in user behavior modeling. In this paper, we propose a novel recommendation approach called LSVCR by leveraging user interactio… ▽ More In online video platforms, reading or writing comments on interesting videos has become an essential part of the video watching experience. However, existing video recommender systems mainly model users' interaction behaviors with videos, lacking consideration of comments in user behavior modeling. In this paper, we propose a novel recommendation approach called LSVCR by leveraging user interaction histories with both videos and comments, so as to jointly conduct personalized video and comment recommendation. Specifically, our approach consists of two key components, namely sequential recommendation (SR) model and supplemental large language model (LLM) recommender. The SR model serves as the primary recommendation backbone (retained in deployment) of our approach, allowing for efficient user preference modeling. Meanwhile, we leverage the LLM recommender as a supplemental component (discarded in deployment) to better capture underlying user preferences from heterogeneous interaction behaviors. In order to integrate the merits of the SR model and the supplemental LLM recommender, we design a twostage training paradigm. The first stage is personalized preference alignment, which aims to align the preference representations from both components, thereby enhancing the semantics of the SR model. The second stage is recommendation-oriented fine-tuning, in which the alignment-enhanced SR model is fine-tuned according to specific objectives. Extensive experiments in both video and comment recommendation tasks demonstrate the effectiveness of LSVCR. Additionally, online A/B testing on the KuaiShou platform verifies the actual benefits brought by our approach. In particular, we achieve a significant overall gain of 4.13% in comment watch time. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.11440 [pdf, ps, other]

Boosting Continuous Emotion Recognition with Self-Pretraining using Masked Autoencoders, Temporal Convolutional Networks, and Transformers

Authors: Weiwei Zhou, Jiada Lu, Chenkun Ling, Weifeng Wang, Shaowei Liu

Abstract: Human emotion recognition holds a pivotal role in facilitating seamless human-computer interaction. This paper delineates our methodology in tackling the Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge, and Action Unit (AU) Detection Challenge within the ambit of the 6th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW). Our study advo… ▽ More Human emotion recognition holds a pivotal role in facilitating seamless human-computer interaction. This paper delineates our methodology in tackling the Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge, and Action Unit (AU) Detection Challenge within the ambit of the 6th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW). Our study advocates a novel approach aimed at refining continuous emotion recognition. We achieve this by initially harnessing pre-training with Masked Autoencoders (MAE) on facial datasets, followed by fine-tuning on the aff-wild2 dataset annotated with expression (Expr) labels. The pre-trained model serves as an adept visual feature extractor, thereby enhancing the model's robustness. Furthermore, we bolster the performance of continuous emotion recognition by integrating Temporal Convolutional Network (TCN) modules and Transformer Encoder modules into our framework. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2402.17946 [pdf, other]

SparseLLM: Towards Global Pruning for Pre-trained Language Models

Authors: Guangji Bai, Yijiang Li, Chen Ling, Kibaek Kim, Liang Zhao

Abstract: The transformative impact of large language models (LLMs) like LLaMA and GPT on natural language processing is countered by their prohibitive computational demands. Pruning has emerged as a pivotal compression strategy, introducing sparsity to enhance both memory and computational efficiency. Yet, traditional global pruning is impractical for LLMs due to scalability issues, while local pruning, de… ▽ More The transformative impact of large language models (LLMs) like LLaMA and GPT on natural language processing is countered by their prohibitive computational demands. Pruning has emerged as a pivotal compression strategy, introducing sparsity to enhance both memory and computational efficiency. Yet, traditional global pruning is impractical for LLMs due to scalability issues, while local pruning, despite its efficiency, leads to suboptimal solutions. Addressing these challenges, we propose SparseLLM, a novel framework that redefines the global pruning process into manageable, coordinated subproblems, allowing for resource-efficient optimization with global optimality. SparseLLM's approach, which conceptualizes LLMs as a chain of modular functions and leverages auxiliary variables for problem decomposition, not only facilitates a pragmatic application on LLMs but also demonstrates significant performance improvements, particularly in high-sparsity regimes where it surpasses current state-of-the-art methods. △ Less

Submitted 23 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

Comments: Preprint. Under review

arXiv:2402.16898 [pdf, other]

MIM-Reasoner: Learning with Theoretical Guarantees for Multiplex Influence Maximization

Authors: Nguyen Do, Tanmoy Chowdhury, Chen Ling, Liang Zhao, My T. Thai

Abstract: Multiplex influence maximization (MIM) asks us to identify a set of seed users such as to maximize the expected number of influenced users in a multiplex network. MIM has been one of central research topics, especially in nowadays social networking landscape where users participate in multiple online social networks (OSNs) and their influences can propagate among several OSNs simultaneously. Altho… ▽ More Multiplex influence maximization (MIM) asks us to identify a set of seed users such as to maximize the expected number of influenced users in a multiplex network. MIM has been one of central research topics, especially in nowadays social networking landscape where users participate in multiple online social networks (OSNs) and their influences can propagate among several OSNs simultaneously. Although there exist a couple combinatorial algorithms to MIM, learning-based solutions have been desired due to its generalization ability to heterogeneous networks and their diversified propagation characteristics. In this paper, we introduce MIM-Reasoner, coupling reinforcement learning with probabilistic graphical model, which effectively captures the complex propagation process within and between layers of a given multiplex network, thereby tackling the most challenging problem in MIM. We establish a theoretical guarantee for MIM-Reasoner as well as conduct extensive analyses on both synthetic and real-world datasets to validate our MIM-Reasoner's performance. △ Less

Submitted 10 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

Journal ref: International Conference on Artificial Intelligence and Statistics (AISTATS) 2024

arXiv:2402.13098 [pdf, other]

ELAD: Explanation-Guided Large Language Models Active Distillation

Authors: Yifei Zhang, Bo Pan, Chen Ling, Yuntong Hu, Liang Zhao

Abstract: The deployment and application of Large Language Models (LLMs) is hindered by their memory inefficiency, computational demands, and the high costs of API inferences. Traditional distillation methods, which transfer the capabilities of LLMs to smaller models, often fail to determine whether the knowledge has been sufficiently transferred, potentially resulting in high costs or incomplete distillati… ▽ More The deployment and application of Large Language Models (LLMs) is hindered by their memory inefficiency, computational demands, and the high costs of API inferences. Traditional distillation methods, which transfer the capabilities of LLMs to smaller models, often fail to determine whether the knowledge has been sufficiently transferred, potentially resulting in high costs or incomplete distillation. In this paper, we propose an Explanation-Guided LLMs Active Distillation (ELAD) framework that employs an active learning strategy to optimize the balance between annotation costs and model performance. To improve efficient sample selection, we introduce an explanation-guided sample selection method that identifies samples challenging its reasoning by exploiting uncertainties in explanation steps. Additionally, we present a customized LLM-annotated explanation revision technique where the teacher model detects and corrects flaws in the student model's reasoning. Our experiments across various reasoning datasets demonstrate that our framework significantly enhances the efficiency of LLM knowledge distillation. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.10779 [pdf, other]

A Condensed Transition Graph Framework for Zero-shot Link Prediction with Large Language Models

Authors: Mingchen Li, Chen Ling, Rui Zhang, Liang Zhao

Abstract: Zero-shot link prediction (ZSLP) on knowledge graphs aims at automatically identifying relations between given entities. Existing methods primarily employ auxiliary information to predict tail entity given head entity and its relation, yet face challenges due to the occasional unavailability of such detailed information and the inherent simplicity of predicting tail entities based on semantic simi… ▽ More Zero-shot link prediction (ZSLP) on knowledge graphs aims at automatically identifying relations between given entities. Existing methods primarily employ auxiliary information to predict tail entity given head entity and its relation, yet face challenges due to the occasional unavailability of such detailed information and the inherent simplicity of predicting tail entities based on semantic similarities. Even though Large Language Models (LLMs) offer a promising solution to predict unobserved relations between the head and tail entity in a zero-shot manner, their performance is still restricted due to the inability to leverage all the (exponentially many) paths' information between two entities, which are critical in collectively indicating their relation types. To address this, in this work, we introduce a Condensed Transition Graph Framework for Zero-Shot Link Prediction (CTLP), which encodes all the paths' information in linear time complexity to predict unseen relations between entities, attaining both efficiency and information preservation. Specifically, we design a condensed transition graph encoder with theoretical guarantees on its coverage, expressiveness, and efficiency. It is learned by a transition graph contrastive learning strategy. Subsequently, we design a soft instruction tuning to learn and map the all-path embedding to the input of LLMs. Experimental results show that our proposed CTLP method achieves state-of-the-art performance on three standard ZSLP datasets △ Less

Submitted 16 February, 2024; originally announced February 2024.

arXiv:2402.10189 [pdf, other]

Uncertainty Quantification for In-Context Learning of Large Language Models

Authors: Chen Ling, Xujiang Zhao, Xuchao Zhang, Wei Cheng, Yanchi Liu, Yiyou Sun, Mika Oishi, Takao Osaki, Katsushi Matsuda, Jie Ji, Guangji Bai, Liang Zhao, Haifeng Chen

Abstract: In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs) and revolutionized various fields by providing a few task-relevant demonstrations in the prompt. However, trustworthy issues with LLM's response, such as hallucination, have also been actively discussed. Existing works have been devoted to quantifying the uncertainty in LLM's response, but they often overlo… ▽ More In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs) and revolutionized various fields by providing a few task-relevant demonstrations in the prompt. However, trustworthy issues with LLM's response, such as hallucination, have also been actively discussed. Existing works have been devoted to quantifying the uncertainty in LLM's response, but they often overlook the complex nature of LLMs and the uniqueness of in-context learning. In this work, we delve into the predictive uncertainty of LLMs associated with in-context learning, highlighting that such uncertainties may stem from both the provided demonstrations (aleatoric uncertainty) and ambiguities tied to the model's configurations (epistemic uncertainty). We propose a novel formulation and corresponding estimation method to quantify both types of uncertainties. The proposed method offers an unsupervised way to understand the prediction of in-context learning in a plug-and-play fashion. Extensive experiments are conducted to demonstrate the effectiveness of the decomposition. The code and data are available at: https://github.com/lingchen0331/UQ_ICL. △ Less

Submitted 28 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: Accepted to the main conference of NAACL 2024

arXiv:2402.07834 [pdf, other]

Generalizing across Temporal Domains with Koopman Operators

Authors: Qiuhao Zeng, Wei Wang, Fan Zhou, Gezheng Xu, Ruizhi Pu, Changjian Shui, Christian Gagne, Shichun Yang, Boyu Wang, Charles X. Ling

Abstract: In the field of domain generalization, the task of constructing a predictive model capable of generalizing to a target domain without access to target data remains challenging. This problem becomes further complicated when considering evolving dynamics between domains. While various approaches have been proposed to address this issue, a comprehensive understanding of the underlying generalization… ▽ More In the field of domain generalization, the task of constructing a predictive model capable of generalizing to a target domain without access to target data remains challenging. This problem becomes further complicated when considering evolving dynamics between domains. While various approaches have been proposed to address this issue, a comprehensive understanding of the underlying generalization theory is still lacking. In this study, we contribute novel theoretic results that aligning conditional distribution leads to the reduction of generalization bounds. Our analysis serves as a key motivation for solving the Temporal Domain Generalization (TDG) problem through the application of Koopman Neural Operators, resulting in Temporal Koopman Networks (TKNets). By employing Koopman Operators, we effectively address the time-evolving distributions encountered in TDG using the principles of Koopman theory, where measurement functions are sought to establish linear transition relations between evolving domains. Through empirical evaluations conducted on synthetic and real-world datasets, we validate the effectiveness of our proposed approach. △ Less

Submitted 15 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

Comments: 15 pages, 7 figures, Accepted by AAAI 2024. arXiv admin note: text overlap with arXiv:2206.00047

arXiv:2402.03030 [pdf, other]

Rejection-Sampled Universal Quantization for Smaller Quantization Errors

Authors: Chih Wei Ling, Cheuk Ting Li

Abstract: We construct a randomized vector quantizer which has a smaller maximum error compared to all known lattice quantizers with the same entropy for dimensions 5, 6, ..., 48, and also has a smaller mean squared error compared to known lattice quantizers with the same entropy for dimensions 35, ..., 48, in the high resolution limit. Moreover, our randomized quantizer has a desirable property that the qu… ▽ More We construct a randomized vector quantizer which has a smaller maximum error compared to all known lattice quantizers with the same entropy for dimensions 5, 6, ..., 48, and also has a smaller mean squared error compared to known lattice quantizers with the same entropy for dimensions 35, ..., 48, in the high resolution limit. Moreover, our randomized quantizer has a desirable property that the quantization error is always uniform over the ball and independent of the input. Our construction is based on applying rejection sampling on universal quantization, which allows us to shape the error distribution to be any continuous distribution, not only uniform distributions over basic cells of a lattice as in conventional dithered quantization. We also characterize the high SNR limit of one-shot channel simulation for any additive noise channel under a mild assumption (e.g., the AWGN channel), up to an additive constant of 1.45 bits. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 15 pages, 2 figures

arXiv:2401.10773 [pdf, other]

Multilevel lattice codes from Hurwitz quaternion integers

Authors: Juliana G. F. Souza, Sueli I. R. Costa, Cong Ling

Abstract: This work presents an extension of the Construction $π_A$ lattices proposed in \cite{huang2017construction}, to Hurwitz quaternion integers. This construction is provided by using an isomorphism from a version of the Chinese remainder theorem applied to maximal orders in contrast to natural orders in prior works. Exploiting this map, we analyze the performance of the resulting multilevel lattice c… ▽ More This work presents an extension of the Construction $π_A$ lattices proposed in \cite{huang2017construction}, to Hurwitz quaternion integers. This construction is provided by using an isomorphism from a version of the Chinese remainder theorem applied to maximal orders in contrast to natural orders in prior works. Exploiting this map, we analyze the performance of the resulting multilevel lattice codes, highlight via computer simulations their notably reduced computational complexity provided by the multistage decoding. Moreover it is shown that this construction effectively attain the Poltyrev-limit. △ Less

Submitted 27 February, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

arXiv:2401.09490 [pdf, other]

Gene-associated Disease Discovery Powered by Large Language Models

Authors: Jiayu Chang, Shiyu Wang, Chen Ling, Zhaohui Qin, Liang Zhao

Abstract: The intricate relationship between genetic variation and human diseases has been a focal point of medical research, evidenced by the identification of risk genes regarding specific diseases. The advent of advanced genome sequencing techniques has significantly improved the efficiency and cost-effectiveness of detecting these genetic markers, playing a crucial role in disease diagnosis and forming… ▽ More The intricate relationship between genetic variation and human diseases has been a focal point of medical research, evidenced by the identification of risk genes regarding specific diseases. The advent of advanced genome sequencing techniques has significantly improved the efficiency and cost-effectiveness of detecting these genetic markers, playing a crucial role in disease diagnosis and forming the basis for clinical decision-making and early risk assessment. To overcome the limitations of existing databases that record disease-gene associations from existing literature, which often lack real-time updates, we propose a novel framework employing Large Language Models (LLMs) for the discovery of diseases associated with specific genes. This framework aims to automate the labor-intensive process of sifting through medical literature for evidence linking genetic variations to diseases, thereby enhancing the efficiency of disease identification. Our approach involves using LLMs to conduct literature searches, summarize relevant findings, and pinpoint diseases related to specific genes. This paper details the development and application of our LLM-powered framework, demonstrating its potential in streamlining the complex process of literature retrieval and summarization to identify diseases associated with specific genetic variations. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: This is the official paper accepted by AAAI 2024 Workshop on Large Language Models for Biological Discoveries

arXiv:2401.00625 [pdf, ps, other]

Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models

Authors: Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang Zhao

Abstract: The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated models like OpenAI's ChatGPT, represents a significant advancement in artificial intelligence. These models, however, bring forth substantial challenges in the high consumption of computational, memory, energy, and financial resources, especially in environments with limited resource capabilities. This survey aims t… ▽ More The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated models like OpenAI's ChatGPT, represents a significant advancement in artificial intelligence. These models, however, bring forth substantial challenges in the high consumption of computational, memory, energy, and financial resources, especially in environments with limited resource capabilities. This survey aims to systematically address these challenges by reviewing a broad spectrum of techniques designed to enhance the resource efficiency of LLMs. We categorize methods based on their optimization focus: computational, memory, energy, financial, and network resources and their applicability across various stages of an LLM's lifecycle, including architecture design, pretraining, finetuning, and system design. Additionally, the survey introduces a nuanced categorization of resource efficiency techniques by their specific resource types, which uncovers the intricate relationships and mappings between various resources and corresponding optimization techniques. A standardized set of evaluation metrics and datasets is also presented to facilitate consistent and fair comparisons across different models and techniques. By offering a comprehensive overview of the current sota and identifying open research avenues, this survey serves as a foundational reference for researchers and practitioners, aiding them in developing more sustainable and efficient LLMs in a rapidly evolving landscape. △ Less

Submitted 3 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

Comments: Preprint. GitHub repo: https://github.com/tiingweii-shii/Awesome-Resource-Efficient-LLM-Papers

arXiv:2312.15566 [pdf, other]

Deep Copula-Based Survival Analysis for Dependent Censoring with Identifiability Guarantees

Authors: Weijia Zhang, Chun Kai Ling, Xuanhui Zhang

Abstract: Censoring is the central problem in survival analysis where either the time-to-event (for instance, death), or the time-tocensoring (such as loss of follow-up) is observed for each sample. The majority of existing machine learning-based survival analysis methods assume that survival is conditionally independent of censoring given a set of covariates; an assumption that cannot be verified since onl… ▽ More Censoring is the central problem in survival analysis where either the time-to-event (for instance, death), or the time-tocensoring (such as loss of follow-up) is observed for each sample. The majority of existing machine learning-based survival analysis methods assume that survival is conditionally independent of censoring given a set of covariates; an assumption that cannot be verified since only marginal distributions is available from the data. The existence of dependent censoring, along with the inherent bias in current estimators has been demonstrated in a variety of applications, accentuating the need for a more nuanced approach. However, existing methods that adjust for dependent censoring require practitioners to specify the ground truth copula. This requirement poses a significant challenge for practical applications, as model misspecification can lead to substantial bias. In this work, we propose a flexible deep learning-based survival analysis method that simultaneously accommodate for dependent censoring and eliminates the requirement for specifying the ground truth copula. We theoretically prove the identifiability of our model under a broad family of copulas and survival distributions. Experiments results from a wide range of datasets demonstrate that our approach successfully discerns the underlying dependency structure and significantly reduces survival estimation bias when compared to existing methods. △ Less

Submitted 4 July, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

Comments: To appear in AAAI 2024

arXiv:2312.09058 [pdf, other]

Learning Coalition Structures with Games

Authors: Yixuan Even Xu, Chun Kai Ling, Fei Fang

Abstract: Coalitions naturally exist in many real-world systems involving multiple decision makers such as ridesharing, security, and online ad auctions, but the coalition structure among the agents is often unknown. We propose and study an important yet previously overseen problem -- Coalition Structure Learning (CSL), where we aim to carefully design a series of games for the agents and infer the underlyi… ▽ More Coalitions naturally exist in many real-world systems involving multiple decision makers such as ridesharing, security, and online ad auctions, but the coalition structure among the agents is often unknown. We propose and study an important yet previously overseen problem -- Coalition Structure Learning (CSL), where we aim to carefully design a series of games for the agents and infer the underlying coalition structure by observing their interactions in those games. We establish a lower bound on the sample complexity -- defined as the number of games needed to learn the structure -- of any algorithms for CSL and propose the Iterative Grouping (IG) algorithm for designing normal-form games to achieve the lower bound. We show that IG can be extended to other succinct games such as congestion games and graphical games. Moreover, we solve CSL in a more restrictive and practical setting: auctions. We show a variant of IG to solve CSL in the auction setting even if we cannot design the bidder valuations. Finally, we conduct experiments to evaluate IG in the auction setting and the results align with our theoretical analysis. △ Less

Submitted 18 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: 13 pages, 4 figures, 3 tables, aaai 2024

arXiv:2312.05822 [pdf, other]

Toward Open-ended Embodied Tasks Solving

Authors: William Wei Wang, Dongqi Han, Xufang Luo, Yifei Shen, Charles Ling, Boyu Wang, Dongsheng Li

Abstract: Empowering embodied agents, such as robots, with Artificial Intelligence (AI) has become increasingly important in recent years. A major challenge is task open-endedness. In practice, robots often need to perform tasks with novel goals that are multifaceted, dynamic, lack a definitive "end-state", and were not encountered during training. To tackle this problem, this paper introduces \textit{Diffu… ▽ More Empowering embodied agents, such as robots, with Artificial Intelligence (AI) has become increasingly important in recent years. A major challenge is task open-endedness. In practice, robots often need to perform tasks with novel goals that are multifaceted, dynamic, lack a definitive "end-state", and were not encountered during training. To tackle this problem, this paper introduces \textit{Diffusion for Open-ended Goals} (DOG), a novel framework designed to enable embodied AI to plan and act flexibly and dynamically for open-ended task goals. DOG synergizes the generative prowess of diffusion models with state-of-the-art, training-free guidance techniques to adaptively perform online planning and control. Our evaluations demonstrate that DOG can handle various kinds of novel task goals not seen during training, in both maze navigation and robot control problems. Our work sheds light on enhancing embodied AI's adaptability and competency in tackling open-ended goals. △ Less

Submitted 10 December, 2023; originally announced December 2023.

arXiv:2311.16392 [pdf, other]

Multi-defender Security Games with Schedules

Authors: Zimeng Song, Chun Kai Ling, Fei Fang

Abstract: Stackelberg Security Games are often used to model strategic interactions in high-stakes security settings. The majority of existing models focus on single-defender settings where a single entity assumes command of all security assets. However, many realistic scenarios feature multiple heterogeneous defenders with their own interests and priorities embedded in a more complex system. Furthermore, d… ▽ More Stackelberg Security Games are often used to model strategic interactions in high-stakes security settings. The majority of existing models focus on single-defender settings where a single entity assumes command of all security assets. However, many realistic scenarios feature multiple heterogeneous defenders with their own interests and priorities embedded in a more complex system. Furthermore, defenders rarely choose targets to protect. Instead, they have a multitude of defensive resources or schedules at its disposal, each with different protective capabilities. In this paper, we study security games featuring multiple defenders and schedules simultaneously. We show that unlike prior work on multi-defender security games, the introduction of schedules can cause non-existence of equilibrium even under rather restricted environments. We prove that under the mild restriction that any subset of a schedule is also a schedule, non-existence of equilibrium is not only avoided, but can be computed in polynomial time in games with two defenders. Under additional assumptions, our algorithm can be extended to games with more than two defenders and its computation scaled up in special classes of games with compactly represented schedules such as those used in patrolling applications. Experimental results suggest that our methods scale gracefully with game size, making our algorithms amongst the few that can tackle multiple heterogeneous defenders. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: Extended version of the paper accepted to GameSec 2023

arXiv:2311.15161 [pdf, other]

doi 10.1109/TKDE.2024.3419449

Hessian Aware Low-Rank Perturbation for Order-Robust Continual Learning

Authors: Jiaqi Li, Yuanhao Lai, Rui Wang, Changjian Shui, Sabyasachi Sahoo, Charles X. Ling, Shichun Yang, Boyu Wang, Christian Gagné, Fan Zhou

Abstract: Continual learning aims to learn a series of tasks sequentially without forgetting the knowledge acquired from the previous ones. In this work, we propose the Hessian Aware Low-Rank Perturbation algorithm for continual learning. By modeling the parameter transitions along the sequential tasks with the weight matrix transformation, we propose to apply the low-rank approximation on the task-adaptive… ▽ More Continual learning aims to learn a series of tasks sequentially without forgetting the knowledge acquired from the previous ones. In this work, we propose the Hessian Aware Low-Rank Perturbation algorithm for continual learning. By modeling the parameter transitions along the sequential tasks with the weight matrix transformation, we propose to apply the low-rank approximation on the task-adaptive parameters in each layer of the neural networks. Specifically, we theoretically demonstrate the quantitative relationship between the Hessian and the proposed low-rank approximation. The approximation ranks are then globally determined according to the marginal increment of the empirical loss estimated by the layer-specific gradient and low-rank approximation error. Furthermore, we control the model capacity by pruning less important parameters to diminish the parameter growth. We conduct extensive experiments on various benchmarks, including a dataset with large-scale tasks, and compare our method against some recent state-of-the-art methods to demonstrate the effectiveness and scalability of our proposed method. Empirical results show that our method performs better on different benchmarks, especially in achieving task order robustness and handling the forgetting issue. The source code is at https://github.com/lijiaqi/HALRP. △ Less

Submitted 7 July, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

Comments: Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE)

arXiv:2310.11672 [pdf, other]

Open-ended Commonsense Reasoning with Unrestricted Answer Scope

Authors: Chen Ling, Xuchao Zhang, Xujiang Zhao, Yanchi Liu, Wei Cheng, Mika Oishi, Takao Osaki, Katsushi Matsuda, Haifeng Chen, Liang Zhao

Abstract: Open-ended Commonsense Reasoning is defined as solving a commonsense question without providing 1) a short list of answer candidates and 2) a pre-defined answer scope. Conventional ways of formulating the commonsense question into a question-answering form or utilizing external knowledge to learn retrieval-based methods are less applicable in the open-ended setting due to an inherent challenge. Wi… ▽ More Open-ended Commonsense Reasoning is defined as solving a commonsense question without providing 1) a short list of answer candidates and 2) a pre-defined answer scope. Conventional ways of formulating the commonsense question into a question-answering form or utilizing external knowledge to learn retrieval-based methods are less applicable in the open-ended setting due to an inherent challenge. Without pre-defining an answer scope or a few candidates, open-ended commonsense reasoning entails predicting answers by searching over an extremely large searching space. Moreover, most questions require implicit multi-hop reasoning, which presents even more challenges to our problem. In this work, we leverage pre-trained language models to iteratively retrieve reasoning paths on the external knowledge base, which does not require task-specific supervision. The reasoning paths can help to identify the most precise answer to the commonsense question. We conduct experiments on two commonsense benchmark datasets. Compared to other approaches, our proposed method achieves better performance both quantitatively and qualitatively. △ Less

Submitted 27 October, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: Findings of EMNLP 2023

arXiv:2310.10408 [pdf, other]

A cross Transformer for image denoising

Authors: Chunwei Tian, Menghua Zheng, Wangmeng Zuo, Shichao Zhang, Yanning Zhang, Chia-Wen Ling

Abstract: Deep convolutional neural networks (CNNs) depend on feedforward and feedback ways to obtain good performance in image denoising. However, how to obtain effective structural information via CNNs to efficiently represent given noisy images is key for complex scenes. In this paper, we propose a cross Transformer denoising CNN (CTNet) with a serial block (SB), a parallel block (PB), and a residual blo… ▽ More Deep convolutional neural networks (CNNs) depend on feedforward and feedback ways to obtain good performance in image denoising. However, how to obtain effective structural information via CNNs to efficiently represent given noisy images is key for complex scenes. In this paper, we propose a cross Transformer denoising CNN (CTNet) with a serial block (SB), a parallel block (PB), and a residual block (RB) to obtain clean images for complex scenes. A SB uses an enhanced residual architecture to deeply search structural information for image denoising. To avoid loss of key information, PB uses three heterogeneous networks to implement multiple interactions of multi-level features to broadly search for extra information for improving the adaptability of an obtained denoiser for complex scenes. Also, to improve denoising performance, Transformer mechanisms are embedded into the SB and PB to extract complementary salient features for effectively removing noise in terms of pixel relations. Finally, a RB is applied to acquire clean images. Experiments illustrate that our CTNet is superior to some popular denoising methods in terms of real and synthetic image denoising. It is suitable to mobile digital devices, i.e., phones. Codes can be obtained at https://github.com/hellloxiaotian/CTNet. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2309.08301 [pdf, other]

RaSpectLoc: RAman SPECTroscopy-dependent robot LOCalisation

Authors: Christopher Thomas Thirgood, Oscar Alejandro Mendez Maldonado, Chao Ling, Jonathan Storey, Simon J Hadfield

Abstract: This paper presents a new information source for supporting robot localisation: material composition. The proposed method complements the existing visual, structural, and semantic cues utilized in the literature. However, it has a distinct advantage in its ability to differentiate structurally, visually or categorically similar objects such as different doors, by using Raman spectrometers. Such de… ▽ More This paper presents a new information source for supporting robot localisation: material composition. The proposed method complements the existing visual, structural, and semantic cues utilized in the literature. However, it has a distinct advantage in its ability to differentiate structurally, visually or categorically similar objects such as different doors, by using Raman spectrometers. Such devices can identify the material of objects it probes through the bonds between the material's molecules. Unlike similar sensors, such as mass spectroscopy, it does so without damaging the material or environment. In addition to introducing the first material-based localisation algorithm, this paper supports the future growth of the field by presenting a gazebo plugin for Raman spectrometers, material sensing demonstrations, as well as the first-ever localisation data-set with benchmarks for material-based localisation. This benchmarking shows that the proposed technique results in a significant improvement over current state-of-the-art localisation techniques, achieving 16\% more accurate localisation than the leading baseline. △ Less

Submitted 21 September, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: 8 pages, 5 figures. This work will be presented at IROS 2023

arXiv:2309.06982 [pdf, other]

Communication-Efficient Laplace Mechanism for Differential Privacy via Random Quantization

Authors: Ali Moradi Shahmiri, Chih Wei Ling, Cheuk Ting Li

Abstract: We propose the first method that realizes the Laplace mechanism exactly (i.e., a Laplace noise is added to the data) that requires only a finite amount of communication (whereas the original Laplace mechanism requires the transmission of a real number) while guaranteeing privacy against the server and database. Our mechanism can serve as a drop-in replacement for local or centralized differential… ▽ More We propose the first method that realizes the Laplace mechanism exactly (i.e., a Laplace noise is added to the data) that requires only a finite amount of communication (whereas the original Laplace mechanism requires the transmission of a real number) while guaranteeing privacy against the server and database. Our mechanism can serve as a drop-in replacement for local or centralized differential privacy applications where the Laplace mechanism is used. Our mechanism is constructed using a random quantization technique. Unlike the simple and prevalent Laplace-mechanism-then-quantize approach, the quantization in our mechanism does not result in any distortion or degradation of utility. Unlike existing dithered quantization and channel simulation schemes for simulating additive Laplacian noise, our mechanism guarantees privacy not only against the database and downstream, but also against the honest but curious server which attempts to decode the data using the dither signals. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: 11 pages, 3 figures, short version to be submitted at 2024 IEEE International Conference on Acoustics, Speech and Signal Processing

arXiv:2309.03433 [pdf, other]

Improving Open Information Extraction with Large Language Models: A Study on Demonstration Uncertainty

Authors: Chen Ling, Xujiang Zhao, Xuchao Zhang, Yanchi Liu, Wei Cheng, Haoyu Wang, Zhengzhang Chen, Takao Osaki, Katsushi Matsuda, Haifeng Chen, Liang Zhao

Abstract: Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text, typically in the form of (subject, relation, object) triples. Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks due to two key issues. First, LLMs struggle to distinguish irrelevant context f… ▽ More Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text, typically in the form of (subject, relation, object) triples. Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks due to two key issues. First, LLMs struggle to distinguish irrelevant context from relevant relations and generate structured output due to the restrictions on fine-tuning the model. Second, LLMs generates responses autoregressively based on probability, which makes the predicted relations lack confidence. In this paper, we assess the capabilities of LLMs in improving the OIE task. Particularly, we propose various in-context learning strategies to enhance LLM's instruction-following ability and a demonstration uncertainty quantification module to enhance the confidence of the generated relations. Our experiments on three OIE benchmark datasets show that our approach holds its own against established supervised methods, both quantitatively and qualitatively. △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2309.02978 [pdf, other]

Helper Recommendation with seniority control in Online Health Community

Authors: Junruo Gao, Chen Ling, Carl Yang, Liang Zhao

Abstract: Online health communities (OHCs) are forums where patients with similar conditions communicate their experiences and provide moral support. Social support in OHCs plays a crucial role in easing and rehabilitating patients. However, many time-sensitive questions from patients often remain unanswered due to the multitude of threads and the random nature of patient visits in OHCs. To address this iss… ▽ More Online health communities (OHCs) are forums where patients with similar conditions communicate their experiences and provide moral support. Social support in OHCs plays a crucial role in easing and rehabilitating patients. However, many time-sensitive questions from patients often remain unanswered due to the multitude of threads and the random nature of patient visits in OHCs. To address this issue, it is imperative to propose a recommender system that assists solution seekers in finding appropriate problem helpers. Nevertheless, developing a recommendation algorithm to enhance social support in OHCs remains an under-explored area. Traditional recommender systems cannot be directly adapted due to the following obstacles. First, unlike user-item links in traditional recommender systems, it is hard to model the social support behind helper-seeker links in OHCs since they are formed based on various heterogeneous reasons. Second, it is difficult to distinguish the impact of historical activities in characterizing patients. Third, it is significantly challenging to ensure that the recommended helpers possess sufficient expertise to assist the seekers. To tackle the aforementioned challenges, we develop a Monotonically regularIzed diseNTangled Variational Autoencoders (MINT) model to strengthen social support in OHCs. △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2308.05472 [pdf, other]

PAC Codes for Source and Joint Source-Channel Coding

Authors: Mengfan Zheng, Cong Ling

Abstract: Polarization-adjusted convolutional (PAC) codes, as a concatenated coding scheme based on polar codes, is able to approach the finite-length bound of binary-input AWGN channel at short blocklengths. In this paper, we extend PAC codes to the fields of source coding and joint source-channel coding and show that they can also approach the corresponding finite-length bounds at short blocklengths. Polarization-adjusted convolutional (PAC) codes, as a concatenated coding scheme based on polar codes, is able to approach the finite-length bound of binary-input AWGN channel at short blocklengths. In this paper, we extend PAC codes to the fields of source coding and joint source-channel coding and show that they can also approach the corresponding finite-length bounds at short blocklengths. △ Less

Submitted 10 August, 2023; originally announced August 2023.

Comments: 6 pages, 6 figures. Submitted to GC 2023 Workshop - Channel Coding Beyond 5G

arXiv:2306.16077 [pdf, other]

Secure and Fast Asynchronous Vertical Federated Learning via Cascaded Hybrid Optimization

Authors: Ganyu Wang, Qingsong Zhang, Li Xiang, Boyu Wang, Bin Gu, Charles Ling

Abstract: Vertical Federated Learning (VFL) attracts increasing attention because it empowers multiple parties to jointly train a privacy-preserving model over vertically partitioned data. Recent research has shown that applying zeroth-order optimization (ZOO) has many advantages in building a practical VFL algorithm. However, a vital problem with the ZOO-based VFL is its slow convergence rate, which limits… ▽ More Vertical Federated Learning (VFL) attracts increasing attention because it empowers multiple parties to jointly train a privacy-preserving model over vertically partitioned data. Recent research has shown that applying zeroth-order optimization (ZOO) has many advantages in building a practical VFL algorithm. However, a vital problem with the ZOO-based VFL is its slow convergence rate, which limits its application in handling modern large models. To address this problem, we propose a cascaded hybrid optimization method in VFL. In this method, the downstream models (clients) are trained with ZOO to protect privacy and ensure that no internal information is shared. Meanwhile, the upstream model (server) is updated with first-order optimization (FOO) locally, which significantly improves the convergence rate, making it feasible to train the large models without compromising privacy and security. We theoretically prove that our VFL framework converges faster than the ZOO-based VFL, as the convergence of our framework is not limited by the size of the server model, making it effective for training large models with the major part on the server. Extensive experiments demonstrate that our method achieves faster convergence than the ZOO-based VFL framework, while maintaining an equivalent level of privacy protection. Moreover, we show that the convergence of our VFL is comparable to the unsafe FOO-based VFL baseline. Additionally, we demonstrate that our method makes the training of a large model feasible. △ Less

Submitted 29 June, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

Comments: Under Review

arXiv:2306.04802 [pdf, other]

A Review on Knowledge Graphs for Healthcare: Resources, Applications, and Promises

Authors: Hejie Cui, Jiaying Lu, Shiyu Wang, Ran Xu, Wenjing Ma, Shaojun Yu, Yue Yu, Xuan Kan, Chen Ling, Tianfan Fu, Liang Zhao, Joyce Ho, Fei Wang, Carl Yang

Abstract: Healthcare knowledge graphs (HKGs) are valuable tools for organizing biomedical concepts and their relationships with interpretable structures. The recent advent of large language models (LLMs) has paved the way for building more comprehensive and accurate HKGs. This, in turn, can improve the reliability of generated content and enable better evaluation of LLMs. However, the challenges of HKGs suc… ▽ More Healthcare knowledge graphs (HKGs) are valuable tools for organizing biomedical concepts and their relationships with interpretable structures. The recent advent of large language models (LLMs) has paved the way for building more comprehensive and accurate HKGs. This, in turn, can improve the reliability of generated content and enable better evaluation of LLMs. However, the challenges of HKGs such as regarding data heterogeneity and limited coverage are not fully understood, highlighting the need for detailed reviews. This work provides the first comprehensive review of HKGs. It summarizes the pipeline and key techniques for HKG construction, as well as the common utilization approaches, i.e., model-free and model-based. The existing HKG resources are also organized based on the data types they capture and application domains they cover, along with relevant statistical information (Resource available at https://github.com/lujiaying/Awesome-HealthCare-KnowledgeBase). At the application level, we delve into the successful integration of HKGs across various health domains, ranging from fine-grained basic science research to high-level clinical decision support and public health. Lastly, the paper highlights the opportunities for HKGs in the era of LLMs. This work aims to serve as a valuable resource for understanding the potential and opportunities of HKG in health research. △ Less

Submitted 19 February, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

arXiv:2306.04539 [pdf, other]

Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications

Authors: Paul Pu Liang, Chun Kai Ling, Yun Cheng, Alex Obolenskiy, Yudong Liu, Rohan Pandey, Alex Wilf, Louis-Philippe Morency, Ruslan Salakhutdinov

Abstract: In many machine learning systems that jointly learn from multiple modalities, a core research question is to understand the nature of multimodal interactions: how modalities combine to provide new task-relevant information that was not present in either alone. We study this challenge of interaction quantification in a semi-supervised setting with only labeled unimodal data and naturally co-occurri… ▽ More In many machine learning systems that jointly learn from multiple modalities, a core research question is to understand the nature of multimodal interactions: how modalities combine to provide new task-relevant information that was not present in either alone. We study this challenge of interaction quantification in a semi-supervised setting with only labeled unimodal data and naturally co-occurring multimodal data (e.g., unlabeled images and captions, video and corresponding audio) but when labeling them is time-consuming. Using a precise information-theoretic definition of interactions, our key contribution is the derivation of lower and upper bounds to quantify the amount of multimodal interactions in this semi-supervised setting. We propose two lower bounds: one based on the shared information between modalities and the other based on disagreement between separately trained unimodal classifiers, and derive an upper bound through connections to approximate algorithms for min-entropy couplings. We validate these estimated bounds and show how they accurately track true interactions. Finally, we show how these theoretical results can be used to estimate multimodal model performance, guide data collection, and select appropriate multimodal models for various tasks. △ Less

Submitted 13 June, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

Comments: ICLR 2024, Code available at: https://github.com/pliang279/PID

arXiv:2305.18703 [pdf, other]

Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey

Authors: Chen Ling, Xujiang Zhao, Jiaying Lu, Chengyuan Deng, Can Zheng, Junxiang Wang, Tanmoy Chowdhury, Yun Li, Hejie Cui, Xuchao Zhang, Tianjiao Zhao, Amit Panalkar, Dhagash Mehta, Stefano Pasquali, Wei Cheng, Haoyu Wang, Yanchi Liu, Zhengzhang Chen, Haifeng Chen, Chris White, Quanquan Gu, Jian Pei, Carl Yang, Liang Zhao

Abstract: Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of dom… ▽ More Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of domain objectives, and the diversity of the constraints (e.g., various social norms, cultural conformity, religious beliefs, and ethical standards in the domain applications). Domain specification techniques are key to make large language models disruptive in many applications. Specifically, to solve these hurdles, there has been a notable increase in research and practices conducted in recent years on the domain specialization of LLMs. This emerging field of study, with its substantial potential for impact, necessitates a comprehensive and systematic review to better summarize and guide ongoing work in this area. In this article, we present a comprehensive survey on domain specification techniques for large language models, an emerging direction critical for large language model applications. First, we propose a systematic taxonomy that categorizes the LLM domain-specialization techniques based on the accessibility to LLMs and summarizes the framework for all the subcategories as well as their relations and differences to each other. Second, we present an extensive taxonomy of critical application domains that can benefit dramatically from specialized LLMs, discussing their practical significance and open challenges. Last, we offer our insights into the current research status and future trends in this area. △ Less

Submitted 29 March, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

arXiv:2305.06788 [pdf, other]

Vector Quantization with Error Uniformly Distributed over an Arbitrary Set

Authors: Chih Wei Ling, Cheuk Ting Li

Abstract: For uniform scalar quantization, the error distribution is approximately a uniform distribution over an interval (which is also a 1-dimensional ball). Nevertheless, for lattice vector quantization, the error distribution is uniform not over a ball, but over the basic cell of the quantization lattice. In this paper, we construct vector quantizers with periodic properties, where the error is uniform… ▽ More For uniform scalar quantization, the error distribution is approximately a uniform distribution over an interval (which is also a 1-dimensional ball). Nevertheless, for lattice vector quantization, the error distribution is uniform not over a ball, but over the basic cell of the quantization lattice. In this paper, we construct vector quantizers with periodic properties, where the error is uniformly distributed over the n-ball, or any other prescribed set. We then prove upper and lower bounds on the entropy of the quantized signals. We also discuss how our construction can be applied to give a randomized quantization scheme with a nonuniform error distribution. △ Less

Submitted 24 January, 2024; v1 submitted 11 May, 2023; originally announced May 2023.

Comments: 22 pages, 3 figures. Short version presented at 2023 IEEE International Symposium on Information Theory

arXiv:2305.02200 [pdf, other]

Deep Graph Representation Learning and Optimization for Influence Maximization

Authors: Chen Ling, Junji Jiang, Junxiang Wang, My Thai, Lukas Xue, James Song, Meikang Qiu, Liang Zhao

Abstract: Influence maximization (IM) is formulated as selecting a set of initial users from a social network to maximize the expected number of influenced users. Researchers have made great progress in designing various traditional methods, and their theoretical design and performance gain are close to a limit. In the past few years, learning-based IM methods have emerged to achieve stronger generalization… ▽ More Influence maximization (IM) is formulated as selecting a set of initial users from a social network to maximize the expected number of influenced users. Researchers have made great progress in designing various traditional methods, and their theoretical design and performance gain are close to a limit. In the past few years, learning-based IM methods have emerged to achieve stronger generalization ability to unknown graphs than traditional ones. However, the development of learning-based IM methods is still limited by fundamental obstacles, including 1) the difficulty of effectively solving the objective function; 2) the difficulty of characterizing the diversified underlying diffusion patterns; and 3) the difficulty of adapting the solution under various node-centrality-constrained IM variants. To cope with the above challenges, we design a novel framework DeepIM to generatively characterize the latent representation of seed sets, and we propose to learn the diversified information diffusion pattern in a data-driven and end-to-end manner. Finally, we design a novel objective function to infer optimal seed sets under flexible node-centrality-based budget constraints. Extensive analyses are conducted over both synthetic and real-world datasets to demonstrate the overall performance of DeepIM. The code and data are available at: https://github.com/triplej0079/DeepIM. △ Less

Submitted 6 May, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

Comments: In Proceedings of the 40th International Conference on Machine Learning (ICML 2023), Honolulu, Hawaii, USA. PMLR 202, 2023

arXiv:2303.07099 [pdf, other]

Beyond Fish and Bicycles: Exploring the Varieties of Online Women's Ideological Spaces

Authors: Utkucan Balci, Chen Ling, Emiliano De Cristofaro, Megan Squire, Gianluca Stringhini, Jeremy Blackburn

Abstract: The Internet has been instrumental in connecting under-represented and vulnerable groups of people. Platforms built to foster social interaction and engagement have enabled historically disenfranchised groups to have a voice. One such vulnerable group is women. In this paper, we explore the diversity in online women's ideological spaces using a multi-dimensional approach. We perform a large-scale,… ▽ More The Internet has been instrumental in connecting under-represented and vulnerable groups of people. Platforms built to foster social interaction and engagement have enabled historically disenfranchised groups to have a voice. One such vulnerable group is women. In this paper, we explore the diversity in online women's ideological spaces using a multi-dimensional approach. We perform a large-scale, data-driven analysis of over 6M Reddit comments and submissions from 14 subreddits. We elicit a diverse taxonomy of online women's ideological spaces, ranging from counterparts to the so-called Manosphere to Gender-Critical Feminism. We then perform content analysis, finding meaningful differences across topics and communities. Finally, we shed light on two platforms, ovarit.com and thepinkpill.co, where two toxic communities of online women's ideological spaces (Gender-Critical Feminism and Femcels) migrated after their ban on Reddit. △ Less

Submitted 13 March, 2023; originally announced March 2023.

Journal ref: Published in the Proceedings of the 15th ACM Web Science Conference 2023 (ACM WebSci 2023). Please cite the WebSci version

arXiv:2302.12247 [pdf, other]

Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework

Authors: Paul Pu Liang, Yun Cheng, Xiang Fan, Chun Kai Ling, Suzanne Nie, Richard Chen, Zihao Deng, Nicholas Allen, Randy Auerbach, Faisal Mahmood, Ruslan Salakhutdinov, Louis-Philippe Morency

Abstract: The recent explosion of interest in multimodal applications has resulted in a wide selection of datasets and methods for representing and integrating information from different modalities. Despite these empirical advances, there remain fundamental research questions: How can we quantify the interactions that are necessary to solve a multimodal task? Subsequently, what are the most suitable multimo… ▽ More The recent explosion of interest in multimodal applications has resulted in a wide selection of datasets and methods for representing and integrating information from different modalities. Despite these empirical advances, there remain fundamental research questions: How can we quantify the interactions that are necessary to solve a multimodal task? Subsequently, what are the most suitable multimodal models to capture these interactions? To answer these questions, we propose an information-theoretic approach to quantify the degree of redundancy, uniqueness, and synergy relating input modalities with an output task. We term these three measures as the PID statistics of a multimodal distribution (or PID for short), and introduce two new estimators for these PID statistics that scale to high-dimensional distributions. To validate PID estimation, we conduct extensive experiments on both synthetic datasets where the PID is known and on large-scale multimodal benchmarks where PID estimations are compared with human annotations. Finally, we demonstrate their usefulness in (1) quantifying interactions within multimodal datasets, (2) quantifying interactions captured by multimodal models, (3) principled approaches for model selection, and (4) three real-world case studies engaging with domain experts in pathology, mood prediction, and robotic perception where our framework helps to recommend strong multimodal models for each application. △ Less

Submitted 10 December, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

Comments: NeurIPS 2023. Code available at: https://github.com/pliang279/PID

arXiv:2302.02093 [pdf]

Knowledge-enhanced Neural Machine Reasoning: A Review

Authors: Tanmoy Chowdhury, Chen Ling, Xuchao Zhang, Xujiang Zhao, Guangji Bai, Jian Pei, Haifeng Chen, Liang Zhao

Abstract: Knowledge-enhanced neural machine reasoning has garnered significant attention as a cutting-edge yet challenging research area with numerous practical applications. Over the past few years, plenty of studies have leveraged various forms of external knowledge to augment the reasoning capabilities of deep models, tackling challenges such as effective knowledge integration, implicit knowledge mining,… ▽ More Knowledge-enhanced neural machine reasoning has garnered significant attention as a cutting-edge yet challenging research area with numerous practical applications. Over the past few years, plenty of studies have leveraged various forms of external knowledge to augment the reasoning capabilities of deep models, tackling challenges such as effective knowledge integration, implicit knowledge mining, and problems of tractability and optimization. However, there is a dearth of a comprehensive technical review of the existing knowledge-enhanced reasoning techniques across the diverse range of application domains. This survey provides an in-depth examination of recent advancements in the field, introducing a novel taxonomy that categorizes existing knowledge-enhanced methods into two primary categories and four subcategories. We systematically discuss these methods and highlight their correlations, strengths, and limitations. Finally, we elucidate the current application domains and provide insight into promising prospects for future research. △ Less

Submitted 6 February, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

Comments: 8 pages, 3 figures

arXiv:2302.01516 [pdf, other]

Class Overwhelms: Mutual Conditional Blended-Target Domain Adaptation

Authors: Pengcheng Xu, Boyu Wang, Charles Ling

Abstract: Current methods of blended targets domain adaptation (BTDA) usually infer or consider domain label information but underemphasize hybrid categorical feature structures of targets, which yields limited performance, especially under the label distribution shift. We demonstrate that domain labels are not directly necessary for BTDA if categorical distributions of various domains are sufficiently alig… ▽ More Current methods of blended targets domain adaptation (BTDA) usually infer or consider domain label information but underemphasize hybrid categorical feature structures of targets, which yields limited performance, especially under the label distribution shift. We demonstrate that domain labels are not directly necessary for BTDA if categorical distributions of various domains are sufficiently aligned even facing the imbalance of domains and the label distribution shift of classes. However, we observe that the cluster assumption in BTDA does not comprehensively hold. The hybrid categorical feature space hinders the modeling of categorical distributions and the generation of reliable pseudo labels for categorical alignment. To address these, we propose a categorical domain discriminator guided by uncertainty to explicitly model and directly align categorical distributions $P(Z|Y)$. Simultaneously, we utilize the low-level features to augment the single source features with diverse target styles to rectify the biased classifier $P(Y|Z)$ among diverse targets. Such a mutual conditional alignment of $P(Z|Y)$ and $P(Y|Z)$ forms a mutual reinforced mechanism. Our approach outperforms the state-of-the-art in BTDA even compared with methods utilizing domain labels, especially under the label distribution shift, and in single target DA on DomainNet. Source codes are available at \url{https://github.com/Pengchengpcx/Class-overwhelms-Mutual-Conditional-Blended-Target-Domain-Adaptation}. △ Less

Submitted 8 March, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

Journal ref: AAAI2023 Oral

arXiv:2301.13381 [pdf, other]

When Source-Free Domain Adaptation Meets Learning with Noisy Labels

Authors: Li Yi, Gezheng Xu, Pengcheng Xu, Jiaqi Li, Ruizhi Pu, Charles Ling, A. Ian McLeod, Boyu Wang

Abstract: Recent state-of-the-art source-free domain adaptation (SFDA) methods have focused on learning meaningful cluster structures in the feature space, which have succeeded in adapting the knowledge from source domain to unlabeled target domain without accessing the private source data. However, existing methods rely on the pseudo-labels generated by source models that can be noisy due to domain shift.… ▽ More Recent state-of-the-art source-free domain adaptation (SFDA) methods have focused on learning meaningful cluster structures in the feature space, which have succeeded in adapting the knowledge from source domain to unlabeled target domain without accessing the private source data. However, existing methods rely on the pseudo-labels generated by source models that can be noisy due to domain shift. In this paper, we study SFDA from the perspective of learning with label noise (LLN). Unlike the label noise in the conventional LLN scenario, we prove that the label noise in SFDA follows a different distribution assumption. We also prove that such a difference makes existing LLN methods that rely on their distribution assumptions unable to address the label noise in SFDA. Empirical evidence suggests that only marginal improvements are achieved when applying the existing LLN methods to solve the SFDA problem. On the other hand, although there exists a fundamental difference between the label noise in the two scenarios, we demonstrate theoretically that the early-time training phenomenon (ETP), which has been previously observed in conventional label noise settings, can also be observed in the SFDA problem. Extensive experiments demonstrate significant improvements to existing SFDA algorithms by leveraging ETP to address the label noise in SFDA. △ Less

Submitted 24 February, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: ICLR 2023 camera-ready

arXiv:2301.09159 [pdf, other]

Abstracting Imperfect Information Away from Two-Player Zero-Sum Games

Authors: Samuel Sokota, Ryan D'Orazio, Chun Kai Ling, David J. Wu, J. Zico Kolter, Noam Brown

Abstract: In their seminal work, Nayyar et al. (2013) showed that imperfect information can be abstracted away from common-payoff games by having players publicly announce their policies as they play. This insight underpins sound solvers and decision-time planning algorithms for common-payoff games. Unfortunately, a naive application of the same insight to two-player zero-sum games fails because Nash equili… ▽ More In their seminal work, Nayyar et al. (2013) showed that imperfect information can be abstracted away from common-payoff games by having players publicly announce their policies as they play. This insight underpins sound solvers and decision-time planning algorithms for common-payoff games. Unfortunately, a naive application of the same insight to two-player zero-sum games fails because Nash equilibria of the game with public policy announcements may not correspond to Nash equilibria of the original game. As a consequence, existing sound decision-time planning algorithms require complicated additional mechanisms that have unappealing properties. The main contribution of this work is showing that certain regularized equilibria do not possess the aforementioned non-correspondence problem -- thus, computing them can be treated as perfect-information problems. Because these regularized equilibria can be made arbitrarily close to Nash equilibria, our result opens the door to a new perspective to solving two-player zero-sum games and yields a simplified framework for decision-time planning in two-player zero-sum games, void of the unappealing properties that plague existing decision-time planning approaches. △ Less

Submitted 31 July, 2023; v1 submitted 22 January, 2023; originally announced January 2023.

arXiv:2301.07845 [pdf, other]

Foresee What You Will Learn: Data Augmentation for Domain Generalization in Non-stationary Environment

Authors: Qiuhao Zeng, Wei Wang, Fan Zhou, Charles Ling, Boyu Wang

Abstract: Existing domain generalization aims to learn a generalizable model to perform well even on unseen domains. For many real-world machine learning applications, the data distribution often shifts gradually along domain indices. For example, a self-driving car with a vision system drives from dawn to dusk, with the sky darkening gradually. Therefore, the system must be able to adapt to changes in ambi… ▽ More Existing domain generalization aims to learn a generalizable model to perform well even on unseen domains. For many real-world machine learning applications, the data distribution often shifts gradually along domain indices. For example, a self-driving car with a vision system drives from dawn to dusk, with the sky darkening gradually. Therefore, the system must be able to adapt to changes in ambient illumination and continue to drive safely on the road. In this paper, we formulate such problems as Evolving Domain Generalization, where a model aims to generalize well on a target domain by discovering and leveraging the evolving pattern of the environment. We then propose Directional Domain Augmentation (DDA), which simulates the unseen target features by mapping source data as augmentations through a domain transformer. Specifically, we formulate DDA as a bi-level optimization problem and solve it through a novel meta-learning approach in the representation space. We evaluate the proposed method on both synthetic datasets and realworld datasets, and empirical results show that our approach can outperform other existing methods. △ Less

Submitted 8 March, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

Comments: 12 pages, 6 figures, accepted by AAAI 2023

Showing 1–50 of 184 results for author: Ling, C