-
Mitigating Entity-Level Hallucination in Large Language Models
Authors:
Weihang Su,
Yichen Tang,
Qingyao Ai,
Changyue Wang,
Zhijing Wu,
Yiqun Liu
Abstract:
The emergence of Large Language Models (LLMs) has revolutionized how users access information, shifting from traditional search engines to direct question-and-answer interactions with LLMs. However, the widespread adoption of LLMs has revealed a significant challenge known as hallucination, wherein LLMs generate coherent yet factually inaccurate responses. This hallucination phenomenon has led to…
▽ More
The emergence of Large Language Models (LLMs) has revolutionized how users access information, shifting from traditional search engines to direct question-and-answer interactions with LLMs. However, the widespread adoption of LLMs has revealed a significant challenge known as hallucination, wherein LLMs generate coherent yet factually inaccurate responses. This hallucination phenomenon has led to users' distrust in information retrieval systems based on LLMs. To tackle this challenge, this paper proposes Dynamic Retrieval Augmentation based on hallucination Detection (DRAD) as a novel method to detect and mitigate hallucinations in LLMs. DRAD improves upon traditional retrieval augmentation by dynamically adapting the retrieval process based on real-time hallucination detection. It features two main components: Real-time Hallucination Detection (RHD) for identifying potential hallucinations without external models, and Self-correction based on External Knowledge (SEK) for correcting these errors using external knowledge. Experiment results show that DRAD demonstrates superior performance in both detecting and mitigating hallucinations in LLMs. All of our code and data are open-sourced at https://github.com/oneal2000/EntityHallucination.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Redefinition of Digital Twin and its Situation Awareness Framework Designing Towards Fourth Paradigm for Energy Internet of Things
Authors:
Xing He,
Yuezhong Tang,
Shuyan Ma,
Qian Ai,
Fei Tao,
Robert Qiu
Abstract:
Traditional knowledge-based situation awareness (SA) modes struggle to adapt to the escalating complexity of today's Energy Internet of Things (EIoT), necessitating a pivotal paradigm shift. In response, this work introduces a pioneering data-driven SA framework, termed digital twin-based situation awareness (DT-SA), aiming to bridge existing gaps between data and demands, and further to enhance S…
▽ More
Traditional knowledge-based situation awareness (SA) modes struggle to adapt to the escalating complexity of today's Energy Internet of Things (EIoT), necessitating a pivotal paradigm shift. In response, this work introduces a pioneering data-driven SA framework, termed digital twin-based situation awareness (DT-SA), aiming to bridge existing gaps between data and demands, and further to enhance SA capabilities within the complex EIoT landscape. First, we redefine the concept of digital twin (DT) within the EIoT context, aligning it with data-intensive scientific discovery paradigm (the Fourth Paradigm) so as to waken EIoT's sleeping data; this contextual redefinition lays the cornerstone of our DT-SA framework for EIoT. Then, the framework is comprehensively explored through its four fundamental steps: digitalization, simulation, informatization, and intellectualization. These steps initiate a virtual ecosystem conducive to a continuously self-adaptive, self-learning, and self-evolving big model (BM), further contributing to the evolution and effectiveness of DT-SA in engineering. Our framework is characterized by the incorporation of system theory and Fourth Paradigm as guiding ideologies, DT as data engine, and BM as intelligence engine. This unique combination forms the backbone of our approach. This work extends beyond engineering, stepping into the domain of data science -- DT-SA not only enhances management practices for EIoT users/operators, but also propels advancements in pattern analysis and machine intelligence (PAMI) within the intricate fabric of a complex system. Numerous real-world cases validate our DT-SA framework.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Prompt Refinement with Image Pivot for Text-to-Image Generation
Authors:
Jingtao Zhan,
Qingyao Ai,
Yiqun Liu,
Yingwei Pan,
Ting Yao,
Jiaxin Mao,
Shaoping Ma,
Tao Mei
Abstract:
For text-to-image generation, automatically refining user-provided natural language prompts into the keyword-enriched prompts favored by systems is essential for the user experience. Such a prompt refinement process is analogous to translating the prompt from "user languages" into "system languages". However, the scarcity of such parallel corpora makes it difficult to train a prompt refinement mod…
▽ More
For text-to-image generation, automatically refining user-provided natural language prompts into the keyword-enriched prompts favored by systems is essential for the user experience. Such a prompt refinement process is analogous to translating the prompt from "user languages" into "system languages". However, the scarcity of such parallel corpora makes it difficult to train a prompt refinement model. Inspired by zero-shot machine translation techniques, we introduce Prompt Refinement with Image Pivot (PRIP). PRIP innovatively uses the latent representation of a user-preferred image as an intermediary "pivot" between the user and system languages. It decomposes the refinement process into two data-rich tasks: inferring representations of user-preferred images from user languages and subsequently translating image representations into system languages. Thus, it can leverage abundant data for training. Extensive experiments show that PRIP substantially outperforms a wide range of baselines and effectively transfers to unseen systems in a zero-shot manner.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
STARD: A Chinese Statute Retrieval Dataset with Real Queries Issued by Non-professionals
Authors:
Weihang Su,
Yiran Hu,
Anzhe Xie,
Qingyao Ai,
Zibing Que,
Ning Zheng,
Yun Liu,
Weixing Shen,
Yiqun Liu
Abstract:
Statute retrieval aims to find relevant statutory articles for specific queries. This process is the basis of a wide range of legal applications such as legal advice, automated judicial decisions, legal document drafting, etc. Existing statute retrieval benchmarks focus on formal and professional queries from sources like bar exams and legal case documents, thereby neglecting non-professional quer…
▽ More
Statute retrieval aims to find relevant statutory articles for specific queries. This process is the basis of a wide range of legal applications such as legal advice, automated judicial decisions, legal document drafting, etc. Existing statute retrieval benchmarks focus on formal and professional queries from sources like bar exams and legal case documents, thereby neglecting non-professional queries from the general public, which often lack precise legal terminology and references. To address this gap, we introduce the STAtute Retrieval Dataset (STARD), a Chinese dataset comprising 1,543 query cases collected from real-world legal consultations and 55,348 candidate statutory articles. Unlike existing statute retrieval datasets, which primarily focus on professional legal queries, STARD captures the complexity and diversity of real queries from the general public. Through a comprehensive evaluation of various retrieval baselines, we reveal that existing retrieval approaches all fall short of these real queries issued by non-professional users. The best method only achieves a Recall@100 of 0.907, suggesting the necessity for further exploration and additional research in this area.
All the codes and datasets are available at: https://github.com/oneal2000/STARD/tree/main
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
EEG-ImageNet: An Electroencephalogram Dataset and Benchmarks with Image Visual Stimuli of Multi-Granularity Labels
Authors:
Shuqi Zhu,
Ziyi Ye,
Qingyao Ai,
Yiqun Liu
Abstract:
Identifying and reconstructing what we see from brain activity gives us a special insight into investigating how the biological visual system represents the world. While recent efforts have achieved high-performance image classification and high-quality image reconstruction from brain signals collected by Functional Magnetic Resonance Imaging (fMRI) or magnetoencephalogram (MEG), the expensiveness…
▽ More
Identifying and reconstructing what we see from brain activity gives us a special insight into investigating how the biological visual system represents the world. While recent efforts have achieved high-performance image classification and high-quality image reconstruction from brain signals collected by Functional Magnetic Resonance Imaging (fMRI) or magnetoencephalogram (MEG), the expensiveness and bulkiness of these devices make relevant applications difficult to generalize to practical applications. On the other hand, Electroencephalography (EEG), despite its advantages of ease of use, cost-efficiency, high temporal resolution, and non-invasive nature, has not been fully explored in relevant studies due to the lack of comprehensive datasets. To address this gap, we introduce EEG-ImageNet, a novel EEG dataset comprising recordings from 16 subjects exposed to 4000 images selected from the ImageNet dataset. EEG-ImageNet consists of 5 times EEG-image pairs larger than existing similar EEG benchmarks. EEG-ImageNet is collected with image stimuli of multi-granularity labels, i.e., 40 images with coarse-grained labels and 40 with fine-grained labels. Based on it, we establish benchmarks for object classification and image reconstruction. Experiments with several commonly used models show that the best models can achieve object classification with accuracy around 60% and image reconstruction with two-way identification around 64%. These results demonstrate the dataset's potential to advance EEG-based visual brain-computer interfaces, understand the visual perception of biological systems, and provide potential applications in improving machine visual models.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Anomalously reduced homogeneous broadening of two-dimensional electronic spectroscopy at high temperature by detailed balance
Authors:
Ru-Qiong Deng,
Cheng-Ge Liu,
Yi-Xuan Yao,
Jing-Yi-Ran Jin,
Hao-Yue Zhang,
Yin Song,
Qing Ai
Abstract:
Dissipation and decoherence of quantum systems in thermal environments is important to various spectroscopies. It is generally believed that dissipation can broaden the line shape of spectroscopies, and thus stronger system-bath interaction can result in more significant homogeneous broadening of two-dimensional electronic spectroscopy (2DES). Here we show that the case can be the opposite in the…
▽ More
Dissipation and decoherence of quantum systems in thermal environments is important to various spectroscopies. It is generally believed that dissipation can broaden the line shape of spectroscopies, and thus stronger system-bath interaction can result in more significant homogeneous broadening of two-dimensional electronic spectroscopy (2DES). Here we show that the case can be the opposite in the regime of electromagnetically induced transparency (EIT). We predict that assisted by EIT, the homogeneous broadening of the 2DES at a higher temperature can be significantly reduced due to the detailed balance. This anomalous effect is due to the long-lasting off-diagonal peaks in 2DES.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study
Authors:
Zechun Niu,
Jiaxin Mao,
Qingyao Ai,
Ji-Rong Wen
Abstract:
Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models. While the CLTR models can be theoretically unbiased when the user behavior assumption is correct and the propensity estimation is accurate, their effectiveness is usually empirically evaluated via simulation-based exp…
▽ More
Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models. While the CLTR models can be theoretically unbiased when the user behavior assumption is correct and the propensity estimation is accurate, their effectiveness is usually empirically evaluated via simulation-based experiments due to a lack of widely-available, large-scale, real click logs. However, the mainstream simulation-based experiments are somewhat limited as they often feature a single, deterministic production ranker and simplified user simulation models to generate the synthetic click logs. As a result, the robustness of CLTR models in complex and diverse situations is largely unknown and needs further investigation.
To address this problem, in this paper, we aim to investigate the robustness of existing CLTR models in a reproducibility study with extensive simulation-based experiments that (1) use both deterministic and stochastic production rankers, each with different ranking performance, and (2) leverage multiple user simulation models with different user behavior assumptions. We find that the DLA models and IPS-DCM show better robustness under various simulation settings than IPS-PBM and PRS with offline propensity estimation. Besides, the existing CLTR models often fail to outperform the naive click baselines when the production ranker has relatively high ranking performance or certain randomness, which suggests an urgent need for developing new CLTR algorithms that work for these settings.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
EEG-SVRec: An EEG Dataset with User Multidimensional Affective Engagement Labels in Short Video Recommendation
Authors:
Shaorun Zhang,
Zhiyu He,
Ziyi Ye,
Peijie Sun,
Qingyao Ai,
Min Zhang,
Yiqun Liu
Abstract:
In recent years, short video platforms have gained widespread popularity, making the quality of video recommendations crucial for retaining users. Existing recommendation systems primarily rely on behavioral data, which faces limitations when inferring user preferences due to issues such as data sparsity and noise from accidental interactions or personal habits. To address these challenges and pro…
▽ More
In recent years, short video platforms have gained widespread popularity, making the quality of video recommendations crucial for retaining users. Existing recommendation systems primarily rely on behavioral data, which faces limitations when inferring user preferences due to issues such as data sparsity and noise from accidental interactions or personal habits. To address these challenges and provide a more comprehensive understanding of user affective experience and cognitive activity, we propose EEG-SVRec, the first EEG dataset with User Multidimensional Affective Engagement Labels in Short Video Recommendation. The study involves 30 participants and collects 3,657 interactions, offering a rich dataset that can be used for a deeper exploration of user preference and cognitive activity. By incorporating selfassessment techniques and real-time, low-cost EEG signals, we offer a more detailed understanding user affective experiences (valence, arousal, immersion, interest, visual and auditory) and the cognitive mechanisms behind their behavior. We establish benchmarks for rating prediction by the recommendation algorithm, showing significant improvement with the inclusion of EEG signals. Furthermore, we demonstrate the potential of this dataset in gaining insights into the affective experience and cognitive activity behind user behaviors in recommender systems. This work presents a novel perspective for enhancing short video recommendation by leveraging the rich information contained in EEG signals and multidimensional affective engagement scores, paving the way for future research in short video recommendation systems.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval
Authors:
Haitao Li,
You Chen,
Zhekai Ge,
Qingyao Ai,
Yiqun Liu,
Quan Zhou,
Shuai Huo
Abstract:
Legal retrieval techniques play an important role in preserving the fairness and equality of the judicial system. As an annually well-known international competition, COLIEE aims to advance the development of state-of-the-art retrieval models for legal texts. This paper elaborates on the methodology employed by the TQM team in COLIEE2024.Specifically, we explored various lexical matching and seman…
▽ More
Legal retrieval techniques play an important role in preserving the fairness and equality of the judicial system. As an annually well-known international competition, COLIEE aims to advance the development of state-of-the-art retrieval models for legal texts. This paper elaborates on the methodology employed by the TQM team in COLIEE2024.Specifically, we explored various lexical matching and semantic retrieval models, with a focus on enhancing the understanding of case relevance. Additionally, we endeavor to integrate various features using the learning-to-rank technique. Furthermore, fine heuristic pre-processing and post-processing methods have been proposed to mitigate irrelevant information. Consequently, our methodology achieved remarkable performance in COLIEE2024, securing first place in Task 1 and third place in Task 3. We anticipate that our proposed approach can contribute valuable insights to the advancement of legal retrieval technology.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Capability-aware Prompt Reformulation Learning for Text-to-Image Generation
Authors:
Jingtao Zhan,
Qingyao Ai,
Yiqun Liu,
Jia Chen,
Shaoping Ma
Abstract:
Text-to-image generation systems have emerged as revolutionary tools in the realm of artistic creation, offering unprecedented ease in transforming textual prompts into visual art. However, the efficacy of these systems is intricately linked to the quality of user-provided prompts, which often poses a challenge to users unfamiliar with prompt crafting. This paper addresses this challenge by levera…
▽ More
Text-to-image generation systems have emerged as revolutionary tools in the realm of artistic creation, offering unprecedented ease in transforming textual prompts into visual art. However, the efficacy of these systems is intricately linked to the quality of user-provided prompts, which often poses a challenge to users unfamiliar with prompt crafting. This paper addresses this challenge by leveraging user reformulation data from interaction logs to develop an automatic prompt reformulation model. Our in-depth analysis of these logs reveals that user prompt reformulation is heavily dependent on the individual user's capability, resulting in significant variance in the quality of reformulation pairs. To effectively use this data for training, we introduce the Capability-aware Prompt Reformulation (CAPR) framework. CAPR innovatively integrates user capability into the reformulation process through two key components: the Conditional Reformulation Model (CRM) and Configurable Capability Features (CCF). CRM reformulates prompts according to a specified user capability, as represented by CCF. The CCF, in turn, offers the flexibility to tune and guide the CRM's behavior. This enables CAPR to effectively learn diverse reformulation strategies across various user capacities and to simulate high-capability user reformulation during inference. Extensive experiments on standard text-to-image generation benchmarks showcase CAPR's superior performance over existing baselines and its remarkable robustness on unseen systems. Furthermore, comprehensive analyses validate the effectiveness of different components. CAPR can facilitate user-friendly interaction with text-to-image systems and make advanced artistic creation more achievable for a broader range of users.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Scaling Laws For Dense Retrieval
Authors:
Yan Fang,
Jingtao Zhan,
Qingyao Ai,
Jiaxin Mao,
Weihang Su,
Jia Chen,
Yiqun Liu
Abstract:
Scaling up neural models has yielded significant advancements in a wide array of tasks, particularly in language generation. Previous studies have found that the performance of neural models frequently adheres to predictable scaling laws, correlated with factors such as training set size and model size. This insight is invaluable, especially as large-scale experiments grow increasingly resource-in…
▽ More
Scaling up neural models has yielded significant advancements in a wide array of tasks, particularly in language generation. Previous studies have found that the performance of neural models frequently adheres to predictable scaling laws, correlated with factors such as training set size and model size. This insight is invaluable, especially as large-scale experiments grow increasingly resource-intensive. Yet, such scaling law has not been fully explored in dense retrieval due to the discrete nature of retrieval metrics and complex relationships between training data and model sizes in retrieval tasks. In this study, we investigate whether the performance of dense retrieval models follows the scaling law as other neural models. We propose to use contrastive log-likelihood as the evaluation metric and conduct extensive experiments with dense retrieval models implemented with different numbers of parameters and trained with different amounts of annotated data. Results indicate that, under our settings, the performance of dense retrieval models follows a precise power-law scaling related to the model size and the number of annotations. Additionally, we examine scaling with prevalent data augmentation methods to assess the impact of annotation quality, and apply the scaling law to find the best resource allocation strategy under a budget constraint. We believe that these insights will significantly contribute to understanding the scaling effect of dense retrieval models and offer meaningful guidance for future research endeavors.
△ Less
Submitted 15 July, 2024; v1 submitted 27 March, 2024;
originally announced March 2024.
-
DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment
Authors:
Haitao Li,
Qingyao Ai,
Xinyan Han,
Jia Chen,
Qian Dong,
Yiqun Liu,
Chong Chen,
Qi Tian
Abstract:
Recent research demonstrates the effectiveness of using pre-trained language models for legal case retrieval. Most of the existing works focus on improving the representation ability for the contextualized embedding of the [CLS] token and calculate relevance using textual semantic similarity. However, in the legal domain, textual semantic similarity does not always imply that the cases are relevan…
▽ More
Recent research demonstrates the effectiveness of using pre-trained language models for legal case retrieval. Most of the existing works focus on improving the representation ability for the contextualized embedding of the [CLS] token and calculate relevance using textual semantic similarity. However, in the legal domain, textual semantic similarity does not always imply that the cases are relevant enough. Instead, relevance in legal cases primarily depends on the similarity of key facts that impact the final judgment. Without proper treatments, the discriminative ability of learned representations could be limited since legal cases are lengthy and contain numerous non-key facts. To this end, we introduce DELTA, a discriminative model designed for legal case retrieval. The basic idea involves pinpointing key facts in legal cases and pulling the contextualized embedding of the [CLS] token closer to the key facts while pushing away from the non-key facts, which can warm up the case embedding space in an unsupervised manner. To be specific, this study brings the word alignment mechanism to the contextual masked auto-encoder. First, we leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability. Second, we employ the deep decoder to enable translation between different structures, with the goal of pinpointing key facts to enhance discriminative ability. Comprehensive experiments conducted on publicly available legal benchmarks show that our approach can outperform existing state-of-the-art methods in legal case retrieval. It provides a new perspective on the in-depth understanding and processing of legal case documents.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models
Authors:
Haitao Li,
Qingyao Ai,
Jia Chen,
Qian Dong,
Zhijing Wu,
Yiqun Liu,
Chong Chen,
Qi Tian
Abstract:
Large Language Models (LLMs) like ChatGPT and GPT-4 are versatile and capable of addressing a diverse range of tasks. However, general LLMs, which are developed on open-domain data, may lack the domain-specific knowledge essential for tasks in vertical domains, such as legal, medical, etc. To address this issue, previous approaches either conduct continuous pre-training with domain-specific data o…
▽ More
Large Language Models (LLMs) like ChatGPT and GPT-4 are versatile and capable of addressing a diverse range of tasks. However, general LLMs, which are developed on open-domain data, may lack the domain-specific knowledge essential for tasks in vertical domains, such as legal, medical, etc. To address this issue, previous approaches either conduct continuous pre-training with domain-specific data or employ retrieval augmentation to support general LLMs. Unfortunately, these strategies are either cost-intensive or unreliable in practical applications. To this end, we present a novel framework named BLADE, which enhances Black-box LArge language models with small Domain-spEcific models. BLADE consists of a black-box LLM and a small domain-specific LM. The small LM preserves domain-specific knowledge and offers specialized insights, while the general LLM contributes robust language comprehension and reasoning capabilities. Specifically, our method involves three steps: 1) pre-training the small LM with domain-specific data, 2) fine-tuning this model using knowledge instruction data, and 3) joint Bayesian optimization of the general LLM and the small LM. Extensive experiments conducted on public legal and medical benchmarks reveal that BLADE significantly outperforms existing approaches. This shows the potential of BLADE as an effective and cost-efficient solution in adapting general LLMs for vertical domains.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Sequential Recommendation with Latent Relations based on Large Language Model
Authors:
Shenghao Yang,
Weizhi Ma,
Peijie Sun,
Qingyao Ai,
Yiqun Liu,
Mingchen Cai,
Min Zhang
Abstract:
Sequential recommender systems predict items that may interest users by modeling their preferences based on historical interactions. Traditional sequential recommendation methods rely on capturing implicit collaborative filtering signals among items. Recent relation-aware sequential recommendation models have achieved promising performance by explicitly incorporating item relations into the modeli…
▽ More
Sequential recommender systems predict items that may interest users by modeling their preferences based on historical interactions. Traditional sequential recommendation methods rely on capturing implicit collaborative filtering signals among items. Recent relation-aware sequential recommendation models have achieved promising performance by explicitly incorporating item relations into the modeling of user historical sequences, where most relations are extracted from knowledge graphs. However, existing methods rely on manually predefined relations and suffer the sparsity issue, limiting the generalization ability in diverse scenarios with varied item relations. In this paper, we propose a novel relation-aware sequential recommendation framework with Latent Relation Discovery (LRD). Different from previous relation-aware models that rely on predefined rules, we propose to leverage the Large Language Model (LLM) to provide new types of relations and connections between items. The motivation is that LLM contains abundant world knowledge, which can be adopted to mine latent relations of items for recommendation. Specifically, inspired by that humans can describe relations between items using natural language, LRD harnesses the LLM that has demonstrated human-like knowledge to obtain language knowledge representations of items. These representations are fed into a latent relation discovery module based on the discrete state variational autoencoder (DVAE). Then the self-supervised relation discovery tasks and recommendation tasks are jointly optimized. Experimental results on multiple public datasets demonstrate our proposed latent relations discovery method can be incorporated with existing relation-aware sequential recommendation models and significantly improve the performance. Further analysis experiments indicate the effectiveness and reliability of the discovered latent relations.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Common Sense Enhanced Knowledge-based Recommendation with Large Language Model
Authors:
Shenghao Yang,
Weizhi Ma,
Peijie Sun,
Min Zhang,
Qingyao Ai,
Yiqun Liu,
Mingchen Cai
Abstract:
Knowledge-based recommendation models effectively alleviate the data sparsity issue leveraging the side information in the knowledge graph, and have achieved considerable performance. Nevertheless, the knowledge graphs used in previous work, namely metadata-based knowledge graphs, are usually constructed based on the attributes of items and co-occurring relations (e.g., also buy), in which the for…
▽ More
Knowledge-based recommendation models effectively alleviate the data sparsity issue leveraging the side information in the knowledge graph, and have achieved considerable performance. Nevertheless, the knowledge graphs used in previous work, namely metadata-based knowledge graphs, are usually constructed based on the attributes of items and co-occurring relations (e.g., also buy), in which the former provides limited information and the latter relies on sufficient interaction data and still suffers from cold start issue. Common sense, as a form of knowledge with generality and universality, can be used as a supplement to the metadata-based knowledge graph and provides a new perspective for modeling users' preferences. Recently, benefiting from the emergent world knowledge of the large language model, efficient acquisition of common sense has become possible. In this paper, we propose a novel knowledge-based recommendation framework incorporating common sense, CSRec, which can be flexibly coupled to existing knowledge-based methods. Considering the challenge of the knowledge gap between the common sense-based knowledge graph and metadata-based knowledge graph, we propose a knowledge fusion approach based on mutual information maximization theory. Experimental results on public datasets demonstrate that our approach significantly improves the performance of existing knowledge-based recommendation models.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
A Situation-aware Enhancer for Personalized Recommendation
Authors:
Jiayu Li,
Peijie Sun,
Chumeng Jiang,
Weizhi Ma,
Qingyao Ai,
Min Zhang
Abstract:
When users interact with Recommender Systems (RecSys), current situations, such as time, location, and environment, significantly influence their preferences. Situations serve as the background for interactions, where relationships between users and items evolve with situation changes. However, existing RecSys treat situations, users, and items on the same level. They can only model the relations…
▽ More
When users interact with Recommender Systems (RecSys), current situations, such as time, location, and environment, significantly influence their preferences. Situations serve as the background for interactions, where relationships between users and items evolve with situation changes. However, existing RecSys treat situations, users, and items on the same level. They can only model the relations between situations and users/items respectively, rather than the dynamic impact of situations on user-item associations (i.e., user preferences). In this paper, we provide a new perspective that takes situations as the preconditions for users' interactions. This perspective allows us to separate situations from user/item representations, and capture situations' influences over the user-item relationship, offering a more comprehensive understanding of situations. Based on it, we propose a novel Situation-Aware Recommender Enhancer (SARE), a pluggable module to integrate situations into various existing RecSys. Since users' perception of situations and situations' impact on preferences are both personalized, SARE includes a Personalized Situation Fusion (PSF) and a User-Conditioned Preference Encoder (UCPE) to model the perception and impact of situations, respectively. We conduct experiments of applying SARE on seven backbones in various settings on two real-world datasets. Experimental results indicate that SARE improves the recommendation performances significantly compared with backbones and SOTA situation-aware baselines.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Improving Legal Case Retrieval with Brain Signals
Authors:
Ruizhe Zhang,
Qingyao Ai,
Ziyi Ye,
Yueyue Wu,
Xiaohui Xie,
Yiqun Liu
Abstract:
The tasks of legal case retrieval have received growing attention from the IR community in the last decade. Relevance feedback techniques with implicit user feedback (e.g., clicks) have been demonstrated to be effective in traditional search tasks (e.g., Web search). In legal case retrieval, however, collecting relevance feedback faces a couple of challenges that are difficult to resolve under exi…
▽ More
The tasks of legal case retrieval have received growing attention from the IR community in the last decade. Relevance feedback techniques with implicit user feedback (e.g., clicks) have been demonstrated to be effective in traditional search tasks (e.g., Web search). In legal case retrieval, however, collecting relevance feedback faces a couple of challenges that are difficult to resolve under existing feedback paradigms. First, legal case retrieval is a complex task as users often need to understand the relationship between legal cases in detail to correctly judge their relevance. Traditional feedback signal such as clicks is too coarse to use as they do not reflect any fine-grained relevance information. Second, legal case documents are usually long, users often need even tens of minutes to read and understand them. Simple behavior signal such as clicks and eye-tracking fixations can hardly be useful when users almost click and examine every part of the document. In this paper, we explore the possibility of solving the feedback problem in legal case retrieval with brain signal. Recent advances in brain signal processing have shown that human emotional can be collected in fine grains through Brain-Machine Interfaces (BMI) without interrupting the users in their tasks. Therefore, we propose a framework for legal case retrieval that uses EEG signal to optimize retrieval results. We collected and create a legal case retrieval dataset with users EEG signal and propose several methods to extract effective EEG features for relevance feedback. Our proposed features achieve a 71% accuracy for feedback prediction with an SVM-RFE model, and our proposed ranking method that takes into account the diverse needs of users can significantly improve user satisfaction for legal case retrieval. Experiment results show that re-ranked result list make user more satisfied.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Evaluation Ethics of LLMs in Legal Domain
Authors:
Ruizhe Zhang,
Haitao Li,
Yueyue Wu,
Qingyao Ai,
Yiqun Liu,
Min Zhang,
Shaoping Ma
Abstract:
In recent years, the utilization of large language models for natural language dialogue has gained momentum, leading to their widespread adoption across various domains. However, their universal competence in addressing challenges specific to specialized fields such as law remains a subject of scrutiny. The incorporation of legal ethics into the model has been overlooked by researchers. We asserts…
▽ More
In recent years, the utilization of large language models for natural language dialogue has gained momentum, leading to their widespread adoption across various domains. However, their universal competence in addressing challenges specific to specialized fields such as law remains a subject of scrutiny. The incorporation of legal ethics into the model has been overlooked by researchers. We asserts that rigorous ethic evaluation is essential to ensure the effective integration of large language models in legal domains, emphasizing the need to assess domain-specific proficiency and domain-specific ethic. To address this, we propose a novelty evaluation methodology, utilizing authentic legal cases to evaluate the fundamental language abilities, specialized legal knowledge and legal robustness of large language models (LLMs). The findings from our comprehensive evaluation contribute significantly to the academic discourse surrounding the suitability and performance of large language models in legal domains.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
DRAGIN: Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models
Authors:
Weihang Su,
Yichen Tang,
Qingyao Ai,
Zhijing Wu,
Yiqun Liu
Abstract:
Dynamic retrieval augmented generation (RAG) paradigm actively decides when and what to retrieve during the text generation process of Large Language Models (LLMs). There are two key elements of this paradigm: identifying the optimal moment to activate the retrieval module (deciding when to retrieve) and crafting the appropriate query once retrieval is triggered (determining what to retrieve). How…
▽ More
Dynamic retrieval augmented generation (RAG) paradigm actively decides when and what to retrieve during the text generation process of Large Language Models (LLMs). There are two key elements of this paradigm: identifying the optimal moment to activate the retrieval module (deciding when to retrieve) and crafting the appropriate query once retrieval is triggered (determining what to retrieve). However, current dynamic RAG methods fall short in both aspects. Firstly, the strategies for deciding when to retrieve often rely on static rules. Moreover, the strategies for deciding what to retrieve typically limit themselves to the LLM's most recent sentence or the last few tokens, while the LLM's real-time information needs may span across the entire context. To overcome these limitations, we introduce a new framework, DRAGIN, i.e., Dynamic Retrieval Augmented Generation based on the real-time Information Needs of LLMs. Our framework is specifically designed to make decisions on when and what to retrieve based on the LLM's real-time information needs during the text generation process. We evaluate DRAGIN along with existing methods comprehensively over 4 knowledge-intensive generation datasets. Experimental results show that DRAGIN achieves superior performance on all tasks, demonstrating the effectiveness of our method. We have open-sourced all the code, data, and models in GitHub: https://github.com/oneal2000/DRAGIN/tree/main
△ Less
Submitted 5 June, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models
Authors:
Weihang Su,
Changyue Wang,
Qingyao Ai,
Yiran HU,
Zhijing Wu,
Yujia Zhou,
Yiqun Liu
Abstract:
Hallucinations in large language models (LLMs) refer to the phenomenon of LLMs producing responses that are coherent yet factually inaccurate. This issue undermines the effectiveness of LLMs in practical applications, necessitating research into detecting and mitigating hallucinations of LLMs. Previous studies have mainly concentrated on post-processing techniques for hallucination detection, whic…
▽ More
Hallucinations in large language models (LLMs) refer to the phenomenon of LLMs producing responses that are coherent yet factually inaccurate. This issue undermines the effectiveness of LLMs in practical applications, necessitating research into detecting and mitigating hallucinations of LLMs. Previous studies have mainly concentrated on post-processing techniques for hallucination detection, which tend to be computationally intensive and limited in effectiveness due to their separation from the LLM's inference process. To overcome these limitations, we introduce MIND, an unsupervised training framework that leverages the internal states of LLMs for real-time hallucination detection without requiring manual annotations. Additionally, we present HELM, a new benchmark for evaluating hallucination detection across multiple LLMs, featuring diverse LLM outputs and the internal states of LLMs during their inference process. Our experiments demonstrate that MIND outperforms existing state-of-the-art methods in hallucination detection.
△ Less
Submitted 10 June, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
Exploring the Impact of Opinion Polarization on Short Video Consumption
Authors:
Bangde Du,
Ziyi Ye,
Zhijing Wu,
Qingyao Ai,
Yiqun Liu
Abstract:
Investigating the increasingly popular domain of short video consumption, this study focuses on the impact of Opinion Polarization (OP), a significant factor in the digital landscape influencing public opinions and social interactions. We analyze OP's effect on viewers' perceptions and behaviors, finding that traditional feedback metrics like likes and watch time fail to fully capture and measure…
▽ More
Investigating the increasingly popular domain of short video consumption, this study focuses on the impact of Opinion Polarization (OP), a significant factor in the digital landscape influencing public opinions and social interactions. We analyze OP's effect on viewers' perceptions and behaviors, finding that traditional feedback metrics like likes and watch time fail to fully capture and measure OP. Addressing this gap, our research utilizes Electroencephalogram (EEG) signals to introduce a novel, non-invasive approach for evaluating neural responses to OP, affecting perception and cognition. Empirical analysis reveals OP's considerable impact on viewers' emotions, evidenced by changes in brain activity. Our findings also highlight the potential of EEG data in predicting exposure to polarized short video content, offering a new perspective on the dynamics of short video consumption and a unique method for quantifying OP's effects.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Gender Biased Legal Case Retrieval System on Users' Decision Process
Authors:
Ruizhe Zhang,
Qingyao Ai,
Yiqun Liu,
Yueyue Wu,
Beining Wang
Abstract:
In the last decade, legal case search has become an important part of a legal practitioner's work. During legal case search, search engines retrieval a number of relevant cases from huge amounts of data and serve them to users. However, it is uncertain whether these cases are gender-biased and whether such bias has impact on user perceptions. We designed a new user experiment framework to simulate…
▽ More
In the last decade, legal case search has become an important part of a legal practitioner's work. During legal case search, search engines retrieval a number of relevant cases from huge amounts of data and serve them to users. However, it is uncertain whether these cases are gender-biased and whether such bias has impact on user perceptions. We designed a new user experiment framework to simulate the judges' reading of relevant cases. 72 participants with backgrounds in legal affairs invited to conduct the experiment. Participants were asked to simulate the role of the judge in conducting a legal case search on 3 assigned cases and determine the sentences of the defendants in these cases. Gender of the defendants in both the task and relevant cases was edited to statistically measure the effect of gender bias in the legal case search results on participants' perceptions. The results showed that gender bias in the legal case search results did not have a significant effect on judges' perceptions.
△ Less
Submitted 25 February, 2024;
originally announced March 2024.
-
Query Augmentation by Decoding Semantics from Brain Signals
Authors:
Ziyi Ye,
Jingtao Zhan,
Qingyao Ai,
Yiqun Liu,
Maarten de Rijke,
Christina Lioma,
Tuukka Ruotsalo
Abstract:
Query augmentation is a crucial technique for refining semantically imprecise queries. Traditionally, query augmentation relies on extracting information from initially retrieved, potentially relevant documents. If the quality of the initially retrieved documents is low, then the effectiveness of query augmentation would be limited as well. We propose Brain-Aug, which enhances a query by incorpora…
▽ More
Query augmentation is a crucial technique for refining semantically imprecise queries. Traditionally, query augmentation relies on extracting information from initially retrieved, potentially relevant documents. If the quality of the initially retrieved documents is low, then the effectiveness of query augmentation would be limited as well. We propose Brain-Aug, which enhances a query by incorporating semantic information decoded from brain signals. BrainAug generates the continuation of the original query with a prompt constructed with brain signal information and a ranking-oriented inference approach. Experimental results on fMRI (functional magnetic resonance imaging) datasets show that Brain-Aug produces semantically more accurate queries, leading to improved document ranking performance. Such improvement brought by brain signals is particularly notable for ambiguous queries.
△ Less
Submitted 3 March, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
PRE: A Peer Review Based Large Language Model Evaluator
Authors:
Zhumin Chu,
Qingyao Ai,
Yiteng Tu,
Haitao Li,
Yiqun Liu
Abstract:
The impressive performance of large language models (LLMs) has attracted considerable attention from the academic and industrial communities. Besides how to construct and train LLMs, how to effectively evaluate and compare the capacity of LLMs has also been well recognized as an important yet difficult problem. Existing paradigms rely on either human annotators or model-based evaluators to evaluat…
▽ More
The impressive performance of large language models (LLMs) has attracted considerable attention from the academic and industrial communities. Besides how to construct and train LLMs, how to effectively evaluate and compare the capacity of LLMs has also been well recognized as an important yet difficult problem. Existing paradigms rely on either human annotators or model-based evaluators to evaluate the performance of LLMs on different tasks. However, these paradigms often suffer from high cost, low generalizability, and inherited biases in practice, which make them incapable of supporting the sustainable development of LLMs in long term. In order to address these issues, inspired by the peer review systems widely used in academic publication process, we propose a novel framework that can automatically evaluate LLMs through a peer-review process. Specifically, for the evaluation of a specific task, we first construct a small qualification exam to select "reviewers" from a couple of powerful LLMs. Then, to actually evaluate the "submissions" written by different candidate LLMs, i.e., the evaluatees, we use the reviewer LLMs to rate or compare the submissions. The final ranking of evaluatee LLMs is generated based on the results provided by all reviewers. We conducted extensive experiments on text summarization tasks with eleven LLMs including GPT-4. The results demonstrate the existence of biasness when evaluating using a single LLM. Also, our PRE model outperforms all the baselines, illustrating the effectiveness of the peer review mechanism.
△ Less
Submitted 3 June, 2024; v1 submitted 28 January, 2024;
originally announced January 2024.
-
Two-Dimensional Electronic Spectroscopy for Three-Level Atoms with Electromagnetically Induced Transparency
Authors:
Jing-Yi-Ran Jin,
Hao-Yue Zhang,
Yi-Xuan Yao,
Qing Ai
Abstract:
Two-dimensional electronic spectroscopy (2DES) has high spectral resolution and is a useful tool for studying atom dynamics. In this paper, we apply the electromagnetically induced transparency (EIT) technique to 2DES in a three-level atom, and find out that the number of peaks (troughs) will become more due to the introduction of EIT. Also, the height of the peaks (the depth of troughs) will chan…
▽ More
Two-dimensional electronic spectroscopy (2DES) has high spectral resolution and is a useful tool for studying atom dynamics. In this paper, we apply the electromagnetically induced transparency (EIT) technique to 2DES in a three-level atom, and find out that the number of peaks (troughs) will become more due to the introduction of EIT. Also, the height of the peaks (the depth of troughs) will change from constant to a damped oscillation. These findings may help us obtain more information about the dynamics of excited states.
△ Less
Submitted 14 January, 2024;
originally announced January 2024.
-
Optically controllable localization of exciton polariton condensates in a potential lattice
Authors:
Qiang Ai,
Jan Wingenbach,
Xinmiao Yang,
Jing Wei,
Zaharias Hatzopoulos,
Pavlos G. Savvidis,
Stefan Schumacher,
Xuekai Ma,
Tingge Gao
Abstract:
Exciton polaritons are inherently non-Hermitian systems with adjustable gain and loss coefficients. In this work we show that exciton polariton condensates can be selectively localized in an optically-induced lattice with equal potential depth by judiciously controlling a second focused pump with a very small size. Specifically, the localized polariton condensate can be tuned among different poten…
▽ More
Exciton polaritons are inherently non-Hermitian systems with adjustable gain and loss coefficients. In this work we show that exciton polariton condensates can be selectively localized in an optically-induced lattice with equal potential depth by judiciously controlling a second focused pump with a very small size. Specifically, the localized polariton condensate can be tuned among different potential traps by adjusting the relative distance between the small pump spot and the potential lattice. The adjustment of the excitation position of the smaller pump and its combination with the bigger pump for the potential creation induce a position-dependent loss distribution across the system. The localization of the exciton polariton condensate and its control are independent of the orientation of the potential lattice, thus, even in slightly disordered system, one can selectively excite such localized polariton condensates. Our results illuminate a path to manipulate the non-Hermitian bosonic condensates in integrated photonic chips.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval
Authors:
Weihang Su,
Qingyao Ai,
Xiangsheng Li,
Jia Chen,
Yiqun Liu,
Xiaolong Wu,
Shengluan Hou
Abstract:
With the development of deep learning and natural language processing techniques, pre-trained language models have been widely used to solve information retrieval (IR) problems. Benefiting from the pre-training and fine-tuning paradigm, these models achieve state-of-the-art performance. In previous works, plain texts in Wikipedia have been widely used in the pre-training stage. However, the rich s…
▽ More
With the development of deep learning and natural language processing techniques, pre-trained language models have been widely used to solve information retrieval (IR) problems. Benefiting from the pre-training and fine-tuning paradigm, these models achieve state-of-the-art performance. In previous works, plain texts in Wikipedia have been widely used in the pre-training stage. However, the rich structured information in Wikipedia, such as the titles, abstracts, hierarchical heading (multi-level title) structure, relationship between articles, references, hyperlink structures, and the writing organizations, has not been fully explored. In this paper, we devise four pre-training objectives tailored for IR tasks based on the structured knowledge of Wikipedia. Compared to existing pre-training methods, our approach can better capture the semantic knowledge in the training corpus by leveraging the human-edited structured data from Wikipedia. Experimental results on multiple IR benchmark datasets show the superior performance of our model in both zero-shot and fine-tuning settings compared to existing strong retrieval baselines. Besides, experimental results in biomedical and legal domains demonstrate that our approach achieves better performance in vertical domains compared to previous models, especially in scenarios where long text similarity matching is needed.
△ Less
Submitted 1 January, 2024; v1 submitted 17 December, 2023;
originally announced December 2023.
-
When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding and Reasoning
Authors:
Qihang Ai,
Jianwu Zhou,
Haiyun Jiang,
Lemao Liu,
Shuming Shi
Abstract:
Graph data is ubiquitous in the physical world, and it has always been a challenge to efficiently model graph structures using a unified paradigm for the understanding and reasoning on various graphs. Moreover, in the era of large language models, integrating complex graph information into text sequences has become exceptionally difficult, which hinders the ability to interact with graph data thro…
▽ More
Graph data is ubiquitous in the physical world, and it has always been a challenge to efficiently model graph structures using a unified paradigm for the understanding and reasoning on various graphs. Moreover, in the era of large language models, integrating complex graph information into text sequences has become exceptionally difficult, which hinders the ability to interact with graph data through natural language instructions.The paper presents a new paradigm for understanding and reasoning about graph data by integrating image encoding and multimodal technologies. This approach enables the comprehension of graph data through an instruction-response format, utilizing GPT-4V's advanced capabilities. The study evaluates this paradigm on various graph types, highlighting the model's strengths and weaknesses, particularly in Chinese OCR performance and complex reasoning tasks. The findings suggest new direction for enhancing graph data processing and natural language interaction.
△ Less
Submitted 16 December, 2023;
originally announced December 2023.
-
Relevance Feedback with Brain Signals
Authors:
Ziyi Ye,
Xiaohui Xie,
Qingyao Ai,
Yiqun Liu,
Zhihong Wang,
Weihang Su,
Min Zhang
Abstract:
The Relevance Feedback (RF) process relies on accurate and real-time relevance estimation of feedback documents to improve retrieval performance. Since collecting explicit relevance annotations imposes an extra burden on the user, extensive studies have explored using pseudo-relevance signals and implicit feedback signals as substitutes. However, such signals are indirect indicators of relevance a…
▽ More
The Relevance Feedback (RF) process relies on accurate and real-time relevance estimation of feedback documents to improve retrieval performance. Since collecting explicit relevance annotations imposes an extra burden on the user, extensive studies have explored using pseudo-relevance signals and implicit feedback signals as substitutes. However, such signals are indirect indicators of relevance and suffer from complex search scenarios where user interactions are absent or biased.
Recently, the advances in portable and high-precision brain-computer interface (BCI) devices have shown the possibility to monitor user's brain activities during search process. Brain signals can directly reflect user's psychological responses to search results and thus it can act as additional and unbiased RF signals. To explore the effectiveness of brain signals in the context of RF, we propose a novel RF framework that combines BCI-based relevance feedback with pseudo-relevance signals and implicit signals to improve the performance of document re-ranking. The experimental results on the user study dataset show that incorporating brain signals leads to significant performance improvement in our RF framework. Besides, we observe that brain signals perform particularly well in several hard search scenarios, especially when implicit signals as feedback are missing or noisy. This reveals when and how to exploit brain signals in the context of RF.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
Exploration of Superposition Theorem in Spectrum Space for Composite Event Analysis in an ADN
Authors:
Xing He,
Qian Ai,
Yuezhong Tang,
Robert Qiu,
Canbing Li
Abstract:
This study presents a formulation of the Superposition Theorem (ST) in the spectrum space, tailored for the analysis of composite events in an active distribution network (ADN). Our formulated ST enables a quantitative analysis on a composite event, uncovering the property of additivity among independent atom events in the spectrum space. This contribution is a significant addition to the existing…
▽ More
This study presents a formulation of the Superposition Theorem (ST) in the spectrum space, tailored for the analysis of composite events in an active distribution network (ADN). Our formulated ST enables a quantitative analysis on a composite event, uncovering the property of additivity among independent atom events in the spectrum space. This contribution is a significant addition to the existing literature and has profound implications in various application scenarios. To accomplish this, we leverage random matrix theory (RMT), specifically the asymptotic empirical spectral distribution, Stieltjes transform, and R transform. These mathematical tools establish a nonlinear, model-free, and unsupervised addition operation in the spectrum space. Comprehensive details, including a related roadmap,theorems, deductions, and proofs, are provided in this work. Case studies, utilizing field data, validate our newly derived ST formulation by demonstrating a remarkable performance. Our ST formulation is model-free, non-linear, non-supervised, theory-guided, and uncertainty-insensitive, making it a valuable asset in the realm of composite event analysis in ADN.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
Quantum Simulation of Bound-State-Enhanced Quantum Metrology
Authors:
Cheng-Ge Liu,
Cong-Wei Lu,
Na-Na Zhang,
Qing Ai
Abstract:
Quantum metrology explores quantum effects to improve the measurement accuracy of some physical quantities beyond the classical limit. However, due to the interaction between the system and the environment, the decoherence can significantly reduce the accuracy of the measurement. Many methods have been proposed to restore the accuracy of the measurement in the long-time limit. Recently, it has bee…
▽ More
Quantum metrology explores quantum effects to improve the measurement accuracy of some physical quantities beyond the classical limit. However, due to the interaction between the system and the environment, the decoherence can significantly reduce the accuracy of the measurement. Many methods have been proposed to restore the accuracy of the measurement in the long-time limit. Recently, it has been found that the bound state can assist the error-free measurement and recover the $t^{-1}$ scaling [K. Bai, Z. Peng, H. G. Luo, and J. H. An, Phys. Rev. Lett. 123, 040402 (2019)]. Here, by using $N$-qubits, we propose a method to simulate the open quantum dynamics of the hybrid system including one atom and coupled resonators. We find that the error of the measurement can vanish as the time increases due to the existence of the bound state. By both analytical and numerical simulations, we prove the $t^{-1}$ scaling of the measurement error can be recovered when there is a bound state in the hybrid system. Interestingly, we observe that there are perfect oscillations which can be used for the evaluation of the atomic transition frequency. For a finite-$N$, the duration of the perfect oscillations doubles as one more qubit is involved.
△ Less
Submitted 2 May, 2024; v1 submitted 23 November, 2023;
originally announced November 2023.
-
Language Generation from Brain Recordings
Authors:
Ziyi Ye,
Qingyao Ai,
Yiqun Liu,
Maarten de Rijke,
Min Zhang,
Christina Lioma,
Tuukka Ruotsalo
Abstract:
Generating human language through non-invasive brain-computer interfaces (BCIs) has the potential to unlock many applications, such as serving disabled patients and improving communication. Currently, however, generating language via BCIs has been previously successful only within a classification setup for selecting pre-generated sentence continuation candidates with the most likely cortical sema…
▽ More
Generating human language through non-invasive brain-computer interfaces (BCIs) has the potential to unlock many applications, such as serving disabled patients and improving communication. Currently, however, generating language via BCIs has been previously successful only within a classification setup for selecting pre-generated sentence continuation candidates with the most likely cortical semantic representation. Inspired by recent research that revealed associations between the brain and the large computational language models, we propose a generative language BCI that utilizes the capacity of a large language model (LLM) jointly with a semantic brain decoder to directly generate language from functional magnetic resonance imaging (fMRI) input. The proposed model can generate coherent language sequences aligned with the semantic content of visual or auditory language stimuli perceived, without prior knowledge of any pre-generated candidates. We compare the language generated from the presented model with a random control, pre-generated language selection approach, and a standard LLM, which generates common coherent text solely based on the next word likelihood according to statistical language training data. The proposed model is found to generate language that is more aligned with semantic stimulus in response to which brain input is sampled. Our findings demonstrate the potential and feasibility of employing BCIs in direct language generation.
△ Less
Submitted 11 March, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Collaborative planning and optimization for electric-thermal-hydrogen-coupled energy systems with portfolio selection of the complete hydrogen energy chain
Authors:
Xinning Yi,
Tianguang Lu,
Yixiao Li,
Qian Ai,
Ran Hao
Abstract:
Under the global low-carbon target, the uneven spatiotemporal distribution of renewable energy resources exacerbates the uncertainty and seasonal power imbalance. Additionally, the issue of an incomplete hydrogen energy chain is widely overlooked in planning models, which hinders the complete analysis of the role of hydrogen in energy systems. Therefore, this paper proposes a high-resolution colla…
▽ More
Under the global low-carbon target, the uneven spatiotemporal distribution of renewable energy resources exacerbates the uncertainty and seasonal power imbalance. Additionally, the issue of an incomplete hydrogen energy chain is widely overlooked in planning models, which hinders the complete analysis of the role of hydrogen in energy systems. Therefore, this paper proposes a high-resolution collaborative planning model for electricity-thermal-hydrogen-coupled energy systems considering both the spatiotemporal distribution characteristics of renewable energy resources and the multi-scale bottom-to-top investment strategy for the complete hydrogen energy chain. Considering the high-resolution system operation flexibility, this paper proposes a hydrogen chain-based fast clustering optimization method that can handle high-dimensional data and multi-time scale operation characteristics. The model optimizes the geographical distribution and capacity configuration of the Northeast China energy system in 2050, with hourly operational characteristics. The planning optimization covered single-energy devices, multi-energy-coupled conversion devices, and electric-hydrogen transmission networks. Last but not least, this paper thoroughly examines the optimal portfolio selection of different hydrogen technologies based on the differences in cost, flexibility, and efficiency. In the Pareto analysis, the proposed model reduces CO2 emissions by 60% with a competitive cost. This paper provides a zero-carbon pathway for multi-energy systems with a cost 4% less than the social cost of carbon $44.6/ton, and the integration of the complete hydrogen energy chain reduces the renewable energy curtailment by 97.0%. Besides, the portfolio selection results indicate that the system favors the SOEC with the highest energy efficiency and the PEMFC with the fastest dynamic response when achieving zero-carbon emissions
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Caseformer: Pre-training for Legal Case Retrieval Based on Inter-Case Distinctions
Authors:
Weihang Su,
Qingyao Ai,
Yueyue Wu,
Yixiao Ma,
Haitao Li,
Yiqun Liu,
Zhijing Wu,
Min Zhang
Abstract:
Legal case retrieval aims to help legal workers find relevant cases related to their cases at hand, which is important for the guarantee of fairness and justice in legal judgments. While recent advances in neural retrieval methods have significantly improved the performance of open-domain retrieval tasks (e.g., Web search), their advantages have not been observed in legal case retrieval due to the…
▽ More
Legal case retrieval aims to help legal workers find relevant cases related to their cases at hand, which is important for the guarantee of fairness and justice in legal judgments. While recent advances in neural retrieval methods have significantly improved the performance of open-domain retrieval tasks (e.g., Web search), their advantages have not been observed in legal case retrieval due to their thirst for annotated data. As annotating large-scale training data in legal domains is prohibitive due to the need for domain expertise, traditional search techniques based on lexical matching such as TF-IDF, BM25, and Query Likelihood are still prevalent in legal case retrieval systems. While previous studies have designed several pre-training methods for IR models in open-domain tasks, these methods are usually suboptimal in legal case retrieval because they cannot understand and capture the key knowledge and data structures in the legal corpus. To this end, we propose a novel pre-training framework named Caseformer that enables the pre-trained models to learn legal knowledge and domain-specific relevance information in legal case retrieval without any human-labeled data. Through three unsupervised learning tasks, Caseformer is able to capture the special language, document structure, and relevance patterns of legal case documents, making it a strong backbone for downstream legal case retrieval tasks. Experimental results show that our model has achieved state-of-the-art performance in both zero-shot and full-data fine-tuning settings. Also, experiments on both Chinese and English legal datasets demonstrate that the effectiveness of Caseformer is language-independent in legal case retrieval.
△ Less
Submitted 2 January, 2024; v1 submitted 1 November, 2023;
originally announced November 2023.
-
LeCaRDv2: A Large-Scale Chinese Legal Case Retrieval Dataset
Authors:
Haitao Li,
Yunqiu Shao,
Yueyue Wu,
Qingyao Ai,
Yixiao Ma,
Yiqun Liu
Abstract:
As an important component of intelligent legal systems, legal case retrieval plays a critical role in ensuring judicial justice and fairness. However, the development of legal case retrieval technologies in the Chinese legal system is restricted by three problems in existing datasets: limited data size, narrow definitions of legal relevance, and naive candidate pooling strategies used in data samp…
▽ More
As an important component of intelligent legal systems, legal case retrieval plays a critical role in ensuring judicial justice and fairness. However, the development of legal case retrieval technologies in the Chinese legal system is restricted by three problems in existing datasets: limited data size, narrow definitions of legal relevance, and naive candidate pooling strategies used in data sampling. To alleviate these issues, we introduce LeCaRDv2, a large-scale Legal Case Retrieval Dataset (version 2). It consists of 800 queries and 55,192 candidates extracted from 4.3 million criminal case documents. To the best of our knowledge, LeCaRDv2 is one of the largest Chinese legal case retrieval datasets, providing extensive coverage of criminal charges. Additionally, we enrich the existing relevance criteria by considering three key aspects: characterization, penalty, procedure. This comprehensive criteria enriches the dataset and may provides a more holistic perspective. Furthermore, we propose a two-level candidate set pooling strategy that effectively identify potential candidates for each query case. It's important to note that all cases in the dataset have been annotated by multiple legal experts specializing in criminal law. Their expertise ensures the accuracy and reliability of the annotations. We evaluate several state-of-the-art retrieval models at LeCaRDv2, demonstrating that there is still significant room for improvement in legal case retrieval. The details of LeCaRDv2 can be found at the anonymous website https://github.com/anonymous1113243/LeCaRDv2.
△ Less
Submitted 26 October, 2023;
originally announced October 2023.
-
Investigating the Influence of Legal Case Retrieval Systems on Users' Decision Process
Authors:
Beining Wang,
Ruizhe Zhang,
Yueyue Wu,
Qingyao Ai,
Min Zhang,
Yiqun Liu
Abstract:
Given a specific query case, legal case retrieval systems aim to retrieve a set of case documents relevant to the case at hand. Previous studies on user behavior analysis have shown that information retrieval (IR) systems can significantly influence users' decisions by presenting results in varying orders and formats. However, whether such influence exists in legal case retrieval remains largely u…
▽ More
Given a specific query case, legal case retrieval systems aim to retrieve a set of case documents relevant to the case at hand. Previous studies on user behavior analysis have shown that information retrieval (IR) systems can significantly influence users' decisions by presenting results in varying orders and formats. However, whether such influence exists in legal case retrieval remains largely unknown. This study presents the first investigation into the influence of legal case retrieval systems on the decision-making process of legal users. We conducted an online user study involving more than ninety participants, and our findings suggest that the result distribution of legal case retrieval systems indeed affect users' judgements on the sentences in cases. Notably, when users are presented with biased results that involve harsher sentences, they tend to impose harsher sentences on the current case as well. This research highlights the importance of optimizing the unbiasedness of legal case retrieval systems.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback
Authors:
Qian Dong,
Yiding Liu,
Qingyao Ai,
Zhijing Wu,
Haitao Li,
Yiqun Liu,
Shuaiqiang Wang,
Dawei Yin,
Shaoping Ma
Abstract:
Large language models (LLMs) have demonstrated remarkable capabilities across various research domains, including the field of Information Retrieval (IR). However, the responses generated by off-the-shelf LLMs tend to be generic, i.e., cannot capture the distinctiveness of each document with similar content. This limits the performance of LLMs in IR because finding and distinguishing relevant docu…
▽ More
Large language models (LLMs) have demonstrated remarkable capabilities across various research domains, including the field of Information Retrieval (IR). However, the responses generated by off-the-shelf LLMs tend to be generic, i.e., cannot capture the distinctiveness of each document with similar content. This limits the performance of LLMs in IR because finding and distinguishing relevant documents from substantial similar documents is a typical problem in many IR tasks. To address this issue, we propose an unsupervised alignment method, namely Reinforcement Learning from Contrastive Feedback (RLCF), empowering LLMs to generate both high-quality and context-specific responses. Our approach constructs unsupervised contrastive feedback signals based on similar document groups, and adopts a reward function, named group-wise reciprocal rank, to optimize LLMs within a standard Proximal Policy Optimization. We conduct extensive experiments to evaluate the effectiveness of RLCF on LLMs built with different languages and parameter sizes on multiple downstream IR applications. RLCF significantly outperforms existing alignment methods, and RLCF-optimized LLMs demonstrate considerable improvement in generating responses with distinctiveness.
△ Less
Submitted 26 March, 2024; v1 submitted 29 September, 2023;
originally announced September 2023.
-
GNN4EEG: A Benchmark and Toolkit for Electroencephalography Classification with Graph Neural Network
Authors:
Kaiyuan Zhang,
Ziyi Ye,
Qingyao Ai,
Xiaohui Xie,
Yiqun Liu
Abstract:
Electroencephalography(EEG) classification is a crucial task in neuroscience, neural engineering, and several commercial applications. Traditional EEG classification models, however, have often overlooked or inadequately leveraged the brain's topological information. Recognizing this shortfall, there has been a burgeoning interest in recent years in harnessing the potential of Graph Neural Network…
▽ More
Electroencephalography(EEG) classification is a crucial task in neuroscience, neural engineering, and several commercial applications. Traditional EEG classification models, however, have often overlooked or inadequately leveraged the brain's topological information. Recognizing this shortfall, there has been a burgeoning interest in recent years in harnessing the potential of Graph Neural Networks (GNN) to exploit the topological information by modeling features selected from each EEG channel in a graph structure. To further facilitate research in this direction, we introduce GNN4EEG, a versatile and user-friendly toolkit for GNN-based modeling of EEG signals. GNN4EEG comprises three components: (i)A large benchmark constructed with four EEG classification tasks based on EEG data collected from 123 participants. (ii)Easy-to-use implementations on various state-of-the-art GNN-based EEG classification models, e.g., DGCNN, RGNN, etc. (iii)Implementations of comprehensive experimental settings and evaluation protocols, e.g., data splitting protocols, and cross-validation protocols. GNN4EEG is publicly released at https://github.com/Miracle-2001/GNN4EEG.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
An Intent Taxonomy of Legal Case Retrieval
Authors:
Yunqiu Shao,
Haitao Li,
Yueyue Wu,
Yiqun Liu,
Qingyao Ai,
Jiaxin Mao,
Yixiao Ma,
Shaoping Ma
Abstract:
Legal case retrieval is a special Information Retrieval~(IR) task focusing on legal case documents. Depending on the downstream tasks of the retrieved case documents, users' information needs in legal case retrieval could be significantly different from those in Web search and traditional ad-hoc retrieval tasks. While there are several studies that retrieve legal cases based on text similarity, th…
▽ More
Legal case retrieval is a special Information Retrieval~(IR) task focusing on legal case documents. Depending on the downstream tasks of the retrieved case documents, users' information needs in legal case retrieval could be significantly different from those in Web search and traditional ad-hoc retrieval tasks. While there are several studies that retrieve legal cases based on text similarity, the underlying search intents of legal retrieval users, as shown in this paper, are more complicated than that yet mostly unexplored. To this end, we present a novel hierarchical intent taxonomy of legal case retrieval. It consists of five intent types categorized by three criteria, i.e., search for Particular Case(s), Characterization, Penalty, Procedure, and Interest. The taxonomy was constructed transparently and evaluated extensively through interviews, editorial user studies, and query log analysis. Through a laboratory user study, we reveal significant differences in user behavior and satisfaction under different search intents in legal case retrieval. Furthermore, we apply the proposed taxonomy to various downstream legal retrieval tasks, e.g., result ranking and satisfaction prediction, and demonstrate its effectiveness. Our work provides important insights into the understanding of user intents in legal case retrieval and potentially leads to better retrieval techniques in the legal domain, such as intent-aware ranking strategies and evaluation methodologies.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Universal quantum gates by nonadiabatic holonomic evolution for the surface electron
Authors:
Jun Wang,
Wan-Ting He,
Hai-Bo Wang,
Qing Ai
Abstract:
The nonadiabatic holonomic quantum computation based on the geometric phase is robust against the built-in noise and decoherence. In this work, we theoretically propose a scheme to realize nonadiabatic holonomic quantum gates in a surface electron system, which is a promising two-dimensional platform for quantum computation. The holonomic gate is realized by a three-level structure that combines t…
▽ More
The nonadiabatic holonomic quantum computation based on the geometric phase is robust against the built-in noise and decoherence. In this work, we theoretically propose a scheme to realize nonadiabatic holonomic quantum gates in a surface electron system, which is a promising two-dimensional platform for quantum computation. The holonomic gate is realized by a three-level structure that combines the Rydberg states and spin states via an inhomogeneous magnetic field. After a cyclic evolution, the computation bases pick up different geometric phases and thus perform a geometric gate. Only the electron with spin up experiences the geometric gate, while the electron with spin down is decoupled from the state-selective driving fields. The arbitrary controlled-U gate encoded on the Rydberg states and spin states can then be realized. The fidelity of the output state exceeds 0.99 with experimentally achievable parameters.
△ Less
Submitted 29 October, 2023; v1 submitted 19 July, 2023;
originally announced July 2023.
-
Information Retrieval Meets Large Language Models: A Strategic Report from Chinese IR Community
Authors:
Qingyao Ai,
Ting Bai,
Zhao Cao,
Yi Chang,
Jiawei Chen,
Zhumin Chen,
Zhiyong Cheng,
Shoubin Dong,
Zhicheng Dou,
Fuli Feng,
Shen Gao,
Jiafeng Guo,
Xiangnan He,
Yanyan Lan,
Chenliang Li,
Yiqun Liu,
Ziyu Lyu,
Weizhi Ma,
Jun Ma,
Zhaochun Ren,
Pengjie Ren,
Zhiqiang Wang,
Mingwen Wang,
Ji-Rong Wen,
Le Wu
, et al. (8 additional authors not shown)
Abstract:
The research field of Information Retrieval (IR) has evolved significantly, expanding beyond traditional search to meet diverse user information needs. Recently, Large Language Models (LLMs) have demonstrated exceptional capabilities in text understanding, generation, and knowledge inference, opening up exciting avenues for IR research. LLMs not only facilitate generative retrieval but also offer…
▽ More
The research field of Information Retrieval (IR) has evolved significantly, expanding beyond traditional search to meet diverse user information needs. Recently, Large Language Models (LLMs) have demonstrated exceptional capabilities in text understanding, generation, and knowledge inference, opening up exciting avenues for IR research. LLMs not only facilitate generative retrieval but also offer improved solutions for user understanding, model evaluation, and user-system interactions. More importantly, the synergistic relationship among IR models, LLMs, and humans forms a new technical paradigm that is more powerful for information seeking. IR models provide real-time and relevant information, LLMs contribute internal knowledge, and humans play a central role of demanders and evaluators to the reliability of information services. Nevertheless, significant challenges exist, including computational costs, credibility concerns, domain-specific limitations, and ethical considerations. To thoroughly discuss the transformative impact of LLMs on IR research, the Chinese IR community conducted a strategic workshop in April 2023, yielding valuable insights. This paper provides a summary of the workshop's outcomes, including the rethinking of IR's core values, the mutual enhancement of LLMs and IR, the proposal of a novel IR technical paradigm, and open challenges.
△ Less
Submitted 26 July, 2023; v1 submitted 19 July, 2023;
originally announced July 2023.
-
Quantum metrology in complex systems and experimental verification by quantum simulation
Authors:
Qing Ai,
Yang-Yang Wang,
Jing Qiu
Abstract:
Quantum metrology based on quantum entanglement and quantum coherence improves the accuracy of measurement. In this paper, we briefly review the schemes of quantum metrology in various complex systems, including non-Markovian noise, correlated noise, quantum critical system. On the other hand, the booming development of quantum information allows us to utilize quantum simulation experiments to tes…
▽ More
Quantum metrology based on quantum entanglement and quantum coherence improves the accuracy of measurement. In this paper, we briefly review the schemes of quantum metrology in various complex systems, including non-Markovian noise, correlated noise, quantum critical system. On the other hand, the booming development of quantum information allows us to utilize quantum simulation experiments to test the feasibility of various theoretical schemes and demonstrate the rich physical phenomena in complex systems, such as bound states in one-dimensional coupled cavity arrays, single-photon switches and routers.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon
Authors:
Kevin Maik Jablonka,
Qianxiang Ai,
Alexander Al-Feghali,
Shruti Badhwar,
Joshua D. Bocarsly,
Andres M Bran,
Stefan Bringuier,
L. Catherine Brinson,
Kamal Choudhary,
Defne Circi,
Sam Cox,
Wibe A. de Jong,
Matthew L. Evans,
Nicolas Gastellu,
Jerome Genzling,
María Victoria Gil,
Ankur K. Gupta,
Zhi Hong,
Alishba Imran,
Sabine Kruschwitz,
Anne Labarre,
Jakub Lála,
Tao Liu,
Steven Ma,
Sauradeep Majumdar
, et al. (28 additional authors not shown)
Abstract:
Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon.
This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of mole…
▽ More
Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon.
This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of molecules and materials, designing novel interfaces for tools, extracting knowledge from unstructured data, and developing new educational applications.
The diverse topics and the fact that working prototypes could be generated in less than two days highlight that LLMs will profoundly impact the future of our fields. The rich collection of ideas and projects also indicates that the applications of LLMs are not limited to materials science and chemistry but offer potential benefits to a wide range of scientific disciplines.
△ Less
Submitted 14 July, 2023; v1 submitted 9 June, 2023;
originally announced June 2023.
-
I^3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage Retrieval
Authors:
Qian Dong,
Yiding Liu,
Qingyao Ai,
Haitao Li,
Shuaiqiang Wang,
Yiqun Liu,
Dawei Yin,
Shaoping Ma
Abstract:
Passage retrieval is a fundamental task in many information systems, such as web search and question answering, where both efficiency and effectiveness are critical concerns. In recent years, neural retrievers based on pre-trained language models (PLM), such as dual-encoders, have achieved huge success. Yet, studies have found that the performance of dual-encoders are often limited due to the negl…
▽ More
Passage retrieval is a fundamental task in many information systems, such as web search and question answering, where both efficiency and effectiveness are critical concerns. In recent years, neural retrievers based on pre-trained language models (PLM), such as dual-encoders, have achieved huge success. Yet, studies have found that the performance of dual-encoders are often limited due to the neglecting of the interaction information between queries and candidate passages. Therefore, various interaction paradigms have been proposed to improve the performance of vanilla dual-encoders. Particularly, recent state-of-the-art methods often introduce late-interaction during the model inference process. However, such late-interaction based methods usually bring extensive computation and storage cost on large corpus. Despite their effectiveness, the concern of efficiency and space footprint is still an important factor that limits the application of interaction-based neural retrieval models. To tackle this issue, we incorporate implicit interaction into dual-encoders, and propose I^3 retriever. In particular, our implicit interaction paradigm leverages generated pseudo-queries to simulate query-passage interaction, which jointly optimizes with query and passage encoders in an end-to-end manner. It can be fully pre-computed and cached, and its inference process only involves simple dot product operation of the query vector and passage vector, which makes it as efficient as the vanilla dual encoders. We conduct comprehensive experiments on MSMARCO and TREC2019 Deep Learning Datasets, demonstrating the I^3 retriever's superiority in terms of both effectiveness and efficiency. Moreover, the proposed implicit interaction is compatible with special pre-training and knowledge distillation for passage retrieval, which brings a new state-of-the-art performance.
△ Less
Submitted 19 March, 2024; v1 submitted 4 June, 2023;
originally announced June 2023.
-
FARA: Future-aware Ranking Algorithm for Fairness Optimization
Authors:
Tao Yang,
Zhichao Xu,
Zhenduo Wang,
Qingyao Ai
Abstract:
Ranking systems are the key components of modern Information Retrieval (IR) applications, such as search engines and recommender systems. Besides the ranking relevance to users, the exposure fairness to item providers has also been considered an important factor in ranking optimization. Many fair ranking algorithms have been proposed to jointly optimize both ranking relevance and fairness. However…
▽ More
Ranking systems are the key components of modern Information Retrieval (IR) applications, such as search engines and recommender systems. Besides the ranking relevance to users, the exposure fairness to item providers has also been considered an important factor in ranking optimization. Many fair ranking algorithms have been proposed to jointly optimize both ranking relevance and fairness. However, we find that most existing fair ranking methods adopt greedy algorithms that only optimize rankings for the next immediate session or request. As shown in this paper, such a myopic paradigm could limit the upper bound of ranking optimization and lead to suboptimal performance in the long term.
To this end, we propose \textbf{FARA}, a novel \textbf{F}uture-\textbf{A}ware \textbf{R}anking \textbf{A}lgorithm for ranking relevance and fairness optimization. Instead of greedily optimizing rankings for the next immediate session, FARA plans ahead by jointly optimizing multiple ranklists together and saving them for future sessions. Specifically, FARA first uses the Taylor expansion to investigate how future ranklists will influence the overall fairness of the system. Then, based on the analysis of the Taylor expansion, FARA adopts a two-phase optimization algorithm where we first solve an optimal future exposure planning problem and then construct the optimal ranklists according to the optimal future exposure planning. Theoretically, we show that FARA is optimal for ranking relevance and fairness joint optimization. Empirically, our extensive experiments on three semi-synthesized datasets show that FARA is efficient, effective, and can deliver significantly better ranking performance compared to state-of-the-art fair ranking methods. We make our implementation public at \href{https://github.com/Taosheng-ty/QP_fairness/}{https://github.com/Taosheng-ty/QP\_fairness/}.
△ Less
Submitted 18 August, 2023; v1 submitted 26 May, 2023;
originally announced May 2023.
-
Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes Approach
Authors:
Tao Yang,
Cuize Han,
Chen Luo,
Parth Gupta,
Jeff M. Phillips,
Qingyao Ai
Abstract:
Ranking is at the core of many artificial intelligence (AI) applications, including search engines, recommender systems, etc. Modern ranking systems are often constructed with learning-to-rank (LTR) models built from user behavior signals. While previous studies have demonstrated the effectiveness of using user behavior signals (e.g., clicks) as both features and labels of LTR algorithms, we argue…
▽ More
Ranking is at the core of many artificial intelligence (AI) applications, including search engines, recommender systems, etc. Modern ranking systems are often constructed with learning-to-rank (LTR) models built from user behavior signals. While previous studies have demonstrated the effectiveness of using user behavior signals (e.g., clicks) as both features and labels of LTR algorithms, we argue that existing LTR algorithms that indiscriminately treat behavior and non-behavior signals in input features could lead to suboptimal performance in practice. Particularly because user behavior signals often have strong correlations with the ranking objective and can only be collected on items that have already been shown to users, directly using behavior signals in LTR could create an exploitation bias that hurts the system performance in the long run.
To address the exploitation bias, we propose EBRank, an empirical Bayes-based uncertainty-aware ranking algorithm. Specifically, to overcome exploitation bias brought by behavior features in ranking models, EBRank uses a sole non-behavior feature based prior model to get a prior estimation of relevance. In the dynamic training and serving of ranking systems, EBRank uses the observed user behaviors to update posterior relevance estimation instead of concatenating behaviors as features in ranking models. Besides, EBRank additionally applies an uncertainty-aware exploration strategy to explore actively, collect user behaviors for empirical Bayesian modeling and improve ranking performance. Experiments on three public datasets show that EBRank is effective, practical and significantly outperforms state-of-the-art ranking algorithms.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Unconfounded Propensity Estimation for Unbiased Ranking
Authors:
Dan Luo,
Lixin Zou,
Qingyao Ai,
Zhiyu Chen,
Chenliang Li,
Dawei Yin,
Brian D. Davison
Abstract:
The goal of unbiased learning to rank (ULTR) is to leverage implicit user feedback for optimizing learning-to-rank systems. Among existing solutions, automatic ULTR algorithms that jointly learn user bias models (i.e., propensity models) with unbiased rankers have received a lot of attention due to their superior performance and low deployment cost in practice. Despite their theoretical soundness,…
▽ More
The goal of unbiased learning to rank (ULTR) is to leverage implicit user feedback for optimizing learning-to-rank systems. Among existing solutions, automatic ULTR algorithms that jointly learn user bias models (i.e., propensity models) with unbiased rankers have received a lot of attention due to their superior performance and low deployment cost in practice. Despite their theoretical soundness, the effectiveness is usually justified under a weak logging policy, where the ranking model can barely rank documents according to their relevance to the query. However, when the logging policy is strong, e.g., an industry-deployed ranking policy, the reported effectiveness cannot be reproduced. In this paper, we first investigate ULTR from a causal perspective and uncover a negative result: existing ULTR algorithms fail to address the issue of propensity overestimation caused by the query-document relevance confounder. Then, we propose a new learning objective based on backdoor adjustment and highlight its differences from conventional propensity models, which reveal the prevalence of propensity overestimation. On top of that, we introduce a novel propensity model called Logging-Policy-aware Propensity (LPP) model and its distinctive two-step optimization strategy, which allows for the joint learning of LPP and ranking models within the automatic ULTR framework, and actualize the unconfounded propensity estimation for ULTR. Extensive experiments on two benchmarks demonstrate the effectiveness and generalizability of the proposed method.
△ Less
Submitted 8 July, 2023; v1 submitted 16 May, 2023;
originally announced May 2023.
-
THUIR@COLIEE 2023: More Parameters and Legal Knowledge for Legal Case Entailment
Authors:
Haitao Li,
Changyue Wang,
Weihang Su,
Yueyue Wu,
Qingyao Ai,
Yiqun Liu
Abstract:
This paper describes the approach of the THUIR team at the COLIEE 2023 Legal Case Entailment task. This task requires the participant to identify a specific paragraph from a given supporting case that entails the decision for the query case. We try traditional lexical matching methods and pre-trained language models with different sizes. Furthermore, learning-to-rank methods are employed to furthe…
▽ More
This paper describes the approach of the THUIR team at the COLIEE 2023 Legal Case Entailment task. This task requires the participant to identify a specific paragraph from a given supporting case that entails the decision for the query case. We try traditional lexical matching methods and pre-trained language models with different sizes. Furthermore, learning-to-rank methods are employed to further improve performance. However, learning-to-rank is not very robust on this task. which suggests that answer passages cannot simply be determined with information retrieval techniques. Experimental results show that more parameters and legal knowledge contribute to the legal case entailment task. Finally, we get the third place in COLIEE 2023. The implementation of our method can be found at https://github.com/CSHaitao/THUIR-COLIEE2023.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
THUIR@COLIEE 2023: Incorporating Structural Knowledge into Pre-trained Language Models for Legal Case Retrieval
Authors:
Haitao Li,
Weihang Su,
Changyue Wang,
Yueyue Wu,
Qingyao Ai,
Yiqun Liu
Abstract:
Legal case retrieval techniques play an essential role in modern intelligent legal systems. As an annually well-known international competition, COLIEE is aiming to achieve the state-of-the-art retrieval model for legal texts. This paper summarizes the approach of the championship team THUIR in COLIEE 2023. To be specific, we design structure-aware pre-trained language models to enhance the unders…
▽ More
Legal case retrieval techniques play an essential role in modern intelligent legal systems. As an annually well-known international competition, COLIEE is aiming to achieve the state-of-the-art retrieval model for legal texts. This paper summarizes the approach of the championship team THUIR in COLIEE 2023. To be specific, we design structure-aware pre-trained language models to enhance the understanding of legal cases. Furthermore, we propose heuristic pre-processing and post-processing approaches to reduce the influence of irrelevant messages. In the end, learning-to-rank methods are employed to merge features with different dimensions. Experimental results demonstrate the superiority of our proposal. Official results show that our run has the best performance among all submissions. The implementation of our method can be found at https://github.com/CSHaitao/THUIR-COLIEE2023.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case Encoding
Authors:
Yixiao Ma,
Yueyue Wu,
Weihang Su,
Qingyao Ai,
Yiqun Liu
Abstract:
Legal case retrieval is a critical process for modern legal information systems. While recent studies have utilized pre-trained language models (PLMs) based on the general domain self-supervised pre-training paradigm to build models for legal case retrieval, there are limitations in using general domain PLMs as backbones. Specifically, these models may not fully capture the underlying legal featur…
▽ More
Legal case retrieval is a critical process for modern legal information systems. While recent studies have utilized pre-trained language models (PLMs) based on the general domain self-supervised pre-training paradigm to build models for legal case retrieval, there are limitations in using general domain PLMs as backbones. Specifically, these models may not fully capture the underlying legal features in legal case documents. To address this issue, we propose CaseEncoder, a legal document encoder that leverages fine-grained legal knowledge in both the data sampling and pre-training phases. In the data sampling phase, we enhance the quality of the training data by utilizing fine-grained law article information to guide the selection of positive and negative examples. In the pre-training phase, we design legal-specific pre-training tasks that align with the judging criteria of relevant legal cases. Based on these tasks, we introduce an innovative loss function called Biased Circle Loss to enhance the model's ability to recognize case relevance in fine grains. Experimental results on multiple benchmarks demonstrate that CaseEncoder significantly outperforms both existing general pre-training models and legal-specific pre-training models in zero-shot legal case retrieval.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.