Skip to main content

Showing 1–50 of 184 results for author: Ai, Q

  1. arXiv:2407.09417  [pdf, other

    cs.CL cs.IR

    Mitigating Entity-Level Hallucination in Large Language Models

    Authors: Weihang Su, Yichen Tang, Qingyao Ai, Changyue Wang, Zhijing Wu, Yiqun Liu

    Abstract: The emergence of Large Language Models (LLMs) has revolutionized how users access information, shifting from traditional search engines to direct question-and-answer interactions with LLMs. However, the widespread adoption of LLMs has revealed a significant challenge known as hallucination, wherein LLMs generate coherent yet factually inaccurate responses. This hallucination phenomenon has led to… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.08919  [pdf, other

    cs.NI cs.ET eess.SP

    Redefinition of Digital Twin and its Situation Awareness Framework Designing Towards Fourth Paradigm for Energy Internet of Things

    Authors: Xing He, Yuezhong Tang, Shuyan Ma, Qian Ai, Fei Tao, Robert Qiu

    Abstract: Traditional knowledge-based situation awareness (SA) modes struggle to adapt to the escalating complexity of today's Energy Internet of Things (EIoT), necessitating a pivotal paradigm shift. In response, this work introduces a pioneering data-driven SA framework, termed digital twin-based situation awareness (DT-SA), aiming to bridge existing gaps between data and demands, and further to enhance S… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 16 pages, 15 figures Accepted by IEEE Transactions on Systems, Man and Cybernetics: Systems

  3. arXiv:2407.00247  [pdf, other

    cs.CV

    Prompt Refinement with Image Pivot for Text-to-Image Generation

    Authors: Jingtao Zhan, Qingyao Ai, Yiqun Liu, Yingwei Pan, Ting Yao, Jiaxin Mao, Shaoping Ma, Tao Mei

    Abstract: For text-to-image generation, automatically refining user-provided natural language prompts into the keyword-enriched prompts favored by systems is essential for the user experience. Such a prompt refinement process is analogous to translating the prompt from "user languages" into "system languages". However, the scarcity of such parallel corpora makes it difficult to train a prompt refinement mod… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: Accepted by ACL 2024

  4. arXiv:2406.15313  [pdf, other

    cs.IR cs.CL

    STARD: A Chinese Statute Retrieval Dataset with Real Queries Issued by Non-professionals

    Authors: Weihang Su, Yiran Hu, Anzhe Xie, Qingyao Ai, Zibing Que, Ning Zheng, Yun Liu, Weixing Shen, Yiqun Liu

    Abstract: Statute retrieval aims to find relevant statutory articles for specific queries. This process is the basis of a wide range of legal applications such as legal advice, automated judicial decisions, legal document drafting, etc. Existing statute retrieval benchmarks focus on formal and professional queries from sources like bar exams and legal case documents, thereby neglecting non-professional quer… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  5. arXiv:2406.07151  [pdf, other

    cs.MM cs.AI cs.IR

    EEG-ImageNet: An Electroencephalogram Dataset and Benchmarks with Image Visual Stimuli of Multi-Granularity Labels

    Authors: Shuqi Zhu, Ziyi Ye, Qingyao Ai, Yiqun Liu

    Abstract: Identifying and reconstructing what we see from brain activity gives us a special insight into investigating how the biological visual system represents the world. While recent efforts have achieved high-performance image classification and high-quality image reconstruction from brain signals collected by Functional Magnetic Resonance Imaging (fMRI) or magnetoencephalogram (MEG), the expensiveness… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  6. Anomalously reduced homogeneous broadening of two-dimensional electronic spectroscopy at high temperature by detailed balance

    Authors: Ru-Qiong Deng, Cheng-Ge Liu, Yi-Xuan Yao, Jing-Yi-Ran Jin, Hao-Yue Zhang, Yin Song, Qing Ai

    Abstract: Dissipation and decoherence of quantum systems in thermal environments is important to various spectroscopies. It is generally believed that dissipation can broaden the line shape of spectroscopies, and thus stronger system-bath interaction can result in more significant homogeneous broadening of two-dimensional electronic spectroscopy (2DES). Here we show that the case can be the opposite in the… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 10 pages, 6 figures

    Journal ref: Phys. Rev. A 109, 052801 (2024)

  7. arXiv:2404.03707  [pdf, other

    cs.LG cs.AI cs.IR

    Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study

    Authors: Zechun Niu, Jiaxin Mao, Qingyao Ai, Ji-Rong Wen

    Abstract: Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models. While the CLTR models can be theoretically unbiased when the user behavior assumption is correct and the propensity estimation is accurate, their effectiveness is usually empirically evaluated via simulation-based exp… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  8. arXiv:2404.01008  [pdf, other

    cs.IR

    EEG-SVRec: An EEG Dataset with User Multidimensional Affective Engagement Labels in Short Video Recommendation

    Authors: Shaorun Zhang, Zhiyu He, Ziyi Ye, Peijie Sun, Qingyao Ai, Min Zhang, Yiqun Liu

    Abstract: In recent years, short video platforms have gained widespread popularity, making the quality of video recommendations crucial for retaining users. Existing recommendation systems primarily rely on behavioral data, which faces limitations when inferring user preferences due to issues such as data sparsity and noise from accidental interactions or personal habits. To address these challenges and pro… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  9. arXiv:2404.00947  [pdf, other

    cs.IR

    Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval

    Authors: Haitao Li, You Chen, Zhekai Ge, Qingyao Ai, Yiqun Liu, Quan Zhou, Shuai Huo

    Abstract: Legal retrieval techniques play an important role in preserving the fairness and equality of the judicial system. As an annually well-known international competition, COLIEE aims to advance the development of state-of-the-art retrieval models for legal texts. This paper elaborates on the methodology employed by the TQM team in COLIEE2024.Specifically, we explored various lexical matching and seman… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 16 pages

  10. arXiv:2403.19716  [pdf, other

    cs.CL cs.AI cs.CV cs.IR

    Capability-aware Prompt Reformulation Learning for Text-to-Image Generation

    Authors: Jingtao Zhan, Qingyao Ai, Yiqun Liu, Jia Chen, Shaoping Ma

    Abstract: Text-to-image generation systems have emerged as revolutionary tools in the realm of artistic creation, offering unprecedented ease in transforming textual prompts into visual art. However, the efficacy of these systems is intricately linked to the quality of user-provided prompts, which often poses a challenge to users unfamiliar with prompt crafting. This paper addresses this challenge by levera… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted at SIGIR 2024

  11. arXiv:2403.18684  [pdf, other

    cs.IR cs.CL

    Scaling Laws For Dense Retrieval

    Authors: Yan Fang, Jingtao Zhan, Qingyao Ai, Jiaxin Mao, Weihang Su, Jia Chen, Yiqun Liu

    Abstract: Scaling up neural models has yielded significant advancements in a wide array of tasks, particularly in language generation. Previous studies have found that the performance of neural models frequently adheres to predictable scaling laws, correlated with factors such as training set size and model size. This insight is invaluable, especially as large-scale experiments grow increasingly resource-in… ▽ More

    Submitted 15 July, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted at SIGIR 2024. V2 fixes a bug in the experiments

  12. arXiv:2403.18435  [pdf, other

    cs.IR cs.CL

    DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment

    Authors: Haitao Li, Qingyao Ai, Xinyan Han, Jia Chen, Qian Dong, Yiqun Liu, Chong Chen, Qi Tian

    Abstract: Recent research demonstrates the effectiveness of using pre-trained language models for legal case retrieval. Most of the existing works focus on improving the representation ability for the contextualized embedding of the [CLS] token and calculate relevance using textual semantic similarity. However, in the legal domain, textual semantic similarity does not always imply that the cases are relevan… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 11 pages

  13. arXiv:2403.18365  [pdf, other

    cs.CL

    BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models

    Authors: Haitao Li, Qingyao Ai, Jia Chen, Qian Dong, Zhijing Wu, Yiqun Liu, Chong Chen, Qi Tian

    Abstract: Large Language Models (LLMs) like ChatGPT and GPT-4 are versatile and capable of addressing a diverse range of tasks. However, general LLMs, which are developed on open-domain data, may lack the domain-specific knowledge essential for tasks in vertical domains, such as legal, medical, etc. To address this issue, previous approaches either conduct continuous pre-training with domain-specific data o… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 11pages

  14. arXiv:2403.18348  [pdf, other

    cs.IR

    Sequential Recommendation with Latent Relations based on Large Language Model

    Authors: Shenghao Yang, Weizhi Ma, Peijie Sun, Qingyao Ai, Yiqun Liu, Mingchen Cai, Min Zhang

    Abstract: Sequential recommender systems predict items that may interest users by modeling their preferences based on historical interactions. Traditional sequential recommendation methods rely on capturing implicit collaborative filtering signals among items. Recent relation-aware sequential recommendation models have achieved promising performance by explicitly incorporating item relations into the modeli… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted by SIGIR 2024

  15. arXiv:2403.18325  [pdf, other

    cs.IR

    Common Sense Enhanced Knowledge-based Recommendation with Large Language Model

    Authors: Shenghao Yang, Weizhi Ma, Peijie Sun, Min Zhang, Qingyao Ai, Yiqun Liu, Mingchen Cai

    Abstract: Knowledge-based recommendation models effectively alleviate the data sparsity issue leveraging the side information in the knowledge graph, and have achieved considerable performance. Nevertheless, the knowledge graphs used in previous work, namely metadata-based knowledge graphs, are usually constructed based on the attributes of items and co-occurring relations (e.g., also buy), in which the for… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted by DASFAA 2024

  16. arXiv:2403.18317  [pdf, other

    cs.IR

    A Situation-aware Enhancer for Personalized Recommendation

    Authors: Jiayu Li, Peijie Sun, Chumeng Jiang, Weizhi Ma, Qingyao Ai, Min Zhang

    Abstract: When users interact with Recommender Systems (RecSys), current situations, such as time, location, and environment, significantly influence their preferences. Situations serve as the background for interactions, where relationships between users and items evolve with situation changes. However, existing RecSys treat situations, users, and items on the same level. They can only model the relations… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted at the International Conference on Database Systems for Advanced Applications (DASFAA 2024)

  17. arXiv:2403.13242  [pdf, other

    cs.IR

    Improving Legal Case Retrieval with Brain Signals

    Authors: Ruizhe Zhang, Qingyao Ai, Ziyi Ye, Yueyue Wu, Xiaohui Xie, Yiqun Liu

    Abstract: The tasks of legal case retrieval have received growing attention from the IR community in the last decade. Relevance feedback techniques with implicit user feedback (e.g., clicks) have been demonstrated to be effective in traditional search tasks (e.g., Web search). In legal case retrieval, however, collecting relevance feedback faces a couple of challenges that are difficult to resolve under exi… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 11pages, 8 figures

  18. arXiv:2403.11152  [pdf, other

    cs.CL cs.AI

    Evaluation Ethics of LLMs in Legal Domain

    Authors: Ruizhe Zhang, Haitao Li, Yueyue Wu, Qingyao Ai, Yiqun Liu, Min Zhang, Shaoping Ma

    Abstract: In recent years, the utilization of large language models for natural language dialogue has gained momentum, leading to their widespread adoption across various domains. However, their universal competence in addressing challenges specific to specialized fields such as law remains a subject of scrutiny. The incorporation of legal ethics into the model has been overlooked by researchers. We asserts… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: 10 pages, in processing of ACL 2024

  19. arXiv:2403.10081  [pdf, other

    cs.CL cs.IR

    DRAGIN: Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models

    Authors: Weihang Su, Yichen Tang, Qingyao Ai, Zhijing Wu, Yiqun Liu

    Abstract: Dynamic retrieval augmented generation (RAG) paradigm actively decides when and what to retrieve during the text generation process of Large Language Models (LLMs). There are two key elements of this paradigm: identifying the optimal moment to activate the retrieval module (deciding when to retrieve) and crafting the appropriate query once retrieval is triggered (determining what to retrieve). How… ▽ More

    Submitted 5 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  20. arXiv:2403.06448  [pdf, other

    cs.CL cs.AI

    Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models

    Authors: Weihang Su, Changyue Wang, Qingyao Ai, Yiran HU, Zhijing Wu, Yujia Zhou, Yiqun Liu

    Abstract: Hallucinations in large language models (LLMs) refer to the phenomenon of LLMs producing responses that are coherent yet factually inaccurate. This issue undermines the effectiveness of LLMs in practical applications, necessitating research into detecting and mitigating hallucinations of LLMs. Previous studies have mainly concentrated on post-processing techniques for hallucination detection, whic… ▽ More

    Submitted 10 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

  21. arXiv:2403.04184  [pdf, other

    cs.SI cs.CY

    Exploring the Impact of Opinion Polarization on Short Video Consumption

    Authors: Bangde Du, Ziyi Ye, Zhijing Wu, Qingyao Ai, Yiqun Liu

    Abstract: Investigating the increasingly popular domain of short video consumption, this study focuses on the impact of Opinion Polarization (OP), a significant factor in the digital landscape influencing public opinions and social interactions. We analyze OP's effect on viewers' perceptions and behaviors, finding that traditional feedback metrics like likes and watch time fail to fully capture and measure… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 9 pages, 8 figures

    MSC Class: 92C55 ACM Class: H.5.2; K.4.2; J.4

  22. arXiv:2403.00814  [pdf

    cs.IR cs.CY cs.HC

    Gender Biased Legal Case Retrieval System on Users' Decision Process

    Authors: Ruizhe Zhang, Qingyao Ai, Yiqun Liu, Yueyue Wu, Beining Wang

    Abstract: In the last decade, legal case search has become an important part of a legal practitioner's work. During legal case search, search engines retrieval a number of relevant cases from huge amounts of data and serve them to users. However, it is uncertain whether these cases are gender-biased and whether such bias has impact on user perceptions. We designed a new user experiment framework to simulate… ▽ More

    Submitted 25 February, 2024; originally announced March 2024.

    Comments: 10pages, in Chinese language. Accepted by CCIR 2023

  23. arXiv:2402.15708  [pdf, other

    cs.CL cs.AI cs.IR

    Query Augmentation by Decoding Semantics from Brain Signals

    Authors: Ziyi Ye, Jingtao Zhan, Qingyao Ai, Yiqun Liu, Maarten de Rijke, Christina Lioma, Tuukka Ruotsalo

    Abstract: Query augmentation is a crucial technique for refining semantically imprecise queries. Traditionally, query augmentation relies on extracting information from initially retrieved, potentially relevant documents. If the quality of the initially retrieved documents is low, then the effectiveness of query augmentation would be limited as well. We propose Brain-Aug, which enhances a query by incorpora… ▽ More

    Submitted 3 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  24. arXiv:2401.15641  [pdf, other

    cs.IR cs.CL

    PRE: A Peer Review Based Large Language Model Evaluator

    Authors: Zhumin Chu, Qingyao Ai, Yiteng Tu, Haitao Li, Yiqun Liu

    Abstract: The impressive performance of large language models (LLMs) has attracted considerable attention from the academic and industrial communities. Besides how to construct and train LLMs, how to effectively evaluate and compare the capacity of LLMs has also been well recognized as an important yet difficult problem. Existing paradigms rely on either human annotators or model-based evaluators to evaluat… ▽ More

    Submitted 3 June, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

    Comments: 11 pages

  25. arXiv:2401.07424  [pdf, ps, other

    quant-ph physics.atom-ph physics.optics

    Two-Dimensional Electronic Spectroscopy for Three-Level Atoms with Electromagnetically Induced Transparency

    Authors: Jing-Yi-Ran Jin, Hao-Yue Zhang, Yi-Xuan Yao, Qing Ai

    Abstract: Two-dimensional electronic spectroscopy (2DES) has high spectral resolution and is a useful tool for studying atom dynamics. In this paper, we apply the electromagnetically induced transparency (EIT) technique to 2DES in a three-level atom, and find out that the number of peaks (troughs) will become more due to the introduction of EIT. Also, the height of the peaks (the depth of troughs) will chan… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

    Comments: 8 pages, 10 figures

  26. arXiv:2401.03625  [pdf, other

    physics.optics cond-mat.quant-gas

    Optically controllable localization of exciton polariton condensates in a potential lattice

    Authors: Qiang Ai, Jan Wingenbach, Xinmiao Yang, Jing Wei, Zaharias Hatzopoulos, Pavlos G. Savvidis, Stefan Schumacher, Xuekai Ma, Tingge Gao

    Abstract: Exciton polaritons are inherently non-Hermitian systems with adjustable gain and loss coefficients. In this work we show that exciton polariton condensates can be selectively localized in an optically-induced lattice with equal potential depth by judiciously controlling a second focused pump with a very small size. Specifically, the localized polariton condensate can be tuned among different poten… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

  27. arXiv:2312.10661  [pdf, other

    cs.IR cs.AI

    Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval

    Authors: Weihang Su, Qingyao Ai, Xiangsheng Li, Jia Chen, Yiqun Liu, Xiaolong Wu, Shengluan Hou

    Abstract: With the development of deep learning and natural language processing techniques, pre-trained language models have been widely used to solve information retrieval (IR) problems. Benefiting from the pre-training and fine-tuning paradigm, these models achieve state-of-the-art performance. In previous works, plain texts in Wikipedia have been widely used in the pre-training stage. However, the rich s… ▽ More

    Submitted 1 January, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

  28. arXiv:2312.10372  [pdf, other

    cs.AI

    When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding and Reasoning

    Authors: Qihang Ai, Jianwu Zhou, Haiyun Jiang, Lemao Liu, Shuming Shi

    Abstract: Graph data is ubiquitous in the physical world, and it has always been a challenge to efficiently model graph structures using a unified paradigm for the understanding and reasoning on various graphs. Moreover, in the era of large language models, integrating complex graph information into text sequences has become exceptionally difficult, which hinders the ability to interact with graph data thro… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: 15 pages, 10 figures, 9 tables

  29. arXiv:2312.05669  [pdf, other

    cs.AI cs.IR

    Relevance Feedback with Brain Signals

    Authors: Ziyi Ye, Xiaohui Xie, Qingyao Ai, Yiqun Liu, Zhihong Wang, Weihang Su, Min Zhang

    Abstract: The Relevance Feedback (RF) process relies on accurate and real-time relevance estimation of feedback documents to improve retrieval performance. Since collecting explicit relevance annotations imposes an extra burden on the user, extensive studies have explored using pseudo-relevance signals and implicit feedback signals as substitutes. However, such signals are indirect indicators of relevance a… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

  30. Exploration of Superposition Theorem in Spectrum Space for Composite Event Analysis in an ADN

    Authors: Xing He, Qian Ai, Yuezhong Tang, Robert Qiu, Canbing Li

    Abstract: This study presents a formulation of the Superposition Theorem (ST) in the spectrum space, tailored for the analysis of composite events in an active distribution network (ADN). Our formulated ST enables a quantitative analysis on a composite event, uncovering the property of additivity among independent atom events in the spectrum space. This contribution is a significant addition to the existing… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: 12 pages. Accepted by IEEE TPWRS

  31. Quantum Simulation of Bound-State-Enhanced Quantum Metrology

    Authors: Cheng-Ge Liu, Cong-Wei Lu, Na-Na Zhang, Qing Ai

    Abstract: Quantum metrology explores quantum effects to improve the measurement accuracy of some physical quantities beyond the classical limit. However, due to the interaction between the system and the environment, the decoherence can significantly reduce the accuracy of the measurement. Many methods have been proposed to restore the accuracy of the measurement in the long-time limit. Recently, it has bee… ▽ More

    Submitted 2 May, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

    Comments: 9 pages,9 figures

    Journal ref: Phys. Rev. A 109, 042623 (2024)

  32. arXiv:2311.09889  [pdf, other

    cs.CL

    Language Generation from Brain Recordings

    Authors: Ziyi Ye, Qingyao Ai, Yiqun Liu, Maarten de Rijke, Min Zhang, Christina Lioma, Tuukka Ruotsalo

    Abstract: Generating human language through non-invasive brain-computer interfaces (BCIs) has the potential to unlock many applications, such as serving disabled patients and improving communication. Currently, however, generating language via BCIs has been previously successful only within a classification setup for selecting pre-generated sentence continuation candidates with the most likely cortical sema… ▽ More

    Submitted 11 March, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Preprint. Under Submission

  33. arXiv:2311.07891  [pdf

    eess.SY

    Collaborative planning and optimization for electric-thermal-hydrogen-coupled energy systems with portfolio selection of the complete hydrogen energy chain

    Authors: Xinning Yi, Tianguang Lu, Yixiao Li, Qian Ai, Ran Hao

    Abstract: Under the global low-carbon target, the uneven spatiotemporal distribution of renewable energy resources exacerbates the uncertainty and seasonal power imbalance. Additionally, the issue of an incomplete hydrogen energy chain is widely overlooked in planning models, which hinders the complete analysis of the role of hydrogen in energy systems. Therefore, this paper proposes a high-resolution colla… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: 32 pages, 17 figures

  34. arXiv:2311.00333  [pdf, other

    cs.IR

    Caseformer: Pre-training for Legal Case Retrieval Based on Inter-Case Distinctions

    Authors: Weihang Su, Qingyao Ai, Yueyue Wu, Yixiao Ma, Haitao Li, Yiqun Liu, Zhijing Wu, Min Zhang

    Abstract: Legal case retrieval aims to help legal workers find relevant cases related to their cases at hand, which is important for the guarantee of fairness and justice in legal judgments. While recent advances in neural retrieval methods have significantly improved the performance of open-domain retrieval tasks (e.g., Web search), their advantages have not been observed in legal case retrieval due to the… ▽ More

    Submitted 2 January, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

  35. arXiv:2310.17609  [pdf, other

    cs.CL cs.IR

    LeCaRDv2: A Large-Scale Chinese Legal Case Retrieval Dataset

    Authors: Haitao Li, Yunqiu Shao, Yueyue Wu, Qingyao Ai, Yixiao Ma, Yiqun Liu

    Abstract: As an important component of intelligent legal systems, legal case retrieval plays a critical role in ensuring judicial justice and fairness. However, the development of legal case retrieval technologies in the Chinese legal system is restricted by three problems in existing datasets: limited data size, narrow definitions of legal relevance, and naive candidate pooling strategies used in data samp… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  36. arXiv:2310.04735  [pdf, other

    cs.IR

    Investigating the Influence of Legal Case Retrieval Systems on Users' Decision Process

    Authors: Beining Wang, Ruizhe Zhang, Yueyue Wu, Qingyao Ai, Min Zhang, Yiqun Liu

    Abstract: Given a specific query case, legal case retrieval systems aim to retrieve a set of case documents relevant to the case at hand. Previous studies on user behavior analysis have shown that information retrieval (IR) systems can significantly influence users' decisions by presenting results in varying orders and formats. However, whether such influence exists in legal case retrieval remains largely u… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

  37. arXiv:2309.17078  [pdf, other

    cs.IR

    Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback

    Authors: Qian Dong, Yiding Liu, Qingyao Ai, Zhijing Wu, Haitao Li, Yiqun Liu, Shuaiqiang Wang, Dawei Yin, Shaoping Ma

    Abstract: Large language models (LLMs) have demonstrated remarkable capabilities across various research domains, including the field of Information Retrieval (IR). However, the responses generated by off-the-shelf LLMs tend to be generic, i.e., cannot capture the distinctiveness of each document with similar content. This limits the performance of LLMs in IR because finding and distinguishing relevant docu… ▽ More

    Submitted 26 March, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: Accepted by SIGIR24

  38. arXiv:2309.15515  [pdf, other

    cs.LG cs.MM

    GNN4EEG: A Benchmark and Toolkit for Electroencephalography Classification with Graph Neural Network

    Authors: Kaiyuan Zhang, Ziyi Ye, Qingyao Ai, Xiaohui Xie, Yiqun Liu

    Abstract: Electroencephalography(EEG) classification is a crucial task in neuroscience, neural engineering, and several commercial applications. Traditional EEG classification models, however, have often overlooked or inadequately leveraged the brain's topological information. Recognizing this shortfall, there has been a burgeoning interest in recent years in harnessing the potential of Graph Neural Network… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  39. arXiv:2307.13298  [pdf, other

    cs.IR cs.CL

    An Intent Taxonomy of Legal Case Retrieval

    Authors: Yunqiu Shao, Haitao Li, Yueyue Wu, Yiqun Liu, Qingyao Ai, Jiaxin Mao, Yixiao Ma, Shaoping Ma

    Abstract: Legal case retrieval is a special Information Retrieval~(IR) task focusing on legal case documents. Depending on the downstream tasks of the retrieved case documents, users' information needs in legal case retrieval could be significantly different from those in Web search and traditional ad-hoc retrieval tasks. While there are several studies that retrieve legal cases based on text similarity, th… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: 28 pages, work in process

  40. Universal quantum gates by nonadiabatic holonomic evolution for the surface electron

    Authors: Jun Wang, Wan-Ting He, Hai-Bo Wang, Qing Ai

    Abstract: The nonadiabatic holonomic quantum computation based on the geometric phase is robust against the built-in noise and decoherence. In this work, we theoretically propose a scheme to realize nonadiabatic holonomic quantum gates in a surface electron system, which is a promising two-dimensional platform for quantum computation. The holonomic gate is realized by a three-level structure that combines t… ▽ More

    Submitted 29 October, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

    Journal ref: Front. Phys. 12, 1348804 (2024)

  41. arXiv:2307.09751  [pdf, other

    cs.IR cs.AI

    Information Retrieval Meets Large Language Models: A Strategic Report from Chinese IR Community

    Authors: Qingyao Ai, Ting Bai, Zhao Cao, Yi Chang, Jiawei Chen, Zhumin Chen, Zhiyong Cheng, Shoubin Dong, Zhicheng Dou, Fuli Feng, Shen Gao, Jiafeng Guo, Xiangnan He, Yanyan Lan, Chenliang Li, Yiqun Liu, Ziyu Lyu, Weizhi Ma, Jun Ma, Zhaochun Ren, Pengjie Ren, Zhiqiang Wang, Mingwen Wang, Ji-Rong Wen, Le Wu , et al. (8 additional authors not shown)

    Abstract: The research field of Information Retrieval (IR) has evolved significantly, expanding beyond traditional search to meet diverse user information needs. Recently, Large Language Models (LLMs) have demonstrated exceptional capabilities in text understanding, generation, and knowledge inference, opening up exciting avenues for IR research. LLMs not only facilitate generative retrieval but also offer… ▽ More

    Submitted 26 July, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: 17 pages

  42. arXiv:2307.02005  [pdf

    quant-ph physics.optics

    Quantum metrology in complex systems and experimental verification by quantum simulation

    Authors: Qing Ai, Yang-Yang Wang, Jing Qiu

    Abstract: Quantum metrology based on quantum entanglement and quantum coherence improves the accuracy of measurement. In this paper, we briefly review the schemes of quantum metrology in various complex systems, including non-Markovian noise, correlated noise, quantum critical system. On the other hand, the booming development of quantum information allows us to utilize quantum simulation experiments to tes… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: 10 pages, in Chinese language 7 figures

    Journal ref: Journal of Beijing Normal University(Natural Science), 2023, 59(6): 869-877

  43. arXiv:2306.06283  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.chem-ph

    14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon

    Authors: Kevin Maik Jablonka, Qianxiang Ai, Alexander Al-Feghali, Shruti Badhwar, Joshua D. Bocarsly, Andres M Bran, Stefan Bringuier, L. Catherine Brinson, Kamal Choudhary, Defne Circi, Sam Cox, Wibe A. de Jong, Matthew L. Evans, Nicolas Gastellu, Jerome Genzling, María Victoria Gil, Ankur K. Gupta, Zhi Hong, Alishba Imran, Sabine Kruschwitz, Anne Labarre, Jakub Lála, Tao Liu, Steven Ma, Sauradeep Majumdar , et al. (28 additional authors not shown)

    Abstract: Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon. This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of mole… ▽ More

    Submitted 14 July, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

  44. arXiv:2306.02371  [pdf, other

    cs.IR

    I^3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage Retrieval

    Authors: Qian Dong, Yiding Liu, Qingyao Ai, Haitao Li, Shuaiqiang Wang, Yiqun Liu, Dawei Yin, Shaoping Ma

    Abstract: Passage retrieval is a fundamental task in many information systems, such as web search and question answering, where both efficiency and effectiveness are critical concerns. In recent years, neural retrievers based on pre-trained language models (PLM), such as dual-encoders, have achieved huge success. Yet, studies have found that the performance of dual-encoders are often limited due to the negl… ▽ More

    Submitted 19 March, 2024; v1 submitted 4 June, 2023; originally announced June 2023.

    Comments: 10 pages

  45. FARA: Future-aware Ranking Algorithm for Fairness Optimization

    Authors: Tao Yang, Zhichao Xu, Zhenduo Wang, Qingyao Ai

    Abstract: Ranking systems are the key components of modern Information Retrieval (IR) applications, such as search engines and recommender systems. Besides the ranking relevance to users, the exposure fairness to item providers has also been considered an important factor in ranking optimization. Many fair ranking algorithms have been proposed to jointly optimize both ranking relevance and fairness. However… ▽ More

    Submitted 18 August, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 11 pages, four figures, four tables. CIKM2023

  46. arXiv:2305.16606  [pdf, other

    cs.IR

    Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes Approach

    Authors: Tao Yang, Cuize Han, Chen Luo, Parth Gupta, Jeff M. Phillips, Qingyao Ai

    Abstract: Ranking is at the core of many artificial intelligence (AI) applications, including search engines, recommender systems, etc. Modern ranking systems are often constructed with learning-to-rank (LTR) models built from user behavior signals. While previous studies have demonstrated the effectiveness of using user behavior signals (e.g., clicks) as both features and labels of LTR algorithms, we argue… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  47. arXiv:2305.09918  [pdf, ps, other

    cs.IR

    Unconfounded Propensity Estimation for Unbiased Ranking

    Authors: Dan Luo, Lixin Zou, Qingyao Ai, Zhiyu Chen, Chenliang Li, Dawei Yin, Brian D. Davison

    Abstract: The goal of unbiased learning to rank (ULTR) is to leverage implicit user feedback for optimizing learning-to-rank systems. Among existing solutions, automatic ULTR algorithms that jointly learn user bias models (i.e., propensity models) with unbiased rankers have received a lot of attention due to their superior performance and low deployment cost in practice. Despite their theoretical soundness,… ▽ More

    Submitted 8 July, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: 11 pages, 5 figures

  48. arXiv:2305.06817  [pdf, other

    cs.CL cs.IR

    THUIR@COLIEE 2023: More Parameters and Legal Knowledge for Legal Case Entailment

    Authors: Haitao Li, Changyue Wang, Weihang Su, Yueyue Wu, Qingyao Ai, Yiqun Liu

    Abstract: This paper describes the approach of the THUIR team at the COLIEE 2023 Legal Case Entailment task. This task requires the participant to identify a specific paragraph from a given supporting case that entails the decision for the query case. We try traditional lexical matching methods and pre-trained language models with different sizes. Furthermore, learning-to-rank methods are employed to furthe… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

    Comments: COLIEE 2023

  49. arXiv:2305.06812  [pdf, other

    cs.IR cs.CL

    THUIR@COLIEE 2023: Incorporating Structural Knowledge into Pre-trained Language Models for Legal Case Retrieval

    Authors: Haitao Li, Weihang Su, Changyue Wang, Yueyue Wu, Qingyao Ai, Yiqun Liu

    Abstract: Legal case retrieval techniques play an essential role in modern intelligent legal systems. As an annually well-known international competition, COLIEE is aiming to achieve the state-of-the-art retrieval model for legal texts. This paper summarizes the approach of the championship team THUIR in COLIEE 2023. To be specific, we design structure-aware pre-trained language models to enhance the unders… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

    Comments: COLIEE 2023

  50. arXiv:2305.05393  [pdf, other

    cs.IR cs.CL

    CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case Encoding

    Authors: Yixiao Ma, Yueyue Wu, Weihang Su, Qingyao Ai, Yiqun Liu

    Abstract: Legal case retrieval is a critical process for modern legal information systems. While recent studies have utilized pre-trained language models (PLMs) based on the general domain self-supervised pre-training paradigm to build models for legal case retrieval, there are limitations in using general domain PLMs as backbones. Specifically, these models may not fully capture the underlying legal featur… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: 5 pages