Skip to main content

Showing 1–50 of 332 results for author: Dou, Z

  1. arXiv:2407.03720  [pdf, other

    cs.IR cs.CL

    Query-oriented Data Augmentation for Session Search

    Authors: Haonan Chen, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen

    Abstract: Modeling contextual information in a search session has drawn more and more attention when understanding complex user intents. Recent methods are all data-driven, i.e., they train different models on large-scale search log data to identify the relevance between search contexts and candidate documents. The common training paradigm is to pair the search context with different candidate documents and… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: TKDE 2024

  2. arXiv:2407.01964  [pdf, other

    cs.CL

    Enabling Discriminative Reasoning in LLMs for Legal Judgment Prediction

    Authors: Chenlong Deng, Kelong Mao, Yuyao Zhang, Zhicheng Dou

    Abstract: Legal judgment prediction is essential for enhancing judicial efficiency. In this work, we identify that existing large language models (LLMs) underperform in this domain due to challenges in understanding case complexities and distinguishing between similar charges. To adapt LLMs for effective legal judgment prediction, we introduce the Ask-Discriminate-Predict (ADAPT) reasoning framework inspire… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: repo: https://github.com/ChenlongDeng/ADAPT

  3. arXiv:2406.19853  [pdf, other

    cs.CL cs.AI

    YuLan: An Open-source Large Language Model

    Authors: Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou , et al. (13 additional authors not shown)

    Abstract: Large language models (LLMs) have become the foundation of many applications, leveraging their extensive capabilities in processing and understanding natural language. While many open-source LLMs have been released with technical reports, the lack of training details hinders further research and development. This paper presents the development of YuLan, a series of open-source LLMs with $12$ billi… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  4. arXiv:2406.19760  [pdf, other

    cs.IR cs.CL

    Learning Interpretable Legal Case Retrieval via Knowledge-Guided Case Reformulation

    Authors: Chenlong Deng, Kelong Mao, Zhicheng Dou

    Abstract: Legal case retrieval for sourcing similar cases is critical in upholding judicial fairness. Different from general web search, legal case retrieval involves processing lengthy, complex, and highly specialized legal documents. Existing methods in this domain often overlook the incorporation of legal expert knowledge, which is crucial for accurately understanding and modeling legal cases, leading to… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  5. arXiv:2406.18676  [pdf, other

    cs.CL cs.AI cs.LG

    Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation

    Authors: Guanting Dong, Yutao Zhu, Chenghao Zhang, Zechen Wang, Zhicheng Dou, Ji-Rong Wen

    Abstract: Retrieval-augmented generation (RAG) has demonstrated effectiveness in mitigating the hallucination problem of large language models (LLMs). However, the difficulty of aligning the retriever with the diverse LLMs' knowledge preferences inevitably poses an inevitable challenge in developing a reliable RAG system. To address this issue, we propose DPA-RAG, a universal framework designed to align div… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Work in progress

  6. arXiv:2406.17988  [pdf, other

    cs.CV

    DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image

    Authors: Qingxuan Wu, Zhiyang Dou, Sirui Xu, Soshi Shimada, Chen Wang, Zhengming Yu, Yuan Liu, Cheng Lin, Zeyu Cao, Taku Komura, Vladislav Golyanik, Christian Theobalt, Wenping Wang, Lingjie Liu

    Abstract: Reconstructing 3D hand-face interactions with deformations from a single image is a challenging yet crucial task with broad applications in AR, VR, and gaming. The challenges stem from self-occlusions during single-view hand-face interactions, diverse spatial relationships between hands and face, complex deformations, and the ambiguity of the single-view setting. The first and only method for hand… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 23 pages, 9 figures, 3 tables

  7. arXiv:2406.16332  [pdf, other

    cs.IR cs.CL

    DemoRank: Selecting Effective Demonstrations for Large Language Models in Ranking Task

    Authors: Wenhan Liu, Yutao Zhu, Zhicheng Dou

    Abstract: Recently, there has been increasing interest in applying large language models (LLMs) as zero-shot passage rankers. However, few studies have explored how to select appropriate in-context demonstrations for the passage ranking task, which is the focus of this paper. Previous studies mainly apply a demonstration retriever to retrieve demonstrations and use top-$k$ demonstrations for in-context lear… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  8. arXiv:2406.16213  [pdf, other

    cs.LG

    Provable Statistical Rates for Consistency Diffusion Models

    Authors: Zehao Dou, Minshuo Chen, Mengdi Wang, Zhuoran Yang

    Abstract: Diffusion models have revolutionized various application domains, including computer vision and audio generation. Despite the state-of-the-art performance, diffusion models are known for their slow sample generation due to the extensive number of steps involved. In response, consistency models have been developed to merge multiple steps in the sampling process, thereby significantly boosting the s… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 28 pages, 2 figures

  9. arXiv:2406.12566  [pdf, other

    cs.CL

    RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation

    Authors: Shuting Wang, Xin Yu, Mang Wang, Weipeng Chen, Yutao Zhu, Zhicheng Dou

    Abstract: Retrieval-augmented generation (RAG) effectively addresses issues of static knowledge and hallucination in large language models. Existing studies mostly focus on question scenarios with clear user intents and concise answers. However, it is prevalent that users issue broad, open-ended queries with diverse sub-intents, for which they desire rich and long-form answers covering multiple relevant asp… ▽ More

    Submitted 21 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  10. arXiv:2406.10744  [pdf, other

    cs.CV

    Technique Report of CVPR 2024 PBDL Challenges

    Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu , et al. (75 additional authors not shown)

    Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More

    Submitted 12 July, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 PBDL Challenges: https://pbdl-ws.github.io/pbdl2024/challenge/index.html

  11. arXiv:2406.10367  [pdf, other

    cs.LG

    Disentangled Hyperbolic Representation Learning for Heterogeneous Graphs

    Authors: Qijie Bai, Changli Nie, Haiwei Zhang, Zhicheng Dou, Xiaojie Yuan

    Abstract: Heterogeneous graphs have attracted a lot of research interests recently due to the success for representing complex real-world systems. However, existing methods have two pain points in embedding them into low-dimensional spaces: the mixing of structural and semantic information, and the distributional mismatch between data and embedding spaces. These two challenges require representation methods… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  12. arXiv:2406.05654  [pdf, other

    cs.CL cs.IR

    DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation

    Authors: Shuting Wang, Jiongnan Liu, Shiren Song, Jiehan Cheng, Yuqi Fu, Peidong Guo, Kun Fang, Yutao Zhu, Zhicheng Dou

    Abstract: Retrieval-Augmented Generation (RAG) offers a promising solution to address various limitations of Large Language Models (LLMs), such as hallucination and difficulties in keeping up with real-time updates. This approach is particularly critical in expert and domain-specific applications where LLMs struggle to cover expert knowledge. Therefore, evaluating RAG models in such scenarios is crucial, ye… ▽ More

    Submitted 16 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

  13. arXiv:2406.01495  [pdf, other

    cs.CL

    Re-ReST: Reflection-Reinforced Self-Training for Language Agents

    Authors: Zi-Yi Dou, Cheng-Fu Yang, Xueqing Wu, Kai-Wei Chang, Nanyun Peng

    Abstract: Finetuning language agents with reasoning-action trajectories is effective, but obtaining these trajectories from human annotations or stronger models is costly and sometimes impractical. In this paper, we investigate the use of self-training in language agents, which can generate supervision from the agent itself, offering a promising alternative without relying on human or stronger model demonst… ▽ More

    Submitted 7 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  14. arXiv:2405.19670  [pdf, other

    cs.CL

    One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models

    Authors: Yutao Zhu, Zhaoheng Huang, Zhicheng Dou, Ji-Rong Wen

    Abstract: Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs) for generating more factual, accurate, and up-to-date content. Existing methods either optimize prompts to guide LLMs in leveraging retrieved information or directly fine-tune LLMs to adapt to RAG scenarios. Although fine-tuning can yield better performance, it often compromises the LLMs' general genera… ▽ More

    Submitted 8 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: working in progress, repo: https://github.com/DaoD/SPRING/

  15. arXiv:2405.19315  [pdf, other

    cs.CV cs.CL cs.LG

    Matryoshka Query Transformer for Large Vision-Language Models

    Authors: Wenbo Hu, Zi-Yi Dou, Liunian Harold Li, Amita Kamath, Nanyun Peng, Kai-Wei Chang

    Abstract: Large Vision-Language Models (LVLMs) typically encode an image into a fixed number of visual tokens (e.g., 576) and process these tokens with a language model. Despite their strong performance, LVLMs face challenges in adapting to varying computational constraints. This raises the question: can we achieve flexibility in the number of visual tokens to suit different tasks and computational resource… ▽ More

    Submitted 6 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: Preprint. Our code and model are publicly available at https://github.com/gordonhu608/MQT-LLaVA

  16. arXiv:2405.16888  [pdf, other

    cs.GR cs.CV

    Part123: Part-aware 3D Reconstruction from a Single-view Image

    Authors: Anran Liu, Cheng Lin, Yuan Liu, Xiaoxiao Long, Zhiyang Dou, Hao-Xiang Guo, Ping Luo, Wenping Wang

    Abstract: Recently, the emergence of diffusion models has opened up new opportunities for single-view reconstruction. However, all the existing methods represent the target object as a closed mesh devoid of any structural information, thus neglecting the part-based structure, which is crucial for many downstream applications, of the reconstructed shape. Moreover, the generated meshes usually suffer from lar… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted to SIGGRAPH 2024 (conference track),webpage: https://liuar0512.github.io/part123_official_page/

  17. arXiv:2405.16802  [pdf, other

    cs.CL cs.LG

    AutoCV: Empowering Reasoning with Automated Process Labeling via Confidence Variation

    Authors: Jianqiao Lu, Zhiyang Dou, Hongru Wang, Zeyu Cao, Jianbo Dai, Yingjia Wan, Yinya Huang, Zhijiang Guo

    Abstract: In this work, we propose a novel method named \textbf{Auto}mated Process Labeling via \textbf{C}onfidence \textbf{V}ariation (\textbf{\textsc{AutoCV}}) to enhance the reasoning capabilities of large language models (LLMs) by automatically annotating the reasoning steps. Our approach begins by training a verification model on the correctness of final answers, enabling it to generate automatic proce… ▽ More

    Submitted 28 May, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

    Comments: 20 pages, 1 figure, 13 tables

  18. arXiv:2405.16635  [pdf, other

    cs.CL

    Compressing Lengthy Context With UltraGist

    Authors: Peitian Zhang, Zheng Liu, Shitao Xiao, Ninglu Shao, Qiwei Ye, Zhicheng Dou

    Abstract: Compressing lengthy context is a critical but technically challenging problem. In this paper, we propose a new method called UltraGist, which is distinguished for its high-quality compression of lengthy context due to the innovative design of the compression and learning algorithm. UltraGist brings forth the following important benefits. Firstly, it notably contributes to the flexibility of compre… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  19. arXiv:2405.15318  [pdf, other

    cs.CL cs.AI

    Are Long-LLMs A Necessity For Long-Context Tasks?

    Authors: Hongjin Qian, Zheng Liu, Peitian Zhang, Kelong Mao, Yujia Zhou, Xu Chen, Zhicheng Dou

    Abstract: The learning and deployment of long-LLMs remains a challenging problem despite recent progresses. In this work, we argue that the long-LLMs are not a necessity to solve long-context tasks, as common long-context tasks are short-context solvable, i.e. they can be solved by purely working with oracle short-contexts within the long-context tasks' inputs. On top of this argument, we propose a framewor… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 18 pages

  20. arXiv:2405.13576  [pdf, other

    cs.CL cs.IR

    FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research

    Authors: Jiajie Jin, Yutao Zhu, Xinyu Yang, Chenghao Zhang, Zhicheng Dou

    Abstract: With the advent of Large Language Models (LLMs), the potential of Retrieval Augmented Generation (RAG) techniques have garnered considerable research attention. Numerous novel algorithms and models have been introduced to enhance various aspects of RAG systems. However, the absence of a standardized framework for implementation, coupled with the inherently intricate RAG process, makes it challengi… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 8 pages

  21. arXiv:2405.11186  [pdf, other

    physics.plasm-ph physics.acc-ph

    Compact Spin-Polarized Positron Acceleration in Multi-Layer Microhole Array Films

    Authors: Zhen-Ke Dou, Chong Lv, Yousef I. Salamin, Nan Zhang, Feng Wan, Zhong-Feng Xu, Jian-Xing Li

    Abstract: Compact spin-polarized positron accelerators play a major role in promoting significant positron application research, which typically require high acceleration gradients and polarization degree, both of which, however, are still great challenging. Here, we put forward a novel spin-polarized positron acceleration method which employs an ultrarelativistic high-density electron beam passing through… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  22. arXiv:2405.10716  [pdf

    physics.app-ph physics.ins-det

    Scanning Acoustic Microscopy for Quantifying Two-phase Transfer in Operando Alkaline Water Electrolyzer

    Authors: Zehua Dou, Hannes Rox, Zyzi Ramos, Robert Baumann, Rachappa Ravishankar, Peter Czurratis, Xuegeng Yang, Andrés Fabian Lasagni, Kerstin Eckert, Juergen Czarske, David Weik

    Abstract: Improved understandings of two-phase transport in electrochemical gas-evolving systems are increasingly demanded, while high-performance imaging techniques using simplified instrumentations are not readily available. This work presents volumetric scanning acoustic microscopy (SAM) imaging for quantifying the dynamics of gas bubbles and electrolyte in porous Nickel electrodes with different wettabi… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: Research artical on an emerging field. 33 pages, 6 figures, 61 references, 10 supplementary figures available. Journal submission in progress

  23. arXiv:2405.05001  [pdf, other

    cs.CV

    HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution

    Authors: Shu-Chuan Chu, Zhi-Chao Dou, Jeng-Shyang Pan, Shaowei Weng, Junbao Li

    Abstract: Transformer-based methods have demonstrated excellent performance on super-resolution visual tasks, surpassing conventional convolutional neural networks. However, existing work typically restricts self-attention computation to non-overlapping windows to save computational costs. This means that Transformer-based networks can only use input information from a limited spatial range. Therefore, a no… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 12 pages, 10 figures, conference

  24. arXiv:2404.19553  [pdf, other

    cs.CL

    Extending Llama-3's Context Ten-Fold Overnight

    Authors: Peitian Zhang, Ninglu Shao, Zheng Liu, Shitao Xiao, Hongjin Qian, Qiwei Ye, Zhicheng Dou

    Abstract: We extend the context length of Llama-3-8B-Instruct from 8K to 80K via QLoRA fine-tuning. The entire training cycle is super efficient, which takes 8 hours on one 8xA800 (80G) GPU machine. The resulted model exhibits superior performances across a broad range of evaluation tasks, such as NIHS, topic retrieval, and long-context language understanding; meanwhile, it also well preserves the original… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  25. arXiv:2404.17779  [pdf, other

    cs.CL

    Medical Vision-Language Pre-Training for Brain Abnormalities

    Authors: Masoud Monajatipoor, Zi-Yi Dou, Aichi Chien, Nanyun Peng, Kai-Wei Chang

    Abstract: Vision-language models have become increasingly powerful for tasks that require an understanding of both visual and linguistic elements, bridging the gap between these modalities. In the context of multimodal clinical AI, there is a growing need for models that possess domain-specific knowledge, as existing models often lack the expertise required for medical applications. In this paper, we take b… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  26. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  27. arXiv:2404.14851  [pdf, other

    cs.IR cs.AI cs.CL

    From Matching to Generation: A Survey on Generative Information Retrieval

    Authors: Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yuyao Zhang, Peitian Zhang, Yutao Zhu, Zhicheng Dou

    Abstract: Information Retrieval (IR) systems are crucial tools for users to access information, widely applied in scenarios like search engines, question answering, and recommendation systems. Traditional IR methods, based on similarity matching to return ranked lists of documents, have been reliable means of information acquisition, dominating the IR field for years. With the advancement of pre-trained lan… ▽ More

    Submitted 15 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  28. arXiv:2404.13874  [pdf, other

    cs.CL cs.CV

    VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models

    Authors: Haoyi Qiu, Wenbo Hu, Zi-Yi Dou, Nanyun Peng

    Abstract: Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs, undermining their reliability. A comprehensive quantitative evaluation is necessary to identify and understand the extent of hallucinations in these models. However, existing benchmarks are often limited in scope, focusing mainly on object hallucina… ▽ More

    Submitted 14 July, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: ACL 2024 Findings

  29. arXiv:2404.13556  [pdf, other

    cs.IR cs.CL

    ChatRetriever: Adapting Large Language Models for Generalized and Robust Conversational Dense Retrieval

    Authors: Kelong Mao, Chenlong Deng, Haonan Chen, Fengran Mo, Zheng Liu, Tetsuya Sakai, Zhicheng Dou

    Abstract: Conversational search requires accurate interpretation of user intent from complex multi-turn contexts. This paper presents ChatRetriever, which inherits the strong generalization capability of large language models to robustly represent complex conversational sessions for dense retrieval. To achieve this, we propose a simple and effective dual-learning approach that adapts LLM for retrieval via c… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  30. arXiv:2404.10840  [pdf, other

    physics.flu-dyn physics.app-ph

    Uncertainty Quantification of Super-Resolution Flow Mapping in Liquid Metals using Ultrasound Localization Microscopy

    Authors: David Weik, Zehua Dou, Dirk Räbiger, Tobias Vogt, Sven Eckert, Jürgen Czarske, Lars Büttner

    Abstract: Convection of liquid metals drives large natural processes and is important in technical processes. Model experiments are conducted for research purposes where simulations are expensive and the clarification of open questions requires novel flow mapping methods with an increased spatial resolution. In this work, the method of Ultrasound Localization Microscopy (ULM) is investigated for this purpos… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  31. arXiv:2404.09790  [pdf, other

    cs.CV

    NTIRE 2024 Challenge on Image Super-Resolution ($\times$4): Methods and Results

    Authors: Zheng Chen, Zongwei Wu, Eduard Zamfir, Kai Zhang, Yulun Zhang, Radu Timofte, Xiaokang Yang, Hongyuan Yu, Cheng Wan, Yuxin Hong, Zhijuan Huang, Yajun Zou, Yuan Huang, Jiamin Lin, Bingnan Han, Xianyu Guan, Yongsheng Yu, Daoan Zhang, Xuanwu Yin, Kunlong Zuo, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou , et al. (63 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 challenge on image super-resolution ($\times$4), highlighting the solutions proposed and the outcomes obtained. The challenge involves generating corresponding high-resolution (HR) images, magnified by a factor of four, from low-resolution (LR) inputs using prior information. The LR images originate from bicubic downsampling degradation. The aim of the challenge i… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: NTIRE 2024 webpage: https://cvlai.net/ntire/2024. Code: https://github.com/zhengchen1999/NTIRE2024_ImageSR_x4

  32. arXiv:2403.13307  [pdf, other

    cs.CV

    LaserHuman: Language-guided Scene-aware Human Motion Generation in Free Environment

    Authors: Peishan Cong, Ziyi Wang, Zhiyang Dou, Yiming Ren, Wei Yin, Kai Cheng, Yujing Sun, Xiaoxiao Long, Xinge Zhu, Yuexin Ma

    Abstract: Language-guided scene-aware human motion generation has great significance for entertainment and robotics. In response to the limitations of existing datasets, we introduce LaserHuman, a pioneering dataset engineered to revolutionize Scene-Text-to-Motion research. LaserHuman stands out with its inclusion of genuine human motions within 3D environments, unbounded free-form natural language descript… ▽ More

    Submitted 21 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  33. An Analysis on Matching Mechanisms and Token Pruning for Late-interaction Models

    Authors: Qi Liu, Gang Guo, Jiaxin Mao, Zhicheng Dou, Ji-Rong Wen, Hao Jiang, Xinyu Zhang, Zhao Cao

    Abstract: With the development of pre-trained language models, the dense retrieval models have become promising alternatives to the traditional retrieval models that rely on exact match and sparse bag-of-words representations. Different from most dense retrieval models using a bi-encoder to encode each query or document into a dense vector, the recently proposed late-interaction multi-vector models (i.e., C… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Accepted by ACM Transactions on Information Systems

  34. arXiv:2402.14690  [pdf, other

    cs.CL

    UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models

    Authors: Zhaoheng Huang, Zhicheng Dou, Yutao Zhu, Ji-rong Wen

    Abstract: Large language models (LLMs) may generate text that lacks consistency with human knowledge, leading to factual inaccuracies or \textit{hallucination}. Existing research for evaluating the factuality of LLMs involves extracting fact claims using an LLM and verifying them against a predefined fact source. However, these evaluation metrics are task-specific, and not scalable, and the substitutability… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: under review

  35. arXiv:2402.12774  [pdf, other

    cs.IR

    Interpreting Conversational Dense Retrieval by Rewriting-Enhanced Inversion of Session Embedding

    Authors: Yiruo Cheng, Kelong Mao, Zhicheng Dou

    Abstract: Conversational dense retrieval has shown to be effective in conversational search. However, a major limitation of conversational dense retrieval is their lack of interpretability, hindering intuitive understanding of model behaviors for targeted improvements. This paper presents CONVINV, a simple yet effective approach to shed light on interpretable conversational dense retrieval models. CONVINV t… ▽ More

    Submitted 1 June, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024. Repo: https://github.com/Ariya12138/ConvInv

  36. arXiv:2402.12174  [pdf, other

    cs.CL

    BIDER: Bridging Knowledge Inconsistency for Efficient Retrieval-Augmented LLMs via Key Supporting Evidence

    Authors: Jiajie Jin, Yutao Zhu, Yujia Zhou, Zhicheng Dou

    Abstract: Retrieval-augmented large language models (LLMs) have demonstrated efficacy in knowledge-intensive tasks such as open-domain QA, addressing inherent challenges in knowledge update and factual inadequacy. However, inconsistencies between retrieval knowledge and the necessary knowledge for LLMs, leading to a decline in LLM's answer quality. This paper introduces BIDER, an approach that refines retri… ▽ More

    Submitted 30 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024 Findings

  37. arXiv:2402.12052  [pdf, other

    cs.CL

    Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs

    Authors: Jiejun Tan, Zhicheng Dou, Yutao Zhu, Peidong Guo, Kun Fang, Ji-Rong Wen

    Abstract: The integration of large language models (LLMs) and search engines represents a significant evolution in knowledge acquisition methodologies. However, determining the knowledge that an LLM already possesses and the knowledge that requires the help of a search engine remains an unresolved issue. Most existing methods solve this problem through the results of preliminary answers or reasoning done by… ▽ More

    Submitted 30 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024 main conference. Repo: https://github.com/plageon/SlimPLM

  38. arXiv:2402.11626  [pdf, other

    cs.CL cs.IR

    Metacognitive Retrieval-Augmented Large Language Models

    Authors: Yujia Zhou, Zheng Liu, Jiajie Jin, Jian-Yun Nie, Zhicheng Dou

    Abstract: Retrieval-augmented generation have become central in natural language processing due to their efficacy in generating factual content. While traditional methods employ single-time retrieval, more recent approaches have shifted towards multi-time retrieval for multi-hop reasoning tasks. However, these strategies are bound by predefined reasoning steps, potentially leading to inaccuracies in respons… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW 2024

  39. arXiv:2402.10548  [pdf, other

    cs.IR

    Cognitive Personalized Search Integrating Large Language Models with an Efficient Memory Mechanism

    Authors: Yujia Zhou, Qiannan Zhu, Jiajie Jin, Zhicheng Dou

    Abstract: Traditional search engines usually provide identical search results for all users, overlooking individual preferences. To counter this limitation, personalized search has been developed to re-rank results based on user preferences derived from query logs. Deep learning-based personalized search methods have shown promise, but they rely heavily on abundant training data, making them susceptible to… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW 2024

  40. arXiv:2402.09760  [pdf, other

    cs.CL cs.AI cs.IR

    Grounding Language Model with Chunking-Free In-Context Retrieval

    Authors: Hongjin Qian, Zheng Liu, Kelong Mao, Yujia Zhou, Zhicheng Dou

    Abstract: This paper presents a novel Chunking-Free In-Context (CFIC) retrieval approach, specifically tailored for Retrieval-Augmented Generation (RAG) systems. Traditional RAG systems often struggle with grounding responses using precise evidence text due to the challenges of processing lengthy documents and filtering out irrelevant content. Commonly employed solutions, such as document chunking and adapt… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  41. arXiv:2402.07092  [pdf, other

    cs.CL cs.IR

    Generalizing Conversational Dense Retrieval via LLM-Cognition Data Augmentation

    Authors: Haonan Chen, Zhicheng Dou, Kelong Mao, Jiongnan Liu, Ziliang Zhao

    Abstract: Conversational search utilizes muli-turn natural language contexts to retrieve relevant passages. Existing conversational dense retrieval models mostly view a conversation as a fixed sequence of questions and responses, overlooking the severe data sparsity problem -- that is, users can perform a conversation in various ways, and these alternate conversations are unrecorded. Consequently, they ofte… ▽ More

    Submitted 3 June, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Comments: ACL 2024

  42. arXiv:2402.07076  [pdf, other

    cs.IR cs.AI

    Enhancing Multi-field B2B Cloud Solution Matching via Contrastive Pre-training

    Authors: Haonan Chen, Zhicheng Dou, Xuetong Hao, Yunhao Tao, Shiren Song, Zhenli Sheng

    Abstract: Cloud solutions have gained significant popularity in the technology industry as they offer a combination of services and tools to tackle specific problems. However, despite their widespread use, the task of identifying appropriate company customers for a specific target solution to the sales team of a solution provider remains a complex business problem that existing matching systems have yet to… ▽ More

    Submitted 6 June, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Comments: KDD 2024, ADS Track

  43. arXiv:2402.01176  [pdf, other

    cs.CL cs.IR

    CorpusLM: Towards a Unified Language Model on Corpus for Knowledge-Intensive Tasks

    Authors: Xiaoxi Li, Zhicheng Dou, Yujia Zhou, Fangchao Liu

    Abstract: Large language models (LLMs) have gained significant attention in various fields but prone to hallucination, especially in knowledge-intensive (KI) tasks. To address this, retrieval-augmented generation (RAG) has emerged as a popular solution to enhance factual accuracy. However, traditional retrieval modules often rely on large document index and disconnect with generative tasks. With the advent… ▽ More

    Submitted 21 April, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  44. arXiv:2401.12946  [pdf, other

    cs.CV cs.CG cs.GR

    Coverage Axis++: Efficient Inner Point Selection for 3D Shape Skeletonization

    Authors: Zimeng Wang, Zhiyang Dou, Rui Xu, Cheng Lin, Yuan Liu, Xiaoxiao Long, Shiqing Xin, Taku Komura, Xiaoming Yuan, Wenping Wang

    Abstract: We introduce Coverage Axis++, a novel and efficient approach to 3D shape skeletonization. The current state-of-the-art approaches for this task often rely on the watertightness of the input or suffer from substantial computational costs, thereby limiting their practicality. To address this challenge, Coverage Axis++ proposes a heuristic algorithm to select skeletal points, offering a high-accuracy… ▽ More

    Submitted 10 June, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: SGP2024. Project Page: https://frank-zy-dou.github.io/projects/CoverageAxis++/index.html

  45. arXiv:2401.12445  [pdf, other

    cs.IR

    Session-level Normalization and Click-through Data Enhancement for Session-based Evaluation

    Authors: Haonan Chen, Zhicheng Dou, Jiaxin Mao

    Abstract: Since a user usually has to issue a sequence of queries and examine multiple documents to resolve a complex information need in a search session, researchers have paid much attention to evaluating search systems at the session level rather than the single-query level. Most existing session-level metrics evaluate each query separately and then aggregate the query-level scores using a session-level… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  46. arXiv:2401.08046  [pdf, other

    cs.CL cs.AI

    Enhancing Robustness of LLM-Synthetic Text Detectors for Academic Writing: A Comprehensive Analysis

    Authors: Zhicheng Dou, Yuchen Guo, Ching-Chun Chang, Huy H. Nguyen, Isao Echizen

    Abstract: The emergence of large language models (LLMs), such as Generative Pre-trained Transformer 4 (GPT-4) used by ChatGPT, has profoundly impacted the academic and broader community. While these models offer numerous advantages in terms of revolutionizing work and study methods, they have also garnered significant attention due to their potential negative consequences. One example is generating academic… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  47. arXiv:2401.06532  [pdf, other

    cs.CL cs.IR

    INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning

    Authors: Yutao Zhu, Peitian Zhang, Chenghao Zhang, Yifei Chen, Binyu Xie, Zheng Liu, Ji-Rong Wen, Zhicheng Dou

    Abstract: Large language models (LLMs) have demonstrated impressive capabilities in various natural language processing tasks. Despite this, their application to information retrieval (IR) tasks is still challenging due to the infrequent occurrence of many IR-specific concepts in natural language. While prompt-based methods can provide task descriptions to LLMs, they often fall short in facilitating a compr… ▽ More

    Submitted 28 May, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: Accepted by ACL 2024 main conference. Repo: https://github.com/DaoD/INTERS

  48. arXiv:2401.03462  [pdf, other

    cs.CL cs.AI

    Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon

    Authors: Peitian Zhang, Zheng Liu, Shitao Xiao, Ninglu Shao, Qiwei Ye, Zhicheng Dou

    Abstract: The utilization of long contexts poses a big challenge for LLMs due to their limited context window size. Although the context window can be extended through fine-tuning, it will result in a considerable cost at both training and inference time, and exert an unfavorable impact to the LLM's original capabilities. In this work, we propose a new method called Activation Beacon, which condenses LLM's… ▽ More

    Submitted 2 February, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  49. arXiv:2312.11036  [pdf, other

    cs.IR cs.AI cs.CL

    UniGen: A Unified Generative Framework for Retrieval and Question Answering with Large Language Models

    Authors: Xiaoxi Li, Yujia Zhou, Zhicheng Dou

    Abstract: Generative information retrieval, encompassing two major tasks of Generative Document Retrieval (GDR) and Grounded Answer Generation (GAR), has gained significant attention in the area of information retrieval and natural language processing. Existing methods for GDR and GAR rely on separate retrieval and reader modules, which hinder simultaneous optimization. To overcome this, we present \textbf{… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  50. arXiv:2312.05295  [pdf, other

    cs.CV

    Disentangled Clothed Avatar Generation from Text Descriptions

    Authors: Jionghao Wang, Yuan Liu, Zhiyang Dou, Zhengming Yu, Yongqing Liang, Xin Li, Wenping Wang, Rong Xie, Li Song

    Abstract: In this paper, we introduced a novel text-to-avatar generation method that separately generates the human body and the clothes and allows high-quality animation on the generated avatar. While recent advancements in text-to-avatar generation have yielded diverse human avatars from text prompts, these methods typically combine all elements-clothes, hair, and body-into a single 3D representation. Suc… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: Project page: https://shanemankiw.github.io/SO-SMPL/