Skip to main content

Showing 1–50 of 50 results for author: Mi, F

  1. arXiv:2406.17626  [pdf, other

    cs.CL cs.AI

    CoSafe: Evaluating Large Language Model Safety in Multi-Turn Dialogue Coreference

    Authors: Erxin Yu, Jing Li, Ming Liao, Siqi Wang, Zuchen Gao, Fei Mi, Lanqing Hong

    Abstract: As large language models (LLMs) constantly evolve, ensuring their safety remains a critical research problem. Previous red-teaming approaches for LLM safety have primarily focused on single prompt attacks or goal hijacking. To the best of our knowledge, we are the first to study LLM safety in multi-turn dialogue coreference. We created a dataset of 1,400 questions across 14 categories, each featur… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Submitted to EMNLP 2024

  2. arXiv:2406.07850  [pdf, other

    cs.CL cs.AI

    Dynamic Stochastic Decoding Strategy for Open-Domain Dialogue Generation

    Authors: Yiwei Li, Fei Mi, Yitong Li, Yasheng Wang, Bin Sun, Shaoxiong Feng, Kan Li

    Abstract: Stochastic sampling strategies such as top-k and top-p have been widely used in dialogue generation task. However, as an open-domain chatting system, there will be two different conversation scenarios, i.e. chit-chat and knowledge-based question answering. In the former situation, responses diversity is essential due to the one-to-many nature in dialogue. The latter, on the other hand, requires le… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Findings

  3. arXiv:2405.00557  [pdf, other

    cs.CL cs.AI

    Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment

    Authors: Zhili Liu, Yunhao Gou, Kai Chen, Lanqing Hong, Jiahui Gao, Fei Mi, Yu Zhang, Zhenguo Li, Xin Jiang, Qun Liu, James T. Kwok

    Abstract: As the capabilities of large language models (LLMs) have expanded dramatically, aligning these models with human values presents a significant challenge. Traditional alignment strategies rely heavily on human intervention, such as Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF), or on the self-alignment capacities of LLMs, which usually require a strong LLM's eme… ▽ More

    Submitted 8 July, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  4. arXiv:2403.02756  [pdf, other

    cs.CL

    Role Prompting Guided Domain Adaptation with General Capability Preserve for Large Language Models

    Authors: Rui Wang, Fei Mi, Yi Chen, Boyang Xue, Hongru Wang, Qi Zhu, Kam-Fai Wong, Ruifeng Xu

    Abstract: The growing interest in Large Language Models (LLMs) for specialized applications has revealed a significant challenge: when tailored to specific domains, LLMs tend to experience catastrophic forgetting, compromising their general capabilities and leading to a suboptimal user experience. Additionally, crafting a versatile model for multiple domains simultaneously often results in a decline in over… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  5. arXiv:2402.16261  [pdf, other

    cs.CL cs.IR

    UniRetriever: Multi-task Candidates Selection for Various Context-Adaptive Conversational Retrieval

    Authors: Hongru Wang, Boyang Xue, Baohang Zhou, Rui Wang, Fei Mi, Weichao Wang, Yasheng Wang, Kam-Fai Wong

    Abstract: Conversational retrieval refers to an information retrieval system that operates in an iterative and interactive manner, requiring the retrieval of various external resources, such as persona, knowledge, and even response, to effectively engage with the user and successfully complete the dialogue. However, most previous work trained independent retrievers for each specific resource, resulting in s… ▽ More

    Submitted 28 February, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

  6. arXiv:2402.12319  [pdf, other

    cs.LG cs.AI cs.CY

    Dynamic Environment Responsive Online Meta-Learning with Fairness Awareness

    Authors: Chen Zhao, Feng Mi, Xintao Wu, Kai Jiang, Latifur Khan, Feng Chen

    Abstract: The fairness-aware online learning framework has emerged as a potent tool within the context of continuous lifelong learning. In this scenario, the learner's objective is to progressively acquire new tasks as they arrive over time, while also guaranteeing statistical parity among various protected sub-populations, such as race and gender, when it comes to the newly introduced tasks. A significant… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted by TKDD, extended from KDD 2022. arXiv admin note: substantial text overlap with arXiv:2205.11264

  7. arXiv:2401.15670  [pdf, other

    cs.CL cs.AI cs.LG

    YODA: Teacher-Student Progressive Learning for Language Models

    Authors: Jianqiao Lu, Wanjun Zhong, Yufei Wang, Zhijiang Guo, Qi Zhu, Wenyong Huang, Yanlin Wang, Fei Mi, Baojun Wang, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu

    Abstract: Although large language models (LLMs) have demonstrated adeptness in a range of tasks, they still lag behind human learning efficiency. This disparity is often linked to the inherent human capacity to learn from basic examples, gradually generalize and handle more complex problems, and refine their skills with continuous feedback. Inspired by this, this paper introduces YODA, a novel teacher-stude… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: 14 pages, 4 figures, 3 tables

  8. arXiv:2401.13256  [pdf, other

    cs.CL cs.AI

    UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for Personalized Dialogue Systems

    Authors: Hongru Wang, Wenyu Huang, Yang Deng, Rui Wang, Zezhong Wang, Yufei Wang, Fei Mi, Jeff Z. Pan, Kam-Fai Wong

    Abstract: Large Language Models (LLMs) has shown exceptional capabilities in many natual language understanding and generation tasks. However, the personalization issue still remains a much-coveted property, especially when it comes to the multiple sources involved in the dialogue system. To better plan and incorporate the use of multiple sources in generating personalized response, we firstly decompose it… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  9. arXiv:2312.01700  [pdf, other

    cs.CL cs.AI

    Data Management For Large Language Models: A Survey

    Authors: Zige Wang, Wanjun Zhong, Yufei Wang, Qi Zhu, Fei Mi, Baojun Wang, Lifeng Shang, Xin Jiang, Qun Liu

    Abstract: Data plays a fundamental role in the training of Large Language Models (LLMs). Effective data management, particularly in the formulation of a well-suited training dataset, holds significance for enhancing model performance and improving training efficiency during pretraining and supervised fine-tuning phases. Despite the considerable importance of data management, the current research community s… ▽ More

    Submitted 25 December, 2023; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: Work in progress

  10. arXiv:2311.09096  [pdf, other

    cs.CL

    Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization

    Authors: Zhexin Zhang, Junxiao Yang, Pei Ke, Fei Mi, Hongning Wang, Minlie Huang

    Abstract: While significant attention has been dedicated to exploiting weaknesses in LLMs through jailbreaking attacks, there remains a paucity of effort in defending against these attacks. We point out a pivotal factor contributing to the success of jailbreaks: the intrinsic conflict between the goals of being helpful and ensuring safety. Accordingly, we propose to integrate goal prioritization at both tra… ▽ More

    Submitted 12 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: ACL 2024 Main Conference

  11. arXiv:2310.20410  [pdf, other

    cs.CL

    FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models

    Authors: Yuxin Jiang, Yufei Wang, Xingshan Zeng, Wanjun Zhong, Liangyou Li, Fei Mi, Lifeng Shang, Xin Jiang, Qun Liu, Wei Wang

    Abstract: The ability to follow instructions is crucial for Large Language Models (LLMs) to handle various real-world applications. Existing benchmarks primarily focus on evaluating pure response quality, rather than assessing whether the response follows constraints stated in the instruction. To fill this research gap, in this paper, we propose FollowBench, a Multi-level Fine-grained Constraints Following… ▽ More

    Submitted 5 June, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: 22 pages, 11 figures, 16 tables. ACL 2024 main camera-ready version

  12. arXiv:2310.10477  [pdf, other

    cs.CL cs.AI cs.LG

    Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis

    Authors: Kai Chen, Chunwei Wang, Kuo Yang, Jianhua Han, Lanqing Hong, Fei Mi, Hang Xu, Zhengying Liu, Wenyong Huang, Zhenguo Li, Dit-Yan Yeung, Lifeng Shang, Xin Jiang, Qun Liu

    Abstract: The rapid development of large language models (LLMs) has not only provided numerous opportunities but also presented significant challenges. This becomes particularly evident when LLMs inadvertently generate harmful or toxic content, either unintentionally or because of intentional inducement. Existing alignment methods usually direct LLMs toward the favorable outcomes by utilizing human-annotate… ▽ More

    Submitted 16 February, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted by ICLR 2024

  13. arXiv:2310.08840  [pdf, other

    cs.CL cs.AI

    Large Language Models as Source Planner for Personalized Knowledge-grounded Dialogue

    Authors: Hongru Wang, Minda Hu, Yang Deng, Rui Wang, Fei Mi, Weichao Wang, Yasheng Wang, Wai-Chung Kwan, Irwin King, Kam-Fai Wong

    Abstract: Open-domain dialogue system usually requires different sources of knowledge to generate more informative and evidential responses. However, existing knowledge-grounded dialogue systems either focus on a single knowledge source or overlook the dependency between multiple sources of knowledge, which may result in generating inconsistent or even paradoxical responses. To incorporate multiple knowledg… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  14. arXiv:2310.08372  [pdf, other

    cs.CL

    Improving Factual Consistency for Knowledge-Grounded Dialogue Systems via Knowledge Enhancement and Alignment

    Authors: Boyang Xue, Weichao Wang, Hongru Wang, Fei Mi, Rui Wang, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong

    Abstract: Pretrained language models (PLMs) based knowledge-grounded dialogue systems are prone to generate responses that are factually inconsistent with the provided knowledge source. In such inconsistent responses, the dialogue models fail to accurately express the external knowledge they rely upon. Inspired by previous work which identified that feed-forward networks (FFNs) within Transformers are respo… ▽ More

    Submitted 3 November, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: EMNLP2023 Findings

  15. arXiv:2310.00533  [pdf, other

    cs.CL cs.AI cs.LG

    SELF: Self-Evolution with Language Feedback

    Authors: Jianqiao Lu, Wanjun Zhong, Wenyong Huang, Yufei Wang, Qi Zhu, Fei Mi, Baojun Wang, Weichao Wang, Xingshan Zeng, Lifeng Shang, Xin Jiang, Qun Liu

    Abstract: Large Language Models (LLMs) have demonstrated remarkable versatility across various domains. To further advance LLMs, we propose 'SELF' (Self-Evolution with Language Feedback), a novel approach that enables LLMs to self-improve through self-reflection, akin to human learning processes. SELF initiates with a meta-skill learning process that equips the LLMs with capabilities for self-feedback and s… ▽ More

    Submitted 1 February, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

    Comments: 20 pages, 4 figures, 11 tables

  16. arXiv:2309.16090  [pdf, other

    cs.AI cs.CL

    TPE: Towards Better Compositional Reasoning over Conceptual Tools with Multi-persona Collaboration

    Authors: Hongru Wang, Huimin Wang, Lingzhi Wang, Minda Hu, Rui Wang, Boyang Xue, Hongyuan Lu, Fei Mi, Kam-Fai Wong

    Abstract: Large language models (LLMs) have demonstrated exceptional performance in planning the use of various functional tools, such as calculators and retrievers, particularly in question-answering tasks. In this paper, we expand the definition of these tools, centering on conceptual tools within the context of dialogue systems. A conceptual tool specifies a cognitive concept that aids systematic or inve… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  17. arXiv:2307.12966  [pdf, other

    cs.CL

    Aligning Large Language Models with Human: A Survey

    Authors: Yufei Wang, Wanjun Zhong, Liangyou Li, Fei Mi, Xingshan Zeng, Wenyong Huang, Lifeng Shang, Xin Jiang, Qun Liu

    Abstract: Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect (hallucinated) information. Hence, aligning LLMs w… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: work in progress

  18. arXiv:2307.06869  [pdf, other

    cs.CL cs.AI

    DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering

    Authors: Pei Ke, Fei Huang, Fei Mi, Yasheng Wang, Qun Liu, Xiaoyan Zhu, Minlie Huang

    Abstract: Existing evaluation metrics for natural language generation (NLG) tasks face the challenges on generalization ability and interpretability. Specifically, most of the well-performed metrics are required to train on evaluation datasets of specific NLG tasks and evaluation dimensions, which may cause over-fitting to task-specific datasets. Furthermore, existing metrics only provide an evaluation scor… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Comments: Accepted by ACL 2023 (Main Conference)

  19. arXiv:2306.01007  [pdf, other

    cs.LG cs.AI

    Towards Fair Disentangled Online Learning for Changing Environments

    Authors: Chen Zhao, Feng Mi, Xintao Wu, Kai Jiang, Latifur Khan, Christan Grant, Feng Chen

    Abstract: In the problem of online learning for changing environments, data are sequentially received one after another over time, and their distribution assumptions may vary frequently. Although existing methods demonstrate the effectiveness of their learning algorithms by providing a tight bound on either dynamic regret or adaptive regret, most of them completely ignore learning with model fairness, defin… ▽ More

    Submitted 16 July, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

    Comments: Accepted by KDD 2023

  20. arXiv:2305.13733  [pdf, other

    cs.CL

    Enhancing Large Language Models Against Inductive Instructions with Dual-critique Prompting

    Authors: Rui Wang, Hongru Wang, Fei Mi, Yi Chen, Boyang Xue, Kam-Fai Wong, Ruifeng Xu

    Abstract: Numerous works are proposed to align large language models (LLMs) with human intents to better fulfill instructions, ensuring they are trustful and helpful. Nevertheless, some human instructions are often malicious or misleading and following them will lead to untruthful and unsafe responses. Previous work rarely focused on understanding how LLMs manage instructions based on counterfactual premise… ▽ More

    Submitted 6 March, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

  21. arXiv:2305.13602  [pdf, other

    cs.CL

    ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue

    Authors: Haoqin Tu, Yitong Li, Fei Mi, Zhongliang Yang

    Abstract: Incorporating visual knowledge into text-only dialogue systems has become a potential direction to imitate the way humans think, imagine, and communicate. However, existing multimodal dialogue systems are either confined by the scale and quality of available datasets or the coarse concept of visual knowledge. To address these issues, we provide a new paradigm of constructing multimodal dialogues a… ▽ More

    Submitted 20 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: 15 pages, accepted to EMNLP 2023 (main)

  22. arXiv:2305.11792  [pdf, other

    cs.CL cs.AI

    Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs

    Authors: Hongru Wang, Rui Wang, Fei Mi, Yang Deng, Zezhong Wang, Bin Liang, Ruifeng Xu, Kam-Fai Wong

    Abstract: Large Language Models (LLMs), such as \texttt{ChatGPT}, greatly empower dialogue systems with strong language understanding and generation capabilities. However, most of the previous works prompt the LLMs to directly generate a response based on the dialogue context, overlooking the underlying linguistic cues about the user status exhibited in the context. Such in-depth dialogue scenarios are chal… ▽ More

    Submitted 15 October, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

  23. arXiv:2301.08824  [pdf, ps, other

    cs.CR cs.LG

    An Automated Vulnerability Detection Framework for Smart Contracts

    Authors: Feng Mi, Chen Zhao, Zhuoyi Wang, Sadaf MD Halim, Xiaodi Li, Zhouxiang Wu, Latifur Khan, Bhavani Thuraisingham

    Abstract: With the increase of the adoption of blockchain technology in providing decentralized solutions to various problems, smart contracts have become more popular to the point that billions of US Dollars are currently exchanged every day through such technology. Meanwhile, various vulnerabilities in smart contracts have been exploited by attackers to steal cryptocurrencies worth millions of dollars. Th… ▽ More

    Submitted 20 January, 2023; originally announced January 2023.

  24. arXiv:2212.10720  [pdf, other

    cs.CL

    MoralDial: A Framework to Train and Evaluate Moral Dialogue Systems via Moral Discussions

    Authors: Hao Sun, Zhexin Zhang, Fei Mi, Yasheng Wang, Wei Liu, Jianwei Cui, Bin Wang, Qun Liu, Minlie Huang

    Abstract: Morality in dialogue systems has raised great attention in research recently. A moral dialogue system aligned with users' values could enhance conversation engagement and user connections. In this paper, we propose a framework, MoralDial to train and evaluate moral dialogue systems. In our framework, we first explore the communication mechanisms of morality and resolve expressed morality into thre… ▽ More

    Submitted 26 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Accepted to ACL 2023

  25. arXiv:2212.01810  [pdf, other

    cs.CL

    Constructing Highly Inductive Contexts for Dialogue Safety through Controllable Reverse Generation

    Authors: Zhexin Zhang, Jiale Cheng, Hao Sun, Jiawen Deng, Fei Mi, Yasheng Wang, Lifeng Shang, Minlie Huang

    Abstract: Large pretrained language models can easily produce toxic or biased content, which is prohibitive for practical use. In order to detect such toxic generations, existing methods rely on templates, real-world data extraction, crowdsourcing workers, or automatic generation to construct adversarial contexts that are likely to induce toxic generations. However, what type of context is more likely to in… ▽ More

    Submitted 4 December, 2022; originally announced December 2022.

    Comments: Findings of EMNLP 2022

  26. arXiv:2212.01739  [pdf, other

    cs.CL

    KPT: Keyword-guided Pre-training for Grounded Dialog Generation

    Authors: Qi Zhu, Fei Mi, Zheng Zhang, Yasheng Wang, Yitong Li, Xin Jiang, Qun Liu, Xiaoyan Zhu, Minlie Huang

    Abstract: Incorporating external knowledge into the response generation process is essential to building more helpful and reliable dialog agents. However, collecting knowledge-grounded conversations is often costly, calling for a better pre-trained model for grounded dialog generation that generalizes well w.r.t. different types of knowledge. In this work, we propose KPT (Keyword-guided Pre-Training), a nov… ▽ More

    Submitted 3 December, 2022; originally announced December 2022.

    Comments: Accepted by AAAI 2023

  27. arXiv:2212.01145  [pdf, other

    cs.CL cs.AI

    Towards Diverse, Relevant and Coherent Open-Domain Dialogue Generation via Hybrid Latent Variables

    Authors: Bin Sun, Yitong Li, Fei Mi, Weichao Wang, Yiwei Li, Kan Li

    Abstract: Conditional variational models, using either continuous or discrete latent variables, are powerful for open-domain dialogue response generation. However, previous works show that continuous latent variables tend to reduce the coherence of generated responses. In this paper, we also found that discrete latent variables have difficulty capturing more diverse expressions. To tackle these problems, we… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: Accepted by AAAI 2023

  28. arXiv:2212.00231  [pdf, other

    cs.CL cs.AI

    Modeling Complex Dialogue Mappings via Sentence Semantic Segmentation Guided Conditional Variational Auto-Encoder

    Authors: Bin Sun, Shaoxiong Feng, Yiwei Li, Weichao Wang, Fei Mi, Yitong Li, Kan Li

    Abstract: Complex dialogue mappings (CDM), including one-to-many and many-to-one mappings, tend to make dialogue models generate incoherent or dull responses, and modeling these mappings remains a huge challenge for neural dialogue systems. To alleviate these problems, methods like introducing external information, reconstructing the optimization function, and manipulating data samples are proposed, while t… ▽ More

    Submitted 30 November, 2022; originally announced December 2022.

    Comments: Findings of EMNLP 2022

  29. arXiv:2209.00250  [pdf, other

    cs.CL

    Exploring Effective Information Utilization in Multi-Turn Topic-Driven Conversations

    Authors: Jiatong Li, Bin He, Fei Mi

    Abstract: Conversations are always related to certain topics. However, it is challenging to fuse dialogue history and topic information from various sources at the same time in current dialogue generation models because of the input length limit of pre-trained language models (PLMs). In order to expand the information that PLMs can utilize, we encode topic and dialogue history information using certain prom… ▽ More

    Submitted 12 September, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: Exploring Chinese Dialogue Systems with external information

  30. arXiv:2205.11264  [pdf, other

    cs.LG cs.AI

    Adaptive Fairness-Aware Online Meta-Learning for Changing Environments

    Authors: Chen Zhao, Feng Mi, Xintao Wu, Kai Jiang, Latifur Khan, Feng Chen

    Abstract: The fairness-aware online learning framework has arisen as a powerful tool for the continual lifelong learning setting. The goal for the learner is to sequentially learn new tasks where they come one after another over time and the learner ensures the statistic parity of the new coming task across different protected sub-populations (e.g. race and gender). A major drawback of existing methods is t… ▽ More

    Submitted 26 May, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022. arXiv admin note: text overlap with arXiv:2108.09435

  31. arXiv:2203.17090  [pdf, other

    cs.CL

    PanGu-Bot: Efficient Generative Dialogue Pre-training from Pre-trained Language Model

    Authors: Fei Mi, Yitong Li, Yulong Zeng, Jingyan Zhou, Yasheng Wang, Chuanfei Xu, Lifeng Shang, Xin Jiang, Shiqi Zhao, Qun Liu

    Abstract: In this paper, we introduce PanGu-Bot, a Chinese pre-trained open-domain dialogue generation model based on a large pre-trained language model (PLM) PANGU-alpha (Zeng et al.,2021). Different from other pre-trained dialogue models trained over a massive amount of dialogue data from scratch, we aim to build a powerful dialogue model with relatively fewer data and computation costs by inheriting valu… ▽ More

    Submitted 5 July, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Update model and results; add comparison with EVA2.0

  32. arXiv:2203.06654  [pdf, other

    cs.CL

    Continual Prompt Tuning for Dialog State Tracking

    Authors: Qi Zhu, Bing Li, Fei Mi, Xiaoyan Zhu, Minlie Huang

    Abstract: A desirable dialog system should be able to continually learn new skills without forgetting old ones, and thereby adapt to new domains or tasks in its life cycle. However, continually training a model often leads to a well-known catastrophic forgetting issue. In this paper, we present Continual Prompt Tuning, a parameter-efficient framework that not only avoids forgetting but also enables knowledg… ▽ More

    Submitted 13 March, 2022; originally announced March 2022.

    Comments: Accepted by ACL 2022, camera-ready version

  33. arXiv:2203.05132  [pdf, other

    cs.CL cs.AI cs.PL

    Compilable Neural Code Generation with Compiler Feedback

    Authors: Xin Wang, Yasheng Wang, Yao Wan, Fei Mi, Yitong Li, Pingyi Zhou, Jin Liu, Hao Wu, Xin Jiang, Qun Liu

    Abstract: Automatically generating compilable programs with (or without) natural language descriptions has always been a touchstone problem for computational linguistics and automated software engineering. Existing deep-learning approaches model code generation as text generation, either constrained by grammar structures in decoder, or driven by pre-trained language models on large-scale code corpus (e.g.,… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

    Comments: Accepted by ACL 2022

  34. arXiv:2202.08011  [pdf, other

    cs.CL

    Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks

    Authors: Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie Huang, Xin Jiang, Qun Liu, Helen Meng

    Abstract: The research of open-domain dialog systems has been greatly prospered by neural models trained on large-scale corpora, however, such corpora often introduce various safety problems (e.g., offensive languages, biases, and toxic behaviors) that significantly hinder the deployment of dialog systems in practice. Among all these unsafe issues, addressing social bias is more complex as its negative impa… ▽ More

    Submitted 28 October, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Journal ref: EMNLP 2022

  35. arXiv:2201.11367  [pdf, other

    cs.CL

    Pan More Gold from the Sand: Refining Open-domain Dialogue Training with Noisy Self-Retrieval Generation

    Authors: Yihe Wang, Yitong Li, Yasheng Wang, Fei Mi, Pingyi Zhou, Xin Wang, Jin Liu, Xin Jiang, Qun Liu

    Abstract: Real human conversation data are complicated, heterogeneous, and noisy, from which building open-domain dialogue systems remains a challenging task. In fact, such dialogue data still contains a wealth of information and knowledge, however, they are not fully explored. In this paper, we show existing open-domain dialogue generation methods that memorize context-response paired data with autoregress… ▽ More

    Submitted 15 September, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

    Comments: Accepted in COLING 2022

  36. arXiv:2201.06025  [pdf, other

    cs.CL cs.AI

    COLD: A Benchmark for Chinese Offensive Language Detection

    Authors: Jiawen Deng, Jingyan Zhou, Hao Sun, Chujie Zheng, Fei Mi, Helen Meng, Minlie Huang

    Abstract: Offensive language detection is increasingly crucial for maintaining a civilized social media platform and deploying pre-trained language models. However, this task in Chinese is still under exploration due to the scarcity of reliable datasets. To this end, we propose a benchmark --COLD for Chinese offensive language analysis, including a Chinese Offensive Language Dataset --COLDATASET and a basel… ▽ More

    Submitted 19 October, 2022; v1 submitted 16 January, 2022; originally announced January 2022.

    Comments: 19 pages

  37. arXiv:2112.07522  [pdf, other

    cs.CL

    LMTurk: Few-Shot Learners as Crowdsourcing Workers in a Language-Model-as-a-Service Framework

    Authors: Mengjie Zhao, Fei Mi, Yasheng Wang, Minglei Li, Xin Jiang, Qun Liu, Hinrich Schütze

    Abstract: Vast efforts have been devoted to creating high-performance few-shot learners, i.e., large-scale pretrained language models (PLMs) that perform well with little downstream task training data. Training PLMs has incurred significant cost, but utilizing the few-shot learners is still challenging due to their enormous size. This work focuses on a crucial question: How to make effective use of these fe… ▽ More

    Submitted 2 May, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: Findings of ACL: NAACL 2022

  38. arXiv:2110.08032  [pdf, other

    cs.CL

    UniDS: A Unified Dialogue System for Chit-Chat and Task-oriented Dialogues

    Authors: Xinyan Zhao, Bin He, Yasheng Wang, Yitong Li, Fei Mi, Yajiao Liu, Xin Jiang, Qun Liu, Huanhuan Chen

    Abstract: With the advances in deep learning, tremendous progress has been made with chit-chat dialogue systems and task-oriented dialogue systems. However, these two systems are often tackled separately in current methods. To achieve more natural interaction with humans, a dialogue agent needs to be capable of both chatting and accomplishing tasks. To this end, we propose a unified dialogue system (UniDS)… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

  39. arXiv:2109.04645  [pdf, other

    cs.CL cs.LG

    CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented Dialog Systems

    Authors: Fei Mi, Yitong Li, Yasheng Wang, Xin Jiang, Qun Liu

    Abstract: As labeling cost for different modules in task-oriented dialog (ToD) systems is high, a major challenge in practice is to learn different tasks with the least amount of labeled data. Recently, prompting methods over pre-trained language models (PLMs) have shown promising results for few-shot learning in ToD. To better utilize the power of PLMs, this paper proposes Comprehensive Instruction (CINS)… ▽ More

    Submitted 21 March, 2022; v1 submitted 9 September, 2021; originally announced September 2021.

    Comments: Accepted at AAAI2022

  40. arXiv:2108.12596  [pdf, other

    cs.LG cs.NE

    Representation Memorization for Fast Learning New Knowledge without Forgetting

    Authors: Fei Mi, Tao Lin, Boi Faltings

    Abstract: The ability to quickly learn new knowledge (e.g. new classes or data distributions) is a big step towards human-level intelligence. In this paper, we consider scenarios that require learning new classes or data distributions quickly and incrementally over time, as it often occurs in real-world dynamic environments. We propose "Memory-based Hebbian Parameter Adaptation" (Hebb) to tackle the two maj… ▽ More

    Submitted 10 September, 2021; v1 submitted 28 August, 2021; originally announced August 2021.

  41. arXiv:2108.12589  [pdf, other

    cs.CL

    Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems

    Authors: Fei Mi, Wanhao Zhou, Fengyu Cai, Lingjing Kong, Minlie Huang, Boi Faltings

    Abstract: As the labeling cost for different modules in task-oriented dialog (ToD) systems is expensive, a major challenge is to train different modules with the least amount of labeled data. Recently, large-scale pre-trained language models, have shown promising results for few-shot learning in ToD. In this paper, we devise a self-training approach to utilize the abundant unlabeled dialog data to further i… ▽ More

    Submitted 28 August, 2021; originally announced August 2021.

    Comments: Accepted as Long Paper at "EMNLP, 2021"

  42. arXiv:2108.11711  [pdf, other

    cs.AI

    SLIM: Explicit Slot-Intent Mapping with BERT for Joint Multi-Intent Detection and Slot Filling

    Authors: Fengyu Cai, Wanhao Zhou, Fei Mi, Boi Faltings

    Abstract: Utterance-level intent detection and token-level slot filling are two key tasks for natural language understanding (NLU) in task-oriented systems. Most existing approaches assume that only a single intent exists in an utterance. However, there are often multiple intents within an utterance in real-life scenarios. In this paper, we propose a multi-intent NLU framework, called SLIM, to jointly learn… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

  43. arXiv:2108.04556  [pdf, other

    cs.CL cs.AI cs.PL

    SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation

    Authors: Xin Wang, Yasheng Wang, Fei Mi, Pingyi Zhou, Yao Wan, Xiao Liu, Li Li, Hao Wu, Jin Liu, Xin Jiang

    Abstract: Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for code intelligence. Recently, many pre-trained language models for source code (e.g., CuBERT and CodeBERT) have been proposed to model the context of code and serve as a basis for downstream code intelligence tasks such as code se… ▽ More

    Submitted 9 September, 2021; v1 submitted 10 August, 2021; originally announced August 2021.

    Comments: 9 pages, 3 figures, 5 tables

  44. arXiv:2010.00910  [pdf, other

    cs.CL cs.LG

    Continual Learning for Natural Language Generation in Task-oriented Dialog Systems

    Authors: Fei Mi, Liangwei Chen, Mengjie Zhao, Minlie Huang, Boi Faltings

    Abstract: Natural language generation (NLG) is an essential component of task-oriented dialog systems. Despite the recent success of neural approaches for NLG, they are typically developed in an offline manner for particular domains. To better fit real-life applications where new data come in a stream, we study NLG in a "continual learning" setting to expand its knowledge to new domains or functionalities i… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

    Comments: Accepted as Long Paper at "Findgins of EMNLP, 2020"

  45. arXiv:2007.12000  [pdf, other

    cs.LG cs.IR stat.ML

    ADER: Adaptively Distilled Exemplar Replay Towards Continual Learning for Session-based Recommendation

    Authors: Fei Mi, Xiaoyu Lin, Boi Faltings

    Abstract: Session-based recommendation has received growing attention recently due to the increasing privacy concern. Despite the recent success of neural session-based recommenders, they are typically developed in an offline manner using a static dataset. However, recommendation requires continual adaptation to take into account new and obsolete items and users, and requires "continual learning" in real-li… ▽ More

    Submitted 23 July, 2020; originally announced July 2020.

    Comments: Accepted at RecSys 2020

  46. arXiv:2005.01573  [pdf, other

    cs.LG cs.IR stat.ML

    Memory Augmented Neural Model for Incremental Session-based Recommendation

    Authors: Fei Mi, Boi Faltings

    Abstract: Increasing concerns with privacy have stimulated interests in Session-based Recommendation (SR) using no personal data other than what is observed in the current browser session. Existing methods are evaluated in static settings which rarely occur in real-world applications. To better address the dynamic nature of SR tasks, we study an incremental SR scenario, where new items and preferences appea… ▽ More

    Submitted 28 April, 2020; originally announced May 2020.

    Comments: Accepted as a full paper at IJCAI 2020

  47. arXiv:2004.12406  [pdf, other

    cs.CL

    Masking as an Efficient Alternative to Finetuning for Pretrained Language Models

    Authors: Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze

    Abstract: We present an efficient method of utilizing pretrained language models, where we learn selective binary masks for pretrained weights in lieu of modifying them through finetuning. Extensive evaluations of masking BERT and RoBERTa on a series of NLP tasks show that our masking scheme yields performance comparable to finetuning, yet has a much smaller memory footprint when several tasks need to be in… ▽ More

    Submitted 11 October, 2020; v1 submitted 26 April, 2020; originally announced April 2020.

    Comments: EMNLP 2020; MZ and TL contribute equally

  48. arXiv:1905.05644  [pdf, other

    cs.CL cs.LG

    Meta-Learning for Low-resource Natural Language Generation in Task-oriented Dialogue Systems

    Authors: Fei Mi, Minlie Huang, Jiyong Zhang, Boi Faltings

    Abstract: Natural language generation (NLG) is an essential component of task-oriented dialogue systems. Despite the recent success of neural approaches for NLG, they are typically developed for particular domains with rich annotated training examples. In this paper, we study NLG in a low-resource setting to generate sentences in new scenarios with handful training examples. We formulate the problem from a… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

    Comments: Accepted as a full paper at IJCAI 2019

  49. arXiv:1806.03733  [pdf, other

    cs.IR

    Context Tree for Adaptive Session-based Recommendation

    Authors: Fei Mi, Boi Faltings

    Abstract: There has been growing interests in recent years from both practical and research perspectives for session-based recommendation tasks as long-term user profiles do not often exist in many real-life recommendation applications. In this case, recommendations for user's immediate next actions need to be generated based on patterns in anonymous short sessions. An often overlooked aspect is that new it… ▽ More

    Submitted 10 June, 2018; originally announced June 2018.

  50. arXiv:1706.07503  [pdf, other

    cs.CL cs.LG

    Personalization in Goal-Oriented Dialog

    Authors: Chaitanya K. Joshi, Fei Mi, Boi Faltings

    Abstract: The main goal of modeling human conversation is to create agents which can interact with people in both open-ended and goal-oriented scenarios. End-to-end trained neural dialog systems are an important line of research for such generalized dialog models as they do not resort to any situation-specific handcrafting of rules. However, incorporating personalization into such systems is a largely unexp… ▽ More

    Submitted 15 December, 2017; v1 submitted 22 June, 2017; originally announced June 2017.

    Comments: Accepted at NIPS 2017 Conversational AI Workshop; Code and data at https://github.com/chaitjo/personalized-dialog