Skip to main content

Showing 1–50 of 60 results for author: Shou, L

  1. arXiv:2406.01402  [pdf, other

    cs.CV cs.AI cs.LG

    Mixture of Rationale: Multi-Modal Reasoning Mixture for Visual Question Answering

    Authors: Tao Li, Linjun Shou, Xuejun Liu

    Abstract: Zero-shot visual question answering (VQA) is a challenging task that requires reasoning across modalities. While some existing methods rely on a single rationale within the Chain of Thoughts (CoT) framework, they may fall short of capturing the complexity of the VQA problem. On the other hand, some other methods that use multiple rationales may still suffer from low diversity, poor modality alignm… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    ACM Class: I.2.10

  2. arXiv:2403.04146  [pdf

    cs.LG cs.AI cs.DC

    FL-GUARD: A Holistic Framework for Run-Time Detection and Recovery of Negative Federated Learning

    Authors: Hong Lin, Lidan Shou, Ke Chen, Gang Chen, Sai Wu

    Abstract: Federated learning (FL) is a promising approach for learning a model from data distributed on massive clients without exposing data privacy. It works effectively in the ideal federation where clients share homogeneous data distribution and learning behavior. However, FL may fail to function appropriately when the federation is not ideal, amid an unhealthy state called Negative Federated Learning (… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Journal ref: Data Science and Engineering (2024)

  3. arXiv:2403.01698  [pdf, other

    cs.CL cs.AI

    Hypertext Entity Extraction in Webpage

    Authors: Yifei Yang, Tianqiao Liu, Bo Shao, Hai Zhao, Linjun Shou, Ming Gong, Daxin Jiang

    Abstract: Webpage entity extraction is a fundamental natural language processing task in both research and applications. Nowadays, the majority of webpage entity extraction models are trained on structured datasets which strive to retain textual content and its structure information. However, existing datasets all overlook the rich hypertext features (e.g., font color, font size) which show their effectiven… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  4. arXiv:2312.10201  [pdf, other

    cs.MM cs.AI

    CARAT: Contrastive Feature Reconstruction and Aggregation for Multi-Modal Multi-Label Emotion Recognition

    Authors: Cheng Peng, Ke Chen, Lidan Shou, Gang Chen

    Abstract: Multi-modal multi-label emotion recognition (MMER) aims to identify relevant emotions from multiple modalities. The challenge of MMER is how to effectively capture discriminative features for multiple labels from heterogeneous data. Recent studies are mainly devoted to exploring various fusion strategies to integrate multi-modal information into a unified representation for all labels. However, su… ▽ More

    Submitted 13 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

  5. arXiv:2312.04333  [pdf, other

    cs.CL

    Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers

    Authors: Nuo Chen, Ning Wu, Shining Liang, Ming Gong, Linjun Shou, Dongmei Zhang, Jia Li

    Abstract: This paper presents an in-depth analysis of Large Language Models (LLMs), focusing on LLaMA, a prominent open-source foundational model in natural language processing. Instead of assessing LLaMA through its generative output, we design multiple-choice tasks to probe its intrinsic understanding in high-order tasks such as reasoning and computation. We examine the model horizontally, comparing diffe… ▽ More

    Submitted 9 January, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: 15 pages

  6. arXiv:2311.03253  [pdf, other

    cs.CL cs.AI

    Coherent Entity Disambiguation via Modeling Topic and Categorical Dependency

    Authors: Zilin Xiao, Linjun Shou, Xingyao Zhang, Jie Wu, Ming Gong, Jian Pei, Daxin Jiang

    Abstract: Previous entity disambiguation (ED) methods adopt a discriminative paradigm, where prediction is made based on matching scores between mention context and candidate entities using length-limited encoders. However, these methods often struggle to capture explicit discourse-level dependencies, resulting in incoherent predictions at the abstract level (e.g. topic or category). We propose CoherentED,… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted to EMNLP 2023 Findings

  7. arXiv:2311.03250  [pdf, other

    cs.CL cs.AI

    Instructed Language Models with Retrievers Are Powerful Entity Linkers

    Authors: Zilin Xiao, Ming Gong, Jie Wu, Xingyao Zhang, Linjun Shou, Jian Pei, Daxin Jiang

    Abstract: Generative approaches powered by large language models (LLMs) have demonstrated emergent abilities in tasks that require complex reasoning abilities. Yet the generative nature still makes the generated content suffer from hallucinations, thus unsuitable for entity-centric tasks like entity linking (EL) requiring precise entity predictions over a large knowledge base. We present Instructed Generati… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted to EMNLP 2023 Main

  8. RUEL: Retrieval-Augmented User Representation with Edge Browser Logs for Sequential Recommendation

    Authors: Ning Wu, Ming Gong, Linjun Shou, Jian Pei, Daxin Jiang

    Abstract: Online recommender systems (RS) aim to match user needs with the vast amount of resources available on various platforms. A key challenge is to model user preferences accurately under the condition of data sparsity. To address this challenge, some methods have leveraged external user behavior data from multiple platforms to enrich user representation. However, all of these methods require a consis… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: CIKM 2023 ADS

  9. arXiv:2309.08168  [pdf, other

    cs.CL

    Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding

    Authors: Jun Zhang, Jue Wang, Huan Li, Lidan Shou, Ke Chen, Gang Chen, Sharad Mehrotra

    Abstract: We present a novel inference scheme, self-speculative decoding, for accelerating Large Language Models (LLMs) without the need for an auxiliary model. This approach is characterized by a two-stage process: drafting and verification. The drafting stage generates draft tokens at a slightly lower quality but more quickly, which is achieved by selectively skipping certain intermediate layers during dr… ▽ More

    Submitted 19 May, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: Accepted to ACL 2024

  10. arXiv:2305.06154  [pdf, other

    cs.CL

    Alleviating Over-smoothing for Unsupervised Sentence Representation

    Authors: Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Bowen Cao, Jianhui Chang, Daxin Jiang, Jia Li

    Abstract: Currently, learning better unsupervised sentence representations is the pursuit of many natural language processing communities. Lots of approaches based on pre-trained language models (PLMs) and contrastive learning have achieved promising results on this task. Experimentally, we observe that the over-smoothing problem reduces the capacity of these powerful PLMs, leading to sub-optimal sentence r… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: 13 pages

    Journal ref: ACL 2023

  11. Augmenting Passage Representations with Query Generation for Enhanced Cross-Lingual Dense Retrieval

    Authors: Shengyao Zhuang, Linjun Shou, Guido Zuccon

    Abstract: Effective cross-lingual dense retrieval methods that rely on multilingual pre-trained language models (PLMs) need to be trained to encompass both the relevance matching task and the cross-language alignment task. However, cross-lingual data for training is often scarcely available. In this paper, rather than using more cross-lingual data for training, we propose to use cross-lingual query generati… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.

    Comments: SIGIR2023 short paper

  12. arXiv:2304.08138  [pdf, other

    cs.IR

    Typos-aware Bottlenecked Pre-Training for Robust Dense Retrieval

    Authors: Shengyao Zhuang, Linjun Shou, Jian Pei, Ming Gong, Houxing Ren, Guido Zuccon, Daxin Jiang

    Abstract: Current dense retrievers (DRs) are limited in their ability to effectively process misspelled queries, which constitute a significant portion of query traffic in commercial search engines. The main issue is that the pre-trained language model-based encoders used by DRs are typically trained and fine-tuned using clean, well-curated text data. Misspelled queries are typically not found in the data u… ▽ More

    Submitted 26 November, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: 10 pages, accepted at SIGIR-AP

  13. arXiv:2303.16434  [pdf, other

    cs.AI cs.CL

    TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

    Authors: Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan

    Abstract: Artificial Intelligence (AI) has made incredible progress recently. On the one hand, advanced foundation models like ChatGPT can offer powerful conversation, in-context learning and code generation abilities on a broad range of open-domain tasks. They can also generate high-level solution outlines for domain-specific tasks based on the common sense knowledge they have acquired. However, they still… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

  14. arXiv:2303.15078  [pdf, other

    cs.CL

    Large Language Models are Diverse Role-Players for Summarization Evaluation

    Authors: Ning Wu, Ming Gong, Linjun Shou, Shining Liang, Daxin Jiang

    Abstract: Text summarization has a wide range of applications in many scenarios. The evaluation of the quality of the generated text is a complex problem. A big challenge to language evaluation is that there is a clear divergence between existing metrics and human evaluation. A document summary's quality can be assessed by human annotators on various criteria, both objective ones like grammar and correctnes… ▽ More

    Submitted 19 September, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: NLPCC 2023

  15. arXiv:2303.14991  [pdf, other

    cs.IR

    Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval

    Authors: Houxing Ren, Linjun Shou, Ning Wu, Ming Gong, Daxin Jiang

    Abstract: In monolingual dense retrieval, lots of works focus on how to distill knowledge from cross-encoder re-ranker to dual-encoder retriever and these methods achieve better performance due to the effectiveness of cross-encoder re-ranker. However, we find that the performance of the cross-encoder re-ranker is heavily influenced by the number of training samples and the quality of negative samples, which… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: EMNLP 2022 main conference

  16. arXiv:2303.14979  [pdf, other

    cs.IR

    Lexicon-Enhanced Self-Supervised Training for Multilingual Dense Retrieval

    Authors: Houxing Ren, Linjun Shou, Jian Pei, Ning Wu, Ming Gong, Daxin Jiang

    Abstract: Recent multilingual pre-trained models have shown better performance in various multilingual tasks. However, these models perform poorly on multilingual retrieval tasks due to lacking multilingual training data. In this paper, we propose to mine and generate self-supervised training data based on a large-scale unlabeled corpus. We carefully design a mining method which combines the sparse and dens… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: EMNLP 2022 Findings

  17. arXiv:2302.09302  [pdf, other

    cs.CL

    Bridge the Gap between Language models and Tabular Understanding

    Authors: Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Chenyu You, Jianhui Chang, Daxin Jiang, Jia Li

    Abstract: Table pretrain-then-finetune paradigm has been proposed and employed at a rapid pace after the success of pre-training in the natural language domain. Despite the promising findings in tabular pre-trained language models (TPLMs), there is an input gap between pre-training and fine-tuning phases. For instance, TPLMs jointly pre-trained with table and text input could be effective for tasks also wit… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: 7 pages

  18. arXiv:2206.10128  [pdf, other

    cs.IR cs.CL

    Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation

    Authors: Shengyao Zhuang, Houxing Ren, Linjun Shou, Jian Pei, Ming Gong, Guido Zuccon, Daxin Jiang

    Abstract: The Differentiable Search Index (DSI) is an emerging paradigm for information retrieval. Unlike traditional retrieval architectures where index and retrieval are two different and separate components, DSI uses a single transformer model to perform both indexing and retrieval. In this paper, we identify and tackle an important issue of current DSI models: the data distribution mismatch that occur… ▽ More

    Submitted 7 July, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: 11 pages

  19. arXiv:2206.03281  [pdf, other

    cs.IR

    Unsupervised Context Aware Sentence Representation Pretraining for Multi-lingual Dense Retrieval

    Authors: Ning Wu, Yaobo Liang, Houxing Ren, Linjun Shou, Nan Duan, Ming Gong, Daxin Jiang

    Abstract: Recent research demonstrates the effectiveness of using pretrained language models (PLM) to improve dense retrieval and multilingual dense retrieval. In this work, we present a simple but effective monolingual pretraining task called contrastive context prediction~(CCP) to learn sentence representation by modeling sentence level contextual relation. By pushing the embedding of sentences in a local… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  20. arXiv:2206.00212  [pdf, ps, other

    cs.IR

    Negative Sampling for Contrastive Representation Learning: A Review

    Authors: Lanling Xu, Jianxun Lian, Wayne Xin Zhao, Ming Gong, Linjun Shou, Daxin Jiang, Xing Xie, Ji-Rong Wen

    Abstract: The learn-to-compare paradigm of contrastive representation learning (CRL), which compares positive samples with negative ones for representation learning, has achieved great success in a wide range of domains, including natural language processing, computer vision, information retrieval and graph learning. While many research works focus on data augmentations, nonlinear transformations or other c… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

    Comments: 6 pages

  21. arXiv:2205.03656  [pdf, other

    cs.CL cs.AI

    Label-aware Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding

    Authors: Shining Liang, Linjun Shou, Jian Pei, Ming Gong, Wanli Zuo, Xianglin Zuo, Daxin Jiang

    Abstract: Despite the great success of spoken language understanding (SLU) in high-resource languages, it remains challenging in low-resource languages mainly due to the lack of labeled training data. The recent multilingual code-switching approach achieves better alignments of model representations across languages by constructing a mixed-language context in zero-shot cross-lingual SLU. However, current co… ▽ More

    Submitted 24 October, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022 Long paper

  22. arXiv:2204.05210  [pdf

    cs.CL cs.AI

    Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling

    Authors: Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Daxin Jiang

    Abstract: Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks (xSL), such as cross-lingual machine reading comprehension (xMRC) by transferring knowledge from a high-resource language to low-resource languages. Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: 15 pages

  23. arXiv:2204.00849  [pdf, other

    cs.IR

    Transformer-Empowered Content-Aware Collaborative Filtering

    Authors: Weizhe Lin, Linjun Shou, Ming Gong, Pei Jian, Zhilin Wang, Bill Byrne, Daxin Jiang

    Abstract: Knowledge graph (KG) based Collaborative Filtering is an effective approach to personalizing recommendation systems for relatively static domains such as movies and books, by leveraging structured information from KG to enrich both item and user representations. Motivated by the use of Transformers for understanding rich text in content-based filtering recommender systems, we propose Content-aware… ▽ More

    Submitted 2 April, 2022; originally announced April 2022.

  24. arXiv:2112.04735  [pdf, other

    cs.LG cs.AI

    From Good to Best: Two-Stage Training for Cross-lingual Machine Reading Comprehension

    Authors: Nuo Chen, Linjun Shou, Min Gong, Jian Pei, Daxin Jiang

    Abstract: Cross-lingual Machine Reading Comprehension (xMRC) is challenging due to the lack of training data in low-resource languages. The recent approaches use training data only in a resource-rich language like English to fine-tune large-scale cross-lingual pre-trained language models. Due to the big difference between languages, a model fine-tuned only by a source language may not perform well for targe… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

  25. arXiv:2109.01583  [pdf, other

    cs.CL cs.AI

    Learning from Multiple Noisy Augmented Data Sets for Better Cross-Lingual Spoken Language Understanding

    Authors: Yingmei Guo, Linjun Shou, Jian Pei, Ming Gong, Mingxing Xu, Zhiyong Wu, Daxin Jiang

    Abstract: Lack of training data presents a grand challenge to scaling out spoken language understanding (SLU) to low-resource languages. Although various data augmentation approaches have been proposed to synthesize training data in low-resource target languages, the augmented data sets are often noisy, and thus impede the performance of SLU models. In this paper we focus on mitigating noise in augmented da… ▽ More

    Submitted 3 September, 2021; originally announced September 2021.

    Comments: Long paper at EMNLP 2021

  26. arXiv:2107.11768  [pdf, other

    cs.CL cs.AI

    A Joint and Domain-Adaptive Approach to Spoken Language Understanding

    Authors: Linhao Zhang, Yu Shi, Linjun Shou, Ming Gong, Houfeng Wang, Michael Zeng

    Abstract: Spoken Language Understanding (SLU) is composed of two subtasks: intent detection (ID) and slot filling (SF). There are two lines of research on SLU. One jointly tackles these two subtasks to improve their prediction accuracy, and the other focuses on the domain-adaptation ability of one of the subtasks. In this paper, we attempt to bridge these two lines of research and propose a joint and domain… ▽ More

    Submitted 25 July, 2021; originally announced July 2021.

  27. arXiv:2106.00241  [pdf, other

    cs.CL cs.AI

    Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition

    Authors: Shining Liang, Ming Gong, Jian Pei, Linjun Shou, Wanli Zuo, Xianglin Zuo, Daxin Jiang

    Abstract: Named entity recognition (NER) is a fundamental component in many applications, such as Web Search and Voice Assistants. Although deep neural networks greatly improve the performance of NER, due to the requirement of large amounts of training data, deep neural networks can hardly scale out to many languages in an industry setting. To tackle this challenge, cross-lingual NER transfers knowledge fro… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: KDD 2021

  28. arXiv:2105.13239  [pdf, other

    cs.CL cs.SE

    CoSQA: 20,000+ Web Queries for Code Search and Question Answering

    Authors: Junjie Huang, Duyu Tang, Linjun Shou, Ming Gong, Ke Xu, Daxin Jiang, Ming Zhou, Nan Duan

    Abstract: Finding codes given natural language query isb eneficial to the productivity of software developers. Future progress towards better semantic matching between query and code requires richer supervised training resources. To remedy this, we introduce the CoSQA dataset.It includes 20,604 labels for pairs of natural language queries and codes, each annotated by at least 3 human annotators. We further… ▽ More

    Submitted 27 May, 2021; originally announced May 2021.

    Comments: ACL 2021 main conference. The CoSQA data and leaderboard are available at https://github.com/microsoft/CodeXGLUE/tree/main/Text-Code/NL-code-search-WebQuery. The code is available at https://github.com/Jun-jie-Huang/CoCLR

  29. arXiv:2105.11174  [pdf, other

    cs.CL cs.AI

    Retrieval Enhanced Model for Commonsense Generation

    Authors: Han Wang, Yang Liu, Chenguang Zhu, Linjun Shou, Ming Gong, Yichong Xu, Michael Zeng

    Abstract: Commonsense generation is a challenging task of generating a plausible sentence describing an everyday scenario using provided concepts. Its requirement of reasoning over commonsense knowledge and compositional generalization ability even puzzles strong pre-trained language generation models. We propose a novel framework using retrieval methods to enhance both the pre-training and fine-tuning for… ▽ More

    Submitted 24 May, 2021; originally announced May 2021.

    Comments: Findings of ACL-IJCNLP 2021

  30. Towards Crowd-aware Indoor Path Planning (Extended Version)

    Authors: Tiantian Liu, Huan Li, Hua Lu, Muhammad Aamir Cheema, Lidan Shou

    Abstract: Indoor venues accommodate many people who collectively form crowds. Such crowds in turn influence people's routing choices, e.g., people may prefer to avoid crowded rooms when walking from A to B. This paper studies two types of crowd-aware indoor path planning queries. The Indoor Crowd-Aware Fastest Path Query (FPQ) finds a path with the shortest travel time in the presence of crowds, whereas the… ▽ More

    Submitted 29 April, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: The extension of a VLDB'21 paper "Towards Crowd-aware Indoor Path Planning"

  31. arXiv:2104.01767  [pdf, other

    cs.CL

    WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach

    Authors: Junjie Huang, Duyu Tang, Wanjun Zhong, Shuai Lu, Linjun Shou, Ming Gong, Daxin Jiang, Nan Duan

    Abstract: Producing the embedding of a sentence in an unsupervised way is valuable to natural language matching and retrieval problems in practice. In this work, we conduct a thorough examination of pretrained model based unsupervised sentence embeddings. We study on four pretrained models and conduct massive experiments on seven datasets regarding sentence semantics. We have there main findings. First, ave… ▽ More

    Submitted 8 April, 2021; v1 submitted 5 April, 2021; originally announced April 2021.

  32. arXiv:2102.11114  [pdf, other

    cs.CL cs.SD eess.AS

    Generating Human Readable Transcript for Automatic Speech Recognition with Pre-trained Language Model

    Authors: Junwei Liao, Yu Shi, Ming Gong, Linjun Shou, Sefik Eskimez, Liyang Lu, Hong Qu, Michael Zeng

    Abstract: Modern Automatic Speech Recognition (ASR) systems can achieve high performance in terms of recognition accuracy. However, a perfectly accurate transcript still can be challenging to read due to disfluency, filter words, and other errata common in spoken communication. Many downstream tasks and human readers rely on the output of the ASR system; therefore, errors introduced by the speaker and ASR s… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

    Comments: Accepted in 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2021)

  33. arXiv:2102.06578  [pdf, other

    cs.CL

    Improving Zero-shot Neural Machine Translation on Language-specific Encoders-Decoders

    Authors: Junwei Liao, Yu Shi, Ming Gong, Linjun Shou, Hong Qu, Michael Zeng

    Abstract: Recently, universal neural machine translation (NMT) with shared encoder-decoder gained good performance on zero-shot translation. Unlike universal NMT, jointly trained language-specific encoders-decoders aim to achieve universal representation across non-shared modules, each of which is for a language or language family. The non-shared architecture has the advantage of mitigating internal languag… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

  34. arXiv:2102.04664  [pdf, other

    cs.SE cs.CL

    CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

    Authors: Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, Shujie Liu

    Abstract: Benchmark datasets have a significant impact on accelerating research in programming language tasks. In this paper, we introduce CodeXGLUE, a benchmark dataset to foster machine learning research for program understanding and generation. CodeXGLUE includes a collection of 10 tasks across 14 datasets and a platform for model evaluation and comparison. CodeXGLUE also features three baseline systems,… ▽ More

    Submitted 16 March, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

    Comments: 14 pages; Revise CodeBLEU scores for all models on text-to-code task

  35. arXiv:2012.14116  [pdf, other

    cs.CL

    Syntax-Enhanced Pre-trained Model

    Authors: Zenan Xu, Daya Guo, Duyu Tang, Qinliang Su, Linjun Shou, Ming Gong, Wanjun Zhong, Xiaojun Quan, Nan Duan, Daxin Jiang

    Abstract: We study the problem of leveraging the syntactic structure of text to enhance pre-trained models such as BERT and RoBERTa. Existing methods utilize syntax of text either in the pre-training stage or in the fine-tuning stage, so that they suffer from discrepancy between the two stages. Such a problem would lead to the necessity of having human-annotated syntactic information, which limits the appli… ▽ More

    Submitted 29 May, 2021; v1 submitted 28 December, 2020; originally announced December 2020.

    Comments: Accepted by ACL-IJCNLP 2021: The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing

  36. arXiv:2012.06048  [pdf, other

    cs.CL cs.LG

    Reinforced Multi-Teacher Selection for Knowledge Distillation

    Authors: Fei Yuan, Linjun Shou, Jian Pei, Wutao Lin, Ming Gong, Yan Fu, Daxin Jiang

    Abstract: In natural language processing (NLP) tasks, slow inference speed and huge footprints in GPU usage remain the bottleneck of applying pre-trained deep models in production. As a popular method for model compression, knowledge distillation transfers knowledge from one or multiple large (teacher) models to a small (student) model. When multiple teacher models are available in distillation, the state-o… ▽ More

    Submitted 13 December, 2020; v1 submitted 11 December, 2020; originally announced December 2020.

    Comments: AAAI 2021

  37. arXiv:2011.11928  [pdf, other

    cs.CL

    GLGE: A New General Language Generation Evaluation Benchmark

    Authors: Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu, Linjun Shou, Ming Gong, Pengcheng Wang, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Ruofei Zhang, Winnie Wu, Ming Zhou, Nan Duan

    Abstract: Multi-task benchmarks such as GLUE and SuperGLUE have driven great progress of pretraining and transfer learning in Natural Language Processing (NLP). These benchmarks mostly focus on a range of Natural Language Understanding (NLU) tasks, without considering the Natural Language Generation (NLG) models. In this paper, we present the General Language Generation Evaluation (GLGE), a new multi-task b… ▽ More

    Submitted 1 June, 2021; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: Findings of Association for Computational Linguistics. ACL 2021

  38. arXiv:2011.11160  [pdf, other

    cs.LG cs.DC

    LINDT: Tackling Negative Federated Learning with Local Adaptation

    Authors: Hong Lin, Lidan Shou, Ke Chen, Gang Chen, Sai Wu

    Abstract: Federated Learning (FL) is a promising distributed learning paradigm, which allows a number of data owners (also called clients) to collaboratively learn a shared model without disclosing each client's data. However, FL may fail to proceed properly, amid a state that we call negative federated learning (NFL). This paper addresses the problem of negative federated learning. We formulate a rigorous… ▽ More

    Submitted 22 November, 2020; originally announced November 2020.

  39. arXiv:2011.05723  [pdf, other

    cs.CL cs.LG

    CalibreNet: Calibration Networks for Multilingual Sequence Labeling

    Authors: Shining Liang, Linjun Shou, Jian Pei, Ming Gong, Wanli Zuo, Daxin Jiang

    Abstract: Lack of training data in low-resource languages presents huge challenges to sequence labeling tasks such as named entity recognition (NER) and machine reading comprehension (MRC). One major obstacle is the errors on the boundary of predicted answers. To tackle this problem, we propose CalibreNet, which predicts answers in two steps. In the first step, any existing sequence labeling method can be a… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

    Comments: Long paper in WSDM 2021

  40. arXiv:2010.14271  [pdf, other

    cs.CL cs.AI

    Cross-lingual Machine Reading Comprehension with Language Branch Knowledge Distillation

    Authors: Junhao Liu, Linjun Shou, Jian Pei, Ming Gong, Min Yang, Daxin Jiang

    Abstract: Cross-lingual Machine Reading Comprehension (CLMRC) remains a challenging problem due to the lack of large-scale annotated datasets in low-source languages, such as Arabic, Hindi, and Vietnamese. Many previous approaches use translation data by translating from a rich-source language, such as English, to low-source languages as auxiliary supervision. However, how to effectively leverage translatio… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: Accepted as long paper in COLING 2020

  41. arXiv:2010.07606   

    cs.CL

    Learning Better Representation for Tables by Self-Supervised Tasks

    Authors: Liang Li, Can Ma, Yinliang Yue, Linjun Shou, Dayong Hu

    Abstract: Table-to-text generation aims at automatically generating natural text to help people to conveniently obtain the important information in tables. Although neural models for table-to-text have achieved remarkable progress, some problems still overlooked. The first is that the values recorded in many tables are mostly numbers in practice. The existing approaches do not do special treatment for these… ▽ More

    Submitted 30 March, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

    Comments: This article is writing messy, and some of the experiments are inadequate, which may mislead the reader about our work

  42. arXiv:2010.06801  [pdf, other

    cs.CL cs.AI

    A Graph Representation of Semi-structured Data for Web Question Answering

    Authors: Xingyao Zhang, Linjun Shou, Jian Pei, Ming Gong, Lijie Wen, Daxin Jiang

    Abstract: The abundant semi-structured data on the Web, such as HTML-based tables and lists, provide commercial search engines a rich information source for question answering (QA). Different from plain text passages in Web documents, Web tables and lists have inherent structures, which carry semantic correlations among various elements in tables and lists. Many existing studies treat tables and lists as fl… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: Accepted as long paper in COLING 2020

  43. arXiv:2010.03910  [pdf, other

    cs.DB cs.DS

    An Experimental Analysis of Indoor Spatial Queries: Modeling, Indexing, and Processing

    Authors: Tiantian Liu, Huan Li, Hua Lu, Muhammad Aamir Cheema, Lidan Shou

    Abstract: Indoor location-based services (LBS), such as POI search and routing, are often built on top of typical indoor spatial queries. To support such queries and indoor LBS, multiple techniques including model/indexes and search algorithms have been proposed. In this work, we conduct an extensive experimental study on existing proposals for indoor spatial queries. We survey five model/indexes, compare t… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

    Comments: An Experiment and Analysis Paper

  44. arXiv:2009.14348  [pdf, other

    cs.CL

    MaP: A Matrix-based Prediction Approach to Improve Span Extraction in Machine Reading Comprehension

    Authors: Huaishao Luo, Yu Shi, Ming Gong, Linjun Shou, Tianrui Li

    Abstract: Span extraction is an essential problem in machine reading comprehension. Most of the existing algorithms predict the start and end positions of an answer span in the given corresponding context by generating two probability vectors. In this paper, we propose a novel approach that extends the probability vector to a probability matrix. Such a matrix can cover more start-end position pairs. Precise… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: to appear at AACL-IJCNLP 2020

  45. arXiv:2009.12056  [pdf, other

    cs.CL cs.LG

    No Answer is Better Than Wrong Answer: A Reflection Model for Document Level Machine Reading Comprehension

    Authors: Xuguang Wang, Linjun Shou, Ming Gong, Nan Duan, Daxin Jiang

    Abstract: The Natural Questions (NQ) benchmark set brings new challenges to Machine Reading Comprehension: the answers are not only at different levels of granularity (long and short), but also of richer types (including no-answer, yes/no, single-span and multi-span). In this paper, we target at this challenge and handle all answer types systematically. In particular, we propose a novel approach called Refl… ▽ More

    Submitted 29 September, 2020; v1 submitted 25 September, 2020; originally announced September 2020.

    Comments: Accepted by Findings of EMNLP 2020

  46. arXiv:2009.07406  [pdf, other

    cs.CL cs.AI

    Tag and Correct: Question aware Open Information Extraction with Two-stage Decoding

    Authors: Martin Kuo, Yaobo Liang, Lei Ji, Nan Duan, Linjun Shou, Ming Gong, Peng Chen

    Abstract: Question Aware Open Information Extraction (Question aware Open IE) takes question and passage as inputs, outputting an answer tuple which contains a subject, a predicate, and one or more arguments. Each field of answer is a natural language word sequence and is extracted from the passage. The semi-structured answer has two advantages which are more readable and falsifiable compared to span answer… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

    Comments: 11 pages, 1 figure, 4 tables

    MSC Class: 68T50; 68T01

  47. Mining Implicit Relevance Feedback from User Behavior for Web Question Answering

    Authors: Linjun Shou, Shining Bo, Feixiang Cheng, Ming Gong, Jian Pei, Daxin Jiang

    Abstract: Training and refreshing a web-scale Question Answering (QA) system for a multi-lingual commercial search engine often requires a huge amount of training examples. One principled idea is to mine implicit relevance feedback from user behavior recorded in search engine logs. All previous works on mining implicit relevance feedback target at relevance of web documents rather than passages. Due to seve… ▽ More

    Submitted 15 June, 2020; v1 submitted 13 June, 2020; originally announced June 2020.

    Comments: Accepted by KDD 2020

  48. arXiv:2004.14069  [pdf, other

    cs.CL cs.AI

    Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension

    Authors: Fei Yuan, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang

    Abstract: Multilingual pre-trained models could leverage the training data from a rich source language (such as English) to improve performance on low resource languages. However, the transfer quality for multilingual Machine Reading Comprehension (MRC) is significantly worse than sentence classification tasks mainly due to the requirement of MRC to detect the word level answer boundary. In this paper, we p… ▽ More

    Submitted 8 May, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: Accepted to ACL 2020

  49. arXiv:2004.13659  [pdf, other

    cs.CL

    LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network

    Authors: Wanjun Zhong, Duyu Tang, Zhangyin Feng, Nan Duan, Ming Zhou, Ming Gong, Linjun Shou, Daxin Jiang, Jiahai Wang, Jian Yin

    Abstract: Verifying the correctness of a textual statement requires not only semantic reasoning about the meaning of words, but also symbolic reasoning about logical operations like count, superlative, aggregation, etc. In this work, we propose LogicalFactChecker, a neural network approach capable of leveraging logical operations for fact checking. It achieves the state-of-the-art performance on TABFACT, a… ▽ More

    Submitted 28 April, 2020; originally announced April 2020.

    Comments: 13 pages; 7 figures; Accepted by ACL2020 as a long paper

  50. arXiv:2004.05568  [pdf, other

    cs.CL

    Pre-training Text Representations as Meta Learning

    Authors: Shangwen Lv, Yuechen Wang, Daya Guo, Duyu Tang, Nan Duan, Fuqing Zhu, Ming Gong, Linjun Shou, Ryan Ma, Daxin Jiang, Guihong Cao, Ming Zhou, Songlin Hu

    Abstract: Pre-training text representations has recently been shown to significantly improve the state-of-the-art in many natural language processing tasks. The central goal of pre-training is to learn text representations that are useful for subsequent tasks. However, existing approaches are optimized by minimizing a proxy objective, such as the negative log likelihood of language modeling. In this work, w… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: 2 figures, 3 tables