Skip to main content

Showing 1–50 of 99 results for author: Xiang, B

  1. arXiv:2406.08522  [pdf, other

    cs.SI cs.LG

    Predicting Cascading Failures with a Hyperparametric Diffusion Model

    Authors: Bin Xiang, Bogdan Cautis, Xiaokui Xiao, Olga Mula, Dusit Niyato, Laks V. S. Lakshmanan

    Abstract: In this paper, we study cascading failures in power grids through the lens of information diffusion models. Similar to the spread of rumors or influence in an online social network, it has been observed that failures (outages) in a power grid can spread contagiously, driven by viral spread mechanisms. We employ a stochastic diffusion model that is Markovian (memoryless) and local (the activation o… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: KDD 2024

  2. arXiv:2406.01359  [pdf, other

    cs.CL cs.SE

    R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models

    Authors: Ken Deng, Jiaheng Liu, He Zhu, Congnan Liu, Jingxin Li, Jiakai Wang, Peng Zhao, Chenchen Zhang, Yanan Wu, Xueqiao Yin, Yuanxing Zhang, Wenbo Su, Bangyu Xiang, Tiezheng Ge, Bo Zheng

    Abstract: Code completion models have made significant progress in recent years. Recently, repository-level code completion has drawn more attention in modern software development, and several baseline methods and benchmarks have been proposed. However, existing repository-level code completion methods often fall short of fully using the extensive context of a project repository, such as the intricacies of… ▽ More

    Submitted 3 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  3. arXiv:2403.10833  [pdf, other

    cs.RO

    Deep Reinforcement Learning-based Large-scale Robot Exploration

    Authors: Yuhong Cao, Rui Zhao, Yizhuo Wang, Bairan Xiang, Guillaume Sartoretti

    Abstract: In this work, we propose a deep reinforcement learning (DRL) based reactive planner to solve large-scale Lidar-based autonomous robot exploration problems in 2D action space. Our DRL-based planner allows the agent to reactively plan its exploration path by making implicit predictions about unknown areas, based on a learned estimation of the underlying transition model of the environment. To this e… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: \c{opyright} 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  4. arXiv:2403.08845  [pdf, other

    cs.LG cs.AI

    Bifurcated Attention: Accelerating Massively Parallel Decoding with Shared Prefixes in LLMs

    Authors: Ben Athiwaratkun, Sujan Kumar Gonugondla, Sanjay Krishna Gouda, Haifeng Qian, Hantian Ding, Qing Sun, Jun Wang, Jiacheng Guo, Liangfu Chen, Parminder Bhatia, Ramesh Nallapati, Sudipta Sengupta, Bing Xiang

    Abstract: This study introduces bifurcated attention, a method designed to enhance language model inference in shared-context batch decoding scenarios. Our approach addresses the challenge of redundant memory IO costs, a critical factor contributing to latency in high batch sizes and extended context lengths. Bifurcated attention achieves this by strategically dividing the attention mechanism during increme… ▽ More

    Submitted 11 July, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  5. arXiv:2403.08688  [pdf, other

    cs.CL cs.AI

    Token Alignment via Character Matching for Subword Completion

    Authors: Ben Athiwaratkun, Shiqi Wang, Mingyue Shang, Yuchen Tian, Zijian Wang, Sujan Kumar Gonugondla, Sanjay Krishna Gouda, Rob Kwiatowski, Ramesh Nallapati, Bing Xiang

    Abstract: Generative models, widely utilized in various applications, can often struggle with prompts corresponding to partial tokens. This struggle stems from tokenization, where partial tokens fall out of distribution during inference, leading to incorrect or nonsensical outputs. This paper examines a technique to alleviate the tokenization artifact on text completion in generative models, maintaining per… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  6. arXiv:2402.02694  [pdf, other

    eess.AS cs.LG cs.SD

    Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift

    Authors: Jisheng Bai, Mou Wang, Haohe Liu, Han Yin, Yafei Jia, Siwei Huang, Yutong Du, Dongzhe Zhang, Dongyuan Shi, Woon-Seng Gan, Mark D. Plumbley, Susanto Rahardja, Bin Xiang, Jianfeng Chen

    Abstract: Acoustic scene classification (ASC) is a crucial research problem in computational auditory scene analysis, and it aims to recognize the unique acoustic characteristics of an environment. One of the challenges of the ASC task is the domain shift between training and testing data. Since 2018, ASC challenges have focused on the generalization of ASC models across different recording devices. Althoug… ▽ More

    Submitted 28 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  7. arXiv:2402.01935  [pdf, other

    cs.CL

    Code Representation Learning At Scale

    Authors: Dejiao Zhang, Wasi Ahmad, Ming Tan, Hantian Ding, Ramesh Nallapati, Dan Roth, Xiaofei Ma, Bing Xiang

    Abstract: Recent studies have shown that code language models at scale demonstrate significant performance gains on downstream tasks, i.e., code generation. However, most of the existing works on code representation learning train models at a hundred million parameter scale using very limited pretraining corpora. In this work, we fuel code representation learning with a vast amount of code data via a two-st… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 10 pages

    Journal ref: ICLR 2024

  8. arXiv:2401.15739  [pdf

    cs.CV cs.LG

    SegmentAnyTree: A sensor and platform agnostic deep learning model for tree segmentation using laser scanning data

    Authors: Maciej Wielgosz, Stefano Puliti, Binbin Xiang, Konrad Schindler, Rasmus Astrup

    Abstract: This research advances individual tree crown (ITC) segmentation in lidar data, using a deep learning model applicable to various laser scanning types: airborne (ULS), terrestrial (TLS), and mobile (MLS). It addresses the challenge of transferability across different data characteristics in 3D forest scene analysis. The study evaluates the model's performance based on platform (ULS, MLS) and data d… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

  9. arXiv:2312.15084  [pdf, other

    cs.CV

    Automated forest inventory: analysis of high-density airborne LiDAR point clouds with 3D deep learning

    Authors: Binbin Xiang, Maciej Wielgosz, Theodora Kontogianni, Torben Peters, Stefano Puliti, Rasmus Astrup, Konrad Schindler

    Abstract: Detailed forest inventories are critical for sustainable and flexible management of forest resources, to conserve various ecosystem services. Modern airborne laser scanners deliver high-density point clouds with great potential for fine-scale forest inventory and analysis, but automatically partitioning those point clouds into meaningful entities like individual trees or tree components remains a… ▽ More

    Submitted 23 February, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

  10. arXiv:2310.11248  [pdf, other

    cs.LG cs.CL cs.SE

    CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion

    Authors: Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang

    Abstract: Code completion models have made significant progress in recent years, yet current popular evaluation datasets, such as HumanEval and MBPP, predominantly focus on code completion tasks within a single file. This over-simplified setting falls short of representing the real-world software development scenario where repositories span multiple files with numerous cross-file dependencies, and accessing… ▽ More

    Submitted 16 November, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: To appear at NeurIPS 2023 (Datasets and Benchmarks Track)

  11. arXiv:2308.05317  [pdf, other

    cs.CL

    Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning

    Authors: Alexander Hanbo Li, Mingyue Shang, Evangelia Spiliopoulou, Jie Ma, Patrick Ng, Zhiguo Wang, Bonan Min, William Wang, Kathleen McKeown, Vittorio Castelli, Dan Roth, Bing Xiang

    Abstract: We present a novel approach for structured data-to-text generation that addresses the limitations of existing methods that primarily focus on specific types of structured data. Our proposed method aims to improve performance in multi-task training, zero-shot and few-shot scenarios by providing a unified representation that can handle various forms of structured data such as tables, knowledge graph… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

  12. arXiv:2307.06857  [pdf, other

    cs.AI cs.CL cs.LG

    Lightweight reranking for language model generations

    Authors: Siddhartha Jain, Xiaofei Ma, Anoop Deoras, Bing Xiang

    Abstract: Large Language Models (LLMs) can exhibit considerable variation in the quality of their sampled outputs. Reranking and selecting the best generation from the sampled set is a popular way of obtaining strong gains in generation quality. In this paper, we present a novel approach for reranking LLM generations. Unlike other techniques that might involve additional inferences or training a specialized… ▽ More

    Submitted 11 January, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

  13. arXiv:2307.02877  [pdf, other

    cs.CV

    Towards accurate instance segmentation in large-scale LiDAR point clouds

    Authors: Binbin Xiang, Torben Peters, Theodora Kontogianni, Frawa Vetterli, Stefano Puliti, Rasmus Astrup, Konrad Schindler

    Abstract: Panoptic segmentation is the combination of semantic and instance segmentation: assign the points in a 3D point cloud to semantic categories and partition them into distinct object instances. It has many obvious applications for outdoor scene understanding, from city mapping to forest management. Existing methods struggle to segment nearby instances of the same semantic category, like adjacent pie… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  14. arXiv:2307.02435  [pdf, other

    cs.LG cs.CL cs.SE

    Exploring Continual Learning for Code Generation Models

    Authors: Prateek Yadav, Qing Sun, Hantian Ding, Xiaopeng Li, Dejiao Zhang, Ming Tan, Xiaofei Ma, Parminder Bhatia, Ramesh Nallapati, Murali Krishna Ramanathan, Mohit Bansal, Bing Xiang

    Abstract: Large-scale code generation models such as Codex and CodeT5 have achieved impressive performance. However, libraries are upgraded or deprecated very frequently and re-training large-scale language models is computationally expensive. Therefore, Continual Learning (CL) is an important aspect that remains underexplored in the code domain. In this paper, we introduce a benchmark called CodeTask-CL th… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: ACL 2023

  15. arXiv:2306.03203  [pdf, other

    cs.CL cs.SE

    A Static Evaluation of Code Completion by Large Language Models

    Authors: Hantian Ding, Varun Kumar, Yuchen Tian, Zijian Wang, Rob Kwiatkowski, Xiaopeng Li, Murali Krishna Ramanathan, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang

    Abstract: Large language models trained on code have shown great potential to increase productivity of software developers. Several execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems. Nevertheless, it is expensive to perform the same evaluation on complex real-world projects considering the execution cost. On the contrary,… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted by ACL 2023 industry track

  16. arXiv:2305.19998  [pdf, other

    cs.CL cs.LG

    Efficient Shapley Values Estimation by Amortization for Text Classification

    Authors: Chenghao Yang, Fan Yin, He He, Kai-Wei Chang, Xiaofei Ma, Bing Xiang

    Abstract: Despite the popularity of Shapley Values in explaining neural text classification models, computing them is prohibitive for large pretrained models due to a large number of model evaluations. In practice, Shapley Values are often estimated with a small number of stochastic model evaluations. However, we show that the estimated Shapley Values are sensitive to random seed choices -- the top-ranked f… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: ACL 2023 Camera Ready

  17. arXiv:2305.18842  [pdf, other

    cs.CL cs.AI cs.CV

    Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge

    Authors: Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu, Yuhao Zhang, Alexander Hanbo Li, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang

    Abstract: The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge. Recently, pre-trained Language Models (PLM) such as GPT-3 have been applied to the task and shown to be powerful world knowledge sources. However, these methods suffer from low knowledge coverage caused by PLM bias -- the tendency to generate certa… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023 Findings

  18. arXiv:2305.17337  [pdf, other

    cs.CL cs.AI

    Benchmarking Diverse-Modal Entity Linking with Generative Models

    Authors: Sijia Wang, Alexander Hanbo Li, Henry Zhu, Sheng Zhang, Chung-Wei Hang, Pramuditha Perera, Jie Ma, William Wang, Zhiguo Wang, Vittorio Castelli, Bing Xiang, Patrick Ng

    Abstract: Entities can be expressed in diverse formats, such as texts, images, or column names and cell values in tables. While existing entity linking (EL) models work well on per modality configuration, such as text-only EL, visual grounding, or schema linking, it is more challenging to design a unified model for diverse modality configurations. To bring various modality configurations together, we constr… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 15 pages. ACL 2023

  19. arXiv:2305.16265  [pdf, other

    cs.CL

    UNITE: A Unified Benchmark for Text-to-SQL Evaluation

    Authors: Wuwei Lan, Zhiguo Wang, Anuj Chauhan, Henghui Zhu, Alexander Li, Jiang Guo, Sheng Zhang, Chung-Wei Hang, Joseph Lilien, Yiqun Hu, Lin Pan, Mingwen Dong, Jun Wang, Jiarong Jiang, Stephen Ash, Vittorio Castelli, Patrick Ng, Bing Xiang

    Abstract: A practical text-to-SQL system should generalize well on a wide variety of natural language questions, unseen database schemas, and novel SQL query structures. To comprehensively evaluate text-to-SQL systems, we introduce a UNIfied benchmark for Text-to-SQL Evaluation (UNITE). It is composed of publicly available text-to-SQL datasets, containing natural language questions from more than 12 domains… ▽ More

    Submitted 14 July, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: 5 pages

  20. arXiv:2304.13980  [pdf, other

    cs.CV

    A Review of Panoptic Segmentation for Mobile Mapping Point Clouds

    Authors: Binbin Xiang, Yuanwen Yue, Torben Peters, Konrad Schindler

    Abstract: 3D point cloud panoptic segmentation is the combined task to (i) assign each point to a semantic class and (ii) separate the points in each class into object instances. Recently there has been an increased interest in such comprehensive 3D scene understanding, building on the rapid advances of semantic segmentation due to the advent of deep 3D neural networks. Yet, to date there is very little wor… ▽ More

    Submitted 17 August, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

  21. arXiv:2303.05378  [pdf, other

    cs.LG cs.SE

    Greener yet Powerful: Taming Large Code Generation Models with Quantization

    Authors: Xiaokai Wei, Sujan Gonugondla, Wasi Ahmad, Shiqi Wang, Baishakhi Ray, Haifeng Qian, Xiaopeng Li, Varun Kumar, Zijian Wang, Yuchen Tian, Qing Sun, Ben Athiwaratkun, Mingyue Shang, Murali Krishna Ramanathan, Parminder Bhatia, Bing Xiang

    Abstract: ML-powered code generation aims to assist developers to write code in a more productive manner, by intelligently generating code blocks based on natural language prompts. Recently, large pretrained deep learning models have substantially pushed the boundary of code generation and achieved impressive performance. Despite their great power, the huge number of model parameters poses a significant thr… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

    Comments: 10 pages, 7 figures, 10 tables

  22. arXiv:2303.00605  [pdf, other

    cs.RO

    SCRIMP: Scalable Communication for Reinforcement- and Imitation-Learning-Based Multi-Agent Pathfinding

    Authors: Yutong Wang, Bairan Xiang, Shinan Huang, Guillaume Sartoretti

    Abstract: Trading off performance guarantees in favor of scalability, the Multi-Agent Path Finding (MAPF) community has recently started to embrace Multi-Agent Reinforcement Learning (MARL), where agents learn to collaboratively generate individual, collision-free (but often suboptimal) paths. Scalability is usually achieved by assuming a local field of view (FOV) around the agents, helping scale to arbitra… ▽ More

    Submitted 31 August, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: \c{opyright} 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  23. arXiv:2302.06729  [pdf, other

    cs.CL cs.AI

    STREET: A Multi-Task Structured Reasoning and Explanation Benchmark

    Authors: Danilo Ribeiro, Shen Wang, Xiaofei Ma, Henry Zhu, Rui Dong, Deguang Kong, Juliette Burger, Anjelica Ramos, William Wang, Zhiheng Huang, George Karypis, Bing Xiang, Dan Roth

    Abstract: We introduce STREET, a unified multi-task and multi-domain natural language reasoning and explanation benchmark. Unlike most existing question-answering (QA) datasets, we expect models to not only answer questions, but also produce step-by-step structured explanations describing how premises in the question are used to produce intermediate conclusions that can prove the correctness of a certain an… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Comments: Published in ICLR 2023

    ACM Class: I.2.7; I.2.6

  24. arXiv:2301.08881  [pdf, other

    cs.CL

    Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness

    Authors: Shuaichen Chang, Jun Wang, Mingwen Dong, Lin Pan, Henghui Zhu, Alexander Hanbo Li, Wuwei Lan, Sheng Zhang, Jiarong Jiang, Joseph Lilien, Steve Ash, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Bing Xiang

    Abstract: Neural text-to-SQL models have achieved remarkable performance in translating natural language questions into SQL queries. However, recent studies reveal that text-to-SQL models are vulnerable to task-specific perturbations. Previous curated robustness test sets usually focus on individual phenomena. In this paper, we propose a comprehensive robustness benchmark based on Spider, a cross-domain tex… ▽ More

    Submitted 28 January, 2023; v1 submitted 20 January, 2023; originally announced January 2023.

    Comments: ICLR 2023

  25. arXiv:2212.10264  [pdf, other

    cs.LG cs.CL cs.SE

    ReCode: Robustness Evaluation of Code Generation Models

    Authors: Shiqi Wang, Zheng Li, Haifeng Qian, Chenghao Yang, Zijian Wang, Mingyue Shang, Varun Kumar, Samson Tan, Baishakhi Ray, Parminder Bhatia, Ramesh Nallapati, Murali Krishna Ramanathan, Dan Roth, Bing Xiang

    Abstract: Code generation models have achieved impressive performance. However, they tend to be brittle as slight edits to a prompt could lead to very different generations; these robustness properties, critical for user experience when deployed in real-life applications, are not well understood. Most existing works on robustness in text or code tasks have focused on classification, while robustness in gene… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: Code and data available at https://github.com/amazon-science/recode

  26. arXiv:2212.10007  [pdf, other

    cs.CL cs.SE

    CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context

    Authors: Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang

    Abstract: While pre-trained language models (LM) for code have achieved great success in code completion, they generate code conditioned only on the contents within the file, i.e., in-file context, but ignore the rich semantics in other files within the same project, i.e., cross-file context, a critical source of information that is especially useful in modern modular software development. Such overlooking… ▽ More

    Submitted 24 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  27. arXiv:2212.08785  [pdf, other

    cs.CL

    Importance of Synthesizing High-quality Data for Text-to-SQL Parsing

    Authors: Yiyun Zhao, Jiarong Jiang, Yiqun Hu, Wuwei Lan, Henry Zhu, Anuj Chauhan, Alexander Li, Lin Pan, Jun Wang, Chung-Wei Hang, Sheng Zhang, Marvin Dong, Joe Lilien, Patrick Ng, Zhiguo Wang, Vittorio Castelli, Bing Xiang

    Abstract: Recently, there has been increasing interest in synthesizing data to improve downstream text-to-SQL tasks. In this paper, we first examined the existing synthesized datasets and discovered that state-of-the-art text-to-SQL algorithms did not further improve on popular benchmarks when trained with augmented synthetic data. We observed two shortcomings: illogical synthetic SQL queries from independe… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

  28. arXiv:2210.14868  [pdf, other

    cs.LG cs.CL

    Multi-lingual Evaluation of Code Generation Models

    Authors: Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, Hantian Ding, Varun Kumar, Nathan Fulton, Arash Farahani, Siddhartha Jain, Robert Giaquinto, Haifeng Qian, Murali Krishna Ramanathan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang

    Abstract: We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X. These datasets cover over 10 programming languages and are generated using a scalable conversion framework that transpiles prompts and test cases from the original Python datasets into the corresponding data in the target language. Using these benchmarks, we are able to assess the perform… ▽ More

    Submitted 28 March, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Code and data release: https://github.com/amazon-research/mxeval

  29. arXiv:2210.01185  [pdf, other

    cs.CL

    ContraCLM: Contrastive Learning For Causal Language Model

    Authors: Nihal Jain, Dejiao Zhang, Wasi Uddin Ahmad, Zijian Wang, Feng Nan, Xiaopeng Li, Ming Tan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Xiaofei Ma, Bing Xiang

    Abstract: Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both token-level and sequence-level. We assess ContraCLM on a variety of downstream tasks. We show that ContraCLM enhances discrimination of the representations and… ▽ More

    Submitted 2 May, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: 10 pages

    Journal ref: ACL 2023

  30. arXiv:2210.00063  [pdf, other

    cs.CL cs.AI cs.LG

    DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases

    Authors: Donghan Yu, Sheng Zhang, Patrick Ng, Henghui Zhu, Alexander Hanbo Li, Jun Wang, Yiqun Hu, William Wang, Zhiguo Wang, Bing Xiang

    Abstract: Question answering over knowledge bases (KBs) aims to answer natural language questions with factual information such as entities and relations in KBs. Previous methods either generate logical forms that can be executed over KBs to obtain final answers or predict answers directly. Empirical results show that the former often produces more accurate answers, but it suffers from non-execution issues… ▽ More

    Submitted 14 April, 2023; v1 submitted 30 September, 2022; originally announced October 2022.

    Comments: ICLR 2023. Code link: https://github.com/awslabs/decode-answer-logical-form

  31. arXiv:2209.14415  [pdf, other

    cs.CL

    Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding

    Authors: Jun Wang, Patrick Ng, Alexander Hanbo Li, Jiarong Jiang, Zhiguo Wang, Ramesh Nallapati, Bing Xiang, Sudipta Sengupta

    Abstract: Most recent research on Text-to-SQL semantic parsing relies on either parser itself or simple heuristic based approach to understand natural language query (NLQ). When synthesizing a SQL query, there is no explicit semantic information of NLQ available to the parser which leads to undesirable generalization performance. In addition, without lexical-level fine-grained query understanding, linking b… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

    Comments: EMNLP Industry Track 2022

  32. arXiv:2209.03761  [pdf, other

    eess.SP cs.LG

    Too Fine or Too Coarse? The Goldilocks Composition of Data Complexity for Robust Left-Right Eye-Tracking Classifiers

    Authors: Brian Xiang, Abdelrahman Abdelmonsef

    Abstract: The differences in distributional patterns between benchmark data and real-world data have been one of the main challenges of using electroencephalogram (EEG) signals for eye-tracking (ET) classification. Therefore, increasing the robustness of machine learning models in predicting eye-tracking positions from EEG data is integral for both research and consumer use. Previously, we compared the perf… ▽ More

    Submitted 24 August, 2022; originally announced September 2022.

    Comments: arXiv admin note: text overlap with arXiv:2208.00465

  33. arXiv:2208.00465  [pdf, other

    cs.LG eess.SP

    Vector-Based Data Improves Left-Right Eye-Tracking Classifier Performance After a Covariate Distributional Shift

    Authors: Brian Xiang, Abdelrahman Abdelmonsef

    Abstract: The main challenges of using electroencephalogram (EEG) signals to make eye-tracking (ET) predictions are the differences in distributional patterns between benchmark data and real-world data and the noise resulting from the unintended interference of brain signals from multiple sources. Increasing the robustness of machine learning models in predicting eye-tracking position from EEG data is there… ▽ More

    Submitted 31 July, 2022; originally announced August 2022.

  34. arXiv:2206.05123  [pdf, other

    cs.CL cs.IR

    REKnow: Enhanced Knowledge for Joint Entity and Relation Extraction

    Authors: Sheng Zhang, Patrick Ng, Zhiguo Wang, Bing Xiang

    Abstract: Relation extraction is an important but challenging task that aims to extract all hidden relational facts from the text. With the development of deep language models, relation extraction methods have achieved good performance on various benchmarks. However, we observe two shortcomings of previous methods: first, there is no unified framework that works well under various relation extraction settin… ▽ More

    Submitted 15 August, 2022; v1 submitted 10 June, 2022; originally announced June 2022.

  35. arXiv:2205.13568  [pdf, other

    cs.CL cs.LG

    Learning Dialogue Representations from Consecutive Utterances

    Authors: Zhihan Zhou, Dejiao Zhang, Wei Xiao, Nicholas Dingwall, Xiaofei Ma, Andrew O. Arnold, Bing Xiang

    Abstract: Learning high-quality dialogue representations is essential for solving a variety of dialogue-oriented tasks, especially considering that dialogue systems often suffer from data scarcity. In this paper, we introduce Dialogue Sentence Embedding (DSE), a self-supervised contrastive learning method that learns effective dialogue representations suitable for a wide range of dialogue tasks. DSE learns… ▽ More

    Submitted 21 July, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: NAACL 2022 main conference

  36. arXiv:2203.11239  [pdf, other

    cs.CL

    DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization

    Authors: Zheng Li, Zijian Wang, Ming Tan, Ramesh Nallapati, Parminder Bhatia, Andrew Arnold, Bing Xiang, Dan Roth

    Abstract: Large-scale pre-trained sequence-to-sequence models like BART and T5 achieve state-of-the-art performance on many generative NLP tasks. However, such models pose a great challenge in resource-constrained scenarios owing to their large memory requirements and high latency. To alleviate this issue, we propose to jointly distill and quantize the model, where knowledge is transferred from the full-pre… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: ACL 2022

  37. arXiv:2110.10778  [pdf, other

    cs.CL

    Contrastive Document Representation Learning with Graph Attention Networks

    Authors: Peng Xu, Xinchi Chen, Xiaofei Ma, Zhiheng Huang, Bing Xiang

    Abstract: Recent progress in pretrained Transformer-based language models has shown great success in learning contextual representation of text. However, due to the quadratic self-attention complexity, most of the pretrained Transformers models can only handle relatively short text. It is still a challenge when it comes to modeling very long documents. In this work, we propose to use a graph attention netwo… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

    Comments: Findings of EMNLP 2021

  38. arXiv:2110.06393  [pdf, other

    cs.CL cs.IR

    Attention-guided Generative Models for Extractive Question Answering

    Authors: Peng Xu, Davis Liang, Zhiheng Huang, Bing Xiang

    Abstract: We propose a novel method for applying Transformer models to extractive question answering (QA) tasks. Recently, pretrained generative sequence-to-sequence (seq2seq) models have achieved great success in question answering. Contributing to the success of these models are internal attention mechanisms such as cross-attention. We propose a simple strategy to obtain an extractive answer span from the… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

    Comments: 10 pages

  39. arXiv:2109.12788  [pdf, other

    cs.CL cs.AI

    Multiplicative Position-aware Transformer Models for Language Understanding

    Authors: Zhiheng Huang, Davis Liang, Peng Xu, Bing Xiang

    Abstract: Transformer models, which leverage architectural improvements like self-attention, perform remarkably well on Natural Language Processing (NLP) tasks. The self-attention mechanism is position agnostic. In order to capture positional ordering information, various flavors of absolute and relative position embeddings have been proposed. However, there is no systematic analysis on their contributions… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: text overlap with arXiv:2009.13658

  40. arXiv:2109.05424  [pdf, other

    cs.CL cs.LG

    Pairwise Supervised Contrastive Learning of Sentence Representations

    Authors: Dejiao Zhang, Shang-Wen Li, Wei Xiao, Henghui Zhu, Ramesh Nallapati, Andrew O. Arnold, Bing Xiang

    Abstract: Many recent successes in sentence representation learning have been achieved by simply fine-tuning on the Natural Language Inference (NLI) datasets with triplet loss or siamese loss. Nevertheless, they share a common weakness: sentences in a contradiction pair are not necessarily from different semantic categories. Therefore, optimizing the semantic entailment and contradiction reasoning objective… ▽ More

    Submitted 29 January, 2022; v1 submitted 12 September, 2021; originally announced September 2021.

    Comments: 9 pages, EMNLP 2021

  41. arXiv:2108.02866  [pdf, other

    cs.CL

    Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering

    Authors: Alexander Hanbo Li, Patrick Ng, Peng Xu, Henghui Zhu, Zhiguo Wang, Bing Xiang

    Abstract: The current state-of-the-art generative models for open-domain question answering (ODQA) have focused on generating direct answers from unstructured textual information. However, a large amount of world's knowledge is stored in structured databases, and need to be accessed using query languages such as SQL. Furthermore, query languages can answer questions that require complex reasoning, as well a… ▽ More

    Submitted 7 December, 2021; v1 submitted 5 August, 2021; originally announced August 2021.

    Comments: 15 pages, LaTeX; typos corrected, add the open source code link; published to ACL 2021

  42. arXiv:2105.05052  [pdf, other

    cs.CL cs.LG

    Joint Text and Label Generation for Spoken Language Understanding

    Authors: Yang Li, Ben Athiwaratkun, Cicero Nogueira dos Santos, Bing Xiang

    Abstract: Generalization is a central problem in machine learning, especially when data is limited. Using prior information to enforce constraints is the principled way of encouraging generalization. In this work, we propose to leverage the prior information embedded in pretrained language models (LM) to improve generalization for intent classification and slot labeling tasks with limited training data. Spe… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

  43. arXiv:2105.04623  [pdf, other

    cs.CL cs.AI

    Improving Factual Consistency of Abstractive Summarization via Question Answering

    Authors: Feng Nan, Cicero Nogueira dos Santos, Henghui Zhu, Patrick Ng, Kathleen McKeown, Ramesh Nallapati, Dejiao Zhang, Zhiguo Wang, Andrew O. Arnold, Bing Xiang

    Abstract: A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents. The fact that automatic summarization may produce plausible-sounding yet inaccurate summaries is a major concern that limits its wide application. In this paper we present an approach to address factual consistency in summari… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

    Comments: ACL-IJCNLP 2021

  44. arXiv:2104.08744  [pdf, other

    cs.CL

    Generative Context Pair Selection for Multi-hop Question Answering

    Authors: Dheeru Dua, Cicero Nogueira dos Santos, Patrick Ng, Ben Athiwaratkun, Bing Xiang, Matt Gardner, Sameer Singh

    Abstract: Compositional reasoning tasks like multi-hop question answering, require making latent decisions to get the final answer, given a question. However, crowdsourced datasets often capture only a slice of the underlying task distribution, which can induce unanticipated biases in models performing compositional reasoning. Furthermore, discriminatively trained models exploit such biases to get a better… ▽ More

    Submitted 18 April, 2021; originally announced April 2021.

  45. arXiv:2103.12953  [pdf, other

    cs.LG cs.CL

    Supporting Clustering with Contrastive Learning

    Authors: Dejiao Zhang, Feng Nan, Xiaokai Wei, Shangwen Li, Henghui Zhu, Kathleen McKeown, Ramesh Nallapati, Andrew Arnold, Bing Xiang

    Abstract: Unsupervised clustering aims at discovering the semantic categories of data according to some distance measured in the representation space. However, different categories often overlap with each other in the representation space at the beginning of the learning process, which poses a significant challenge for distance-based clustering in achieving good separation between different categories. To t… ▽ More

    Submitted 28 May, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: NAACL 2021

  46. arXiv:2103.06406  [pdf, other

    cs.LG cs.DC eess.SP math.OC

    Distributed Principal Subspace Analysis for Partitioned Big Data: Algorithms, Analysis, and Implementation

    Authors: Arpita Gang, Bingqing Xiang, Waheed U. Bajwa

    Abstract: Principal Subspace Analysis (PSA) -- and its sibling, Principal Component Analysis (PCA) -- is one of the most popular approaches for dimensionality reduction in signal processing and machine learning. But centralized PSA/PCA solutions are fast becoming irrelevant in the modern era of big data, in which the number of samples and/or the dimensionality of samples often exceed the storage and/or comp… ▽ More

    Submitted 12 October, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

    Comments: 16 pages; Final accepted version; To appear in IEEE Transactions on Signal and Information Processing Over Networks

    Journal ref: IEEE Trans. Signal Inform. Proc. over Netw., vol. 7, pp. 699-715, Oct. 2021

  47. arXiv:2102.09902  [pdf

    physics.med-ph cs.CE physics.flu-dyn

    Numerical study of COVID-19 spatial-temporal spreading in London

    Authors: J. Zheng, X. Wu, F. Fang, J. Li, Z. Wang, H. Xiao, J. Zhu, C. C. Pain, P. F. Linden, B. Xiang

    Abstract: Recent study reported that an aerosolised virus (COVID-19) can survive in the air for a few hours. It is highly possible that people get infected with the disease by breathing and contact with items contaminated by the aerosolised virus. However, the aerosolised virus transmission and trajectories in various meteorological environments remain unclear. This paper has investigated the movement of ae… ▽ More

    Submitted 22 February, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

    Comments: 15 pages, 6 figures

  48. arXiv:2102.09130  [pdf, other

    cs.CL cs.AI

    Entity-level Factual Consistency of Abstractive Text Summarization

    Authors: Feng Nan, Ramesh Nallapati, Zhiguo Wang, Cicero Nogueira dos Santos, Henghui Zhu, Dejiao Zhang, Kathleen McKeown, Bing Xiang

    Abstract: A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document. For example, state-of-the-art models trained on existing datasets exhibit entity hallucination, generating names of entities that are not present in the source document. We propose a set of new metrics to quantify the entity-level factual consistency of gene… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

    Comments: EACL 2021

  49. arXiv:2101.11131  [pdf, other

    cs.CL

    CLiMP: A Benchmark for Chinese Language Model Evaluation

    Authors: Beilei Xiang, Changbing Yang, Yu Li, Alex Warstadt, Katharina Kann

    Abstract: Linguistically informed analyses of language models (LMs) contribute to the understanding and improvement of these models. Here, we introduce the corpus of Chinese linguistic minimal pairs (CLiMP), which can be used to investigate what knowledge Chinese LMs acquire. CLiMP consists of sets of 1,000 minimal pairs (MPs) for 16 syntactic contrasts in Mandarin, covering 9 major Mandarin linguistic phen… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

  50. arXiv:2101.05779  [pdf, other

    cs.LG cs.CL

    Structured Prediction as Translation between Augmented Natural Languages

    Authors: Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto

    Abstract: We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking. Instead of tackling the problem by training task-specific discri… ▽ More

    Submitted 2 December, 2021; v1 submitted 14 January, 2021; originally announced January 2021.

    Journal ref: International Conference on Learning Representations (ICLR) 2021