Skip to main content

Showing 1–13 of 13 results for author: Perot, V

  1. arXiv:2407.08223  [pdf, other

    cs.CL cs.AI

    Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting

    Authors: Zilong Wang, Zifeng Wang, Long Le, Huaixiu Steven Zheng, Swaroop Mishra, Vincent Perot, Yuwei Zhang, Anush Mattapalli, Ankur Taly, Jingbo Shang, Chen-Yu Lee, Tomas Pfister

    Abstract: Retrieval augmented generation (RAG) combines the generative abilities of large language models (LLMs) with external knowledge sources to provide more accurate and up-to-date responses. Recent RAG advancements focus on improving retrieval outcomes through iterative LLM refinement or self-critique capabilities acquired through additional instruction tuning of LLMs. In this work, we introduce Specul… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Preprint

  2. arXiv:2406.13121  [pdf, other

    cs.CL cs.AI cs.IR

    Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?

    Authors: Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, Kelvin Guu

    Abstract: Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases. Leveraging LCLMs' ability to natively ingest and process entire corpora of information offers numerous advantages. It enhances user-friendliness by eliminating the need for specialized knowledge of tools, provides robust end-to-… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 29 pages. Dataset available at https://github.com/google-deepmind/loft

  3. arXiv:2404.05875  [pdf, other

    cs.CL cs.AI cs.LG

    CodecLM: Aligning Language Models with Tailored Synthetic Data

    Authors: Zifeng Wang, Chun-Liang Li, Vincent Perot, Long T. Le, Jin Miao, Zizhao Zhang, Chen-Yu Lee, Tomas Pfister

    Abstract: Instruction tuning has emerged as the key in aligning large language models (LLMs) with specific task instructions, thereby mitigating the discrepancy between the next-token prediction objective and users' actual goals. To reduce the labor and time cost to collect or annotate data by humans, researchers start to explore the use of LLMs to generate instruction-aligned synthetic data. Recent works f… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted to Findings of NAACL 2024

  4. arXiv:2404.00488  [pdf

    cs.CL cs.AI cs.LG

    Noise-Aware Training of Layout-Aware Language Models

    Authors: Ritesh Sarkhel, Xiaoqi Ren, Lauro Beltrao Costa, Guolong Su, Vincent Perot, Yanan Xie, Emmanouil Koukoumidis, Arnab Nandi

    Abstract: A visually rich document (VRD) utilizes visual features along with linguistic cues to disseminate information. Training a custom extractor that identifies named entities from a document requires a large number of instances of the target document type annotated at textual and visual modalities. This is an expensive bottleneck in enterprise scenarios, where we want to train custom extractors for tho… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  5. arXiv:2401.04398  [pdf, other

    cs.CL

    Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding

    Authors: Zilong Wang, Hao Zhang, Chun-Liang Li, Julian Martin Eisenschlos, Vincent Perot, Zifeng Wang, Lesly Miculicich, Yasuhisa Fujii, Jingbo Shang, Chen-Yu Lee, Tomas Pfister

    Abstract: Table-based reasoning with large language models (LLMs) is a promising direction to tackle many table understanding tasks, such as table-based question answering and fact verification. Compared with generic reasoning, table-based reasoning requires the extraction of underlying semantics from both free-form questions and semi-structured tabular data. Chain-of-Thought and its similar approaches inco… ▽ More

    Submitted 18 January, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted to ICLR 2024

  6. arXiv:2309.10952  [pdf, other

    cs.CL cs.AI cs.LG

    LMDX: Language Model-based Document Information Extraction and Localization

    Authors: Vincent Perot, Kai Kang, Florian Luisier, Guolong Su, Xiaoyu Sun, Ramya Sree Boppana, Zilong Wang, Zifeng Wang, Jiaqi Mu, Hao Zhang, Chen-Yu Lee, Nan Hua

    Abstract: Large Language Models (LLM) have revolutionized Natural Language Processing (NLP), improving state-of-the-art and exhibiting emergent capabilities across various tasks. However, their application in extracting information from visually rich documents, which is at the core of many document processing workflows and involving the extraction of key entities from semi-structured documents, has not yet… ▽ More

    Submitted 21 June, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

  7. arXiv:2305.02549  [pdf, other

    cs.CL cs.CV cs.LG

    FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction

    Authors: Chen-Yu Lee, Chun-Liang Li, Hao Zhang, Timothy Dozat, Vincent Perot, Guolong Su, Xiang Zhang, Kihyuk Sohn, Nikolai Glushnev, Renshen Wang, Joshua Ainslie, Shangbang Long, Siyang Qin, Yasuhisa Fujii, Nan Hua, Tomas Pfister

    Abstract: The recent advent of self-supervised pre-training techniques has led to a surge in the use of multimodal learning in form document understanding. However, existing approaches that extend the mask language modeling to other modalities require careful multi-task tuning, complex reconstruction target designs, or additional pre-training data. In FormNetV2, we introduce a centralized multimodal graph c… ▽ More

    Submitted 13 June, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  8. arXiv:2211.07730  [pdf, other

    cs.LG cs.AI cs.CL

    QueryForm: A Simple Zero-shot Form Entity Query Framework

    Authors: Zifeng Wang, Zizhao Zhang, Jacob Devlin, Chen-Yu Lee, Guolong Su, Hao Zhang, Jennifer Dy, Vincent Perot, Tomas Pfister

    Abstract: Zero-shot transfer learning for document understanding is a crucial yet under-investigated scenario to help reduce the high cost involved in annotating document entities. We present a novel query-based framework, QueryForm, that extracts entity values from form-like documents in a zero-shot fashion. QueryForm contains a dual prompting mechanism that composes both the document schema and a specific… ▽ More

    Submitted 27 June, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: Accepted to Findings of ACL 2023

  9. arXiv:2204.04799  [pdf, other

    cs.LG cs.CV

    DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning

    Authors: Zifeng Wang, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister

    Abstract: Continual learning aims to enable a single model to learn a sequence of tasks without catastrophic forgetting. Top-performing methods usually require a rehearsal buffer to store past pristine examples for experience replay, which, however, limits their practical value due to privacy and memory constraints. In this work, we present a simple yet effective framework, DualPrompt, which learns a tiny s… ▽ More

    Submitted 5 August, 2022; v1 submitted 10 April, 2022; originally announced April 2022.

    Comments: Published at ECCV 2022 as a conference paper

  10. arXiv:2203.08411  [pdf, other

    cs.CL cs.CV cs.LG

    FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction

    Authors: Chen-Yu Lee, Chun-Liang Li, Timothy Dozat, Vincent Perot, Guolong Su, Nan Hua, Joshua Ainslie, Renshen Wang, Yasuhisa Fujii, Tomas Pfister

    Abstract: Sequence modeling has demonstrated state-of-the-art performance on natural language and document understanding tasks. However, it is challenging to correctly serialize tokens in form-like documents in practice due to their variety of layout patterns. We propose FormNet, a structure-aware sequence model to mitigate the suboptimal serialization of forms. First, we design Rich Attention that leverage… ▽ More

    Submitted 23 March, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: Accepted to ACL 2022

  11. arXiv:2112.08654  [pdf, other

    cs.LG cs.CV

    Learning to Prompt for Continual Learning

    Authors: Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister

    Abstract: The mainstream paradigm behind continual learning has been to adapt the model parameters to non-stationary data distributions, where catastrophic forgetting is the central challenge. Typical methods rely on a rehearsal buffer or known task identity at test time to retrieve learned knowledge and address forgetting, while this work presents a new paradigm for continual learning that aims to train a… ▽ More

    Submitted 21 March, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: Published at CVPR 2022 as a conference paper

  12. arXiv:2005.08469  [pdf, other

    cs.CL

    Text Classification with Few Examples using Controlled Generalization

    Authors: Abhijit Mahabal, Jason Baldridge, Burcu Karagol Ayan, Vincent Perot, Dan Roth

    Abstract: Training data for text classification is often limited in practice, especially for applications with many output classes or involving many related classification problems. This means classifiers must generalize from limited evidence, but the manner and extent of generalization is task dependent. Current practice primarily relies on pre-trained word embeddings to map words unseen in training to sim… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

    Journal ref: Proceedings of NAACL-HLT 2019

  13. arXiv:1809.10610  [pdf, other

    cs.LG stat.ML

    Counterfactual Fairness in Text Classification through Robustness

    Authors: Sahaj Garg, Vincent Perot, Nicole Limtiaco, Ankur Taly, Ed H. Chi, Alex Beutel

    Abstract: In this paper, we study counterfactual fairness in text classification, which asks the question: How would the prediction change if the sensitive attribute referenced in the example were different? Toxicity classifiers demonstrate a counterfactual fairness issue by predicting that "Some people are gay" is toxic while "Some people are straight" is nontoxic. We offer a metric, counterfactual token f… ▽ More

    Submitted 13 February, 2019; v1 submitted 27 September, 2018; originally announced September 2018.