Skip to main content

Showing 1–5 of 5 results for author: Agostinelli, V

  1. arXiv:2405.13046  [pdf, other

    cs.CL cs.LG

    LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions

    Authors: Victor Agostinelli, Sanghyun Hong, Lizhong Chen

    Abstract: A promising approach to preserving model performance in linearized transformers is to employ position-based re-weighting functions. However, state-of-the-art re-weighting functions rely heavily on target sequence lengths, making it difficult or impossible to apply them to autoregressive and simultaneous tasks, where the target and sometimes even the input sequence length are unknown. To address th… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: Submitted and accepted at ICML 2024

  2. arXiv:2405.10443  [pdf, other

    cs.CL cs.LG

    Simultaneous Masking, Not Prompting Optimization: A Paradigm Shift in Fine-tuning LLMs for Simultaneous Translation

    Authors: Matthew Raffel, Victor Agostinelli, Lizhong Chen

    Abstract: Large language models (LLMs) have achieved state-of-the-art performance in various language processing tasks, motivating their adoption in simultaneous translation. Current fine-tuning methods to adapt LLMs for simultaneous translation focus on prompting optimization strategies using either data augmentation or prompt structure modifications. However, these methods suffer from several issues, such… ▽ More

    Submitted 26 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  3. arXiv:2312.04691  [pdf, other

    cs.CL cs.AI

    Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models

    Authors: Victor Agostinelli, Max Wild, Matthew Raffel, Kazi Ahmed Asif Fuad, Lizhong Chen

    Abstract: Large language models (LLMs) with billions of parameters and pretrained on massive amounts of data are now capable of near or better than state-of-the-art performance in a variety of downstream natural language processing tasks. Neural machine translation (NMT) is one such task that LLMs have been applied to with great success. However, little research has focused on applying LLMs to the more diff… ▽ More

    Submitted 4 July, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: ACL 2024

  4. arXiv:2306.14031  [pdf, other

    cs.LG cs.AI

    Partitioning-Guided K-Means: Extreme Empty Cluster Resolution for Extreme Model Compression

    Authors: Tianhong Huang, Victor Agostinelli, Lizhong Chen

    Abstract: Compactness in deep learning can be critical to a model's viability in low-resource applications, and a common approach to extreme model compression is quantization. We consider Iterative Product Quantization (iPQ) with Quant-Noise to be state-of-the-art in this area, but this quantization framework suffers from preventable inference quality degradation due to prevalent empty clusters. In this pap… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

  5. arXiv:2304.08453  [pdf, other

    cs.CL cs.AI

    Improving Autoregressive NLP Tasks via Modular Linearized Attention

    Authors: Victor Agostinelli, Lizhong Chen

    Abstract: Various natural language processing (NLP) tasks necessitate models that are efficient and small based on their ultimate application at the edge or in other resource-constrained environments. While prior research has reduced the size of these models, increasing computational efficiency without considerable performance impacts remains difficult, especially for autoregressive tasks. This paper propos… ▽ More

    Submitted 24 June, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: Submitted and accepted at ECML PKDD 2023