Skip to main content

Showing 1–13 of 13 results for author: Lomeli, M

  1. arXiv:2402.14158  [pdf, other

    cs.CL

    TOOLVERIFIER: Generalization to New Tools via Self-Verification

    Authors: Dheeraj Mekala, Jason Weston, Jack Lanchantin, Roberta Raileanu, Maria Lomeli, Jingbo Shang, Jane Dwivedi-Yu

    Abstract: Teaching language models to use tools is an important milestone towards building general assistants, but remains an open problem. While there has been significant progress on learning to use specific tools via fine-tuning, language models still struggle with learning how to robustly use new tools from only a few demonstrations. In this work we introduce a self-verification method which distinguish… ▽ More

    Submitted 13 March, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  2. arXiv:2401.08281  [pdf, other

    cs.LG cs.CV cs.SE

    The Faiss library

    Authors: Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, Hervé Jégou

    Abstract: Vector databases manage large collections of embedding vectors. As AI applications are growing rapidly, so are the number of embeddings that need to be stored and indexed. The Faiss library is dedicated to vector similarity search, a core functionality of vector databases. Faiss is a toolkit of indexing methods and related primitives used to search, cluster, compress and transform vectors. This pa… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  3. arXiv:2310.10638  [pdf, other

    cs.CL cs.AI cs.LG

    In-context Pretraining: Language Modeling Beyond Document Boundaries

    Authors: Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Gergely Szilvasy, Rich James, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, Mike Lewis

    Abstract: Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to document completion. Existing pretraining pipelines train LMs by concatenating random sets of short documents to create input contexts but the prior documents provide no signal for predicting the next d… ▽ More

    Submitted 24 June, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  4. arXiv:2310.01352  [pdf, other

    cs.CL cs.AI

    RA-DIT: Retrieval-Augmented Dual Instruction Tuning

    Authors: Xi Victoria Lin, Xilun Chen, Mingda Chen, Weijia Shi, Maria Lomeli, Rich James, Pedro Rodriguez, Jacob Kahn, Gergely Szilvasy, Mike Lewis, Luke Zettlemoyer, Scott Yih

    Abstract: Retrieval-augmented language models (RALMs) improve performance by accessing long-tail and up-to-date knowledge from external data stores, but are challenging to build. Existing approaches require either expensive retrieval-specific modifications to LM pre-training or use post-hoc integration of the data store that leads to suboptimal performance. We introduce Retrieval-Augmented Dual Instruction… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: v4: ICLR 2024 camera-ready version

  5. arXiv:2302.07842  [pdf, ps, other

    cs.CL

    Augmented Language Models: a Survey

    Authors: Grégoire Mialon, Roberto Dessì, Maria Lomeli, Christoforos Nalmpantis, Ram Pasunuru, Roberta Raileanu, Baptiste Rozière, Timo Schick, Jane Dwivedi-Yu, Asli Celikyilmaz, Edouard Grave, Yann LeCun, Thomas Scialom

    Abstract: This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into simpler subtasks while the latter consists in calling external modules such as a code interpreter. LMs can leverage these augmentations separately or in combination via heuristics, or learn to do so from demo… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  6. arXiv:2302.04761  [pdf, other

    cs.CL

    Toolformer: Language Models Can Teach Themselves to Use Tools

    Authors: Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom

    Abstract: Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this paper, we show that LMs can teach themselves to use external tools via simple APIs and achieve the best of… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

  7. arXiv:2209.13331  [pdf, other

    cs.CL cs.LG

    EditEval: An Instruction-Based Benchmark for Text Improvements

    Authors: Jane Dwivedi-Yu, Timo Schick, Zhengbao Jiang, Maria Lomeli, Patrick Lewis, Gautier Izacard, Edouard Grave, Sebastian Riedel, Fabio Petroni

    Abstract: Evaluation of text generation to date has primarily focused on content created sequentially, rather than improvements on a piece of text. Writing, however, is naturally an iterative and incremental process that requires expertise in different modular skills such as fixing outdated information or making the style more consistent. Even so, comprehensive evaluation of a model's capacity to perform th… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

  8. arXiv:2208.03299  [pdf, other

    cs.CL

    Atlas: Few-shot Learning with Retrieval Augmented Language Models

    Authors: Gautier Izacard, Patrick Lewis, Maria Lomeli, Lucas Hosseini, Fabio Petroni, Timo Schick, Jane Dwivedi-Yu, Armand Joulin, Sebastian Riedel, Edouard Grave

    Abstract: Large language models have shown impressive few-shot results on a wide range of tasks. However, when knowledge is key for such results, as is the case for tasks such as question answering and fact checking, massive parameter counts to store knowledge seem to be needed. Retrieval augmented models are known to excel at knowledge intensive tasks without the need for as many parameters, but it is uncl… ▽ More

    Submitted 16 November, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

  9. arXiv:2207.06220  [pdf, other

    cs.IR cs.AI

    Improving Wikipedia Verifiability with AI

    Authors: Fabio Petroni, Samuel Broscheit, Aleksandra Piktus, Patrick Lewis, Gautier Izacard, Lucas Hosseini, Jane Dwivedi-Yu, Maria Lomeli, Timo Schick, Pierre-Emmanuel Mazaré, Armand Joulin, Edouard Grave, Sebastian Riedel

    Abstract: Verifiability is a core content policy of Wikipedia: claims that are likely to be challenged need to be backed by citations. There are millions of articles available online and thousands of new articles are released each month. For this reason, finding relevant sources is a difficult task: many claims do not have any references that support them. Furthermore, even existing citations might not supp… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

  10. arXiv:2001.05895  [pdf, other

    cs.LG stat.ML

    Masking schemes for universal marginalisers

    Authors: Divya Gautam, Maria Lomeli, Kostis Gourgoulias, Daniel H. Thompson, Saurabh Johri

    Abstract: We consider the effect of structure-agnostic and structure-dependent masking schemes when training a universal marginaliser (arXiv:1711.00695) in order to learn conditional distributions of the form $P(x_i |\mathbf x_{\mathbf b})$, where $x_i$ is a given random variable and $\mathbf x_{\mathbf b}$ is some arbitrary subset of all random variables of the generative model of interest. In other words,… ▽ More

    Submitted 16 January, 2020; originally announced January 2020.

    Comments: To be published in Proceedings of the 2nd Symposium on Advances in Approximate Bayesian Inference, 2019

  11. arXiv:1910.07474  [pdf, other

    cs.LG cs.AI stat.ML

    Universal Marginaliser for Deep Amortised Inference for Probabilistic Programs

    Authors: Robert Walecki, Kostis Gourgoulias, Adam Baker, Chris Hart, Chris Lucas, Max Zwiessele, Albert Buchard, Maria Lomeli, Yura Perov, Saurabh Johri

    Abstract: Probabilistic programming languages (PPLs) are powerful modelling tools which allow to formalise our knowledge about the world and reason about its inherent uncertainty. Inference methods used in PPL can be computationally costly due to significant time burden and/or storage requirements; or they can lack theoretical guarantees of convergence and accuracy when applied to large scale graphical mode… ▽ More

    Submitted 16 October, 2019; originally announced October 2019.

  12. arXiv:1811.04727  [pdf, other

    cs.AI cs.LG stat.ML

    Universal Marginalizer for Amortised Inference and Embedding of Generative Models

    Authors: Robert Walecki, Albert Buchard, Kostis Gourgoulias, Chris Hart, Maria Lomeli, A. K. W. Navarro, Max Zwiessele, Yura Perov, Saurabh Johri

    Abstract: Probabilistic graphical models are powerful tools which allow us to formalise our knowledge about the world and reason about its inherent uncertainty. There exist a considerable number of methods for performing inference in probabilistic graphical models; however, they can be computationally costly due to significant time burden and/or storage requirements; or they lack theoretical guarantees of c… ▽ More

    Submitted 12 November, 2018; originally announced November 2018.

  13. arXiv:1807.00400  [pdf, other

    stat.ML cs.LG

    Antithetic and Monte Carlo kernel estimators for partial rankings

    Authors: Maria Lomeli, Mark Rowland, Arthur Gretton, Zoubin Ghahramani

    Abstract: In the modern age, rankings data is ubiquitous and it is useful for a variety of applications such as recommender systems, multi-object tracking and preference learning. However, most rankings data encountered in the real world is incomplete, which prevents the direct application of existing modelling tools for complete rankings. Our contribution is a novel way to extend kernel methods for complet… ▽ More

    Submitted 25 July, 2018; v1 submitted 1 July, 2018; originally announced July 2018.