Skip to main content

Showing 1–6 of 6 results for author: Viaud, G

  1. arXiv:2407.01449  [pdf, other

    cs.IR cs.CL cs.CV

    ColPali: Efficient Document Retrieval with Vision Language Models

    Authors: Manuel Faysse, Hugues Sibille, Tony Wu, Bilel Omrani, Gautier Viaud, Céline Hudelot, Pierre Colombo

    Abstract: Documents are visually rich structures that convey information through text, as well as tables, figures, page layouts, or fonts. While modern document retrieval systems exhibit strong performance on query-to-text matching, they struggle to exploit visual cues efficiently, hindering their performance on practical document retrieval applications such as Retrieval Augmented Generation. To benchmark c… ▽ More

    Submitted 2 July, 2024; v1 submitted 27 June, 2024; originally announced July 2024.

    Comments: Under Review

  2. arXiv:2402.00786  [pdf, other

    cs.CL cs.LG

    CroissantLLM: A Truly Bilingual French-English Language Model

    Authors: Manuel Faysse, Patrick Fernandes, Nuno M. Guerreiro, António Loison, Duarte M. Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro H. Martins, Antoni Bigata Casademunt, François Yvon, André F. T. Martins, Gautier Viaud, Céline Hudelot, Pierre Colombo

    Abstract: We introduce CroissantLLM, a 1.3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully open-sourced bilingual model that runs swiftly on consumer-grade local hardware. To that end, we pioneer the approach of training an intrinsically bilingual model with a 1:1 English-to-French pretraining data ratio, a cust… ▽ More

    Submitted 29 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  3. Revisiting Instruction Fine-tuned Model Evaluation to Guide Industrial Applications

    Authors: Manuel Faysse, Gautier Viaud, Céline Hudelot, Pierre Colombo

    Abstract: Instruction Fine-Tuning (IFT) is a powerful paradigm that strengthens the zero-shot capabilities of Large Language Models (LLMs), but in doing so induces new evaluation metric requirements. We show LLM-based metrics to be well adapted to these requirements, and leverage them to conduct an investigation of task-specialization strategies, quantifying the trade-offs that emerge in practical industria… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: Short paper accepted at EMNLP 2023

    Journal ref: 2023.emnlp-main.559

  4. arXiv:2111.11296  [pdf, other

    cs.IR cs.AI

    Improving Next-Application Prediction with Deep Personalized-Attention Neural Network

    Authors: Jun Zhu, Gautier Viaud, Céline Hudelot

    Abstract: Recently, due to the ubiquity and supremacy of E-recruitment platforms, job recommender systems have been largely studied. In this paper, we tackle the next job application problem, which has many practical applications. In particular, we propose to leverage next-item recommendation approaches to consider better the job seeker's career preference to discover the next relevant job postings (referre… ▽ More

    Submitted 9 November, 2021; originally announced November 2021.

  5. arXiv:2109.13209  [pdf, other

    cs.CL cs.AI cs.LG

    FQuAD2.0: French Question Answering and knowing that you know nothing

    Authors: Quentin Heinrich, Gautier Viaud, Wacim Belblidia

    Abstract: Question Answering, including Reading Comprehension, is one of the NLP research areas that has seen significant scientific breakthroughs over the past few years, thanks to the concomitant advances in Language Modeling. Most of these breakthroughs, however, are centered on the English language. In 2020, as a first strong initiative to bridge the gap to the French language, Illuin Technology introdu… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Comments: 12 pages, 2 figures

  6. arXiv:1407.8078  [pdf, ps, other

    math.NA cs.MS

    Zolotarev Quadrature Rules and Load Balancing for the FEAST Eigensolver

    Authors: Stefan Guettel, Eric Polizzi, Ping Tak Peter Tang, Gautier Viaud

    Abstract: The FEAST method for solving large sparse eigenproblems is equivalent to subspace iteration with an approximate spectral projector and implicit orthogonalization. This relation allows to characterize the convergence of this method in terms of the error of a certain rational approximant to an indicator function. We propose improved rational approximants leading to FEAST variants with faster converg… ▽ More

    Submitted 30 July, 2014; originally announced July 2014.

    Comments: 22 pages, 8 figures