Skip to main content

Showing 1–4 of 4 results for author: de la Clergerie, É V

  1. arXiv:2406.06589  [pdf, other

    cs.CL cs.AI

    PatentEval: Understanding Errors in Patent Generation

    Authors: You Zuo, Kim Gerdes, Eric Villemonte de La Clergerie, Benoît Sagot

    Abstract: In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims-to-abstract generation, and the generation of the next claim given previous ones. We have also developed a benchmark, PatentEval, for systematically assessing language models in this context. Our study includes a comparative analysis, annotated… ▽ More

    Submitted 25 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Journal ref: NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun 2024, Mexico City, Mexico

  2. arXiv:2104.07560  [pdf, other

    cs.CL

    Rethinking Automatic Evaluation in Sentence Simplification

    Authors: Thomas Scialom, Louis Martin, Jacopo Staiano, Éric Villemonte de la Clergerie, Benoît Sagot

    Abstract: Automatic evaluation remains an open research question in Natural Language Generation. In the context of Sentence Simplification, this is particularly challenging: the task requires by nature to replace complex words with simpler ones that shares the same meaning. This limits the effectiveness of n-gram based metrics like BLEU. Going hand in hand with the recent advances in NLG, new metrics have b… ▽ More

    Submitted 16 April, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: updated affiliation and link to data

  3. CamemBERT: a Tasty French Language Model

    Authors: Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoann Dupont, Laurent Romary, Éric Villemonte de la Clergerie, Djamé Seddah, Benoît Sagot

    Abstract: Pretrained language models are now ubiquitous in Natural Language Processing. Despite their success, most available models have either been trained on English data or on the concatenation of data in multiple languages. This makes practical use of such models --in all languages except English-- very limited. In this paper, we investigate the feasibility of training monolingual Transformer-based lan… ▽ More

    Submitted 21 May, 2020; v1 submitted 10 November, 2019; originally announced November 2019.

    Comments: ACL 2020 long paper. Web site: https://camembert-model.fr

    Journal ref: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, July 2020, Online

  4. arXiv:1901.10746  [pdf, other

    cs.CL

    Reference-less Quality Estimation of Text Simplification Systems

    Authors: Louis Martin, Samuel Humeau, Pierre-Emmanuel Mazaré, Antoine Bordes, Éric Villemonte de La Clergerie, Benoît Sagot

    Abstract: The evaluation of text simplification (TS) systems remains an open challenge. As the task has common points with machine translation (MT), TS is often evaluated using MT metrics such as BLEU. However, such metrics require high quality reference data, which is rarely available for TS. TS has the advantage over MT of being a monolingual task, which allows for direct comparisons to be made between th… ▽ More

    Submitted 30 January, 2019; originally announced January 2019.

    Journal ref: 1st Workshop on Automatic Text Adaptation (ATA), Nov 2018, Tilburg, Netherlands. https://www.ida.liu.se/~evere22/ATA-18/