Skip to main content

Showing 1–32 of 32 results for author: Toral, A

  1. arXiv:2403.08693  [pdf, other

    cs.CL

    Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages

    Authors: Rik van Noord, Taja Kuzman, Peter Rupnik, Nikola Ljubešić, Miquel Esplà-Gomis, Gema Ramírez-Sánchez, Antonio Toral

    Abstract: Large, curated, web-crawled corpora play a vital role in training language models (LMs). They form the lion's share of the training data in virtually all recent LMs, such as the well-known GPT, LLaMA and XLM-RoBERTa models. However, despite this importance, relatively little attention has been given to the quality of these corpora. In this paper, we compare four of the currently most relevant larg… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted to LREC-COLING 2024 (long)

  2. arXiv:2307.02358  [pdf

    cs.CL cs.HC

    To be or not to be: a translation reception study of a literary text translated into Dutch and Catalan using machine translation

    Authors: Ana Guerberof Arenas, Antonio Toral

    Abstract: This article presents the results of a study involving the reception of a fictional story by Kurt Vonnegut translated from English into Catalan and Dutch in three conditions: machine-translated (MT), post-edited (PE) and translated from scratch (HT). 223 participants were recruited who rated the reading conditions using three scales: Narrative Engagement, Enjoyment and Translation Reception. The r… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: 39 pages, 9 figures, authors' manuscript approved for publication in Target International Journal of Translation published by John Benjamins

  3. arXiv:2306.00121  [pdf, other

    cs.CL

    Multilingual Multi-Figurative Language Detection

    Authors: Huiyuan Lai, Antonio Toral, Malvina Nissim

    Abstract: Figures of speech help people express abstract concepts and evoke stronger emotions than literal expressions, thereby making texts more creative and engaging. Due to its pervasive and fundamental character, figurative language understanding has been addressed in Natural Language Processing, but it's highly understudied in a multilingual setting and when considering more than one figure of speech a… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: Accepted to ACL 2023 (Findings)

  4. arXiv:2305.19757  [pdf, other

    cs.CL

    Automatic Discrimination of Human and Neural Machine Translation in Multilingual Scenarios

    Authors: Malina Chichirau, Rik van Noord, Antonio Toral

    Abstract: We tackle the task of automatically discriminating between human and machine translations. As opposed to most previous work, we perform experiments in a multilingual setting, considering multiple languages and multilingual pretrained language models. We show that a classifier trained on parallel data with a single source language (in our case German-English) can still perform well on English trans… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted at EAMT2023

  5. arXiv:2305.01633  [pdf, other

    cs.CL

    Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP

    Authors: Anya Belz, Craig Thomson, Ehud Reiter, Gavin Abercrombie, Jose M. Alonso-Moral, Mohammad Arvan, Anouck Braggaar, Mark Cieliebak, Elizabeth Clark, Kees van Deemter, Tanvi Dinkar, Ondřej Dušek, Steffen Eger, Qixiang Fang, Mingqi Gao, Albert Gatt, Dimitra Gkatzia, Javier González-Corbelle, Dirk Hovy, Manuela Hürlimann, Takumi Ito, John D. Kelleher, Filip Klubicka, Emiel Krahmer, Huiyuan Lai , et al. (17 additional authors not shown)

    Abstract: We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible. We present our results and findings, which include that just 13\% of papers had (i) sufficiently low barriers to reproduction, and (ii) enough obtainable information, to be considered for reproduction, a… ▽ More

    Submitted 7 August, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: 5 pages plus appendix, 4 tables, 1 figure. To appear at "Workshop on Insights from Negative Results in NLP" (co-located with EACL2023). Updated author list and acknowledgements

    MSC Class: 68 ACM Class: I.2.7

  6. arXiv:2304.13462  [pdf, other

    cs.CL

    Multidimensional Evaluation for Text Style Transfer Using ChatGPT

    Authors: Huiyuan Lai, Antonio Toral, Malvina Nissim

    Abstract: We investigate the potential of ChatGPT as a multidimensional evaluator for the task of \emph{Text Style Transfer}, alongside, and in comparison to, existing automatic metrics as well as human judgements. We focus on a zero-shot setting, i.e. prompting ChatGPT with specific task instructions, and test its performance on three commonly-used dimensions of text style transfer evaluation: style streng… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  7. Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation

    Authors: Lukas Edman, Gabriele Sarti, Antonio Toral, Gertjan van Noord, Arianna Bisazza

    Abstract: Pretrained character-level and byte-level language models have been shown to be competitive with popular subword models across a range of Natural Language Processing (NLP) tasks. However, there has been little research on their effectiveness for neural machine translation (NMT), particularly within the popular pretrain-then-finetune paradigm. This work performs an extensive comparison across multi… ▽ More

    Submitted 26 January, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: This version of our work is a pre-MIT Press publication version

  8. arXiv:2212.01304  [pdf, other

    cs.CL

    Subword-Delimited Downsampling for Better Character-Level Translation

    Authors: Lukas Edman, Antonio Toral, Gertjan van Noord

    Abstract: Subword-level models have been the dominant paradigm in NLP. However, character-level models have the benefit of seeing each character individually, providing the model with more detailed information that ultimately could lead to better models. Recent works have shown character-level models to be competitive with subword models, but costly in terms of time and computation. Character-level models w… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: This paper is a modified version of the one published in Findings of EMNLP2022, adapted to be compatible to ArXiv

  9. arXiv:2205.14086  [pdf, other

    cs.CL

    Patching Leaks in the Charformer for Efficient Character-Level Generation

    Authors: Lukas Edman, Antonio Toral, Gertjan van Noord

    Abstract: Character-based representations have important advantages over subword-based ones for morphologically rich languages. They come with increased robustness to noisy input and do not need a separate tokenization step. However, they also have a crucial disadvantage: they notably increase the length of text sequences. The GBST method from Charformer groups (aka downsamples) characters to solve this, bu… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

  10. DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages

    Authors: Gabriele Sarti, Arianna Bisazza, Ana Guerberof Arenas, Antonio Toral

    Abstract: We introduce DivEMT, the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. Using a strictly controlled setup, 18 professional translators were instructed to translate or post-edit the same set of English documents into Arabic, Dutch, Italian, Turkish, Ukrainian, and Vietnamese. During the process, their edits, keys… ▽ More

    Submitted 18 October, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022, materials: https://github.com/gsarti/divemt

    Journal ref: Proceedings of EMNLP (2022) 7795-7816

  11. arXiv:2205.04810  [pdf, ps, other

    cs.CL

    The Importance of Context in Very Low Resource Language Modeling

    Authors: Lukas Edman, Antonio Toral, Gertjan van Noord

    Abstract: This paper investigates very low resource language model pretraining, when less than 100 thousand sentences are available. We find that, in very low resource scenarios, statistical n-gram language models outperform state-of-the-art neural models. Our experiments show that this is mainly due to the focus of the former on a local context. As such, we introduce three methods to improve a neural model… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

  12. arXiv:2204.07549  [pdf, other

    cs.CL

    Human Judgement as a Compass to Navigate Automatic Metrics for Formality Transfer

    Authors: Huiyuan Lai, Jiali Mao, Antonio Toral, Malvina Nissim

    Abstract: Although text style transfer has witnessed rapid development in recent years, there is as yet no established standard for evaluation, which is performed using several automatic metrics, lacking the possibility of always resorting to human judgement. We focus on the task of formality transfer, and on the three aspects that are usually evaluated: style strength, content preservation, and fluency. To… ▽ More

    Submitted 15 April, 2022; originally announced April 2022.

    Comments: Accepted to HumEval 2022

  13. Creativity in translation: machine translation as a constraint for literary texts

    Authors: Ana Guerberof Arenas, Antonio Toral

    Abstract: This article presents the results of a study involving the translation of a short story by Kurt Vonnegut from English to Catalan and Dutch using three modalities: machine-translation (MT), post-editing (PE) and translation without aid (HT). Our aim is to explore creativity, understood to involve novelty and acceptability, from a quantitative perspective. The results show that HT has the highest cr… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: 28 pages, 2 figures, 10 tables

    Journal ref: Translation Spaces 2022

  14. arXiv:2203.08552  [pdf, other

    cs.CL

    Multilingual Pre-training with Language and Task Adaptation for Multilingual Text Style Transfer

    Authors: Huiyuan Lai, Antonio Toral, Malvina Nissim

    Abstract: We exploit the pre-trained seq2seq model mBART for multilingual text style transfer. Using machine translated data as well as gold aligned English sentences yields state-of-the-art results in the three target languages we consider. Besides, in view of the general scarcity of parallel data, we propose a modular approach for multilingual formality transfer, which consists of two training strategies… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: Accepted to ACL 2022

  15. arXiv:2109.12012  [pdf, other

    cs.CL

    Unsupervised Translation of German--Lower Sorbian: Exploring Training and Novel Transfer Methods on a Low-Resource Language

    Authors: Lukas Edman, Ahmet Üstün, Antonio Toral, Gertjan van Noord

    Abstract: This paper describes the methods behind the systems submitted by the University of Groningen for the WMT 2021 Unsupervised Machine Translation task for German--Lower Sorbian (DE--DSB): a high-resource language to a low-resource one. Our system uses a transformer encoder-decoder architecture in which we make three changes to the standard training procedure. First, our training focuses on two langua… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

  16. arXiv:2109.04543  [pdf, other

    cs.CL

    Generic resources are what you need: Style transfer tasks without task-specific parallel training data

    Authors: Huiyuan Lai, Antonio Toral, Malvina Nissim

    Abstract: Style transfer aims to rewrite a source text in a different target style while preserving its content. We propose a novel approach to this task that leverages generic resources, and without using any task-specific parallel (source-target) data outperforms existing unsupervised approaches on the two most popular style transfer tasks: formality transfer and polarity swap. In practice, we adopt a mul… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP2021 (main conference)

  17. arXiv:2105.06947  [pdf, other

    cs.CL

    Thank you BART! Rewarding Pre-Trained Models Improves Formality Style Transfer

    Authors: Huiyuan Lai, Antonio Toral, Malvina Nissim

    Abstract: Scarcity of parallel data causes formality style transfer models to have scarce success in preserving content. We show that fine-tuning pre-trained language (GPT-2) and sequence-to-sequence (BART) models boosts content preservation, and that this is possible even with limited amounts of parallel data. Augmenting these models with rewards that target style and content -- the two core aspects of the… ▽ More

    Submitted 5 July, 2021; v1 submitted 14 May, 2021; originally announced May 2021.

  18. The Impact of Post-editing and Machine Translation on Creativity and Reading Experience

    Authors: Ana Guerberof Arenas, Antonio Toral

    Abstract: This article presents the results of a study involving the translation of a fictional story from English into Catalan in three modalities: machine-translated (MT), post-edited (MTPE) and translated without aid (HT). Each translation was analysed to evaluate its creativity. Subsequently, a cohort of 88 Catalan participants read the story in a randomly assigned modality and completed a survey. The r… ▽ More

    Submitted 15 January, 2021; originally announced January 2021.

    Comments: 28 pages, 10 tables, 4 figures. Translation Spaces (2020)

  19. arXiv:2011.14979  [pdf

    cs.CL

    Machine Translation of Novels in the Age of Transformer

    Authors: Antonio Toral, Antoni Oliver, Pau Ribas Ballestín

    Abstract: In this chapter we build a machine translation (MT) system tailored to the literary domain, specifically to novels, based on the state-of-the-art architecture in neural MT (NMT), the Transformer (Vaswani et al., 2017), for the translation direction English-to-Catalan. Subsequently, we assess to what extent such a system can be useful by evaluating its translations, by comparing this MT system agai… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

    Comments: Chapter published in the book Maschinelle Übersetzung für Übersetzungsprofis (pp. 276-295). Jörg Porsiel (Ed.), BDÜ Fachverlag, 2020. ISBN 978-3-946702-09-2

  20. arXiv:2011.04308  [pdf, other

    cs.CL

    Character-level Representations Improve DRS-based Semantic Parsing Even in the Age of BERT

    Authors: Rik van Noord, Antonio Toral, Johan Bos

    Abstract: We combine character-level and contextual language model representations to improve performance on Discourse Representation Structure parsing. Character representations can easily be added in a sequence-to-sequence model in either one encoder or as a fully separate encoder, with improvements that are robust to different language models, languages and data sets. For English, these improvements are… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: EMNLP 2020 (long)

  21. arXiv:2006.08297  [pdf, other

    cs.CL

    Fine-grained Human Evaluation of Transformer and Recurrent Approaches to Neural Machine Translation for English-to-Chinese

    Authors: Yuying Ye, Antonio Toral

    Abstract: This research presents a fine-grained human evaluation to compare the Transformer and recurrent approaches to neural machine translation (MT), on the translation direction English-to-Chinese. To this end, we develop an error taxonomy compliant with the Multidimensional Quality Metrics (MQM) framework that is customised to the relevant phenomena of this translation direction. We then conduct an err… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

    Comments: Accepted at the 22nd Annual Conference of the European Association for Machine Translation (EAMT 2020)

  22. arXiv:2005.05738  [pdf, other

    cs.CL

    Reassessing Claims of Human Parity and Super-Human Performance in Machine Translation at WMT 2019

    Authors: Antonio Toral

    Abstract: We reassess the claims of human parity and super-human performance made at the news shared task of WMT 2019 for three translation directions: English-to-German, English-to-Russian and German-to-English. First we identify three potential issues in the human evaluation of that shared task: (i) the limited amount of intersentential context available, (ii) the limited translation proficiency of the ev… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

    Comments: Accepted at the 22nd Annual Conference of the European Association for Machine Translation (EAMT 2020)

  23. A Set of Recommendations for Assessing Human-Machine Parity in Language Translation

    Authors: Samuel Läubli, Sheila Castilho, Graham Neubig, Rico Sennrich, Qinlan Shen, Antonio Toral

    Abstract: The quality of machine translation has increased remarkably over the past years, to the degree that it was found to be indistinguishable from professional human translation in a number of empirical investigations. We reassess Hassan et al.'s 2018 investigation into Chinese to English news translation, showing that the finding of human-machine parity was owed to weaknesses in the evaluation design… ▽ More

    Submitted 3 April, 2020; originally announced April 2020.

    Journal ref: Journal of Artificial Intelligence Research 67 (2020) 653-672

  24. arXiv:1907.00900  [pdf, other

    cs.CL

    Post-editese: an Exacerbated Translationese

    Authors: Antonio Toral

    Abstract: Post-editing (PE) machine translation (MT) is widely used for dissemination because it leads to higher productivity than human translation from scratch (HT). In addition, PE translations are found to be of equal or better quality than HTs. However, most such studies measure quality solely as the number of errors. We conduct a set of computational analyses in which we compare PE against HT on three… ▽ More

    Submitted 3 October, 2019; v1 submitted 1 July, 2019; originally announced July 2019.

    Comments: Accepted at the 17th Machine Translation Summit. v2: two references added

  25. arXiv:1906.08069  [pdf, other

    cs.CL

    The Effect of Translationese in Machine Translation Test Sets

    Authors: Mike Zhang, Antonio Toral

    Abstract: The effect of translationese has been studied in the field of machine translation (MT), mostly with respect to training data. We study in depth the effect of translationese on test data, using the test sets from the last three editions of WMT's news shared task, containing 17 translation directions. We show evidence that (i) the use of translationese in test sets results in inflated human evaluati… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

    Comments: 9 pages, 10 pages appendix, 3 figures, 20 tables, accepted in WMT19

  26. arXiv:1810.12579  [pdf, other

    cs.CL

    Exploring Neural Methods for Parsing Discourse Representation Structures

    Authors: Rik van Noord, Lasha Abzianidze, Antonio Toral, Johan Bos

    Abstract: Neural methods have had several recent successes in semantic parsing, though they have yet to face the challenge of producing meaning representations based on formal semantics. We present a sequence-to-sequence neural semantic parser that is able to produce Discourse Representation Structures (DRSs) for English sentences with high accuracy, outperforming traditional DRS parsers. To facilitate the… ▽ More

    Submitted 30 October, 2018; originally announced October 2018.

    Comments: to appear in TACL 2018

  27. arXiv:1808.10432  [pdf, other

    cs.CL

    Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation

    Authors: Antonio Toral, Sheila Castilho, Ke Hu, Andy Way

    Abstract: We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was originally written, the translation proficiency of the eva… ▽ More

    Submitted 30 August, 2018; originally announced August 2018.

    Comments: WMT 2018

  28. Quantitative Fine-Grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian

    Authors: Filip Klubička, Antonio Toral, Víctor M. Sánchez-Cartagena

    Abstract: This paper presents a quantitative fine-grained manual evaluation approach to comparing the performance of different machine translation (MT) systems. We build upon the well-established Multidimensional Quality Metrics (MQM) error taxonomy and implement a novel method that assesses whether the differences in performance for MQM error types between different MT systems are statistically significant… ▽ More

    Submitted 2 February, 2018; originally announced February 2018.

    Comments: 22 pages, 2 figures, 9 tables, 1 equation. This is a post-peer-review, pre-copyedit version of an article published in Machine Translation Journal. The final authenticated version will be available online at the journal page. arXiv admin note: substantial text overlap with arXiv:1706.04389

    MSC Class: 68T50

    Journal ref: Machine Translation, pp 1-21, (2018), http://rdcu.be/GIkb

  29. arXiv:1801.04962  [pdf, other

    cs.CL

    What Level of Quality can Neural Machine Translation Attain on Literary Text?

    Authors: Antonio Toral, Andy Way

    Abstract: Given the rise of a new approach to MT, Neural MT (NMT), and its promising performance on different text types, we assess the translation quality it can attain on what is perceived to be the greatest challenge for MT: literary text. Specifically, we target novels, arguably the most popular type of literary text. We build a literary-adapted NMT system for the English-to-Catalan translation directio… ▽ More

    Submitted 15 January, 2018; originally announced January 2018.

    Comments: Chapter for the forthcoming book "Translation Quality Assessment: From Principles to Practice" (Springer)

  30. Fine-grained human evaluation of neural versus phrase-based machine translation

    Authors: Filip Klubička, Antonio Toral, Víctor M. Sánchez-Cartagena

    Abstract: We compare three approaches to statistical machine translation (pure phrase-based, factored phrase-based and neural) by performing a fine-grained manual evaluation via error annotation of the systems' outputs. The error types in our annotation are compliant with the multidimensional quality metrics (MQM), and the annotation is performed by two annotators. Inter-annotator agreement is high for such… ▽ More

    Submitted 14 June, 2017; originally announced June 2017.

    Comments: 12 pages, 2 figures, The Prague Bulletin of Mathematical Linguistics

    ACM Class: I.2.7

    Journal ref: The Prague Bulletin of Mathematical Linguistics No. 108, pp. 121-132 (2017)

  31. arXiv:1701.02901  [pdf, other

    cs.CL

    A Multifaceted Evaluation of Neural versus Phrase-Based Machine Translation for 9 Language Directions

    Authors: Antonio Toral, Víctor M. Sánchez-Cartagena

    Abstract: We aim to shed light on the strengths and weaknesses of the newly introduced neural machine translation paradigm. To that end, we conduct a multifaceted evaluation in which we compare outputs produced by state-of-the-art neural machine translation and phrase-based machine translation systems for 9 language directions across a number of dimensions. Specifically, we measure the similarity of the out… ▽ More

    Submitted 11 January, 2017; originally announced January 2017.

    Comments: Conference of the European Chapter of the Association for Computational Linguistics (EACL). April 2017, València, Spain

  32. arXiv:1303.1932  [pdf

    cs.CL

    Mining and Exploiting Domain-Specific Corpora in the PANACEA Platform

    Authors: Núria Bel, Vassilis Papavasiliou, Prokopis Prokopidis, Antonio Toral, Victoria Arranz

    Abstract: The objective of the PANACEA ICT-2007.2.2 EU project is to build a platform that automates the stages involved in the acquisition, production, updating and maintenance of the large language resources required by, among others, MT systems. The development of a Corpus Acquisition Component (CAC) for extracting monolingual and bilingual data from the web is one of the most innovative building blocks… ▽ More

    Submitted 8 March, 2013; originally announced March 2013.

    Comments: 3 pages. Also available in UPF institutional repository (http://hdl.handle.net/10230/20416)

    Journal ref: Proceedings of the 5th Workshop on Building and Using Comparable Corpora at the Eighth International Conference on Language Resources and Evaluation (LREC-2012); 2012 May 23-25; Istanbul, Turkey. Paris: ELRA; 2012. p. 24-26