Skip to main content

Showing 1–14 of 14 results for author: Cettolo, M

  1. arXiv:2405.10741  [pdf, other

    cs.CL

    SBAAM! Eliminating Transcript Dependency in Automatic Subtitling

    Authors: Marco Gaido, Sara Papi, Matteo Negri, Mauro Cettolo, Luisa Bentivogli

    Abstract: Subtitling plays a crucial role in enhancing the accessibility of audiovisual content and encompasses three primary subtasks: translating spoken dialogue, segmenting translations into concise textual units, and estimating timestamps that govern their on-screen duration. Past attempts to automate this process rely, to varying degrees, on automatic transcripts, employed diversely for the three subta… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL 2024 main conference

  2. arXiv:2310.15752  [pdf, other

    cs.CL cs.AI

    Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection

    Authors: Dennis Fucci, Marco Gaido, Sara Papi, Mauro Cettolo, Matteo Negri, Luisa Bentivogli

    Abstract: When translating words referring to the speaker, speech translation (ST) systems should not resort to default masculine generics nor rely on potentially misleading vocal traits. Rather, they should assign gender according to the speakers' preference. The existing solutions to do so, though effective, are hardly feasible in practice as they involve dedicated model re-training on gender-labeled ST d… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP 2023

  3. arXiv:2310.06590  [pdf, ps, other

    cs.CL

    No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation

    Authors: Dennis Fucci, Marco Gaido, Matteo Negri, Mauro Cettolo, Luisa Bentivogli

    Abstract: Automatic speech recognition (ASR) systems are known to be sensitive to the sociolinguistic variability of speech data, in which gender plays a crucial role. This can result in disparities in recognition accuracy between male and female speakers, primarily due to the under-representation of the latter group in the training data. While in the context of hybrid ASR models several solutions have been… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted at ASRU 2023

  4. arXiv:2209.13192  [pdf, other

    cs.CL

    Direct Speech Translation for Automatic Subtitling

    Authors: Sara Papi, Marco Gaido, Alina Karakanta, Mauro Cettolo, Matteo Negri, Marco Turchi

    Abstract: Automatic subtitling is the task of automatically translating the speech of audiovisual content into short pieces of timed text, i.e. subtitles and their corresponding timestamps. The generated subtitles need to conform to space and time requirements, while being synchronised with the speech and segmented in a way that facilitates comprehension. Given its considerable complexity, the task has so f… ▽ More

    Submitted 25 July, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

    Comments: Accepted at TACL

  5. arXiv:2205.09360  [pdf, other

    cs.CL

    Evaluating Subtitle Segmentation for End-to-end Generation Systems

    Authors: Alina Karakanta, François Buet, Mauro Cettolo, François Yvon

    Abstract: Subtitles appear on screen as short pieces of text, segmented based on formal constraints (length) and syntactic/semantic criteria. Subtitle segmentation can be evaluated with sequence segmentation metrics against a human reference. However, standard segmentation metrics cannot be applied when systems generate outputs different than the reference, e.g. with end-to-end subtitling systems. In this p… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: Accepted at LREC 2022

  6. arXiv:2106.01045  [pdf, other

    cs.CL

    Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?

    Authors: Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Alberto Martinelli, Matteo Negri, Marco Turchi

    Abstract: Five years after the first published proofs of concept, direct approaches to speech translation (ST) are now competing with traditional cascade solutions. In light of this steady progress, can we claim that the performance gap between the two is closed? Starting from this question, we present a systematic comparison between state-of-the-art systems representative of the two paradigms. Focusing on… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: Accepted at ACL2021

  7. arXiv:2104.11710  [pdf, other

    cs.SD cs.CL eess.AS

    Beyond Voice Activity Detection: Hybrid Audio Segmentation for Direct Speech Translation

    Authors: Marco Gaido, Matteo Negri, Mauro Cettolo, Marco Turchi

    Abstract: The audio segmentation mismatch between training data and those seen at run-time is a major problem in direct speech translation. Indeed, while systems are usually trained on manually segmented corpora, in real use cases they are often presented with continuous audio requiring automatic (and sub-optimal) segmentation. After comparing existing techniques (VAD-based, fixed-length and hybrid segmenta… ▽ More

    Submitted 14 October, 2021; v1 submitted 23 April, 2021; originally announced April 2021.

    Comments: Accepted to ICNLSP 2021

  8. arXiv:2102.01578  [pdf, other

    cs.CL

    CTC-based Compression for Direct Speech Translation

    Authors: Marco Gaido, Mauro Cettolo, Matteo Negri, Marco Turchi

    Abstract: Previous studies demonstrated that a dynamic phone-informed compression of the input audio is beneficial for speech translation (ST). However, they required a dedicated model for phone recognition and did not test this solution for direct ST, in which a single model translates the input audio into the target language without intermediate representations. In this work, we propose the first method a… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

    Comments: Accepted at EACL2021

    Journal ref: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (2021), 690-696

  9. arXiv:2008.02270  [pdf, other

    cs.CL

    Contextualized Translation of Automatically Segmented Speech

    Authors: Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Mauro Cettolo, Marco Turchi

    Abstract: Direct speech-to-text translation (ST) models are usually trained on corpora segmented at sentence level, but at inference time they are commonly fed with audio split by a voice activity detector (VAD). Since VAD segmentation is not syntax-informed, the resulting segments do not necessarily correspond to well-formed sentences uttered by the speaker but, most likely, to fragments of one or more sen… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

    Comments: Interspeech 2020

  10. arXiv:1911.12091  [pdf, ps, other

    cs.CL cs.AI cs.IR

    Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction

    Authors: Liane Guillou, Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro Cettolo, Bonnie Webber, Andrei Popescu-Belis

    Abstract: We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction. This is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a placeholder value in the target-language text, provided in lemmatised and PoS-tagged form. We provided four subtasks, for the English-French an… ▽ More

    Submitted 27 November, 2019; originally announced November 2019.

    Comments: cross-lingual pronoun prediction, WMT, shared task, English, German, French

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: WMT-2016

  11. arXiv:1806.06957  [pdf, ps, other

    cs.CL

    A Comparison of Transformer and Recurrent Neural Networks on Multilingual Neural Machine Translation

    Authors: Surafel M. Lakew, Mauro Cettolo, Marcello Federico

    Abstract: Recently, neural machine translation (NMT) has been extended to multilinguality, that is to handle more than one translation direction with a single system. Multilingual NMT showed competitive performance against pure bilingual systems. Notably, in low-resource settings, it proved to work effectively and efficiently, thanks to shared representation space that is forced across languages and induces… ▽ More

    Submitted 20 June, 2018; v1 submitted 18 June, 2018; originally announced June 2018.

    Comments: 12 pages, to appear on the 27th International Conference on Computational Linguistics (COLING 2018)

  12. arXiv:1612.04683  [pdf, other

    cs.CL

    Unsupervised Clustering of Commercial Domains for Adaptive Machine Translation

    Authors: Mauro Cettolo, Mara Chinea Rios, Roldano Cattoni

    Abstract: In this paper, we report on domain clustering in the ambit of an adaptive MT architecture. A standard bottom-up hierarchical clustering algorithm has been instantiated with five different distances, which have been compared, on an MT benchmark built on 40 commercial domains, in terms of dendrograms, intrinsic and extrinsic evaluations. The main outcome is that the most expensive distance is also t… ▽ More

    Submitted 14 December, 2016; originally announced December 2016.

    Comments: 9 pages report on Summer Internship at FBK

  13. arXiv:1610.00572  [pdf, other

    cs.CL cs.IR

    An Arabic-Hebrew parallel corpus of TED talks

    Authors: Mauro Cettolo

    Abstract: We describe an Arabic-Hebrew parallel corpus of TED talks built upon WIT3, the Web inventory that repurposes the original content of the TED website in a way which is more convenient for MT researchers. The benchmark consists of about 2,000 talks, whose subtitles in Arabic and Hebrew have been accurately aligned and rearranged in sentences, for a total of about 3.5M tokens per language. Talks have… ▽ More

    Submitted 3 October, 2016; originally announced October 2016.

    Comments: To appear in Proceedings of the AMTA 2016 Workshop on Semitic Machine Translation (SeMaT)

  14. arXiv:1608.04631  [pdf, other

    cs.CL

    Neural versus Phrase-Based Machine Translation Quality: a Case Study

    Authors: Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, Marcello Federico

    Abstract: Within the field of Statistical Machine Translation (SMT), the neural approach (NMT) has recently emerged as the first technology able to challenge the long-standing dominance of phrase-based approaches (PBMT). In particular, at the IWSLT 2015 evaluation campaign, NMT outperformed well established state-of-the-art PBMT systems on English-German, a language pair known to be particularly hard becaus… ▽ More

    Submitted 9 October, 2016; v1 submitted 16 August, 2016; originally announced August 2016.

    Comments: Conference on Empirical Methods in Natural Language Processing (EMNLP), November 1-5, 2016, Austin, Texas, USA