Skip to main content

Showing 1–1 of 1 results for author: Sanchez, L K C

  1. arXiv:2210.02509  [pdf, other

    cs.CL

    Revisiting Syllables in Language Modelling and their Application on Low-Resource Machine Translation

    Authors: Arturo Oncevay, Kervy Dante Rivas Rojas, Liz Karen Chavez Sanchez, Roberto Zariquiey

    Abstract: Language modelling and machine translation tasks mostly use subword or character inputs, but syllables are seldom used. Syllables provide shorter sequences than characters, require less-specialised extracting rules than morphemes, and their segmentation is not impacted by the corpus size. In this study, we first explore the potential of syllables for open-vocabulary language modelling in 21 langua… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: COLING 2022, short-paper