Skip to main content

Showing 1–1 of 1 results for author: Thinwa, C

  1. arXiv:2304.12155  [pdf, ps, other

    cs.CL cs.LG

    The African Stopwords project: curating stopwords for African languages

    Authors: Chris Emezue, Hellina Nigatu, Cynthia Thinwa, Helper Zhou, Shamsuddeen Muhammad, Lerato Louis, Idris Abdulmumin, Samuel Oyerinde, Benjamin Ajibade, Olanrewaju Samuel, Oviawe Joshua, Emeka Onwuegbuzia, Handel Emezue, Ifeoluwatayo A. Ige, Atnafu Lambebo Tonja, Chiamaka Chukwuneke, Bonaventure F. P. Dossou, Naome A. Etori, Mbonu Chinedu Emmanuel, Oreen Yousuf, Kaosarat Aina, Davis David

    Abstract: Stopwords are fundamental in Natural Language Processing (NLP) techniques for information retrieval. One of the common tasks in preprocessing of text data is the removal of stopwords. Currently, while high-resource languages like English benefit from the availability of several stopwords, low-resource languages, such as those found in the African continent, have none that are standardized and avai… ▽ More

    Submitted 21 March, 2023; originally announced April 2023.

    Comments: Accepted at the AfricaNLP workshop at ICLR2022