Skip to main content

Showing 1–5 of 5 results for author: Inciarte, A A

  1. arXiv:2311.08844  [pdf, other

    cs.CV cs.CL

    Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder

    Authors: Abdelrahman Mohamed, Fakhraddin Alwajih, El Moatez Billah Nagoudi, Alcides Alcoba Inciarte, Muhammad Abdul-Mageed

    Abstract: Although image captioning has a vast array of applications, it has not reached its full potential in languages other than English. Arabic, for instance, although the native language of more than 400 million people, remains largely underrepresented in this area. This is due to the lack of labeled data and powerful Arabic generative models. We alleviate this issue by presenting a novel vision-langua… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted in ArabicNLP Conference

  2. arXiv:2304.13292  [pdf, other

    cs.CL

    Zero-Shot Slot and Intent Detection in Low-Resource Languages

    Authors: Sang Yun Kwon, Gagan Bhatia, El Moatez Billah Nagoudi, Alcides Alcoba Inciarte, Muhammad Abdul-Mageed

    Abstract: Intent detection and slot filling are critical tasks in spoken and natural language understanding for task-oriented dialog systems. In this work we describe our participation in the slot and intent detection for low-resource language varieties (SID4LR; Aepli et al. (2023)). We investigate the slot and intent detection (SID) tasks using a wide range of models and settings. Given the recent success… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

    Comments: VarDial @ EACL

  3. arXiv:2212.10785  [pdf, other

    cs.CL cs.AI

    SERENGETI: Massively Multilingual Language Models for Africa

    Authors: Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Alcides Alcoba Inciarte

    Abstract: Multilingual pretrained language models (mPLMs) acquire valuable, generalizable linguistic information during pretraining and have advanced the state of the art on task-specific finetuning. To date, only ~31 out of ~2,000 African languages are covered in existing language models. We ameliorate this limitation by developing SERENGETI, a massively multilingual language model that covers 517 African… ▽ More

    Submitted 26 May, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: To appear in Findings of ACL 2023

  4. arXiv:2212.10755  [pdf, other

    cs.CL

    JASMINE: Arabic GPT Models for Few-Shot Learning

    Authors: El Moatez Billah Nagoudi, Muhammad Abdul-Mageed, AbdelRahim Elmadany, Alcides Alcoba Inciarte, Md Tawkat Islam Khondaker

    Abstract: Scholarship on generative pretraining (GPT) remains acutely Anglocentric, leaving serious gaps in our understanding of the whole class of autoregressive models. For example, we have little knowledge about the potential of these models and their societal impacts in diverse linguistic and cultural settings. We alleviate this issue for Arabic, a wide collection of languages and dialectal varieties wi… ▽ More

    Submitted 24 October, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  5. arXiv:2210.11744  [pdf, other

    cs.CL cs.LG

    AfroLID: A Neural Language Identification Tool for African Languages

    Authors: Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Alcides Alcoba Inciarte

    Abstract: Language identification (LID) is a crucial precursor for NLP, especially for mining web data. Problematically, most of the world's 7000+ languages today are not covered by LID technologies. We address this pressing issue for Africa by introducing AfroLID, a neural LID toolkit for $517$ African languages and varieties. AfroLID exploits a multi-domain web dataset manually curated from across 14 lang… ▽ More

    Submitted 6 December, 2022; v1 submitted 21 October, 2022; originally announced October 2022.

    Comments: To appear at EMNLP 2022 Main conference