Skip to main content

Showing 1–7 of 7 results for author: Baali, M

  1. Assistant, Parrot, or Colonizing Loudspeaker? ChatGPT Metaphors for Developing Critical AI Literacies

    Authors: Anuj Gupta, Yasser Atef, Anna Mills, Maha Bali

    Abstract: This study explores how discussing metaphors for AI can help build awareness of the frames that shape our understanding of AI systems, particularly large language models (LLMs) like ChatGPT. Given the pressing need to teach "critical AI literacy", discussion of metaphor provides an opportunity for inquiry and dialogue with space for nuance, playfulness, and critique. Using a collaborative autoethn… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: This is a preprint (accepted version) of an article that has been accepted for publication at the journal Open Praxis: https://openpraxis.org/

    ACM Class: I.2.0; K.3.0; K.3.1; K.4.0; K.4.2; J.4; J.5

  2. arXiv:2310.04445  [pdf, other

    cs.CL cs.AI cs.LG

    LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model

    Authors: Muhammad Ahmed Shah, Roshan Sharma, Hira Dhamyal, Raphael Olivier, Ankit Shah, Joseph Konan, Dareen Alharthi, Hazim T Bukhari, Massa Baali, Soham Deshmukh, Michael Kuhlmann, Bhiksha Raj, Rita Singh

    Abstract: It has been shown that Large Language Model (LLM) alignments can be circumvented by appending specially crafted attack suffixes with harmful queries to elicit harmful responses. To conduct attacks against private target models whose characterization is unknown, public models can be used as proxies to fashion the attack, with successful attacks being transferred from public proxies to private targe… ▽ More

    Submitted 21 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

  3. arXiv:2308.16149  [pdf, other

    cs.CL cs.AI cs.LG

    Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

    Authors: Neha Sengupta, Sunil Kumar Sahu, Bokang Jia, Satheesh Katipomu, Haonan Li, Fajri Koto, William Marshall, Gurpreet Gosal, Cynthia Liu, Zhiming Chen, Osama Mohammed Afzal, Samta Kamboj, Onkar Pandit, Rahul Pal, Lalit Pradhan, Zain Muhammad Mujahid, Massa Baali, Xudong Han, Sondos Mahmoud Bsharat, Alham Fikri Aji, Zhiqiang Shen, Zhengzhong Liu, Natalia Vassilieva, Joel Hestness, Andy Hock , et al. (7 additional authors not shown)

    Abstract: We introduce Jais and Jais-chat, new state-of-the-art Arabic-centric foundation and instruction-tuned open generative large language models (LLMs). The models are based on the GPT-3 decoder-only architecture and are pretrained on a mixture of Arabic and English texts, including source code in various programming languages. With 13 billion parameters, they demonstrate better knowledge and reasoning… ▽ More

    Submitted 29 September, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

    Comments: Arabic-centric, foundation model, large-language model, LLM, generative model, instruction-tuned, Jais, Jais-chat

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  4. arXiv:2306.07936  [pdf, other

    eess.AS cs.CL cs.SD

    FOOCTTS: Generating Arabic Speech with Acoustic Environment for Football Commentator

    Authors: Massa Baali, Ahmed Ali

    Abstract: This paper presents FOOCTTS, an automatic pipeline for a football commentator that generates speech with background crowd noise. The application gets the text from the user, applies text pre-processing such as vowelization, followed by the commentator's speech synthesizer. Our pipeline included Arabic automatic speech recognition for data labeling, CTC segmentation, transcription vowelization to m… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: Accepted at Interspeech 2023 Show & Tell Demo Session

  5. arXiv:2306.04368  [pdf, other

    cs.SD cs.CL eess.AS

    Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based Augmentation

    Authors: Massa Baali, Ibrahim Almakky, Shady Shehata, Fakhri Karray

    Abstract: Despite major advancements in Automatic Speech Recognition (ASR), the state-of-the-art ASR systems struggle to deal with impaired speech even with high-resource languages. In Arabic, this challenge gets amplified, with added complexities in collecting data from dysarthric speakers. In this paper, we aim to improve the performance of Arabic dysarthric automatic speech recognition through a multi-st… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: Accepted to Interspeech 2023

  6. arXiv:2301.09099  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case Study

    Authors: Massa Baali, Tomoki Hayashi, Hamdy Mubarak, Soumi Maiti, Shinji Watanabe, Wassim El-Hajj, Ahmed Ali

    Abstract: Several high-resource Text to Speech (TTS) systems currently produce natural, well-established human-like speech. In contrast, low-resource languages, including Arabic, have very limited TTS systems due to the lack of resources. We propose a fully unsupervised method for building TTS, including automatic data selection and pre-training/fine-tuning strategies for TTS training, using broadcast news… ▽ More

    Submitted 26 January, 2023; v1 submitted 22 January, 2023; originally announced January 2023.

  7. arXiv:2203.03601  [pdf, other

    cs.CL

    Creating Speech-to-Speech Corpus from Dubbed Series

    Authors: Massa Baali, Wassim El-Hajj, Ahmed Ali

    Abstract: Dubbed series are gaining a lot of popularity in recent years with strong support from major media service providers. Such popularity is fueled by studies that showed that dubbed versions of TV shows are more popular than their subtitled equivalents. We propose an unsupervised approach to construct speech-to-speech corpus, aligned on short segment levels, to produce a parallel speech corpus in the… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.