Skip to main content

Showing 1–24 of 24 results for author: Hamza, W

  1. arXiv:2401.02921  [pdf, other

    cs.CL eess.AS

    Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks

    Authors: Kevin Everson, Yile Gu, Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-yi Lee, Ariya Rastrow, Andreas Stolcke

    Abstract: In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text. In real-world scenarios, prior to input into an LLM, an automated speech recognition (ASR) system generates an output transcript hypothesis, where inherent errors ca… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: Accepted to ICASSP 2024

  2. arXiv:2306.08756  [pdf, other

    cs.CL cs.AI cs.LG

    Recipes for Sequential Pre-training of Multilingual Encoder and Seq2Seq Models

    Authors: Saleh Soltan, Andy Rosenbaum, Tobias Falke, Qin Lu, Anna Rumshisky, Wael Hamza

    Abstract: Pre-trained encoder-only and sequence-to-sequence (seq2seq) models each have advantages, however training both model types from scratch is computationally expensive. We explore recipes to improve pre-training efficiency by initializing one model from the other. (1) Extracting the encoder from a seq2seq model, we show it under-performs a Masked Language Modeling (MLM) encoder, particularly on seque… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: ACL Findings 2023 and SustaiNLP Workshop 2023

  3. Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data

    Authors: Vladislav Lialin, Stephen Rawls, David Chan, Shalini Ghosh, Anna Rumshisky, Wael Hamza

    Abstract: Scaling up weakly-supervised datasets has shown to be highly effective in the image-text domain and has contributed to most of the recent state-of-the-art computer vision and multimodal neural networks. However, existing large-scale video-text datasets and mining techniques suffer from several limitations, such as the scarcity of aligned data, the lack of diversity in the data, and the difficulty… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Journal ref: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)

  4. arXiv:2301.09809  [pdf, other

    cs.CL

    Low-Resource Compositional Semantic Parsing with Concept Pretraining

    Authors: Subendhu Rongali, Mukund Sridhar, Haidar Khan, Konstantine Arkoudas, Wael Hamza, Andrew McCallum

    Abstract: Semantic parsing plays a key role in digital voice assistants such as Alexa, Siri, and Google Assistant by mapping natural language to structured meaning representations. When we want to improve the capabilities of a voice assistant by adding a new domain, the underlying semantic parsing model needs to be retrained using thousands of annotated examples from the new domain, which is time-consuming… ▽ More

    Submitted 30 January, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

    Comments: EACL 2023

  5. arXiv:2210.07074  [pdf, other

    cs.CL cs.AI cs.LG

    CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing

    Authors: Andy Rosenbaum, Saleh Soltan, Wael Hamza, Amir Saffari, Marco Damonte, Isabel Groves

    Abstract: A bottleneck to developing Semantic Parsing (SP) models is the need for a large volume of human-labeled training data. Given the complexity and cost of human annotation for SP, labeled data is often scarce, particularly in multilingual settings. Large Language Models (LLMs) excel at SP given only a few examples, however LLMs are unsuitable for runtime systems which require low latency. In this wor… ▽ More

    Submitted 14 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Accepted to AACL-IJCNLP 2022: The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, November 20-23, 2022. See https://www.aacl2022.org/

  6. arXiv:2209.09900  [pdf, other

    cs.CL cs.AI cs.LG

    LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging

    Authors: Andy Rosenbaum, Saleh Soltan, Wael Hamza, Yannick Versley, Markus Boese

    Abstract: We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Tagging (IC+ST), via fine-tuning AlexaTM 5B, a 5-billion-parameter multilingual sequence-to-sequence (seq2seq) model, on a flexible instruction prompt. In a 10-shot novel intent setting for the SNIPS dataset, LINGUIST surpasses state-of-the-art approaches (Back-Translation and Example Extrapolation) by a… ▽ More

    Submitted 20 September, 2022; originally announced September 2022.

    Comments: Accepted to The 29th International Conference on Computational Linguistics (COLING 2022) October 12-17, 2022, Gyeongju, Republic of Korea https://coling2022.org/

  7. arXiv:2208.01448  [pdf, other

    cs.CL cs.LG

    AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

    Authors: Saleh Soltan, Shankar Ananthakrishnan, Jack FitzGerald, Rahul Gupta, Wael Hamza, Haidar Khan, Charith Peris, Stephen Rawls, Andy Rosenbaum, Anna Rumshisky, Chandana Satya Prakash, Mukund Sridhar, Fabian Triefenbach, Apurv Verma, Gokhan Tur, Prem Natarajan

    Abstract: In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various tasks. In particular, we train a 20 billion parameter multilingual seq2seq model called Alexa Teacher Model (AlexaTM 20B) and show that it achieves s… ▽ More

    Submitted 3 August, 2022; v1 submitted 2 August, 2022; originally announced August 2022.

  8. arXiv:2206.07808  [pdf, other

    cs.CL cs.AI cs.LG

    Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems

    Authors: Jack FitzGerald, Shankar Ananthakrishnan, Konstantine Arkoudas, Davide Bernardi, Abhishek Bhagia, Claudio Delli Bovi, Jin Cao, Rakesh Chada, Amit Chauhan, Luoxin Chen, Anurag Dwarakanath, Satyam Dwivedi, Turan Gojayev, Karthik Gopalakrishnan, Thomas Gueudre, Dilek Hakkani-Tur, Wael Hamza, Jonathan Hueser, Kevin Martin Jose, Haidar Khan, Beiye Liu, Jianhua Lu, Alessandro Manzotti, Pradeep Natarajan, Karolina Owczarzak , et al. (16 additional authors not shown)

    Abstract: We present results from a large-scale experiment on pretraining encoders with non-embedding parameter counts ranging from 700M to 9.3B, their subsequent distillation into smaller models ranging from 17M-170M parameters, and their application to the Natural Language Understanding (NLU) component of a virtual assistant system. Though we train using 70% spoken-form data, our teacher models perform co… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Comments: KDD 2022

    ACM Class: I.2.7

    Journal ref: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '22), August 14-18, 2022, Washington, DC, USA

  9. arXiv:2204.14243  [pdf, other

    cs.CL

    Training Naturalized Semantic Parsers with Very Little Data

    Authors: Subendhu Rongali, Konstantine Arkoudas, Melanie Rubino, Wael Hamza

    Abstract: Semantic parsing is an important NLP problem, particularly for voice assistants such as Alexa and Google Assistant. State-of-the-art (SOTA) semantic parsers are seq2seq architectures based on large language models that have been pretrained on vast amounts of text. To better leverage that pretraining, recent work has explored a reformulation of semantic parsing whereby the output sequences are them… ▽ More

    Submitted 4 May, 2022; v1 submitted 29 April, 2022; originally announced April 2022.

    Comments: IJCAI 2022

  10. arXiv:2204.13796  [pdf, other

    cs.CL cs.AI

    Instilling Type Knowledge in Language Models via Multi-Task QA

    Authors: Shuyang Li, Mukund Sridhar, Chandana Satya Prakash, Jin Cao, Wael Hamza, Julian McAuley

    Abstract: Understanding human language often necessitates understanding entities and their place in a taxonomy of knowledge -- their types. Previous methods to learn entity types rely on training classifiers on datasets with coarse, noisy, and incomplete labels. We introduce a method to instill fine-grained type knowledge in language models with text-to-text pre-training on type-centric questions leveraging… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: Findings of NAACL 2022; dataset link: https://github.com/amazon-research/wikiwiki-dataset

  11. arXiv:2101.08333  [pdf, other

    cs.CL

    Zero-shot Generalization in Dialog State Tracking through Generative Question Answering

    Authors: Shuyang Li, Jin Cao, Mukund Sridhar, Henghui Zhu, Shang-Wen Li, Wael Hamza, Julian McAuley

    Abstract: Dialog State Tracking (DST), an integral part of modern dialog systems, aims to track user preferences and constraints (slots) in task-oriented dialogs. In real-world settings with constantly changing services, DST systems must generalize to new domains and unseen slot types. Existing methods for DST do not generalize well to new slot names and many require known ontologies of slot types and value… ▽ More

    Submitted 20 January, 2021; originally announced January 2021.

    Comments: Accepted as a Long Paper at EACL 2021

  12. arXiv:2012.08549  [pdf, other

    cs.CL

    Exploring Transfer Learning For End-to-End Spoken Language Understanding

    Authors: Subendhu Rongali, Beiye Liu, Liwei Cai, Konstantine Arkoudas, Chengwei Su, Wael Hamza

    Abstract: Voice Assistants such as Alexa, Siri, and Google Assistant typically use a two-stage Spoken Language Understanding pipeline; first, an Automatic Speech Recognition (ASR) component to process customer speech and generate text transcriptions, followed by a Natural Language Understanding (NLU) component to map transcriptions to an actionable hypothesis. An end-to-end (E2E) system that goes directly f… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

    Comments: AAAI 2021

  13. arXiv:2012.02763  [pdf, other

    cs.CL cs.AI cs.LG

    Delexicalized Paraphrase Generation

    Authors: Boya Yu, Konstantine Arkoudas, Wael Hamza

    Abstract: We present a neural model for paraphrasing and train it to generate delexicalized sentences. We achieve this by creating training data in which each input is paired with a number of reference paraphrases. These sets of reference paraphrases represent a weak type of semantic equivalence based on annotated slots and intents. To understand semantics from different types of slots, other than anonymizi… ▽ More

    Submitted 4 December, 2020; originally announced December 2020.

  14. arXiv:2010.04355  [pdf

    cs.CL cs.AI cs.LG

    Style Attuned Pre-training and Parameter Efficient Fine-tuning for Spoken Language Understanding

    Authors: Jin Cao, Jun Wang, Wael Hamza, Kelly Vanee, Shang-Wen Li

    Abstract: Neural models have yielded state-of-the-art results in deciphering spoken language understanding (SLU) problems; however, these models require a significant amount of domain-specific labeled examples for training, which is prohibitively expensive. While pre-trained language models like BERT have been shown to capture a massive amount of knowledge by learning from unlabeled corpora and solve SLU us… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

    Comments: Accepted at INTERSPEECH 2020

  15. arXiv:2010.03714  [pdf, other

    cs.CL cs.AI cs.LG

    Don't Parse, Insert: Multilingual Semantic Parsing with Insertion Based Decoding

    Authors: Qile Zhu, Haidar Khan, Saleh Soltan, Stephen Rawls, Wael Hamza

    Abstract: Semantic parsing is one of the key components of natural language understanding systems. A successful parse transforms an input utterance to an action that is easily understood by the system. Many algorithms have been proposed to solve this problem, from conventional rulebased or statistical slot-filling systems to shiftreduce based neural parsers. For complex parsing tasks, the state-of-the-art m… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: Presented at CoNLL 2020

  16. Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

    Authors: Subendhu Rongali, Luca Soldaini, Emilio Monti, Wael Hamza

    Abstract: Virtual assistants such as Amazon Alexa, Apple Siri, and Google Assistant often rely on a semantic parsing component to understand which action(s) to execute for an utterance spoken by its users. Traditionally, rule-based or statistical slot-filling systems have been used to parse "simple" queries; that is, queries that contain a single action and can be decomposed into a set of non-overlapping en… ▽ More

    Submitted 30 January, 2020; originally announced January 2020.

    Comments: To be published in The Web Conference (WWW 2020)

  17. arXiv:2001.05284  [pdf, other

    cs.CL cs.SD eess.AS

    Improving Spoken Language Understanding By Exploiting ASR N-best Hypotheses

    Authors: Mingda Li, Weitong Ruan, Xinyue Liu, Luca Soldaini, Wael Hamza, Chengwei Su

    Abstract: In a modern spoken language understanding (SLU) system, the natural language understanding (NLU) module takes interpretations of a speech from the automatic speech recognition (ASR) module as the input. The NLU module usually uses the first best interpretation of a given speech in downstream tasks such as domain and intent classification. However, the ASR module might misrecognize some speeches an… ▽ More

    Submitted 11 January, 2020; originally announced January 2020.

    Comments: Submitted to ICASSP 2020. Have signed an e-copyright agreement with the IEEE during ICASSP 2020 submission

  18. arXiv:1806.10201  [pdf, other

    cs.CL

    Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking

    Authors: Gourab Kundu, Avirup Sil, Radu Florian, Wael Hamza

    Abstract: We propose an entity-centric neural cross-lingual coreference model that builds on multi-lingual embeddings and language-independent features. We perform both intrinsic and extrinsic evaluations of our model. In the intrinsic evaluation, we show that our model, when trained on English and tested on Chinese and Spanish, achieves competitive results to the models trained directly on Chinese and Span… ▽ More

    Submitted 26 June, 2018; originally announced June 2018.

    Journal ref: ACL 2018

  19. arXiv:1712.01813  [pdf, other

    cs.CL

    Neural Cross-Lingual Entity Linking

    Authors: Avirup Sil, Gourab Kundu, Radu Florian, Wael Hamza

    Abstract: A major challenge in Entity Linking (EL) is making effective use of contextual information to disambiguate mentions to Wikipedia that might refer to different entities in different contexts. The problem exacerbates with cross-lingual EL which involves linking mentions written in non-English documents to entries in the English Wikipedia: to compare textual clues across languages we need to compute… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

    Comments: Association for the Advancement of Artificial Intelligence (AAAI), 2018

  20. arXiv:1709.01058  [pdf, other

    cs.CL

    A Unified Query-based Generative Model for Question Generation and Question Answering

    Authors: Linfeng Song, Zhiguo Wang, Wael Hamza

    Abstract: We propose a query-based generative model for solving both tasks of question generation (QG) and question an- swering (QA). The model follows the classic encoder- decoder framework. The encoder takes a passage and a query as input then performs query understanding by matching the query with the passage from multiple per- spectives. The decoder is an attention-based Long Short Term Memory (LSTM) mo… ▽ More

    Submitted 28 August, 2018; v1 submitted 4 September, 2017; originally announced September 2017.

  21. arXiv:1708.07863  [pdf, other

    cs.CL cs.AI

    $k$-Nearest Neighbor Augmented Neural Networks for Text Classification

    Authors: Zhiguo Wang, Wael Hamza, Linfeng Song

    Abstract: In recent years, many deep-learning based models are proposed for text classification. This kind of models well fits the training set from the statistical point of view. However, it lacks the capacity of utilizing instance-level information from individual instances in the training set. In this work, we propose to enhance neural network models by allowing them to leverage information from $k$-near… ▽ More

    Submitted 25 August, 2017; originally announced August 2017.

  22. arXiv:1703.04489  [pdf, ps, other

    cs.CL cs.AI

    Reinforcement Learning for Transition-Based Mention Detection

    Authors: Georgiana Dinu, Wael Hamza, Radu Florian

    Abstract: This paper describes an application of reinforcement learning to the mention detection task. We define a novel action-based formulation for the mention detection task, in which a model can flexibly revise past labeling decisions by grouping together tokens and assigning partial mention labels. We devise a method to create mention-level episodes and we train a model by rewarding correctly labeled c… ▽ More

    Submitted 13 March, 2017; originally announced March 2017.

    Comments: Deep Reinforcement Learning Workshop, NIPS 2016

  23. arXiv:1702.03814  [pdf, other

    cs.AI cs.CL

    Bilateral Multi-Perspective Matching for Natural Language Sentences

    Authors: Zhiguo Wang, Wael Hamza, Radu Florian

    Abstract: Natural language sentence matching is a fundamental technology for a variety of tasks. Previous approaches either match sentences from a single direction or only apply single granular (word-by-word or sentence-by-sentence) matching. In this work, we propose a bilateral multi-perspective matching (BiMPM) model under the "matching-aggregation" framework. Given two sentences $P$ and $Q$, our model fi… ▽ More

    Submitted 14 July, 2017; v1 submitted 13 February, 2017; originally announced February 2017.

    Comments: To appear in Proceedings of IJCAI 2017

  24. arXiv:1612.04211  [pdf, other

    cs.CL

    Multi-Perspective Context Matching for Machine Comprehension

    Authors: Zhiguo Wang, Haitao Mi, Wael Hamza, Radu Florian

    Abstract: Previous machine comprehension (MC) datasets are either too small to train end-to-end deep learning models, or not difficult enough to evaluate the ability of current MC techniques. The newly released SQuAD dataset alleviates these limitations, and gives us a chance to develop more realistic MC models. Based on this dataset, we propose a Multi-Perspective Context Matching (MPCM) model, which is an… ▽ More

    Submitted 13 December, 2016; originally announced December 2016.

    Comments: 8