Skip to main content

Showing 1–15 of 15 results for author: Fazel-Zarandi, M

  1. arXiv:2312.05180  [pdf, other

    cs.CL

    PathFinder: Guided Search over Multi-Step Reasoning Paths

    Authors: Olga Golovneva, Sean O'Brien, Ramakanth Pasunuru, Tianlu Wang, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz

    Abstract: With recent advancements in large language models, methods like chain-of-thought prompting to elicit reasoning chains have been shown to improve results on reasoning tasks. However, tasks that require multiple steps of reasoning still pose significant challenges to state-of-the-art models. Drawing inspiration from the beam search algorithm, we propose PathFinder, a tree-search-based reasoning path… ▽ More

    Submitted 12 December, 2023; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023 R0-FoMo Workshop

  2. arXiv:2310.02804  [pdf, other

    cs.CL cs.CV cs.LG

    DOMINO: A Dual-System for Multi-step Visual Language Reasoning

    Authors: Peifang Wang, Olga Golovneva, Armen Aghajanyan, Xiang Ren, Muhao Chen, Asli Celikyilmaz, Maryam Fazel-Zarandi

    Abstract: Visual language reasoning requires a system to extract text or numbers from information-dense images like charts or plots and perform logical or arithmetic reasoning to arrive at an answer. To tackle this task, existing work relies on either (1) an end-to-end vision-language model trained on a large amount of data, or (2) a two-stage pipeline where a captioning model converts the image into text t… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  3. arXiv:2309.02591  [pdf, other

    cs.LG cs.CL cs.CV

    Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

    Authors: Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz , et al. (2 additional authors not shown)

    Abstract: We present CM3Leon (pronounced "Chameleon"), a retrieval-augmented, token-based, decoder-only multi-modal language model capable of generating and infilling both text and images. CM3Leon uses the CM3 multi-modal architecture but additionally shows the extreme benefits of scaling up and tuning on more diverse instruction-style data. It is the first multi-modal model trained with a recipe adapted fr… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  4. arXiv:2308.04592  [pdf, other

    cs.CL cs.AI

    Shepherd: A Critic for Language Model Generation

    Authors: Tianlu Wang, Ping Yu, Xiaoqing Ellen Tan, Sean O'Brien, Ramakanth Pasunuru, Jane Dwivedi-Yu, Olga Golovneva, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz

    Abstract: As large language models improve, there is increasing interest in techniques that leverage these models' capabilities to refine their own outputs. In this work, we introduce Shepherd, a language model specifically tuned to critique responses and suggest refinements, extending beyond the capabilities of an untuned model to identify diverse errors and provide suggestions to remedy them. At the core… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: 7 figures, 7 tables

  5. arXiv:2305.13516  [pdf, other

    cs.CL cs.SD eess.AS

    Scaling Speech Technology to 1,000+ Languages

    Authors: Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli

    Abstract: Expanding the language coverage of speech technology has the potential to improve access to information for many more people. However, current speech technology is restricted to about one hundred languages which is a small fraction of the over 7,000 languages spoken around the world. The Massively Multilingual Speech (MMS) project increases the number of supported languages by 10-40x, depending on… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  6. arXiv:2303.11131  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Cocktail HuBERT: Generalized Self-Supervised Pre-training for Mixture and Single-Source Speech

    Authors: Maryam Fazel-Zarandi, Wei-Ning Hsu

    Abstract: Self-supervised learning leverages unlabeled data effectively, improving label efficiency and generalization to domains without labeled data. While recent work has studied generalization to more acoustic/linguistic domains, languages, and modalities, these investigations are limited to single-source speech with one primary speaker in the recording. This paper presents Cocktail HuBERT, a self-super… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: ICASSP 2023

  7. arXiv:2212.07919  [pdf, other

    cs.CL cs.LG

    ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning

    Authors: Olga Golovneva, Moya Chen, Spencer Poff, Martin Corredor, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz

    Abstract: Large language models show improved downstream task performance when prompted to generate step-by-step reasoning to justify their final answers. These reasoning steps greatly improve model interpretability and verification, but objectively studying their correctness (independent of the final answer) is difficult without reliable methods for automatic evaluation. We simply do not know how often the… ▽ More

    Submitted 12 September, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

  8. arXiv:2203.10610  [pdf, other

    cs.CL cs.AI

    Towards Large-Scale Interpretable Knowledge Graph Reasoning for Dialogue Systems

    Authors: Yi-Lin Tuan, Sajjad Beygi, Maryam Fazel-Zarandi, Qiaozi Gao, Alessandra Cervone, William Yang Wang

    Abstract: Users interacting with voice assistants today need to phrase their requests in a very specific manner to elicit an appropriate response. This limits the user experience, and is partly due to the lack of reasoning capabilities of dialogue platforms and the hand-crafted rules that require extensive labor. One possible way to improve user experience and relieve the manual efforts of designers is to b… ▽ More

    Submitted 20 March, 2022; originally announced March 2022.

    Comments: accepted to the Findings of ACL 2022

  9. arXiv:2202.04161  [pdf, other

    cs.CL cs.AI

    Logical Reasoning for Task Oriented Dialogue Systems

    Authors: Sajjad Beygi, Maryam Fazel-Zarandi, Alessandra Cervone, Prakash Krishnan, Siddhartha Reddy Jonnalagadda

    Abstract: In recent years, large pretrained models have been used in dialogue systems to improve successful task completion rates. However, lack of reasoning capabilities of dialogue platforms make it difficult to provide relevant and fluent responses, unless the designers of a conversational experience spend a considerable amount of time implementing these capabilities in external rule based modules. In th… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

  10. arXiv:2104.09088  [pdf, other

    cs.CL cs.LG

    Alexa Conversations: An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems

    Authors: Anish Acharya, Suranjit Adhikari, Sanchit Agarwal, Vincent Auvray, Nehal Belgamwar, Arijit Biswas, Shubhra Chandra, Tagyoung Chung, Maryam Fazel-Zarandi, Raefer Gabriel, Shuyang Gao, Rahul Goel, Dilek Hakkani-Tur, Jan Jezabek, Abhay Jha, Jiun-Yu Kao, Prakash Krishnan, Peter Ku, Anuj Goyal, Chien-Wei Lin, Qing Liu, Arindam Mandal, Angeliki Metallinou, Vishal Naik, Yi Pan , et al. (6 additional authors not shown)

    Abstract: Traditional goal-oriented dialogue systems rely on various components such as natural language understanding, dialogue state tracking, policy learning and response generation. Training each component requires annotations which are hard to obtain for every new domain, limiting scalability of such systems. Similarly, rule-based dialogue systems require extensive writing and maintenance of rules and… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Journal ref: NAACL 2021 System Demonstrations Track

  11. arXiv:2011.08243  [pdf, other

    cs.CL cs.AI cs.LG

    Dialog Simulation with Realistic Variations for Training Goal-Oriented Conversational Systems

    Authors: Chien-Wei Lin, Vincent Auvray, Daniel Elkind, Arijit Biswas, Maryam Fazel-Zarandi, Nehal Belgamwar, Shubhra Chandra, Matt Zhao, Angeliki Metallinou, Tagyoung Chung, Charlie Shucheng Zhu, Suranjit Adhikari, Dilek Hakkani-Tur

    Abstract: Goal-oriented dialog systems enable users to complete specific goals like requesting information about a movie or booking a ticket. Typically the dialog system pipeline contains multiple ML models, including natural language understanding, state tracking and action prediction (policy learning). These models are trained through a combination of supervised or reinforcement learning methods and there… ▽ More

    Submitted 16 November, 2020; originally announced November 2020.

    Comments: To be presented at Human in the Loop Dialogue Systems Workshop, NeurIPS 2020

  12. arXiv:2006.05635  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors

    Authors: Longshaokan Wang, Maryam Fazel-Zarandi, Aditya Tiwari, Spyros Matsoukas, Lazaros Polymenakos

    Abstract: Speech-based virtual assistants, such as Amazon Alexa, Google assistant, and Apple Siri, typically convert users' audio signals to text data through automatic speech recognition (ASR) and feed the text to downstream dialog models for natural language understanding and response generation. The ASR output is error-prone; however, the downstream dialog models are often trained on error-free text data… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.

    Comments: To be presented at 2nd Workshop on NLP for ConvAI, ACL 2020

  13. arXiv:1911.06747  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Towards Personalized Dialog Policies for Conversational Skill Discovery

    Authors: Maryam Fazel-Zarandi, Sampat Biswas, Ryan Summers, Ahmed Elmalt, Andy McCraw, Michael McPhilips, John Peach

    Abstract: Many businesses and consumers are extending the capabilities of voice-based services such as Amazon Alexa, Google Home, Microsoft Cortana, and Apple Siri to create custom voice experiences (also known as skills). As the number of these experiences increases, a key problem is the discovery of skills that can be used to address a user's request. In this paper, we focus on conversational skill discov… ▽ More

    Submitted 15 November, 2019; originally announced November 2019.

    Comments: The 3rd Conversational AI workshop - today's practice and tomorrow's potential

  14. arXiv:1911.03378  [pdf, other

    cs.CL cs.AI cs.LG

    Investigation of Error Simulation Techniques for Learning Dialog Policies for Conversational Error Recovery

    Authors: Maryam Fazel-Zarandi, Longshaokan Wang, Aditya Tiwari, Spyros Matsoukas

    Abstract: Training dialog policies for speech-based virtual assistants requires a plethora of conversational data. The data collection phase is often expensive and time consuming due to human involvement. To address this issue, a common solution is to build user simulators for data generation. For the successful deployment of the trained policies into real world domains, it is vital that the user simulator… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

    Comments: The 3rd Conversational AI workshop - today's practice and tomorrow's potential

  15. arXiv:1712.04034  [pdf, other

    cs.CL cs.AI

    Learning Robust Dialog Policies in Noisy Environments

    Authors: Maryam Fazel-Zarandi, Shang-Wen Li, Jin Cao, Jared Casale, Peter Henderson, David Whitney, Alborz Geramifard

    Abstract: Modern virtual personal assistants provide a convenient interface for completing daily tasks via voice commands. An important consideration for these assistants is the ability to recover from automatic speech recognition (ASR) and natural language understanding (NLU) errors. In this paper, we focus on learning robust dialog policies to recover from these errors. To this end, we develop a user simu… ▽ More

    Submitted 11 December, 2017; originally announced December 2017.

    Comments: 1st Workshop on Conversational AI at NIPS 2017