Skip to main content

Showing 1–14 of 14 results for author: Rasooli, M S

  1. arXiv:2305.17304  [pdf, other

    cs.CL

    External Language Model Integration for Factorized Neural Transducers

    Authors: Michael Levit, Sarangarajan Parthasarathy, Cem Aksoylar, Mohammad Sadegh Rasooli, Shuangyu Chang

    Abstract: We propose an adaptation method for factorized neural transducers (FNT) with external language models. We demonstrate that both neural and n-gram external LMs add significantly more value when linearly interpolated with predictor output compared to shallow fusion, thus confirming that FNT forces the predictor to act like regular language models. Further, we propose a method to integrate class-base… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  2. arXiv:2209.14500  [pdf, other

    cs.LG cs.CL

    Bidirectional Language Models Are Also Few-shot Learners

    Authors: Ajay Patel, Bryan Li, Mohammad Sadegh Rasooli, Noah Constant, Colin Raffel, Chris Callison-Burch

    Abstract: Large language models such as GPT-3 (Brown et al., 2020) can perform arbitrary tasks without undergoing fine-tuning after being prompted with only a few labeled examples. An arbitrary task can be reformulated as a natural language prompt, and a language model can be asked to generate the completion, indirectly performing the task in a paradigm known as prompt-based learning. To date, emergent prom… ▽ More

    Submitted 5 February, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: To appear at ICLR 2023

  3. arXiv:2209.02821  [pdf, other

    cs.CL

    Multilingual Bidirectional Unsupervised Translation Through Multilingual Finetuning and Back-Translation

    Authors: Bryan Li, Mohammad Sadegh Rasooli, Ajay Patel, Chris Callison-Burch

    Abstract: We propose a two-stage approach for training a single NMT model to translate unseen languages both to and from English. For the first stage, we initialize an encoder-decoder model to pretrained XLM-R and RoBERTa weights, then perform multilingual fine-tuning on parallel data in 40 languages to English. We find this model can generalize to zero-shot translations on unseen languages. For the second… ▽ More

    Submitted 3 April, 2023; v1 submitted 6 September, 2022; originally announced September 2022.

    Comments: LoResMT @ EACL 2023

  4. arXiv:2104.08384  [pdf, other

    cs.CL cs.CV

    "Wikily" Supervised Neural Translation Tailored to Cross-Lingual Tasks

    Authors: Mohammad Sadegh Rasooli, Chris Callison-Burch, Derry Tanti Wijaya

    Abstract: We present a simple but effective approach for leveraging Wikipedia for neural machine translation as well as cross-lingual tasks of image captioning and dependency parsing without using any direct supervision from external parallel data or supervised models in the target language. We show that first sentences and titles of linked Wikipedia pages, as well as cross-lingual image captions, are stron… ▽ More

    Submitted 10 September, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: To appear in EMNLP 2021 main conference

  5. arXiv:2012.06154  [pdf, other

    cs.CL cs.AI

    ParsiNLU: A Suite of Language Understanding Challenges for Persian

    Authors: Daniel Khashabi, Arman Cohan, Siamak Shakeri, Pedram Hosseini, Pouya Pezeshkpour, Malihe Alikhani, Moin Aminnaseri, Marzieh Bitaab, Faeze Brahman, Sarik Ghazarian, Mozhdeh Gheini, Arman Kabiri, Rabeeh Karimi Mahabadi, Omid Memarrast, Ahmadreza Mosallanezhad, Erfan Noury, Shahab Raji, Mohammad Sadegh Rasooli, Sepideh Sadeghi, Erfan Sadeqi Azer, Niloofar Safi Samghabadi, Mahsa Shafaei, Saber Sheybani, Ali Tazarv, Yadollah Yaghoobzadeh

    Abstract: Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like English. This work focuses on Persian language, one of the widely spoken languages in the world, and yet there are few NLU datasets available for this rich language. The availability of high-quality evaluat… ▽ More

    Submitted 13 July, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

    Comments: To appear on Transactions of the Association for Computational Linguistics (TACL), 2021

  6. arXiv:2012.05879  [pdf, other

    cs.CL

    Automatic Standardization of Colloquial Persian

    Authors: Mohammad Sadegh Rasooli, Farzane Bakhtyari, Fatemeh Shafiei, Mahsa Ravanbakhsh, Chris Callison-Burch

    Abstract: The Iranian Persian language has two varieties: standard and colloquial. Most natural language processing tools for Persian assume that the text is in standard form: this assumption is wrong in many real applications especially web content. This paper describes a simple and effective standardization approach based on sequence-to-sequence translation. We design an algorithm for generating artificia… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

  7. arXiv:2009.10205  [pdf, other

    cs.CL

    The Persian Dependency Treebank Made Universal

    Authors: Mohammad Sadegh Rasooli, Pegah Safari, Amirsaeid Moloodi, Alireza Nourian

    Abstract: We describe an automatic method for converting the Persian Dependency Treebank (Rasooli et al, 2013) to Universal Dependencies. This treebank contains 29107 sentences. Our experiments along with manual linguistic analysis show that our data is more compatible with Universal Dependencies than the Uppsala Persian Universal Dependency Treebank (Seraji et al., 2016), and is larger in size and more div… ▽ More

    Submitted 22 September, 2020; v1 submitted 21 September, 2020; originally announced September 2020.

  8. arXiv:2004.14961  [pdf, other

    cs.CL

    Mutlitask Learning for Cross-Lingual Transfer of Semantic Dependencies

    Authors: Maryam Aminian, Mohammad Sadegh Rasooli, Mona Diab

    Abstract: We describe a method for developing broad-coverage semantic dependency parsers for languages for which no semantically annotated resource is available. We leverage a multitask learning framework coupled with an annotation projection method. We transfer supervised semantic dependency parse annotations from a rich-resource language to a low-resource language through parallel data, and train a semant… ▽ More

    Submitted 30 April, 2020; originally announced April 2020.

  9. arXiv:1904.03256  [pdf, other

    cs.CL

    Cross-Lingual Transfer of Semantic Roles: From Raw Text to Semantic Roles

    Authors: Maryam Aminian, Mohammad Sadegh Rasooli, Mona Diab

    Abstract: We describe a transfer method based on annotation projection to develop a dependency-based semantic role labeling system for languages for which no supervised linguistic information other than parallel data is available. Unlike previous work that presumes the availability of supervised features such as lemmas, part-of-speech tags, and dependency parse trees, we only make use of word and character… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: Accepted at the 13th International Conference on Computational Semantics (IWCS 2019)

  10. arXiv:1903.05683  [pdf, other

    cs.CL

    Low-Resource Syntactic Transfer with Unsupervised Source Reordering

    Authors: Mohammad Sadegh Rasooli, Michael Collins

    Abstract: We describe a cross-lingual transfer method for dependency parsing that takes into account the problem of word order differences between source and target languages. Our model only relies on the Bible, a considerably smaller parallel data than the commonly used parallel data in transfer methods. We use the concatenation of projected trees from the Bible corpus, and the gold-standard treebanks in m… ▽ More

    Submitted 13 March, 2019; originally announced March 2019.

    Comments: Accepted in NAACL 2019

  11. arXiv:1803.04291  [pdf, other

    cs.CL

    Entity-Aware Language Model as an Unsupervised Reranker

    Authors: Mohammad Sadegh Rasooli, Sarangarajan Parthasarathy

    Abstract: In language modeling, it is difficult to incorporate entity relationships from a knowledge-base. One solution is to use a reranker trained with global features, in which global features are derived from n-best lists. However, training such a reranker requires manually annotated n-best lists, which is expensive to obtain. We propose a method based on the contrastive estimation method that alleviate… ▽ More

    Submitted 17 June, 2018; v1 submitted 12 March, 2018; originally announced March 2018.

    Journal ref: Interspeech 2018

  12. arXiv:1710.01411  [pdf, ps, other

    cs.CL

    Transferring Semantic Roles Using Translation and Syntactic Information

    Authors: Maryam Aminian, Mohammad Sadegh Rasooli, Mona Diab

    Abstract: Our paper addresses the problem of annotation projection for semantic role labeling for resource-poor languages using supervised annotations from a resource-rich language through parallel data. We propose a transfer method that employs information from source and target syntactic dependencies as well as word alignment density to improve the quality of an iterative bootstrapping method. Our experim… ▽ More

    Submitted 3 October, 2017; originally announced October 2017.

  13. arXiv:1610.06227  [pdf, ps, other

    cs.CL

    Cross-Lingual Syntactic Transfer with Limited Resources

    Authors: Mohammad Sadegh Rasooli, Michael Collins

    Abstract: We describe a simple but effective method for cross-lingual syntactic transfer of dependency parsers, in the scenario where a large amount of translation data is not available. The method makes use of three steps: 1) a method for deriving cross-lingual word clusters, which can then be used in a multilingual parser; 2) a method for transferring lexical information from a target language to source l… ▽ More

    Submitted 3 February, 2017; v1 submitted 19 October, 2016; originally announced October 2016.

  14. arXiv:1503.06733  [pdf, other

    cs.CL

    Yara Parser: A Fast and Accurate Dependency Parser

    Authors: Mohammad Sadegh Rasooli, Joel Tetreault

    Abstract: Dependency parsers are among the most crucial tools in natural language processing as they have many important applications in downstream tasks such as information retrieval, machine translation and knowledge acquisition. We introduce the Yara Parser, a fast and accurate open-source dependency parser based on the arc-eager algorithm and beam search. It achieves an unlabeled accuracy of 93.32 on th… ▽ More

    Submitted 24 March, 2015; v1 submitted 23 March, 2015; originally announced March 2015.