Skip to main content

Showing 51–100 of 120 results for author: Haffari, G

  1. arXiv:2211.14843  [pdf, other

    cs.CV

    Learning Object-Language Alignments for Open-Vocabulary Object Detection

    Authors: Chuang Lin, Peize Sun, Yi Jiang, Ping Luo, Lizhen Qu, Gholamreza Haffari, Zehuan Yuan, Jianfei Cai

    Abstract: Existing object detection methods are bounded in a fixed-set vocabulary by costly labeled data. When dealing with novel categories, the model has to be retrained with more bounding box annotations. Natural language supervision is an attractive alternative for its annotation-free attributes and broader object concepts. However, learning open-vocabulary object detection from language is challenging… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

    Comments: Technical Report

  2. arXiv:2211.03277  [pdf, other

    cs.CL

    Complex Reading Comprehension Through Question Decomposition

    Authors: Xiao-Yu Guo, Yuan-Fang Li, Gholamreza Haffari

    Abstract: Multi-hop reading comprehension requires not only the ability to reason over raw text but also the ability to combine multiple evidence. We propose a novel learning approach that helps language models better understand difficult multi-hop questions and perform "complex, compositional" reasoning. Our model first learns to decompose each multi-hop question into several sub-questions by a trainable q… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

    Comments: 10 pages, 1 figure, accepted at ALTA 2022

  3. arXiv:2210.13030  [pdf, other

    cs.CL cs.SD eess.AS

    Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing

    Authors: Hao Yang, Jinming Zhao, Gholamreza Haffari, Ehsan Shareghi

    Abstract: Pre-trained speech Transformers have facilitated great success across various speech processing tasks. However, fine-tuning these encoders for downstream tasks require sufficiently large training data to converge or to achieve state-of-the-art. In text domain this has been partly attributed to sub-optimality of the representation space in pre-trained Transformers. In this work, we take a sober loo… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: 8 pages, 3 figures

  4. arXiv:2210.11628  [pdf, other

    cs.CL

    Can Domains Be Transferred Across Languages in Multi-Domain Multilingual Neural Machine Translation?

    Authors: Thuy-Trang Vu, Shahram Khadivi, Xuanli He, Dinh Phung, Gholamreza Haffari

    Abstract: Previous works mostly focus on either multilingual or multi-domain aspects of neural machine translation (NMT). This paper investigates whether the domain information can be transferred across languages on the composition of multi-domain and multilingual NMT, particularly for the incomplete data condition where in-domain bitext is missing for some language pairs. Our results in the curated leave-o… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: WMT2022

  5. arXiv:2210.10317  [pdf, other

    cs.CV

    LAVA: Label-efficient Visual Learning and Adaptation

    Authors: Islam Nassar, Munawar Hayat, Ehsan Abbasnejad, Hamid Rezatofighi, Mehrtash Harandi, Gholamreza Haffari

    Abstract: We present LAVA, a simple yet effective method for multi-domain visual transfer learning with limited data. LAVA builds on a few recent innovations to enable adapting to partially labelled datasets with class and domain shifts. First, LAVA learns self-supervised visual representations on the source dataset and ground them using class label semantics to overcome transfer collapse problems associate… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted in WACV2023

  6. arXiv:2210.08759  [pdf, other

    cs.CL cs.MM

    Towards Relation Extraction From Speech

    Authors: Tongtong Wu, Guitao Wang, Jinming Zhao, Zhaoran Liu, Guilin Qi, Yuan-Fang Li, Gholamreza Haffari

    Abstract: Relation extraction typically aims to extract semantic relationships between entities from the unstructured text. One of the most essential data sources for relation extraction is the spoken language, such as interviews and dialogues. However, the error propagation introduced in automatic speech recognition (ASR) has been ignored in relation extraction, and the end-to-end speech-based relation ext… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: Accepted by EMNLP 2022

  7. arXiv:2210.08475  [pdf, other

    cs.CL cs.AI

    RedApt: An Adaptor for wav2vec 2 Encoding \\ Faster and Smaller Speech Translation without Quality Compromise

    Authors: Jinming Zhao, Hao Yang, Gholamreza Haffari, Ehsan Shareghi

    Abstract: Pre-trained speech Transformers in speech translation (ST) have facilitated state-of-the-art (SotA) results; yet, using such encoders is computationally expensive. To improve this, we present a novel Reducer Adaptor block, RedApt, that could be seamlessly integrated within any Transformer-based speech encoding architecture. Integrating the pretrained wav2vec 2 speech encoder with RedAptbrings 41%… ▽ More

    Submitted 16 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022 Finding

  8. arXiv:2210.02703  [pdf, other

    cs.CL

    Teaching Neural Module Networks to Do Arithmetic

    Authors: Jiayi Chen, Xiao-Yu Guo, Yuan-Fang Li, Gholamreza Haffari

    Abstract: Answering complex questions that require multi-step multi-type reasoning over raw text is challenging, especially when conducting numerical reasoning. Neural Module Networks(NMNs), follow the programmer-interpreter framework and design trainable modules to learn different reasoning skills. However, NMNs only have limited reasoning abilities, and lack numerical reasoning capability. We up-grade NMN… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

    Comments: 9 pages including appendix, camera-ready version of COLING 2022

  9. Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations

    Authors: Vy Vo, Trung Le, Van Nguyen, He Zhao, Edwin Bonilla, Gholamreza Haffari, Dinh Phung

    Abstract: Interpretable machine learning seeks to understand the reasoning process of complex black-box systems that are long notorious for lack of explainability. One flourishing approach is through counterfactual explanations, which provide suggestions on what a user can do to alter an outcome. Not only must a counterfactual example counter the original prediction from the black-box classifier but it shou… ▽ More

    Submitted 31 May, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

    Journal ref: In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, August 6-10, 2023, Long Beach, CA, USA. ACM, New York, NY, USA, 18 pages

  10. arXiv:2207.03113  [pdf, other

    cs.LG cs.AI

    An Additive Instance-Wise Approach to Multi-class Model Interpretation

    Authors: Vy Vo, Van Nguyen, Trung Le, Quan Hung Tran, Gholamreza Haffari, Seyit Camtepe, Dinh Phung

    Abstract: Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. A large number of interpreting methods focus on identifying explanatory input features, which generally fall into two main categories: attribution and selection. A popular attribution-based approach is to exploit local neighborhoods for learning instance-specific explainers in an addi… ▽ More

    Submitted 9 February, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

    Journal ref: In The Eleventh International Conference on Learning Representations, 2023

  11. arXiv:2207.00952  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation

    Authors: Jinming Zhao, Hao Yang, Ehsan Shareghi, Gholamreza Haffari

    Abstract: End-to-end speech-to-text translation models are often initialized with pre-trained speech encoder and pre-trained text decoder. This leads to a significant training gap between pre-training and fine-tuning, largely due to the modality differences between speech outputs from the encoder and text inputs to the decoder. In this work, we aim to bridge the modality gap between speech and text to impro… ▽ More

    Submitted 3 July, 2022; originally announced July 2022.

    Comments: Interspeech2022

  12. arXiv:2202.13363  [pdf, other

    cs.CL

    Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation

    Authors: Zhuang Li, Lizhen Qu, Qiongkai Xu, Tongtong Wu, Tianyang Zhan, Gholamreza Haffari

    Abstract: In this paper, we propose a variational autoencoder with disentanglement priors, VAE-DPRIOR, for task-specific natural language generation with none or a handful of task-specific labeled examples. In order to tackle compositional generalization across tasks, our model performs disentangled representation learning by introducing a conditional prior for the latent content space and another condition… ▽ More

    Submitted 29 October, 2022; v1 submitted 27 February, 2022; originally announced February 2022.

    Comments: 22 pages, EMNLP 2022

  13. arXiv:2112.15124  [pdf, other

    cs.CL

    Utilizing Wordnets for Cognate Detection among Indian Languages

    Authors: Diptesh Kanojia, Kevin Patel, Pushpak Bhattacharyya, Malhar Kulkarni, Gholamreza Haffari

    Abstract: Automatic Cognate Detection (ACD) is a challenging task which has been utilized to help NLP applications like Machine Translation, Information Retrieval and Computational Phylogenetics. Unidentified cognate pairs can pose a challenge to these applications and result in a degradation of performance. In this paper, we detect cognate word pairs among ten Indian languages with Hindi and use deep learn… ▽ More

    Submitted 30 December, 2021; originally announced December 2021.

    Comments: Published at GWC 2019

  14. arXiv:2112.09526  [pdf, other

    cs.CL

    Challenge Dataset of Cognates and False Friend Pairs from Indian Languages

    Authors: Diptesh Kanojia, Pushpak Bhattacharyya, Malhar Kulkarni, Gholamreza Haffari

    Abstract: Cognates are present in multiple variants of the same text across different languages (e.g., "hund" in German and "hound" in English language mean "dog"). They pose a challenge to various Natural Language Processing (NLP) applications such as Machine Translation, Cross-lingual Sense Disambiguation, Computational Phylogenetics, and Information Retrieval. A possible solution to address this challeng… ▽ More

    Submitted 17 December, 2021; originally announced December 2021.

    Comments: Published at LREC 2020

  15. arXiv:2112.08789  [pdf, other

    cs.CL

    Harnessing Cross-lingual Features to Improve Cognate Detection for Low-resource Languages

    Authors: Diptesh Kanojia, Raj Dabre, Shubham Dewangan, Pushpak Bhattacharyya, Gholamreza Haffari, Malhar Kulkarni

    Abstract: Cognates are variants of the same lexical form across different languages; for example 'fonema' in Spanish and 'phoneme' in English are cognates, both of which mean 'a unit of sound'. The task of automatic detection of cognates among any two languages can help downstream NLP tasks such as Cross-lingual Information Retrieval, Computational Phylogenetics, and Machine Translation. In this paper, we d… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: Published at COLING 2020

  16. arXiv:2112.08087  [pdf, other

    cs.CL cs.AI

    Cognition-aware Cognate Detection

    Authors: Diptesh Kanojia, Prashant Sharma, Sayali Ghodekar, Pushpak Bhattacharyya, Gholamreza Haffari, Malhar Kulkarni

    Abstract: Automatic detection of cognates helps downstream NLP tasks of Machine Translation, Cross-lingual Information Retrieval, Computational Phylogenetics and Cross-lingual Named Entity Recognition. Previous approaches for the task of cognate detection use orthographic, phonetic and semantic similarity based features sets. In this paper, we propose a novel method for enriching the feature sets, with cogn… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

    Comments: Published at EACL 2021

  17. arXiv:2111.13204  [pdf, other

    cs.LG

    BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule

    Authors: Miao Zhang, Jilin Hu, Steven Su, Shirui Pan, Xiaojun Chang, Bin Yang, Gholamreza Haffari

    Abstract: Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost through weight sharing and continuous relaxation. However, more recent works find that existing differentiable NAS techniques struggle to outperform naive baselines, yielding deteriorative architectures as the search proceeds. Rather than directl… ▽ More

    Submitted 25 November, 2021; originally announced November 2021.

  18. Medical Visual Question Answering: A Survey

    Authors: Zhihong Lin, Donghao Zhang, Qingyi Tao, Danli Shi, Gholamreza Haffari, Qi Wu, Mingguang He, Zongyuan Ge

    Abstract: Medical Visual Question Answering~(VQA) is a combination of medical artificial intelligence and popular VQA challenges. Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer. Although the general-domain VQA has been extensively studied, the medical VQA still needs specific investigation and expl… ▽ More

    Submitted 6 June, 2023; v1 submitted 19 November, 2021; originally announced November 2021.

  19. arXiv:2111.05759  [pdf, other

    cs.CV

    Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation

    Authors: Chuang Lin, Yi Jiang, Jianfei Cai, Lizhen Qu, Gholamreza Haffari, Zehuan Yuan

    Abstract: Vision-and-Language Navigation (VLN) is a task that an agent is required to follow a language instruction to navigate to the goal position, which relies on the ongoing interactions with the environment during moving. Recent Transformer-based VLN methods have made great progress benefiting from the direct connections between visual observations and the language instruction via the multimodal cross-… ▽ More

    Submitted 18 July, 2022; v1 submitted 10 November, 2021; originally announced November 2021.

    Comments: ECCV 2022

  20. arXiv:2110.07816  [pdf, other

    cs.CL cs.AI

    Multilingual Neural Machine Translation:Can Linguistic Hierarchies Help?

    Authors: Fahimeh Saleh, Wray Buntine, Gholamreza Haffari, Lan Du

    Abstract: Multilingual Neural Machine Translation (MNMT) trains a single NMT model that supports translation between multiple languages, rather than training separate models for different languages. Learning a single model can enhance the low-resource translation by leveraging data from multiple languages. However, the performance of an MNMT model is highly dependent on the type of languages used in trainin… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

  21. arXiv:2110.05213  [pdf, other

    cs.CL cs.LG

    It is Not as Good as You Think! Evaluating Simultaneous Machine Translation on Interpretation Data

    Authors: Jinming Zhao, Philip Arthur, Gholamreza Haffari, Trevor Cohn, Ehsan Shareghi

    Abstract: Most existing simultaneous machine translation (SiMT) systems are trained and evaluated on offline translation corpora. We argue that SiMT systems should be trained and tested on real interpretation data. To illustrate this argument, we propose an interpretation test set and conduct a realistic evaluation of SiMT trained on offline translations. Our results, on our test set along with 3 existing s… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: EMNLP2021

  22. arXiv:2109.05186  [pdf, other

    cs.CL

    Total Recall: a Customized Continual Learning Method for Neural Semantic Parsers

    Authors: Zhuang Li, Lizhen Qu, Gholamreza Haffari

    Abstract: This paper investigates continual learning for semantic parsing. In this setting, a neural semantic parser learns tasks sequentially without accessing full training data from previous tasks. Direct application of the SOTA continual learning algorithms to this problem fails to achieve comparable performance with re-training models with all seen tasks because they have not considered the special pro… ▽ More

    Submitted 15 September, 2021; v1 submitted 11 September, 2021; originally announced September 2021.

    Comments: 9 pages, accepted to EMNLP2021

  23. arXiv:2109.04292  [pdf, other

    cs.CL

    Generalised Unsupervised Domain Adaptation of Neural Machine Translation with Cross-Lingual Data Selection

    Authors: Thuy-Trang Vu, Xuanli He, Dinh Phung, Gholamreza Haffari

    Abstract: This paper considers the unsupervised domain adaptation problem for neural machine translation (NMT), where we assume the access to only monolingual text in either the source or target language in the new domain. We propose a cross-lingual data selection method to extract in-domain sentences in the missing language side from a large generic monolingual corpus. Our proposed method trains an adaptiv… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: EMNLP2021

  24. arXiv:2109.02289  [pdf, other

    cs.CL cs.AI

    Improving Numerical Reasoning Skills in the Modular Approach for Complex Question Answering on Text

    Authors: Xiao-Yu Guo, Yuan-Fang Li, Gholamreza Haffari

    Abstract: Numerical reasoning skills are essential for complex question answering (CQA) over text. It requires opertaions including counting, comparison, addition and subtraction. A successful approach to CQA on text, Neural Module Networks (NMNs), follows the programmer-interpreter paradigm and leverages specialised modules to perform compositional reasoning. However, the NMNs framework does not consider t… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

  25. arXiv:2109.02284  [pdf, other

    cs.CL

    Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural Machine Translation Training

    Authors: Minghao Wu, Yitong Li, Meng Zhang, Liangyou Li, Gholamreza Haffari, Qun Liu

    Abstract: Learning multilingual and multi-domain translation model is challenging as the heterogeneous and imbalanced data make the model converge inconsistently over different corpora in real world. One common practice is to adjust the share of each corpus in the training, so that the learning process is balanced and low-resource cases can benefit from the high resource ones. However, automatic balancing m… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

    Comments: 15 pages, 4 figures, to appear at EMNLP 2021 main conference

  26. arXiv:2108.13873  [pdf, other

    cs.CR cs.CL cs.LG

    Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs

    Authors: Qiongkai Xu, Xuanli He, Lingjuan Lyu, Lizhen Qu, Gholamreza Haffari

    Abstract: Machine-learning-as-a-service (MLaaS) has attracted millions of users to their splendid large-scale models. Although published as black-box APIs, the valuable models behind these services are still vulnerable to imitation attacks. Recently, a series of works have demonstrated that attackers manage to steal or extract the victim models. Nonetheless, none of the previous stolen models can outperform… ▽ More

    Submitted 4 September, 2022; v1 submitted 29 August, 2021; originally announced August 2021.

    Comments: COLING 2022 (oral)

  27. arXiv:2106.06168  [pdf, other

    cs.LG

    Generate, Annotate, and Learn: NLP with Synthetic Text

    Authors: Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi

    Abstract: This paper studies the use of language models as a source of synthetic unlabeled text for NLP. We formulate a general framework called ``generate, annotate, and learn (GAL)'' to take advantage of synthetic text within knowledge distillation, self-training, and few-shot learning applications. To generate high-quality task-specific text, we either fine-tune LMs on inputs from the task of interest, o… ▽ More

    Submitted 31 May, 2022; v1 submitted 11 June, 2021; originally announced June 2021.

    Comments: accepted to TACL2022

  28. arXiv:2105.09509  [pdf, other

    cs.CL cs.AI

    Adaptive Knowledge-Enhanced Bayesian Meta-Learning for Few-shot Event Detection

    Authors: Shirong Shen, Tongtong Wu, Guilin Qi, Yuan-Fang Li, Gholamreza Haffari, Sheng Bi

    Abstract: Event detection (ED) aims at detecting event trigger words in sentences and classifying them into specific event types. In real-world applications, ED typically does not have sufficient labelled data, thus can be formulated as a few-shot learning problem. To tackle the issue of low sample diversity in few-shot ED, we propose a novel knowledge-based few-shot event detection method which uses a defi… ▽ More

    Submitted 30 May, 2021; v1 submitted 20 May, 2021; originally announced May 2021.

    Comments: Accepted by ACL2021 Findings

  29. arXiv:2105.06717  [pdf, other

    cs.AI cs.CL

    Neural-Symbolic Commonsense Reasoner with Relation Predictors

    Authors: Farhad Moghimifar, Lizhen Qu, Yue Zhuo, Gholamreza Haffari, Mahsa Baktashmotlagh

    Abstract: Commonsense reasoning aims to incorporate sets of commonsense facts, retrieved from Commonsense Knowledge Graphs (CKG), to draw conclusion about ordinary situations. The dynamic nature of commonsense knowledge postulates models capable of performing multi-hop reasoning over new situations. This feature also results in having large-scale sparse Knowledge Graphs, where such reasoning process is need… ▽ More

    Submitted 14 May, 2021; originally announced May 2021.

    Comments: ACL2021

  30. arXiv:2104.05248  [pdf, other

    cs.CV cs.LG

    All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

    Authors: Islam Nassar, Samitha Herath, Ehsan Abbasnejad, Wray Buntine, Gholamreza Haffari

    Abstract: Pseudo-labeling is a key component in semi-supervised learning (SSL). It relies on iteratively using the model to generate artificial labels for the unlabeled data to train against. A common property among its various methods is that they only rely on the model's prediction to make labeling decisions without considering any prior knowledge about the visual similarity among the classes. In this pap… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: Accepted in CVPR2021

  31. arXiv:2101.10708  [pdf, other

    cs.CL cs.AI cs.LG

    Few-Shot Semantic Parsing for New Predicates

    Authors: Zhuang Li, Lizhen Qu, Shuo Huang, Gholamreza Haffari

    Abstract: In this work, we investigate the problems of semantic parsing in a few-shot learning setting. In this setting, we are provided with utterance-logical form pairs per new predicate. The state-of-the-art neural semantic parsers achieve less than 25% accuracy on benchmark datasets when k= 1. To tackle this problem, we proposed to i) apply a designated meta-learning method to train the model; ii) regul… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: Accepted to EACL 2021

  32. arXiv:2011.13549  [pdf, other

    cs.CL

    Domain Adaptative Causality Encoder

    Authors: Farhad Moghimifar, Gholamreza Haffari, Mahsa Baktashmotlagh

    Abstract: Current approaches which are mainly based on the extraction of low-level relations among individual events are limited by the shortage of publicly available labelled data. Therefore, the resulting models perform poorly when applied to a distributionally different domain for which labelled data did not exist at the time of training. To overcome this limitation, in this paper, we leverage the charac… ▽ More

    Submitted 26 November, 2020; originally announced November 2020.

    Comments: ALTA2020

  33. arXiv:2011.09911  [pdf

    cs.LG

    Multi-objective semi-supervised clustering to identify health service patterns for injured patients

    Authors: Hadi Akbarzadeh Khorshidi, Uwe Aickelin, Gholamreza Haffari, Behrooz Hassani-Mahmooei

    Abstract: This study develops a pattern recognition method that identifies patterns based on their similarity and their association with the outcome of interest. The practical purpose of developing this pattern recognition method is to group patients, who are injured in transport accidents, in the early stages post-injury. This grouping is based on distinctive patterns in health service use within the first… ▽ More

    Submitted 16 November, 2020; originally announced November 2020.

    Comments: Health information science and systems, Volume 7, Issue 1

  34. arXiv:2011.00797  [pdf, other

    cs.CL

    Context Dependent Semantic Parsing: A Survey

    Authors: Zhuang Li, Lizhen Qu, Gholamreza Haffari

    Abstract: Semantic parsing is the task of translating natural language utterances into machine-readable meaning representations. Currently, most semantic parsing methods are not able to utilize contextual information (e.g. dialogue and comments history), which has a great potential to boost semantic parsing performance. To address this issue, context dependent semantic parsing has recently drawn a lot of at… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: 10 pages, acceteped by COLING'2020

  35. arXiv:2011.00777  [pdf, other

    cs.CL

    COSMO: Conditional SEQ2SEQ-based Mixture Model for Zero-Shot Commonsense Question Answering

    Authors: Farhad Moghimifar, Lizhen Qu, Yue Zhuo, Mahsa Baktashmotlagh, Gholamreza Haffari

    Abstract: Commonsense reasoning refers to the ability of evaluating a social situation and acting accordingly. Identification of the implicit causes and effects of a social context is the driving capability which can enable machines to perform commonsense reasoning. The dynamic world of social interactions requires context-dependent on-demand systems to infer such underlying information. However, current ap… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: COLING2020

  36. arXiv:2010.15877  [pdf, other

    cs.CL cs.AI

    Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement Learning

    Authors: Yuncheng Hua, Yuan-Fang Li, Gholamreza Haffari, Guilin Qi, Tongtong Wu

    Abstract: Complex question-answering (CQA) involves answering complex natural-language questions on a knowledge base (KB). However, the conventional neural program induction (NPI) approach exhibits uneven performance when the questions have different types, harboring inherently different characteristics, e.g., difficulty level. This paper proposes a meta-reinforcement learning approach to program induction… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

    Comments: 11 pages, 1 figure, accepted in EMNLP 2020

  37. arXiv:2010.15875  [pdf, other

    cs.AI cs.CL

    Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via Alternate Meta-learning

    Authors: Yuncheng Hua, Yuan-Fang Li, Gholamreza Haffari, Guilin Qi, Wei Wu

    Abstract: A compelling approach to complex question answering is to convert the question to a sequence of actions, which can then be executed on the knowledge base to yield the answer, aka the programmer-interpreter approach. Use similar training questions to the test question, meta-learning enables the programmer to adapt to unseen questions to tackle potential distributional biases quickly. However, this… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

    Comments: 8 pages, 2 figures, published in IJCAI 2020

    Journal ref: IJCAI 2020: 3679-3686

  38. arXiv:2010.09366  [pdf, other

    cs.CL cs.AI

    Understanding Unnatural Questions Improves Reasoning over Text

    Authors: Xiao-Yu Guo, Yuan-Fang Li, Gholamreza Haffari

    Abstract: Complex question answering (CQA) over raw text is a challenging task. A prominent approach to this task is based on the programmer-interpreter framework, where the programmer maps the question into a sequence of reasoning actions which is then executed on the raw text by the interpreter. Learning an effective CQA model requires large amounts of human-annotated data,consisting of the ground-truth s… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

  39. arXiv:2010.05445  [pdf, other

    cs.CL

    Collective Wisdom: Improving Low-resource Neural Machine Translation using Adaptive Knowledge Distillation

    Authors: Fahimeh Saleh, Wray Buntine, Gholamreza Haffari

    Abstract: Scarcity of parallel sentence-pairs poses a significant hurdle for training high-quality Neural Machine Translation (NMT) models in bilingually low-resource scenarios. A standard approach is transfer learning, which involves taking a model trained on a high-resource language-pair and fine-tuning it on the data of the low-resource MT condition of interest. However, it is not clear generally which h… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

  40. arXiv:2010.03732  [pdf, other

    cs.CL

    Leveraging Discourse Rewards for Document-Level Neural Machine Translation

    Authors: Inigo Jauregi Unanue, Nazanin Esmaili, Gholamreza Haffari, Massimo Piccardi

    Abstract: Document-level machine translation focuses on the translation of entire documents from a source to a target language. It is widely regarded as a challenging task since the translation of the individual sentences in the document needs to retain aspects of the discourse at document level. However, document-level translation models are usually not trained to explicitly ensure discourse quality. There… ▽ More

    Submitted 19 October, 2020; v1 submitted 7 October, 2020; originally announced October 2020.

    Comments: Accepted at COLING 2020

  41. arXiv:2010.02591  [pdf, other

    cs.CL

    Scene Graph Modification Based on Natural Language Commands

    Authors: Xuanli He, Quan Hung Tran, Gholamreza Haffari, Walter Chang, Trung Bui, Zhe Lin, Franck Dernoncourt, Nhan Dam

    Abstract: Structured representations like graphs and parse trees play a crucial role in many Natural Language Processing systems. In recent years, the advancements in multi-turn user interfaces necessitate the need for controlling and updating these structured representations given new sources of information. Although there have been many efforts focusing on improving the performance of the parsers that map… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: Accepted to the Findings of EMNLP 2020

  42. arXiv:2010.01739  [pdf, other

    cs.CL

    Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models

    Authors: Thuy-Trang Vu, Dinh Phung, Gholamreza Haffari

    Abstract: Recent work has shown the importance of adaptation of broad-coverage contextualised embedding models on the domain of the target task of interest. Current self-supervised adaptation methods are simplistic, as the training signal comes from a small percentage of \emph{randomly} masked-out tokens. In this paper, we show that careful masking strategies can bridge the knowledge gap of masked language… ▽ More

    Submitted 4 October, 2020; originally announced October 2020.

    Comments: EMNLP2020

  43. arXiv:2007.08954  [pdf, other

    cs.CL cs.IR cs.LG

    SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression

    Authors: Jinming Zhao, Ming Liu, Longxiang Gao, Yuan Jin, Lan Du, He Zhao, He Zhang, Gholamreza Haffari

    Abstract: Obtaining training data for multi-document summarization (MDS) is time consuming and resource-intensive, so recent neural models can only be trained for limited domains. In this paper, we propose SummPip: an unsupervised method for multi-document summarization, in which we convert the original documents to a sentence graph, taking both linguistic and deep representation into account, then apply sp… ▽ More

    Submitted 20 July, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

    Comments: accepted to SIGIR 2020

  44. arXiv:2005.06606  [pdf, other

    cs.CL cs.LG

    Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation

    Authors: Xuanli He, Gholamreza Haffari, Mohammad Norouzi

    Abstract: This paper introduces Dynamic Programming Encoding (DPE), a new segmentation algorithm for tokenizing sentences into subword units. We view the subword segmentation of output sentences as a latent variable that should be marginalized out for learning and inference. A mixed character-subword transformer is proposed, which enables exact log marginal likelihood estimation and exact MAP inference to f… ▽ More

    Submitted 1 August, 2020; v1 submitted 3 May, 2020; originally announced May 2020.

    Comments: update related work

  45. arXiv:2004.09894  [pdf, other

    cs.CL

    Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns

    Authors: KayYen Wong, Sameen Maruf, Gholamreza Haffari

    Abstract: The advent of context-aware NMT has resulted in promising improvements in the overall translation quality and specifically in the translation of discourse phenomena such as pronouns. Previous works have mainly focused on the use of past sentences as context with a focus on anaphora translation. In this work, we investigate the effect of future sentences as context by comparing the performance of a… ▽ More

    Submitted 28 April, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

    Comments: Accepted to ACL 2020

  46. arXiv:2002.04306  [pdf, other

    cs.CL cs.AI cs.LG

    Learning Coupled Policies for Simultaneous Machine Translation using Imitation Learning

    Authors: Philip Arthur, Trevor Cohn, Gholamreza Haffari

    Abstract: We present a novel approach to efficiently learn a simultaneous translation model with coupled programmer-interpreter policies. First, wepresent an algorithmic oracle to produce oracle READ/WRITE actions for training bilingual sentence-pairs using the notion of word alignments. This oracle actions are designed to capture enough information from the partial input before writing the output. Next, we… ▽ More

    Submitted 25 January, 2021; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: 9 pages

  47. arXiv:2001.03294  [pdf, other

    cs.CL cs.LG

    Learning to Multi-Task Learn for Better Neural Machine Translation

    Authors: Poorya Zaremoodi, Gholamreza Haffari

    Abstract: Scarcity of parallel sentence pairs is a major challenge for training high quality neural machine translation (NMT) models in bilingually low-resource scenarios, as NMT is data-hungry. Multi-task learning is an elegant approach to inject linguistic-related inductive biases into NMT, using auxiliary syntactic and semantic tasks, to improve generalisation. The challenge, however, is to devise effect… ▽ More

    Submitted 9 January, 2020; originally announced January 2020.

  48. arXiv:1912.08494  [pdf, other

    cs.CL

    A Survey on Document-level Neural Machine Translation: Methods and Evaluation

    Authors: Sameen Maruf, Fahimeh Saleh, Gholamreza Haffari

    Abstract: Machine translation (MT) is an important task in natural language processing (NLP) as it automates the translation process and reduces the reliance on human translators. With the resurgence of neural networks, the translation quality surpasses that of the translations obtained using statistical techniques for most language-pairs. Up until a few years ago, almost all of the neural translation model… ▽ More

    Submitted 12 January, 2021; v1 submitted 18 December, 2019; originally announced December 2019.

    Comments: Accepted for publication by ACM Computing Surveys

  49. arXiv:1911.03407  [pdf, other

    cs.CL

    Question Generation from Paragraphs: A Tale of Two Hierarchical Models

    Authors: Vishwajeet Kumar, Raktim Chaki, Sai Teja Talluri, Ganesh Ramakrishnan, Yuan-Fang Li, Gholamreza Haffari

    Abstract: Automatic question generation from paragraphs is an important and challenging problem, particularly due to the long context from paragraphs. In this paper, we propose and study two hierarchical models for the task of question generation from paragraphs. Specifically, we propose (a) a novel hierarchical BiLSTM model with selective attention and (b) a novel hierarchical Transformer architecture, bot… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

  50. arXiv:1903.08788  [pdf, other

    cs.CL

    Selective Attention for Context-aware Neural Machine Translation

    Authors: Sameen Maruf, André F. T. Martins, Gholamreza Haffari

    Abstract: Despite the progress made in sentence-level NMT, current systems still fall short at achieving fluent, good quality translation for a full document. Recent works in context-aware NMT consider only a few previous sentences as context and may not scale to entire documents. To this end, we propose a novel and scalable top-down approach to hierarchical attention for context-aware NMT which uses sparse… ▽ More

    Submitted 23 May, 2019; v1 submitted 20 March, 2019; originally announced March 2019.

    Comments: Accepted at NAACL-HLT 2019