Skip to main content

Showing 1–38 of 38 results for author: Mehdad, Y

  1. arXiv:2311.02772  [pdf, ps, other

    cs.SD cs.CL eess.AS

    Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency

    Authors: Sungho Jeon, Ching-Feng Yeh, Hakan Inan, Wei-Ning Hsu, Rashi Rungta, Yashar Mehdad, Daniel Bikel

    Abstract: In this paper, we show that a simple self-supervised pre-trained audio model can achieve comparable inference efficiency to more complicated pre-trained models with speech transformer encoders. These speech transformers rely on mixing convolutional modules with self-attention modules. They achieve state-of-the-art performance on ASR with top efficiency. We first show that employing these speech tr… ▽ More

    Submitted 8 February, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: 5 pages; accepted to Self-supervision in Audio, Speech and Beyond (SASB) workshop in ICASSP24

  2. arXiv:2309.16039  [pdf, other

    cs.CL

    Effective Long-Context Scaling of Foundation Models

    Authors: Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma

    Abstract: We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens. Our model series are built through continual pretraining from Llama 2 with longer training sequences and on a dataset where long texts are upsampled. We perform extensive evaluation on language modeling, synthetic context probing tasks, and a wide range of research benchmarks. On research benchm… ▽ More

    Submitted 13 November, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

  3. arXiv:2305.17888  [pdf, other

    cs.CL

    LLM-QAT: Data-Free Quantization Aware Training for Large Language Models

    Authors: Zechun Liu, Barlas Oguz, Changsheng Zhao, Ernie Chang, Pierre Stock, Yashar Mehdad, Yangyang Shi, Raghuraman Krishnamoorthi, Vikas Chandra

    Abstract: Several post-training quantization methods have been applied to large language models (LLMs), and have been shown to perform well down to 8-bits. We find that these methods break down at lower bit precision, and investigate quantization aware training for LLMs (LLM-QAT) to push quantization levels even further. We propose a data-free distillation method that leverages generations produced by the p… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  4. arXiv:2305.03204  [pdf, other

    cs.CV cs.CL

    VideoOFA: Two-Stage Pre-Training for Video-to-Text Generation

    Authors: Xilun Chen, Lili Yu, Wenhan Xiong, Barlas Oğuz, Yashar Mehdad, Wen-tau Yih

    Abstract: We propose a new two-stage pre-training framework for video-to-text generation tasks such as video captioning and video question answering: A generative encoder-decoder model is first jointly pre-trained on massive image-text data to learn fundamental vision-language concepts, and then adapted to video data in an intermediate video-text pre-training stage to learn video-specific skills such as spa… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  5. arXiv:2303.16406  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Hierarchical Video-Moment Retrieval and Step-Captioning

    Authors: Abhay Zala, Jaemin Cho, Satwik Kottur, Xilun Chen, Barlas Oğuz, Yasher Mehdad, Mohit Bansal

    Abstract: There is growing interest in searching for information from large video corpora. Prior works have studied relevant tasks, such as text-based video retrieval, moment retrieval, video summarization, and video captioning in isolation, without an end-to-end setup that can jointly search from video corpora and generate summaries. Such an end-to-end setup would allow for many interesting applications, e… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: CVPR 2023 (15 pages; the first two authors contributed equally; Project website: https://hirest-cvpr2023.github.io)

  6. arXiv:2302.07452  [pdf, other

    cs.IR cs.CL

    How to Train Your DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval

    Authors: Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen

    Abstract: Various techniques have been developed in recent years to improve dense retrieval (DR), such as unsupervised contrastive learning and pseudo-query generation. Existing DRs, however, often suffer from effectiveness tradeoffs between supervised and zero-shot retrieval, which some argue was due to the limited model capacity. We contradict this hypothesis and show that a generalizable DR can be traine… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

  7. arXiv:2212.12652  [pdf, other

    cs.CL

    STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension

    Authors: Borui Wang, Chengcheng Feng, Arjun Nair, Madelyn Mao, Jai Desai, Asli Celikyilmaz, Haoran Li, Yashar Mehdad, Dragomir Radev

    Abstract: Abstractive dialogue summarization has long been viewed as an important standalone task in natural language processing, but no previous work has explored the possibility of whether abstractive dialogue summarization can also be used as a means to boost an NLP system's performance on other important dialogue comprehension tasks. In this paper, we propose a novel type of dialogue summarization task… ▽ More

    Submitted 23 December, 2022; originally announced December 2022.

    Comments: EMNLP 2022

  8. arXiv:2212.09726  [pdf, other

    cs.CL cs.AI cs.LG

    Improving Faithfulness of Abstractive Summarization by Controlling Confounding Effect of Irrelevant Sentences

    Authors: Asish Ghoshal, Arash Einolghozati, Ankit Arun, Haoran Li, Lili Yu, Vera Gor, Yashar Mehdad, Scott Wen-tau Yih, Asli Celikyilmaz

    Abstract: Lack of factual correctness is an issue that still plagues state-of-the-art summarization systems despite their impressive progress on generating seemingly fluent summaries. In this paper, we show that factual inconsistency can be caused by irrelevant parts of the input text, which act as confounders. To that end, we leverage information-theoretic measures of causal effects to quantify the amount… ▽ More

    Submitted 18 January, 2024; v1 submitted 19 December, 2022; originally announced December 2022.

  9. arXiv:2211.10411  [pdf, other

    cs.IR cs.CL

    CITADEL: Conditional Token Interaction via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval

    Authors: Minghan Li, Sheng-Chieh Lin, Barlas Oguz, Asish Ghoshal, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen

    Abstract: Multi-vector retrieval methods combine the merits of sparse (e.g. BM25) and dense (e.g. DPR) retrievers and have achieved state-of-the-art performance on various retrieval tasks. These methods, however, are orders of magnitude slower and need much more space to store their indices compared to their single-vector counterparts. In this paper, we unify different multi-vector retrieval models from a t… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

  10. arXiv:2210.13678  [pdf, other

    cs.CL cs.IR cs.LG

    Bridging the Training-Inference Gap for Dense Phrase Retrieval

    Authors: Gyuwan Kim, Jinhyuk Lee, Barlas Oguz, Wenhan Xiong, Yizhe Zhang, Yashar Mehdad, William Yang Wang

    Abstract: Building dense retrievers requires a series of standard procedures, including training and validating neural models and creating indexes for efficient search. However, these procedures are often misaligned in that training objectives do not exactly reflect the retrieval scenario at inference time. In this paper, we explore how the gap between training and inference in dense retrieval can be reduce… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Findings of EMNLP 2022; 12 pages, 3 figures

  11. arXiv:2209.13759  [pdf, other

    cs.CL

    Structured Summarization: Unified Text Segmentation and Segment Labeling as a Generation Task

    Authors: Hakan Inan, Rashi Rungta, Yashar Mehdad

    Abstract: Text segmentation aims to divide text into contiguous, semantically coherent segments, while segment labeling deals with producing labels for each segment. Past work has shown success in tackling segmentation and labeling for documents and conversations. This has been possible with a combination of task-specific pipelines, supervised and unsupervised learning objectives. In this work, we propose a… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

  12. arXiv:2209.10052  [pdf, other

    cs.CL

    Adapting Pretrained Text-to-Text Models for Long Text Sequences

    Authors: Wenhan Xiong, Anchit Gupta, Shubham Toshniwal, Yashar Mehdad, Wen-tau Yih

    Abstract: We present an empirical study of adapting an existing pretrained text-to-text model for long-sequence inputs. Through a comprehensive study along three axes of the pretraining pipeline -- model architecture, optimization objective, and pretraining corpus, we propose an effective recipe to build long-context models from existing short-context models. Specifically, we replace the full attention in t… ▽ More

    Submitted 16 November, 2022; v1 submitted 20 September, 2022; originally announced September 2022.

  13. arXiv:2205.13016  [pdf, other

    cs.LG cs.CL

    BiT: Robustly Binarized Multi-distilled Transformer

    Authors: Zechun Liu, Barlas Oguz, Aasish Pappu, Lin Xiao, Scott Yih, Meng Li, Raghuraman Krishnamoorthi, Yashar Mehdad

    Abstract: Modern pre-trained transformers have rapidly advanced the state-of-the-art in machine learning, but have also grown in parameters and computational complexity, making them increasingly difficult to deploy in resource-constrained environments. Binarization of the weights and activations of the network can significantly alleviate these issues, however, is technically challenging from an optimization… ▽ More

    Submitted 2 October, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022

  14. CONFIT: Toward Faithful Dialogue Summarization with Linguistically-Informed Contrastive Fine-tuning

    Authors: Xiangru Tang, Arjun Nair, Borui Wang, Bingyao Wang, Jai Desai, Aaron Wade, Haoran Li, Asli Celikyilmaz, Yashar Mehdad, Dragomir Radev

    Abstract: Factual inconsistencies in generated summaries severely limit the practical applications of abstractive dialogue summarization. Although significant progress has been achieved by using pre-trained models, substantial amounts of hallucinated content are found during the human evaluation. Pre-trained models are most commonly fine-tuned with cross-entropy loss for text summarization, which may not be… ▽ More

    Submitted 9 July, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Journal ref: NAACL 2022

  15. arXiv:2112.07210  [pdf, other

    cs.CL

    Simple Local Attentions Remain Competitive for Long-Context Tasks

    Authors: Wenhan Xiong, Barlas Oğuz, Anchit Gupta, Xilun Chen, Diana Liskovich, Omer Levy, Wen-tau Yih, Yashar Mehdad

    Abstract: Many NLP tasks require processing long contexts beyond the length limit of pretrained models. In order to scale these models to longer text sequences, many efficient long-range attention variants have been proposed. Despite the abundance of research along this direction, it is still difficult to gauge the relative effectiveness of these models in practical use cases, e.g., if we apply these models… ▽ More

    Submitted 3 May, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: NAACL 2022 Main Conference

  16. arXiv:2110.06918  [pdf, other

    cs.CL cs.IR cs.LG

    Salient Phrase Aware Dense Retrieval: Can a Dense Retriever Imitate a Sparse One?

    Authors: Xilun Chen, Kushal Lakhotia, Barlas Oğuz, Anchit Gupta, Patrick Lewis, Stan Peshterliev, Yashar Mehdad, Sonal Gupta, Wen-tau Yih

    Abstract: Despite their recent popularity and well-known advantages, dense retrievers still lag behind sparse methods such as BM25 in their ability to reliably match salient phrases and rare entities in the query and to generalize to out-of-domain data. It has been argued that this is an inherent limitation of dense models. We rebut this claim by introducing the Salient Phrase Aware Retriever (SPAR), a dens… ▽ More

    Submitted 11 November, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

  17. arXiv:2109.09195  [pdf, other

    cs.CL

    Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries

    Authors: Xiangru Tang, Alexander Fabbri, Haoran Li, Ziming Mao, Griffin Thomas Adams, Borui Wang, Asli Celikyilmaz, Yashar Mehdad, Dragomir Radev

    Abstract: Current pre-trained models applied to summarization are prone to factual inconsistencies which either misrepresent the source text or introduce extraneous information. Thus, comparing the factual consistency of summaries is necessary as we develop improved models. However, the optimal human evaluation setup for factual consistency has not been standardized. To address this issue, we crowdsourced e… ▽ More

    Submitted 9 July, 2022; v1 submitted 19 September, 2021; originally announced September 2021.

  18. arXiv:2107.13602  [pdf, other

    cs.CL cs.IR

    Domain-matched Pre-training Tasks for Dense Retrieval

    Authors: Barlas Oğuz, Kushal Lakhotia, Anchit Gupta, Patrick Lewis, Vladimir Karpukhin, Aleksandra Piktus, Xilun Chen, Sebastian Riedel, Wen-tau Yih, Sonal Gupta, Yashar Mehdad

    Abstract: Pre-training on larger datasets with ever increasing model size is now a proven recipe for increased performance across almost all NLP tasks. A notable exception is information retrieval, where additional pre-training has so far failed to produce convincing results. We show that, with the right pre-training setup, this barrier can be overcome. We demonstrate this by pre-training large bi-encoder m… ▽ More

    Submitted 28 July, 2021; originally announced July 2021.

  19. arXiv:2106.02134  [pdf, other

    cs.CL

    Syntax-augmented Multilingual BERT for Cross-lingual Transfer

    Authors: Wasi Uddin Ahmad, Haoran Li, Kai-Wei Chang, Yashar Mehdad

    Abstract: In recent years, we have seen a colossal effort in pre-training multilingual text encoders using large-scale corpora in many languages to facilitate cross-lingual transfer learning. However, due to typological differences across languages, the cross-lingual transfer is challenging. Nevertheless, language syntax, e.g., syntactic dependencies, can bridge the typological gap. Previous works have show… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: ACL 2021 (camera ready)

  20. arXiv:2106.00829  [pdf, other

    cs.CL

    ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining

    Authors: Alexander R. Fabbri, Faiaz Rahman, Imad Rizvi, Borui Wang, Haoran Li, Yashar Mehdad, Dragomir Radev

    Abstract: While online conversations can cover a vast amount of information in many different formats, abstractive text summarization has primarily focused on modeling solely news articles. This research gap is due, in part, to the lack of standardized datasets for summarizing online discussions. To address this gap, we design annotation protocols motivated by an issues--viewpoints--assertions framework to… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: ACL 2021

  21. arXiv:2105.06982  [pdf, other

    cs.CL

    EASE: Extractive-Abstractive Summarization with Explanations

    Authors: Haoran Li, Arash Einolghozati, Srinivasan Iyer, Bhargavi Paranjape, Yashar Mehdad, Sonal Gupta, Marjan Ghazvininejad

    Abstract: Current abstractive summarization systems outperform their extractive counterparts, but their widespread adoption is inhibited by the inherent lack of interpretability. To achieve the best of both worlds, we propose EASE, an extractive-abstractive framework for evidence-based text generation and apply it to document summarization. We present an explainable summarization system based on the Informa… ▽ More

    Submitted 14 May, 2021; originally announced May 2021.

  22. arXiv:2101.00977  [pdf, other

    cs.LG

    Towards Understanding the Behaviors of Optimal Deep Active Learning Algorithms

    Authors: Yilun Zhou, Adithya Renduchintala, Xian Li, Sida Wang, Yashar Mehdad, Asish Ghoshal

    Abstract: Active learning (AL) algorithms may achieve better performance with fewer data because the model guides the data selection process. While many algorithms have been proposed, there is little study on what the optimal AL algorithm looks like, which would help researchers understand where their models fall short and iterate on the design. In this paper, we present a simulated annealing algorithm to s… ▽ More

    Submitted 20 February, 2021; v1 submitted 29 December, 2020; originally announced January 2021.

    Comments: AISTATS 2021

  23. arXiv:2101.00133  [pdf, other

    cs.CL cs.AI

    NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

    Authors: Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini , et al. (28 additional authors not shown)

    Abstract: We review the EfficientQA competition from NeurIPS 2020. The competition focused on open-domain question answering (QA), where systems take natural language questions as input and return natural language answers. The aim of the competition was to build systems that can predict correct answers while also satisfying strict on-disk memory budgets. These memory budgets were designed to encourage conte… ▽ More

    Submitted 19 September, 2021; v1 submitted 31 December, 2020; originally announced January 2021.

    Comments: 26 pages; Published in Proceedings of Machine Learning Research (PMLR), NeurIPS 2020 Competition and Demonstration Track

  24. arXiv:2012.15482  [pdf, other

    cs.CL

    FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation

    Authors: Kushal Lakhotia, Bhargavi Paranjape, Asish Ghoshal, Wen-tau Yih, Yashar Mehdad, Srinivasan Iyer

    Abstract: Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for NLP tasks such as Question Answering (QA) and Fact Verification. Recently, pre-trained sequence to sequence (seq2seq) models have proven to be very effective in jointly making predictions, as well as generating NL explanations.… ▽ More

    Submitted 31 December, 2020; originally announced December 2020.

  25. arXiv:2012.15075  [pdf, other

    cs.CL

    Human Evaluation of Spoken vs. Visual Explanations for Open-Domain QA

    Authors: Ana Valeria Gonzalez, Gagan Bansal, Angela Fan, Robin Jia, Yashar Mehdad, Srinivasan Iyer

    Abstract: While research on explaining predictions of open-domain QA systems (ODQA) to users is gaining momentum, most works have failed to evaluate the extent to which explanations improve user trust. While few works evaluate explanations using user studies, they employ settings that may deviate from the end-user's usage in-the-wild: ODQA is most ubiquitous in voice-assistants, yet current research only ev… ▽ More

    Submitted 30 December, 2020; originally announced December 2020.

    Comments: pre-print

  26. arXiv:2012.14610  [pdf, other

    cs.CL

    UniK-QA: Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering

    Authors: Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Scott Yih

    Abstract: We study open-domain question answering with structured, unstructured and semi-structured knowledge sources, including text, tables, lists and knowledge bases. Departing from prior work, we propose a unifying approach that homogenizes all sources by reducing them to text and applies the retriever-reader model which has so far been limited to text sources only. Our approach greatly improves the res… ▽ More

    Submitted 3 May, 2022; v1 submitted 29 December, 2020; originally announced December 2020.

    Comments: NAACL-HLT 2022 Findings

  27. arXiv:2010.12836  [pdf, other

    cs.CL

    Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation

    Authors: Alexander R. Fabbri, Simeng Han, Haoyuan Li, Haoran Li, Marjan Ghazvininejad, Shafiq Joty, Dragomir Radev, Yashar Mehdad

    Abstract: Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks. However, these models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains. In this work, we introduce a novel and generalizable method, called WikiTransfer, for fin… ▽ More

    Submitted 11 April, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: NAACL 2021

  28. arXiv:2010.10757  [pdf, other

    cs.CL

    RECONSIDER: Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering

    Authors: Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wen-tau Yih

    Abstract: State-of-the-art Machine Reading Comprehension (MRC) models for Open-domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples. This training scheme possibly explains empirical observations that these models achieve a high recall amongst their top few predictions, but a low overall accuracy, mo… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

  29. arXiv:2010.03546  [pdf, other

    cs.CL

    Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic Parsing

    Authors: Xilun Chen, Asish Ghoshal, Yashar Mehdad, Luke Zettlemoyer, Sonal Gupta

    Abstract: Task-oriented semantic parsing is a critical component of virtual assistants, which is responsible for understanding the user's intents (set reminder, play music, etc.). Recent advances in deep learning have enabled several approaches to successfully parse more complex queries (Gupta et al., 2018; Rongali et al.,2020), but these models require a large amount of annotated training data to parse que… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  30. arXiv:2010.02413  [pdf, other

    cs.CL cs.AI

    Efficient One-Pass End-to-End Entity Linking for Questions

    Authors: Belinda Z. Li, Sewon Min, Srinivasan Iyer, Yashar Mehdad, Wen-tau Yih

    Abstract: We present ELQ, a fast end-to-end entity linking model for questions, which uses a biencoder to jointly perform mention detection and linking in one pass. Evaluated on WebQSP and GraphQuestions with extended annotations that cover multiple entities per question, ELQ outperforms the previous state of the art by a large margin of +12.7% and +19.6% F1, respectively. With a very fast inference time (1… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

    Comments: 9 pages, EMNLP 2020

  31. arXiv:2009.13655  [pdf, other

    cs.CL cs.LG

    Conversational Semantic Parsing

    Authors: Armen Aghajanyan, Jean Maillard, Akshat Shrivastava, Keith Diedrick, Mike Haeger, Haoran Li, Yashar Mehdad, Ves Stoyanov, Anuj Kumar, Mike Lewis, Sonal Gupta

    Abstract: The structured representation for semantic parsing in task-oriented assistant systems is geared towards simple understanding of one-turn queries. Due to the limitations of the representation, the session-based properties such as co-reference resolution and context carryover are processed downstream in a pipelined system. In this paper, we propose a semantic representation for such task-oriented co… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

  32. arXiv:2009.12756  [pdf, other

    cs.CL

    Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval

    Authors: Wenhan Xiong, Xiang Lorraine Li, Srini Iyer, Jingfei Du, Patrick Lewis, William Yang Wang, Yashar Mehdad, Wen-tau Yih, Sebastian Riedel, Douwe Kiela, Barlas Oğuz

    Abstract: We propose a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions, which achieves state-of-the-art performance on two multi-hop datasets, HotpotQA and multi-evidence FEVER. Contrary to previous work, our method does not require access to any corpus-specific information, such as inter-document hyperlinks or human-annotated entity markers, and can be ap… ▽ More

    Submitted 19 February, 2021; v1 submitted 27 September, 2020; originally announced September 2020.

  33. arXiv:2008.09335  [pdf, other

    cs.CL cs.LG

    MTOP: A Comprehensive Multilingual Task-Oriented Semantic Parsing Benchmark

    Authors: Haoran Li, Abhinav Arora, Shuohui Chen, Anchit Gupta, Sonal Gupta, Yashar Mehdad

    Abstract: Scaling semantic parsing models for task-oriented dialog systems to new languages is often expensive and time-consuming due to the lack of available datasets. Available datasets suffer from several shortcomings: a) they contain few languages b) they contain small amounts of labeled examples per language c) they are based on the simple intent and slot detection paradigm for non-compositional querie… ▽ More

    Submitted 26 January, 2021; v1 submitted 21 August, 2020; originally announced August 2020.

    Comments: 13 pages, 2 figures, Accepted at EACL 2021

    Journal ref: EACL 2021

  34. arXiv:1707.04596  [pdf, other

    cs.CL cs.IR

    DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging

    Authors: Sheng Chen, Akshay Soni, Aasish Pappu, Yashar Mehdad

    Abstract: Tagging news articles or blog posts with relevant tags from a collection of predefined ones is coined as document tagging in this work. Accurate tagging of articles can benefit several downstream applications such as recommendation and search. In this work, we propose a novel yet simple approach called DocTag2Vec to accomplish this task. We substantially extend Word2Vec and Doc2Vec---two popular m… ▽ More

    Submitted 14 July, 2017; originally announced July 2017.

    Comments: 10 pages

  35. arXiv:1705.05765  [pdf, other

    cs.AI cs.IR

    Online Article Ranking as a Constrained, Dynamic, Multi-Objective Optimization Problem

    Authors: Jeya Balaji Balasubramanian, Akshay Soni, Yashar Mehdad, Nikolay Laptev

    Abstract: The content ranking problem in a social news website, is typically a function that maximizes a scalar metric of interest like dwell-time. However, like in most real-world applications we are interested in more than one metric---for instance simultaneously maximizing click-through rate, monetization metrics, dwell-time---and also satisfy the traffic requirements promised to different publishers. Al… ▽ More

    Submitted 16 May, 2017; originally announced May 2017.

    Comments: 7 pages

  36. arXiv:1702.07798  [pdf, other

    stat.ML cs.LG

    Rank-to-engage: New Listwise Approaches to Maximize Engagement

    Authors: Swayambhoo Jain, Akshay Soni, Nikolay Laptev, Yashar Mehdad

    Abstract: For many internet businesses, presenting a given list of items in an order that maximizes a certain metric of interest (e.g., click-through-rate, average engagement time etc.) is crucial. We approach the aforementioned task from a learning-to-rank perspective which reveals a new problem setup. In traditional learning-to-rank literature, it is implicitly assumed that during the training data genera… ▽ More

    Submitted 24 February, 2017; originally announced February 2017.

  37. arXiv:1702.05181  [pdf, other

    cs.IR cs.LG stat.ML

    RIPML: A Restricted Isometry Property based Approach to Multilabel Learning

    Authors: Akshay Soni, Yashar Mehdad

    Abstract: The multilabel learning problem with large number of labels, features, and data-points has generated a tremendous interest recently. A recurring theme of these problems is that only a few labels are active in any given datapoint as compared to the total number of labels. However, only a small number of existing work take direct advantage of this inherent extreme sparsity in the label space. By the… ▽ More

    Submitted 16 February, 2017; originally announced February 2017.

    Comments: 6 pages

  38. arXiv:1612.00148  [pdf, other

    cs.CL cs.IR

    Domain Adaptation for Named Entity Recognition in Online Media with Word Embeddings

    Authors: Vivek Kulkarni, Yashar Mehdad, Troy Chevalier

    Abstract: Content on the Internet is heterogeneous and arises from various domains like News, Entertainment, Finance and Technology. Understanding such content requires identifying named entities (persons, places and organizations) as one of the key steps. Traditionally Named Entity Recognition (NER) systems have been built using available annotated datasets (like CoNLL, MUC) and demonstrate excellent perfo… ▽ More

    Submitted 1 December, 2016; originally announced December 2016.

    Comments: 12 pages, 3 figures, 8 tables arxiv preprint