Skip to main content

Showing 1–36 of 36 results for author: Oğuz, B

  1. arXiv:2407.08020  [pdf, other

    cs.CV

    Interactive Segmentation Model for Placenta Segmentation from 3D Ultrasound images

    Authors: Hao Li, Baris Oguz, Gabriel Arenas, Xing Yao, Jiacheng Wang, Alison Pouch, Brett Byram, Nadav Schwartz, Ipek Oguz

    Abstract: Placenta volume measurement from 3D ultrasound images is critical for predicting pregnancy outcomes, and manual annotation is the gold standard. However, such manual annotation is expensive and time-consuming. Automated segmentation algorithms can often successfully segment the placenta, but these methods may not consistently produce robust segmentations suitable for practical use. Recently, inspi… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  2. arXiv:2405.01525  [pdf, other

    cs.CL cs.AI

    FLAME: Factuality-Aware Alignment for Large Language Models

    Authors: Sheng-Chieh Lin, Luyu Gao, Barlas Oguz, Wenhan Xiong, Jimmy Lin, Wen-tau Yih, Xilun Chen

    Abstract: Alignment is a standard procedure to fine-tune pre-trained large language models (LLMs) to follow natural language instructions and serve as helpful AI assistants. We have observed, however, that the conventional alignment process fails to enhance the factual accuracy of LLMs, and often leads to the generation of more false facts (i.e. hallucination). In this paper, we study how to make the LLM al… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  3. arXiv:2311.09193  [pdf, other

    cs.CL cs.AI cs.CV

    The Role of Chain-of-Thought in Complex Vision-Language Reasoning Task

    Authors: Yifan Wu, Pengchuan Zhang, Wenhan Xiong, Barlas Oguz, James C. Gee, Yixin Nie

    Abstract: The study explores the effectiveness of the Chain-of-Thought approach, known for its proficiency in language tasks by breaking them down into sub-tasks and intermediate steps, in improving vision-language tasks that demand sophisticated perception and reasoning. We present the "Description then Decision" strategy, which is inspired by how humans process signals. This strategy significantly improve… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  4. arXiv:2309.16039  [pdf, other

    cs.CL

    Effective Long-Context Scaling of Foundation Models

    Authors: Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma

    Abstract: We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens. Our model series are built through continual pretraining from Llama 2 with longer training sequences and on a dataset where long texts are upsampled. We perform extensive evaluation on language modeling, synthetic context probing tasks, and a wide range of research benchmarks. On research benchm… ▽ More

    Submitted 13 November, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

  5. arXiv:2309.15564  [pdf, other

    cs.LG cs.CL cs.CV

    Jointly Training Large Autoregressive Multimodal Models

    Authors: Emanuele Aiello, Lili Yu, Yixin Nie, Armen Aghajanyan, Barlas Oguz

    Abstract: In recent years, advances in the large-scale pretraining of language and text-to-image models have revolutionized the field of machine learning. Yet, integrating these two modalities into a single, robust model capable of generating seamless multimodal outputs remains a significant challenge. To address this gap, we present the Joint Autoregressive Mixture (JAM) framework, a modular approach that… ▽ More

    Submitted 28 September, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

  6. arXiv:2308.10382  [pdf, other

    cs.CV cs.AI

    False Negative/Positive Control for SAM on Noisy Medical Images

    Authors: Xing Yao, Han Liu, Dewei Hu, Daiwei Lu, Ange Lou, Hao Li, Ruining Deng, Gabriel Arenas, Baris Oguz, Nadav Schwartz, Brett C Byram, Ipek Oguz

    Abstract: The Segment Anything Model (SAM) is a recently developed all-range foundation model for image segmentation. It can use sparse manual prompts such as bounding boxes to generate pixel-level segmentation in natural images but struggles in medical images such as low-contrast, noisy ultrasound images. We propose a refined test-phase prompt augmentation technique designed to improve SAM's performance in… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

  7. arXiv:2306.04845  [pdf, other

    cs.CL

    Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts

    Authors: Ganesh Jawahar, Haichuan Yang, Yunyang Xiong, Zechun Liu, Dilin Wang, Fei Sun, Meng Li, Aasish Pappu, Barlas Oguz, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Raghuraman Krishnamoorthi, Vikas Chandra

    Abstract: Weight-sharing supernet has become a vital component for performance estimation in the state-of-the-art (SOTA) neural architecture search (NAS) frameworks. Although supernet can directly generate different subnetworks without retraining, there is no guarantee for the quality of these subnetworks because of weight sharing. In NLP tasks such as machine translation and pre-trained language modeling,… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  8. arXiv:2306.01841  [pdf, other

    cs.CL

    Binary and Ternary Natural Language Generation

    Authors: Zechun Liu, Barlas Oguz, Aasish Pappu, Yangyang Shi, Raghuraman Krishnamoorthi

    Abstract: Ternary and binary neural networks enable multiplication-free computation and promise multiple orders of magnitude efficiency gains over full-precision networks if implemented on specialized hardware. However, since both the parameter and the output space are highly discretized, such networks have proven very difficult to optimize. The difficulties are compounded for the class of transformer text… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Oral

  9. arXiv:2305.17888  [pdf, other

    cs.CL

    LLM-QAT: Data-Free Quantization Aware Training for Large Language Models

    Authors: Zechun Liu, Barlas Oguz, Changsheng Zhao, Ernie Chang, Pierre Stock, Yashar Mehdad, Yangyang Shi, Raghuraman Krishnamoorthi, Vikas Chandra

    Abstract: Several post-training quantization methods have been applied to large language models (LLMs), and have been shown to perform well down to 8-bits. We find that these methods break down at lower bit precision, and investigate quantization aware training for LLMs (LLM-QAT) to push quantization levels even further. We propose a data-free distillation method that leverages generations produced by the p… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  10. arXiv:2305.14312  [pdf, other

    cs.CV

    Text-guided 3D Human Generation from 2D Collections

    Authors: Tsu-Jui Fu, Wenhan Xiong, Yixin Nie, Jingyu Liu, Barlas Oğuz, William Yang Wang

    Abstract: 3D human modeling has been widely used for engaging interaction in gaming, film, and animation. The customization of these characters is crucial for creativity and scalability, which highlights the importance of controllability. In this work, we introduce Text-guided 3D Human Generation (\texttt{T3H}), where a model is to generate a 3D human, guided by the fashion description. There are two goals:… ▽ More

    Submitted 20 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: EMNLP'23 (Findings) ; Project website: https://text-3dh.github.io/

  11. arXiv:2305.03204  [pdf, other

    cs.CV cs.CL

    VideoOFA: Two-Stage Pre-Training for Video-to-Text Generation

    Authors: Xilun Chen, Lili Yu, Wenhan Xiong, Barlas Oğuz, Yashar Mehdad, Wen-tau Yih

    Abstract: We propose a new two-stage pre-training framework for video-to-text generation tasks such as video captioning and video question answering: A generative encoder-decoder model is first jointly pre-trained on massive image-text data to learn fundamental vision-language concepts, and then adapted to video data in an intermediate video-text pre-training stage to learn video-specific skills such as spa… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  12. arXiv:2303.16406  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Hierarchical Video-Moment Retrieval and Step-Captioning

    Authors: Abhay Zala, Jaemin Cho, Satwik Kottur, Xilun Chen, Barlas Oğuz, Yasher Mehdad, Mohit Bansal

    Abstract: There is growing interest in searching for information from large video corpora. Prior works have studied relevant tasks, such as text-based video retrieval, moment retrieval, video summarization, and video captioning in isolation, without an end-to-end setup that can jointly search from video corpora and generate summaries. Such an end-to-end setup would allow for many interesting applications, e… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: CVPR 2023 (15 pages; the first two authors contributed equally; Project website: https://hirest-cvpr2023.github.io)

  13. arXiv:2303.05371  [pdf, other

    cs.CV cs.GR

    3DGen: Triplane Latent Diffusion for Textured Mesh Generation

    Authors: Anchit Gupta, Wenhan Xiong, Yixin Nie, Ian Jones, Barlas Oğuz

    Abstract: Latent diffusion models for image generation have crossed a quality threshold which enabled them to achieve mass adoption. Recently, a series of works have made advancements towards replicating this success in the 3D domain, introducing techniques such as point cloud VAE, triplane representation, neural implicit surfaces and differentiable rendering based training. We take another step along this… ▽ More

    Submitted 27 March, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

  14. arXiv:2303.03565  [pdf, other

    cs.CV

    CLIP-Layout: Style-Consistent Indoor Scene Synthesis with Semantic Furniture Embedding

    Authors: Jingyu Liu, Wenhan Xiong, Ian Jones, Yixin Nie, Anchit Gupta, Barlas Oğuz

    Abstract: Indoor scene synthesis involves automatically picking and placing furniture appropriately on a floor plan, so that the scene looks realistic and is functionally plausible. Such scenes can serve as homes for immersive 3D experiences, or be used to train embodied agents. Existing methods for this task rely on labeled categories of furniture, e.g. bed, chair or table, to generate contextually relevan… ▽ More

    Submitted 2 June, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: Changed paper template and cleaned up tables

  15. arXiv:2302.07452  [pdf, other

    cs.IR cs.CL

    How to Train Your DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval

    Authors: Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen

    Abstract: Various techniques have been developed in recent years to improve dense retrieval (DR), such as unsupervised contrastive learning and pseudo-query generation. Existing DRs, however, often suffer from effectiveness tradeoffs between supervised and zero-shot retrieval, which some argue was due to the limited model capacity. We contradict this hypothesis and show that a generalizable DR can be traine… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

  16. arXiv:2211.10411  [pdf, other

    cs.IR cs.CL

    CITADEL: Conditional Token Interaction via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval

    Authors: Minghan Li, Sheng-Chieh Lin, Barlas Oguz, Asish Ghoshal, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen

    Abstract: Multi-vector retrieval methods combine the merits of sparse (e.g. BM25) and dense (e.g. DPR) retrievers and have achieved state-of-the-art performance on various retrieval tasks. These methods, however, are orders of magnitude slower and need much more space to store their indices compared to their single-vector counterparts. In this paper, we unify different multi-vector retrieval models from a t… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

  17. arXiv:2210.13678  [pdf, other

    cs.CL cs.IR cs.LG

    Bridging the Training-Inference Gap for Dense Phrase Retrieval

    Authors: Gyuwan Kim, Jinhyuk Lee, Barlas Oguz, Wenhan Xiong, Yizhe Zhang, Yashar Mehdad, William Yang Wang

    Abstract: Building dense retrievers requires a series of standard procedures, including training and validating neural models and creating indexes for efficient search. However, these procedures are often misaligned in that training objectives do not exactly reflect the retrieval scenario at inference time. In this paper, we explore how the gap between training and inference in dense retrieval can be reduce… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Findings of EMNLP 2022; 12 pages, 3 figures

  18. arXiv:2210.01371  [pdf, other

    cs.IR cs.CL

    A Study on the Efficiency and Generalization of Light Hybrid Retrievers

    Authors: Man Luo, Shashank Jain, Anchit Gupta, Arash Einolghozati, Barlas Oguz, Debojeet Chatterjee, Xilun Chen, Chitta Baral, Peyman Heidari

    Abstract: Hybrid retrievers can take advantage of both sparse and dense retrievers. Previous hybrid retrievers leverage indexing-heavy dense retrievers. In this work, we study "Is it possible to reduce the indexing memory of hybrid retrievers without sacrificing performance"? Driven by this question, we leverage an indexing-efficient dense retriever (i.e. DrBoost) and introduce a LITE retriever that further… ▽ More

    Submitted 23 May, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: accepted to ACL23

  19. arXiv:2205.13016  [pdf, other

    cs.LG cs.CL

    BiT: Robustly Binarized Multi-distilled Transformer

    Authors: Zechun Liu, Barlas Oguz, Aasish Pappu, Lin Xiao, Scott Yih, Meng Li, Raghuraman Krishnamoorthi, Yashar Mehdad

    Abstract: Modern pre-trained transformers have rapidly advanced the state-of-the-art in machine learning, but have also grown in parameters and computational complexity, making them increasingly difficult to deploy in resource-constrained environments. Binarization of the weights and activations of the network can significantly alleviate these issues, however, is technically challenging from an optimization… ▽ More

    Submitted 2 October, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022

  20. arXiv:2112.09924  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    The Web Is Your Oyster - Knowledge-Intensive NLP against a Very Large Web Corpus

    Authors: Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Dmytro Okhonko, Samuel Broscheit, Gautier Izacard, Patrick Lewis, Barlas Oğuz, Edouard Grave, Wen-tau Yih, Sebastian Riedel

    Abstract: In order to address increasing demands of real-world applications, the research for knowledge-intensive NLP (KI-NLP) should advance by capturing the challenges of a truly open-domain environment: web-scale knowledge, lack of structure, inconsistent quality and noise. To this end, we propose a new setup for evaluating existing knowledge intensive tasks in which we generalize the background corpus t… ▽ More

    Submitted 24 May, 2022; v1 submitted 18 December, 2021; originally announced December 2021.

  21. arXiv:2112.07771  [pdf, other

    cs.CL cs.IR

    Boosted Dense Retriever

    Authors: Patrick Lewis, Barlas Oğuz, Wenhan Xiong, Fabio Petroni, Wen-tau Yih, Sebastian Riedel

    Abstract: We propose DrBoost, a dense retrieval ensemble inspired by boosting. DrBoost is trained in stages: each component model is learned sequentially and specialized by focusing only on retrieval mistakes made by the current ensemble. The final representation is the concatenation of the output vectors of all the component models, making it a drop-in replacement for standard dense retrievers at test time… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

  22. arXiv:2112.07210  [pdf, other

    cs.CL

    Simple Local Attentions Remain Competitive for Long-Context Tasks

    Authors: Wenhan Xiong, Barlas Oğuz, Anchit Gupta, Xilun Chen, Diana Liskovich, Omer Levy, Wen-tau Yih, Yashar Mehdad

    Abstract: Many NLP tasks require processing long contexts beyond the length limit of pretrained models. In order to scale these models to longer text sequences, many efficient long-range attention variants have been proposed. Despite the abundance of research along this direction, it is still difficult to gauge the relative effectiveness of these models in practical use cases, e.g., if we apply these models… ▽ More

    Submitted 3 May, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: NAACL 2022 Main Conference

  23. arXiv:2110.07731  [pdf, other

    cs.CL cs.LG

    CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training

    Authors: Patrick Huber, Armen Aghajanyan, Barlas Oğuz, Dmytro Okhonko, Wen-tau Yih, Sonal Gupta, Xilun Chen

    Abstract: With the rise of large-scale pre-trained language models, open-domain question-answering (ODQA) has become an important research topic in NLP. Based on the popular pre-training fine-tuning approach, we posit that an additional in-domain pre-training stage using a large-scale, natural, and diverse question-answering (QA) dataset can be beneficial for ODQA. Consequently, we propose a novel QA datase… ▽ More

    Submitted 2 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 9 pages, Findings of NAACL 2022

  24. arXiv:2110.06918  [pdf, other

    cs.CL cs.IR cs.LG

    Salient Phrase Aware Dense Retrieval: Can a Dense Retriever Imitate a Sparse One?

    Authors: Xilun Chen, Kushal Lakhotia, Barlas Oğuz, Anchit Gupta, Patrick Lewis, Stan Peshterliev, Yashar Mehdad, Sonal Gupta, Wen-tau Yih

    Abstract: Despite their recent popularity and well-known advantages, dense retrievers still lag behind sparse methods such as BM25 in their ability to reliably match salient phrases and rare entities in the query and to generalize to out-of-domain data. It has been argued that this is an inherent limitation of dense models. We rebut this claim by introducing the Salient Phrase Aware Retriever (SPAR), a dens… ▽ More

    Submitted 11 November, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

  25. arXiv:2107.13602  [pdf, other

    cs.CL cs.IR

    Domain-matched Pre-training Tasks for Dense Retrieval

    Authors: Barlas Oğuz, Kushal Lakhotia, Anchit Gupta, Patrick Lewis, Vladimir Karpukhin, Aleksandra Piktus, Xilun Chen, Sebastian Riedel, Wen-tau Yih, Sonal Gupta, Yashar Mehdad

    Abstract: Pre-training on larger datasets with ever increasing model size is now a proven recipe for increased performance across almost all NLP tasks. A notable exception is information retrieval, where additional pre-training has so far failed to produce convincing results. We show that, with the right pre-training setup, this barrier can be overcome. We demonstrate this by pre-training large bi-encoder m… ▽ More

    Submitted 28 July, 2021; originally announced July 2021.

  26. arXiv:2103.06500  [pdf, other

    cs.CL

    Conversational Answer Generation and Factuality for Reading Comprehension Question-Answering

    Authors: Stan Peshterliev, Barlas Oguz, Debojeet Chatterjee, Hakan Inan, Vikas Bhardwaj

    Abstract: Question answering (QA) is an important use case on voice assistants. A popular approach to QA is extractive reading comprehension (RC) which finds an answer span in a text passage. However, extractive answers are often unnatural in a conversational context which results in suboptimal user experience. In this work, we investigate conversational answer generation for QA. We propose AnswerBART, an e… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

  27. arXiv:2101.00133  [pdf, other

    cs.CL cs.AI

    NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

    Authors: Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini , et al. (28 additional authors not shown)

    Abstract: We review the EfficientQA competition from NeurIPS 2020. The competition focused on open-domain question answering (QA), where systems take natural language questions as input and return natural language answers. The aim of the competition was to build systems that can predict correct answers while also satisfying strict on-disk memory budgets. These memory budgets were designed to encourage conte… ▽ More

    Submitted 19 September, 2021; v1 submitted 31 December, 2020; originally announced January 2021.

    Comments: 26 pages; Published in Proceedings of Machine Learning Research (PMLR), NeurIPS 2020 Competition and Demonstration Track

  28. arXiv:2101.00117  [pdf, other

    cs.CL

    Multi-task Retrieval for Knowledge-Intensive Tasks

    Authors: Jean Maillard, Vladimir Karpukhin, Fabio Petroni, Wen-tau Yih, Barlas Oğuz, Veselin Stoyanov, Gargi Ghosh

    Abstract: Retrieving relevant contexts from a large corpus is a crucial step for tasks such as open-domain question answering and fact checking. Although neural retrieval outperforms traditional methods like tf-idf and BM25, its performance degrades considerably when applied to out-of-domain data. Driven by the question of whether a neural retrieval model can be universal and perform robustly on a wide va… ▽ More

    Submitted 31 December, 2020; originally announced January 2021.

  29. Joint Verification and Reranking for Open Fact Checking Over Tables

    Authors: Michael Schlichtkrull, Vladimir Karpukhin, Barlas Oğuz, Mike Lewis, Wen-tau Yih, Sebastian Riedel

    Abstract: Structured information is an important knowledge source for automatic verification of factual claims. Nevertheless, the majority of existing research into this task has focused on textual data, and the few recent inquiries into structured data have been for the closed-domain setting where appropriate evidence for each claim is assumed to have already been retrieved. In this paper, we investigate v… ▽ More

    Submitted 20 August, 2021; v1 submitted 30 December, 2020; originally announced December 2020.

  30. arXiv:2012.14610  [pdf, other

    cs.CL

    UniK-QA: Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering

    Authors: Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Scott Yih

    Abstract: We study open-domain question answering with structured, unstructured and semi-structured knowledge sources, including text, tables, lists and knowledge bases. Departing from prior work, we propose a unifying approach that homogenizes all sources by reducing them to text and applies the retriever-reader model which has so far been limited to text sources only. Our approach greatly improves the res… ▽ More

    Submitted 3 May, 2022; v1 submitted 29 December, 2020; originally announced December 2020.

    Comments: NAACL-HLT 2022 Findings

  31. arXiv:2009.12756  [pdf, other

    cs.CL

    Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval

    Authors: Wenhan Xiong, Xiang Lorraine Li, Srini Iyer, Jingfei Du, Patrick Lewis, William Yang Wang, Yashar Mehdad, Wen-tau Yih, Sebastian Riedel, Douwe Kiela, Barlas Oğuz

    Abstract: We propose a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions, which achieves state-of-the-art performance on two multi-hop datasets, HotpotQA and multi-evidence FEVER. Contrary to previous work, our method does not require access to any corpus-specific information, such as inter-document hyperlinks or human-annotated entity markers, and can be ap… ▽ More

    Submitted 19 February, 2021; v1 submitted 27 September, 2020; originally announced September 2020.

  32. arXiv:2004.04906  [pdf, other

    cs.CL

    Dense Passage Retrieval for Open-Domain Question Answering

    Authors: Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, Wen-tau Yih

    Abstract: Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder fra… ▽ More

    Submitted 30 September, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

    Comments: EMNLP 2020

  33. arXiv:1910.07475  [pdf, other

    cs.CL cs.AI cs.LG

    MLQA: Evaluating Cross-lingual Extractive Question Answering

    Authors: Patrick Lewis, Barlas Oğuz, Ruty Rinott, Sebastian Riedel, Holger Schwenk

    Abstract: Question answering (QA) models have shown rapid progress enabled by the availability of large, high-quality benchmark datasets. Such annotated datasets are difficult and costly to collect, and rarely exist in languages other than English, making training QA systems in other languages challenging. An alternative to building large monolingual training datasets is to develop cross-lingual systems whi… ▽ More

    Submitted 3 May, 2020; v1 submitted 16 October, 2019; originally announced October 2019.

    Comments: To appear in ACL 2020

  34. arXiv:1909.07009  [pdf, other

    cs.CL

    Bridging the domain gap in cross-lingual document classification

    Authors: Guokun Lai, Barlas Oguz, Yiming Yang, Veselin Stoyanov

    Abstract: The scarcity of labeled training data often prohibits the internationalization of NLP models to multiple languages. Recent developments in cross-lingual understanding (XLU) has made progress in this area, trying to bridge the language barrier using language universal representations. However, even if the language problem was resolved, models trained in one language would not transfer to another la… ▽ More

    Submitted 20 September, 2019; v1 submitted 16 September, 2019; originally announced September 2019.

  35. arXiv:1812.08729  [pdf, other

    cs.CL

    PyText: A Seamless Path from NLP research to production

    Authors: Ahmed Aly, Kushal Lakhotia, Shicong Zhao, Mrinal Mohit, Barlas Oguz, Abhinav Arora, Sonal Gupta, Christopher Dewan, Stef Nelson-Lindall, Rushin Shah

    Abstract: We introduce PyText - a deep learning based NLP modeling framework built on PyTorch. PyText addresses the often-conflicting requirements of enabling rapid experimentation and of serving models at scale. It achieves this by providing simple and extensible interfaces for model components, and by using PyTorch's capabilities of exporting models for inference via the optimized Caffe2 execution engine.… ▽ More

    Submitted 12 December, 2018; originally announced December 2018.

  36. arXiv:1107.3166  [pdf, other

    cs.NI cs.DC cs.SI eess.SY

    Stable, scalable, decentralized P2P file sharing with non-altruistic peers

    Authors: Barlas Oğuz, Venkat Anantharam, Ilkka Norros

    Abstract: P2P systems provide a scalable solution for distributing large files in a network. The file is split into many chunks, and peers contact other peers to collect missing chunks to eventually complete the entire file. The so-called `rare chunk' phenomenon, where a single chunk becomes rare and prevents peers from completing the file, is a threat to the stability of such systems. Practical systems suc… ▽ More

    Submitted 15 July, 2011; originally announced July 2011.