Skip to main content

Showing 1–50 of 55 results for author: Joshi, M

  1. arXiv:2403.08140  [pdf, other

    cs.CL

    BAGEL: Bootstrapping Agents by Guiding Exploration with Language

    Authors: Shikhar Murty, Christopher Manning, Peter Shaw, Mandar Joshi, Kenton Lee

    Abstract: Following natural language instructions by executing actions in digital environments (e.g. web-browsers and REST APIs) is a challenging task for language model (LM) agents. Unfortunately, LM agents often fail to generalize to new environments without human demonstrations. This work presents BAGEL, a method for bootstrapping LM agents without human supervision. BAGEL converts a seed set of randomly… ▽ More

    Submitted 8 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: ICML 2024 Camera ready version

  2. arXiv:2403.02054  [pdf, other

    cs.AI

    Large Language Model-Based Evolutionary Optimizer: Reasoning with elitism

    Authors: Shuvayan Brahmachary, Subodh M. Joshi, Aniruddha Panda, Kaushik Koneripalli, Arun Kumar Sagotra, Harshil Patel, Ankush Sharma, Ameya D. Jagtap, Kaushic Kalyanaraman

    Abstract: Large Language Models (LLMs) have demonstrated remarkable reasoning abilities, prompting interest in their application as black-box optimizers. This paper asserts that LLMs possess the capability for zero-shot optimization across diverse scenarios, including multi-objective and high-dimensional problems. We introduce a novel population-based method for numerical optimization using LLMs called Lang… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  3. arXiv:2402.19109  [pdf, other

    stat.ME cs.IT

    Confidence and Assurance of Percentiles

    Authors: Sanjay M. Joshi

    Abstract: Confidence interval of mean is often used when quoting statistics. The same rigor is often missing when quoting percentiles and tolerance or percentile intervals. This article derives the expression for confidence in percentiles of a sample population. Confidence intervals of median is compared to those of mean for a few sample distributions. The concept of assurance from reliability engineering i… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 5 pages, 4 Figures

  4. arXiv:2311.09612  [pdf, other

    cs.CV cs.CL

    Efficient End-to-End Visual Document Understanding with Rationale Distillation

    Authors: Wang Zhu, Alekh Agarwal, Mandar Joshi, Robin Jia, Jesse Thomason, Kristina Toutanova

    Abstract: Understanding visually situated language requires interpreting complex layouts of textual and visual elements. Pre-processing tools, such as optical character recognition (OCR), can map document image inputs to textual tokens, then large language models (LLMs) can reason over text. However, such methods have high computational and engineering complexity. Can small pretrained image-to-text models a… ▽ More

    Submitted 1 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted by NAACL 2024

  5. arXiv:2311.07897  [pdf, other

    cs.CL

    CPopQA: Ranking Cultural Concept Popularity by LLMs

    Authors: Ming Jiang, Mansi Joshi

    Abstract: Prior work has demonstrated large language models' (LLMs) potential to discern statistical tendencies within their pre-training corpora. Despite that, many examinations of LLMs' knowledge capacity focus on knowledge explicitly appearing in the training data or implicitly inferable from similar contexts. How well an LLM captures the corpus-level statistical trends of concepts for reasoning, especia… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  6. arXiv:2311.07191  [pdf, other

    cs.AI cs.LG stat.AP

    Applying Large Language Models for Causal Structure Learning in Non Small Cell Lung Cancer

    Authors: Narmada Naik, Ayush Khandelwal, Mohit Joshi, Madhusudan Atre, Hollis Wright, Kavya Kannan, Scott Hill, Giridhar Mamidipudi, Ganapati Srinivasa, Carlo Bifulco, Brian Piening, Kevin Matlock

    Abstract: Causal discovery is becoming a key part in medical AI research. These methods can enhance healthcare by identifying causal links between biomarkers, demographics, treatments and outcomes. They can aid medical professionals in choosing more impactful treatments and strategies. In parallel, Large Language Models (LLMs) have shown great potential in identifying patterns and generating insights from t… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  7. arXiv:2306.00245  [pdf, other

    cs.LG cs.CL cs.CV cs.HC

    From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces

    Authors: Peter Shaw, Mandar Joshi, James Cohan, Jonathan Berant, Panupong Pasupat, Hexiang Hu, Urvashi Khandelwal, Kenton Lee, Kristina Toutanova

    Abstract: Much of the previous work towards digital agents for graphical user interfaces (GUIs) has relied on text-based representations (derived from HTML or other structured data sources), which are not always readily available. These input representations have been often coupled with custom, task-specific action spaces. This paper focuses on creating agents that interact with the digital world using the… ▽ More

    Submitted 6 December, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

  8. arXiv:2305.18565  [pdf, other

    cs.CV cs.CL cs.LG

    PaLI-X: On Scaling up a Multilingual Vision and Language Model

    Authors: Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, AJ Piergiovanni, Matthias Minderer, Filip Pavetic , et al. (18 additional authors not shown)

    Abstract: We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture. Our model achieves new levels of performance on a wide-range of varied and complex tasks, including multiple image-based captioning and question-answering tasks, image-based document understanding and few-sh… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  9. arXiv:2305.16578  [pdf, other

    cs.IT stat.ME

    Computation of Reliability Statistics for Finite Samples of Success-Failure Experiments

    Authors: Sanjay M. Joshi

    Abstract: Computational method for statistical measures of reliability, confidence, and assurance are available for infinite population size. If the population size is finite and small compared to the number of samples tested, these computational methods need to be improved for a better representation of reality. This article discusses how to compute reliability, confidence, and assurance statistics for fin… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: 6 pages, 4 figures, 1 table

  10. arXiv:2303.07451  [pdf

    cs.HC cs.AI

    DRISHTI: Visual Navigation Assistant for Visually Impaired

    Authors: Malay Joshi, Aditi Shukla, Jayesh Srivastava, Manya Rastogi

    Abstract: In today's society, where independent living is becoming increasingly important, it can be extremely constricting for those who are blind. Blind and visually impaired (BVI) people face challenges because they need manual support to prompt information about their environment. In this work, we took our first step towards developing an affordable and high-performing eye wearable assistive device, DRI… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: Paper presented at International Conference on Advancements and Key Challenges in Green Energy and Computing (AKGEC 2023) is accepted to be published in the proceedings of the Journal of Physics

  11. arXiv:2302.11154  [pdf, other

    cs.CV cs.AI cs.CL

    Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities

    Authors: Hexiang Hu, Yi Luan, Yang Chen, Urvashi Khandelwal, Mandar Joshi, Kenton Lee, Kristina Toutanova, Ming-Wei Chang

    Abstract: Large-scale multi-modal pre-training models such as CLIP and PaLI exhibit strong generalization on various visual domains and tasks. However, existing image classification benchmarks often evaluate recognition on a specific domain (e.g., outdoor images) or a specific task (e.g., classifying plant species), which falls short of evaluating whether pre-trained foundational models are universal visual… ▽ More

    Submitted 23 February, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: Dataset available at https://open-vision-language.github.io/oven

  12. arXiv:2212.10505  [pdf, other

    cs.CL cs.AI cs.CV

    DePlot: One-shot visual language reasoning by plot-to-table translation

    Authors: Fangyu Liu, Julian Martin Eisenschlos, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Wenhu Chen, Nigel Collier, Yasemin Altun

    Abstract: Visual language such as charts and plots is ubiquitous in the human world. Comprehending plots and charts requires strong reasoning skills. Prior state-of-the-art (SOTA) models require at least tens of thousands of training examples and their reasoning capabilities are still much limited, especially on complex human-written queries. This paper presents the first one-shot solution to visual languag… ▽ More

    Submitted 23 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL 2023 (Findings)

  13. arXiv:2212.09662  [pdf, other

    cs.CL cs.AI cs.CV

    MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering

    Authors: Fangyu Liu, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Yasemin Altun, Nigel Collier, Julian Martin Eisenschlos

    Abstract: Visual language data such as plots, charts, and infographics are ubiquitous in the human world. However, state-of-the-art vision-language models do not perform well on these data. We propose MatCha (Math reasoning and Chart derendering pretraining) to enhance visual language models' capabilities in jointly modeling charts/plots and language data. Specifically, we propose several pretraining tasks… ▽ More

    Submitted 23 May, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  14. arXiv:2212.08022  [pdf, other

    cs.CY

    iCardo: A Machine Learning Based Smart Healthcare Framework for Cardiovascular Disease Prediction

    Authors: Nidhi Sinha, Teena Jangid, Amit M. Joshi, Saraju P. Mohanty

    Abstract: The point of care services and medication have become simpler with efficient consumer electronics devices in a smart healthcare system. Cardiovascular disease is a critical illness which causes heart failure, and early and prompt identification can lessen damage and prevent premature mortality. Machine learning has been used to predict cardiovascular disease (CVD) in the literature. The article ex… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

    Comments: 19 Pages, 9 Figures, 5 Tables

  15. arXiv:2211.07893  [pdf, other

    cs.LG cs.AI cs.CR cs.DC

    Federated Learning for Healthcare Domain - Pipeline, Applications and Challenges

    Authors: Madhura Joshi, Ankit Pal, Malaikannan Sankarasubbu

    Abstract: Federated learning is the process of developing machine learning models over datasets distributed across data centers such as hospitals, clinical research labs, and mobile devices while preventing data leakage. This survey examines previous research and studies on federated learning in the healthcare sector across a range of use cases and applications. Our survey shows what challenges, methods, an… ▽ More

    Submitted 19 November, 2022; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: ACM Transactions on Computing for Healthcare, Vol. 3, No. 4, Article 40. Publication date: October 2022

    Journal ref: ACM Transactions on Computing for Healthcare, Vol. 3, No. 4, Article 40. Publication date: October 2022

  16. arXiv:2210.03347  [pdf, other

    cs.CL cs.CV

    Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

    Authors: Kenton Lee, Mandar Joshi, Iulia Turc, Hexiang Hu, Fangyu Liu, Julian Eisenschlos, Urvashi Khandelwal, Peter Shaw, Ming-Wei Chang, Kristina Toutanova

    Abstract: Visually-situated language is ubiquitous -- sources range from textbooks with diagrams to web pages with images and tables, to mobile apps with buttons and forms. Perhaps due to this diversity, previous work has typically relied on domain-specific recipes with limited sharing of the underlying data, model architectures, and objectives. We present Pix2Struct, a pretrained image-to-text model for pu… ▽ More

    Submitted 15 June, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: Accepted at ICML

  17. OysterSim: Underwater Simulation for Enhancing Oyster Reef Monitoring

    Authors: Xiaomin Lin, Nitesh Jha, Mayank Joshi, Nare Karapetyan, Yiannis Aloimonos, Miao Yu

    Abstract: Oysters are the living vacuum cleaners of the oceans. There is an exponential decline in the oyster population due to over-harvesting. With the current development of the automation and AI, robots are becoming an integral part of the environmental monitoring process that can be also utilized for oyster reef preservation. Nevertheless, the underwater environment poses many difficulties, both from t… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Journal ref: OCEANS 2022, Hampton Roads, 2022, pp. 1-6

  18. arXiv:2205.04050  [pdf, other

    cs.CL

    Few-shot Mining of Naturally Occurring Inputs and Outputs

    Authors: Mandar Joshi, Terra Blevins, Mike Lewis, Daniel S. Weld, Luke Zettlemoyer

    Abstract: Creating labeled natural language training data is expensive and requires significant human effort. We mine input output examples from large corpora using a supervised mining function trained using a small seed set of only 100 examples. The mining consists of two stages -- (1) a biencoder-based recall-oriented dense search which pairs inputs with potential outputs, and (2) a crossencoder-based fil… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

  19. arXiv:2204.07496  [pdf, other

    cs.CL cs.IR

    Improving Passage Retrieval with Zero-Shot Question Generation

    Authors: Devendra Singh Sachan, Mike Lewis, Mandar Joshi, Armen Aghajanyan, Wen-tau Yih, Joelle Pineau, Luke Zettlemoyer

    Abstract: We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage. This approach can be applied on top of any retrieval method (e.g. neural or… ▽ More

    Submitted 2 April, 2023; v1 submitted 15 April, 2022; originally announced April 2022.

    Comments: EMNLP 2022 camera-ready version. Code is available at: https://github.com/DevSinghSachan/unsupervised-passage-reranking

  20. arXiv:2203.01412  [pdf, other

    cs.CV

    Effect of Timing Error: A Case Study of Navigation Camera

    Authors: Sandeep S. Kulkarni, Sanjay M. Joshi

    Abstract: We focus on the problem of timing errors in navigation camera as a case study in a broader problem of the effect of a timing error in cyber-physical systems. These systems rely on the requirement that certain things happen at the same time or certain things happen periodically at some period $T$. However, as these systems get more complex, timing errors can occur between the components thereby vio… ▽ More

    Submitted 1 March, 2022; originally announced March 2022.

  21. arXiv:2202.06744  [pdf

    cs.DC

    Overhead Management in Multi-Core Environment

    Authors: Urmila Shrawankar, Mayuri Joshi

    Abstract: In multi-core systems, various factors like inter-process communication, dependency, resource sharing and scheduling, level of parallelism, synchronization, number of available cores etc. influence the extent of possible High Performance Computing parallelization. These parameters if not managed to the root level, later surface as overheads during execution. This paper emphasizes on these paramete… ▽ More

    Submitted 31 January, 2022; originally announced February 2022.

    Comments: 06 pages, 05 figures, 03 tables

  22. arXiv:2201.07520  [pdf, other

    cs.CL

    CM3: A Causal Masked Multimodal Model of the Internet

    Authors: Armen Aghajanyan, Bernie Huang, Candace Ross, Vladimir Karpukhin, Hu Xu, Naman Goyal, Dmytro Okhonko, Mandar Joshi, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer

    Abstract: We introduce CM3, a family of causally masked generative models trained over a large corpus of structured multi-modal documents that can contain both text and image tokens. Our new causally masked approach generates tokens left to right while also masking out a small number of long token spans that are generated at the end of the string, instead of their original positions. The casual masking obje… ▽ More

    Submitted 19 January, 2022; originally announced January 2022.

  23. arXiv:2111.11298  [pdf, other

    eess.SP cs.HC cs.LG

    Novel EEG based Schizophrenia Detection with IoMT Framework for Smart Healthcare

    Authors: Geetanjali Sharma, Amit M. Joshi

    Abstract: In the field of neuroscience, Brain activity analysis is always considered as an important area. Schizophrenia(Sz) is a brain disorder that severely affects the thinking, behaviour, and feelings of people all around the world. Electroencephalography (EEG) is proved to be an efficient biomarker in Sz detection. EEG is a non-linear time-seriesi signal and utilizing it for investigation is rather cru… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

    Comments: 18 pages, 9 Figures

  24. arXiv:2109.04194  [pdf, other

    cs.HC eess.SP

    Novel Time Domain Based Upper-Limb Prosthesis Control using Incremental Learning Approach

    Authors: Sidharth Pancholi, Amit M. Joshi Deepak Joshi, Bradly S. Duerstock

    Abstract: The upper limb of the body is a vital for various kind of activities for human. The complete or partial loss of the upper limb would lead to a significant impact on daily activities of the amputees. EMG carries important information of human physique which helps to decode the various functionalities of human arm. EMG signal based bionics and prosthesis have gained huge research attention over the… ▽ More

    Submitted 13 January, 2024; v1 submitted 25 August, 2021; originally announced September 2021.

    Comments: 15 Pages, 8 Figures, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  25. arXiv:2107.08514  [pdf, other

    eess.SP cs.HC cs.LG q-bio.NC

    Classification of Upper Arm Movements from EEG signals using Machine Learning with ICA Analysis

    Authors: Pranali Kokate, Sidharth Pancholi, Amit M. Joshi

    Abstract: The Brain-Computer Interface system is a profoundly developing area of experimentation for Motor activities which plays vital role in decoding cognitive activities. Classification of Cognitive-Motor Imagery activities from EEG signals is a critical task. Hence proposed a unique algorithm for classifying left/right-hand movements by utilizing Multi-layer Perceptron Neural Network. Handcrafted stati… ▽ More

    Submitted 18 July, 2021; originally announced July 2021.

    Comments: 41 Pages, Figures 32, Table 9

  26. arXiv:2107.06955  [pdf, ps, other

    cs.CL cs.LG

    HTLM: Hyper-Text Pre-Training and Prompting of Language Models

    Authors: Armen Aghajanyan, Dmytro Okhonko, Mike Lewis, Mandar Joshi, Hu Xu, Gargi Ghosh, Luke Zettlemoyer

    Abstract: We introduce HTLM, a hyper-text language model trained on a large-scale web crawl. Modeling hyper-text has a number of advantages: (1) it is easily gathered at scale, (2) it provides rich document-level and end-task-adjacent supervision (e.g. class and id attributes often encode document category information), and (3) it allows for new structured prompting that follows the established semantics of… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

  27. arXiv:2106.05365  [pdf, other

    cs.CL cs.AI

    DESCGEN: A Distantly Supervised Dataset for Generating Abstractive Entity Descriptions

    Authors: Weijia Shi, Mandar Joshi, Luke Zettlemoyer

    Abstract: Short textual descriptions of entities provide summaries of their key attributes and have been shown to be useful sources of background knowledge for tasks such as entity linking and question answering. However, generating entity descriptions, especially for new and long-tail entities, can be challenging since relevant information is often scattered across multiple sources with varied content and… ▽ More

    Submitted 16 June, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

    Journal ref: ACL-IJCNLP 2021

  28. arXiv:2106.04192  [pdf, other

    cs.CL

    Realistic Evaluation Principles for Cross-document Coreference Resolution

    Authors: Arie Cattan, Alon Eirew, Gabriel Stanovsky, Mandar Joshi, Ido Dagan

    Abstract: We point out that common evaluation practices for cross-document coreference resolution have been unrealistically permissive in their assumed settings, yielding inflated results. We propose addressing this issue via two evaluation methodology principles. First, as in other tasks, models should be evaluated on predicted mentions rather than on gold mentions. Doing this raises a subtle issue regardi… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Comments: *SEM 2021

  29. arXiv:2106.01210  [pdf, other

    cs.CL

    Cross-document Coreference Resolution over Predicted Mentions

    Authors: Arie Cattan, Alon Eirew, Gabriel Stanovsky, Mandar Joshi, Ido Dagan

    Abstract: Coreference resolution has been mostly investigated within a single document scope, showing impressive progress in recent years based on end-to-end models. However, the more challenging task of cross-document (CD) coreference resolution remained relatively under-explored, with the few recent models applied only to gold mentions. Here, we introduce the first end-to-end model for CD coreference reso… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: Findings of ACL 2021

  30. arXiv:2102.09866  [pdf

    cs.CL cs.LG

    KBCNMUJAL@HASOC-Dravidian-CodeMix-FIRE2020: Using Machine Learning for Detection of Hate Speech and Offensive Code-Mixed Social Media text

    Authors: Varsha Pathak, Manish Joshi, Prasad Joshi, Monica Mundada, Tanmay Joshi

    Abstract: This paper describes the system submitted by our team, KBCNMUJAL, for Task 2 of the shared task Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC), at Forum for Information Retrieval Evaluation, December 16-20, 2020, Hyderabad, India. The datasets of two Dravidian languages Viz. Malayalam and Tamil of size 4000 observations, each were shared by the HASOC organizers… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

  31. arXiv:2102.07983  [pdf, other

    cs.CL

    FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the Dictionary

    Authors: Terra Blevins, Mandar Joshi, Luke Zettlemoyer

    Abstract: Current models for Word Sense Disambiguation (WSD) struggle to disambiguate rare senses, despite reaching human performance on global WSD metrics. This stems from a lack of data for both modeling and evaluating rare senses in existing WSD datasets. In this paper, we introduce FEWS (Few-shot Examples of Word Senses), a new low-shot WSD dataset automatically extracted from example sentences in Wikti… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: EACL 2021

  32. arXiv:2011.03713  [pdf, other

    cs.CL

    Naturalization of Text by the Insertion of Pauses and Filler Words

    Authors: Richa Sharma, Parth Vipul Shah, Ashwini M. Joshi

    Abstract: In this article, we introduce a set of methods to naturalize text based on natural human speech. Voice-based interactions provide a natural way of interfacing with electronic systems and are seeing a widespread adaptation of late. These computerized voices can be naturalized to some degree by inserting pauses and filler words at appropriate positions. The first proposed text transformation method… ▽ More

    Submitted 7 November, 2020; originally announced November 2020.

    Comments: Keywords: Text transformation, natural speech, bigram, RNN, filler words

  33. arXiv:2010.03378  [pdf

    eess.IV cs.CV cs.LG

    Descriptive analysis of computational methods for automating mammograms with practical applications

    Authors: Aparna Bhale, Manish Joshi

    Abstract: Mammography is a vital screening technique for early revealing and identification of breast cancer in order to assist to decrease mortality rate. Practical applications of mammograms are not limited to breast cancer revealing, identification ,but include task based lens design, image compression, image classification, content based image retrieval and a host of others. Mammography computational an… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: 33 pages and 2 Figures. A review paper of the research work related to mamography

  34. arXiv:2009.11032  [pdf, other

    cs.CL

    Streamlining Cross-Document Coreference Resolution: Evaluation and Modeling

    Authors: Arie Cattan, Alon Eirew, Gabriel Stanovsky, Mandar Joshi, Ido Dagan

    Abstract: Recent evaluation protocols for Cross-document (CD) coreference resolution have often been inconsistent or lenient, leading to incomparable results across works and overestimation of performance. To facilitate proper future research on this task, our primary contribution is proposing a pragmatic evaluation methodology which assumes access to only raw text -- rather than assuming gold mentions, dis… ▽ More

    Submitted 23 October, 2020; v1 submitted 23 September, 2020; originally announced September 2020.

  35. arXiv:2008.12905  [pdf, other

    cs.DB cs.DS

    Batching and Matching for Food Delivery in Dynamic Road Networks

    Authors: Manas Joshi, Arshdeep Singh, Sayan Ranu, Amitabha Bagchi, Priyank Karia, Puneet Kala

    Abstract: Given a stream of food orders and available delivery vehicles, how should orders be assigned to vehicles so that the delivery time is minimized? Several decisions have to be made: (1) assignment of orders to vehicles, (2) grouping orders into batches to cope with limited vehicle availability, and (3) adapting to dynamic positions of delivery vehicles. We show that the minimization problem is not o… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

    Comments: 12 pages, 9 figures, Accepted in ICDE 2021 as Short Paper

  36. arXiv:2005.00652  [pdf, other

    cs.CL cs.LG

    An Information Bottleneck Approach for Controlling Conciseness in Rationale Extraction

    Authors: Bhargavi Paranjape, Mandar Joshi, John Thickstun, Hannaneh Hajishirzi, Luke Zettlemoyer

    Abstract: Decisions of complex language understanding models can be rationalized by limiting their inputs to a relevant subsequence of the original text. A rationale should be as concise as possible without significantly degrading task performance, but this balance can be difficult to achieve in practice. In this paper, we show that it is possible to better manage this trade-off by optimizing a bound on the… ▽ More

    Submitted 2 November, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

    Comments: EMNLP 2020 main track accepted paper

  37. arXiv:2004.12006  [pdf, other

    cs.CL

    Contextualized Representations Using Textual Encyclopedic Knowledge

    Authors: Mandar Joshi, Kenton Lee, Yi Luan, Kristina Toutanova

    Abstract: We present a method to represent input texts by contextualizing them jointly with dynamically retrieved textual encyclopedic background knowledge from multiple documents. We apply our method to reading comprehension tasks by encoding questions and passages together with background sentences about the entities they mention. We show that integrating background knowledge from text is effective for ta… ▽ More

    Submitted 13 July, 2021; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: Added experiments comparing linkers

  38. arXiv:1911.03052  [pdf, ps, other

    cs.CV

    A Novel Approach for Partial Fingerprint Identification to Mitigate MasterPrint Generation

    Authors: Mahesh Joshi, Bodhisatwa Mazumdar, Somnath Dey

    Abstract: Partial fingerprint recognition is a method to recognize an individual when the sensor size has a small form factor in accepting a full fingerprint. It is also used in forensic research to identify the partial fingerprints collected from the crime scenes. But the distinguishing features in the partial fingerprint are relatively low due to small fingerprint captured by the sensor. Hence, the unique… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

  39. arXiv:1910.07233  [pdf

    cs.IR

    Rule based Approach for Word Normalization by resolving Transcription Ambiguity in Transliterated Search Queries

    Authors: Varsha Pathak, Manish Joshi

    Abstract: Query term matching with document term matching is the basic function of any best effort Information Retrieval models like Vector Space Model. In our problem of SMS based Information Systems we expect common people to participate in information search. Our system allows mobile users to formulate their queries in their own words, own transliteration style and spelling formation. To achieve this fle… ▽ More

    Submitted 16 October, 2019; originally announced October 2019.

    Comments: 11 pages, 2 figures, 2 tables, Unpublished

    MSC Class: 68T35

  40. arXiv:1908.09091  [pdf, ps, other

    cs.CL

    BERT for Coreference Resolution: Baselines and Analysis

    Authors: Mandar Joshi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer

    Abstract: We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3.9 F1) and GAP (+11.5 F1) benchmarks. A qualitative analysis of model predictions indicates that, compared to ELMo and BERT-base, BERT-large is particularly better at distinguishing between related but distinct entities (e.g., President and CEO). However, there is still room for improvement in modeling docum… ▽ More

    Submitted 22 December, 2019; v1 submitted 24 August, 2019; originally announced August 2019.

    Comments: Fix test set numbers for e2e-coref on GAP

  41. arXiv:1907.11692  [pdf, ps, other

    cs.CL

    RoBERTa: A Robustly Optimized BERT Pretraining Approach

    Authors: Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov

    Abstract: Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes, and, as we will show, hyperparameter choices have significant impact on the final results. We present a replication study of BERT pretraining (Devlin et al., 2019) that caref… ▽ More

    Submitted 26 July, 2019; originally announced July 2019.

  42. arXiv:1907.10529  [pdf, other

    cs.CL cs.LG

    SpanBERT: Improving Pre-training by Representing and Predicting Spans

    Authors: Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, Omer Levy

    Abstract: We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text. Our approach extends BERT by (1) masking contiguous random spans, rather than random tokens, and (2) training the span boundary representations to predict the entire content of the masked span, without relying on the individual token representations within it. SpanBERT consistently outperform… ▽ More

    Submitted 17 January, 2020; v1 submitted 24 July, 2019; originally announced July 2019.

    Comments: Accepted at TACL

  43. arXiv:1907.02014  [pdf, other

    cs.OH cs.CV cs.CY

    Using AI for Economic Upliftment of Handicraft Industry

    Authors: Nitya Raviprakash, Sonam Damani, Ankush Chatterjee, Meghana Joshi, Puneet Agrawal

    Abstract: The handicraft industry is a strong pillar of Indian economy which provides large-scale employment opportunities to artisans in rural and underprivileged communities. However, in this era of globalization, diverse modern designs have rendered traditional designs old and monotonous, causing an alarming decline of handicraft sales. For this age-old industry to survive the global competition, it is i… ▽ More

    Submitted 31 May, 2019; originally announced July 2019.

  44. arXiv:1906.00606  [pdf, other

    cs.HC cs.AI

    An Extensive Review of Computational Dance Automation Techniques and Applications

    Authors: Manish Joshi, Sangeeta Jadhav

    Abstract: Dance is an art and when technology meets this kind of art, it's a novel attempt in itself. Several researchers have attempted to automate several aspects of dance, right from dance notation to choreography. Furthermore, we have encountered several applications of dance automation like e-learning, heritage preservation, etc. Despite several attempts by researchers for more than two decades in vari… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

    Comments: 15 pages, 6 figures,

  45. arXiv:1810.12097  [pdf, other

    cs.CL

    Ruuh: A Deep Learning Based Conversational Social Agent

    Authors: Sonam Damani, Nitya Raviprakash, Umang Gupta, Ankush Chatterjee, Meghana Joshi, Khyatti Gupta, Kedhar Nath Narahari, Puneet Agrawal, Manoj Kumar Chinnakotla, Sneha Magapu, Abhishek Mathur

    Abstract: Dialogue systems and conversational agents are becoming increasingly popular in the modern society but building an agent capable of holding intelligent conversation with its users is a challenging problem for artificial intelligence. In this demo, we demonstrate a deep learning based conversational social agent called "Ruuh" (facebook.com/Ruuh) designed by a team at Microsoft India to converse on… ▽ More

    Submitted 22 October, 2018; originally announced October 2018.

    Comments: 2 pages, 1 figure

  46. arXiv:1810.08854  [pdf, other

    cs.CL

    pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference

    Authors: Mandar Joshi, Eunsol Choi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer

    Abstract: Reasoning about implied relationships (e.g., paraphrastic, common sense, encyclopedic) between pairs of words is crucial for many cross-sentence inference problems. This paper proposes new methods for learning and using embeddings of word pairs that implicitly represent background knowledge about such relationships. Our pairwise embeddings are computed as a compositional function on word represent… ▽ More

    Submitted 5 April, 2019; v1 submitted 20 October, 2018; originally announced October 2018.

    Comments: NAACL camera ready

  47. arXiv:1805.07116  [pdf, other

    cs.CR

    Security Vulnerabilities Against Fingerprint Biometric System

    Authors: Mahesh Joshi, Bodhisatwa Mazumdar, Somnath Dey

    Abstract: The biometric system is an automatic identification and authentication system that uses unique biological traits, such as fingerprint, face, iris, voice, retina, etc. of an individual. Of all these systems, fingerprint biometric system is the most widely used because of its low cost, high matching speed, and relatively high matching accuracy. Due to the high efficiency of fingerprint biometric sys… ▽ More

    Submitted 18 May, 2018; originally announced May 2018.

  48. arXiv:1705.06338  [pdf, other

    cs.IR cs.AI

    Distributed Vector Representation Of Shopping Items, The Customer And Shopping Cart To Build A Three Fold Recommendation System

    Authors: Bibek Behera, Manoj Joshi, Abhilash KK, Mohammad Ansari Ismail

    Abstract: The main idea of this paper is to represent shopping items through vectors because these vectors act as the base for building em- beddings for customers and shopping carts. Also, these vectors are input to the mathematical models that act as either a recommendation engine or help in targeting potential customers. We have used exponential family embeddings as the tool to construct two basic vectors… ▽ More

    Submitted 17 May, 2017; originally announced May 2017.

    Comments: Cicling 2017

  49. arXiv:1705.03551  [pdf, other

    cs.CL

    TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

    Authors: Mandar Joshi, Eunsol Choi, Daniel S. Weld, Luke Zettlemoyer

    Abstract: We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. TriviaQA includes 95K question-answer pairs authored by trivia enthusiasts and independently gathered evidence documents, six per question on average, that provide high quality distant supervision for answering the questions. We show that, in comparison to other recently introduc… ▽ More

    Submitted 13 May, 2017; v1 submitted 9 May, 2017; originally announced May 2017.

    Comments: Added references, fixed typos, minor baseline update

  50. arXiv:1512.01755  [pdf, ps, other

    physics.comp-ph cs.CE math.NA

    A post-processing technique for stabilizing the discontinuous pressure projection operator in marginally-resolved incompressible inviscid flow

    Authors: Sumedh M. Joshi, Peter J. Diamessis, Derek T. Steinmoeller, Marek Stastna, Greg N. Thomsen

    Abstract: A method for post-processing the velocity after a pressure projection is developed that helps to maintain stability in an under-resolved, inviscid, discontinuous element-based simulation for use in environmental fluid mechanics process studies. The post-processing method is needed because of spurious divergence growth at element interfaces due to the discontinuous nature of the discretization used… ▽ More

    Submitted 6 December, 2015; originally announced December 2015.