Skip to main content

Showing 1–20 of 20 results for author: Mitamura, T

  1. arXiv:2406.19236  [pdf, other

    cs.AI cs.CV cs.RO

    Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions

    Authors: Minghan Li, Heng Li, Zhi-Qi Cheng, Yifei Dong, Yuxuan Zhou, Jun-Yan He, Qi Dai, Teruko Mitamura, Alexander G. Hauptmann

    Abstract: Vision-and-Language Navigation (VLN) aims to develop embodied agents that navigate based on human instructions. However, current VLN frameworks often rely on static environments and optimal expert supervision, limiting their real-world applicability. To address this, we introduce Human-Aware Vision-and-Language Navigation (HA-VLN), extending traditional VLN by incorporating dynamic human activitie… ▽ More

    Submitted 4 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 30 pages, 18 figures, Project Page: https://lpercc.github.io/HA3D_simulator/

  2. arXiv:2405.13954  [pdf, other

    cs.LG cs.AI cs.CL

    What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions

    Authors: Sang Keun Choe, Hwijeen Ahn, Juhan Bae, Kewen Zhao, Minsoo Kang, Youngseog Chung, Adithya Pratapa, Willie Neiswanger, Emma Strubell, Teruko Mitamura, Jeff Schneider, Eduard Hovy, Roger Grosse, Eric Xing

    Abstract: Large language models (LLMs) are trained on a vast amount of human-written data, but data providers often remain uncredited. In response to this issue, data valuation (or data attribution), which quantifies the contribution or value of each data to the model output, has been discussed as a potential solution. Nevertheless, applying existing data valuation methods to recent LLMs and their vast trai… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  3. arXiv:2403.00990  [pdf, other

    cs.CL

    Formulation Comparison for Timeline Construction using LLMs

    Authors: Kimihiro Hasegawa, Nikhil Kandukuri, Susan Holm, Yukari Yamakawa, Teruko Mitamura

    Abstract: Constructing a timeline requires identifying the chronological order of events in an article. In prior timeline construction datasets, temporal orders are typically annotated by either event-to-time anchoring or event-to-event pairwise ordering, both of which suffer from missing temporal information. To mitigate the issue, we develop a new evaluation dataset, TimeSET, consisting of single-document… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  4. arXiv:2304.02173  [pdf, other

    cs.CV cs.AI cs.MM

    ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

    Authors: Zhi-Qi Cheng, Qi Dai, Siyao Li, Jingdong Sun, Teruko Mitamura, Alexander G. Hauptmann

    Abstract: Charts are a powerful tool for visually conveying complex data, but their comprehension poses a challenge due to the diverse chart types and intricate components. Existing chart comprehension methods suffer from either heuristic rules or an over-reliance on OCR systems, resulting in suboptimal performance. To address these issues, we present ChartReader, a unified framework that seamlessly integra… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  5. arXiv:2302.04197  [pdf, ps, other

    cs.CL

    Hierarchical Event Grounding

    Authors: Jiefu Ou, Adithya Pratapa, Rishubh Gupta, Teruko Mitamura

    Abstract: Event grounding aims at linking mention references in text corpora to events from a knowledge base (KB). Previous work on this task focused primarily on linking to a single KB event, thereby overlooking the hierarchical aspects of events. Events in documents are typically described at various levels of spatio-temporal granularity (Glavas et al. 2014). These hierarchical relations are utilized in d… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Comments: Accepted to AAAI 2023

  6. arXiv:2208.08965  [pdf, other

    cs.CV cs.AI cs.CL cs.MM

    GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement

    Authors: Zhi-Qi Cheng, Qi Dai, Siyao Li, Teruko Mitamura, Alexander G. Hauptmann

    Abstract: Grounded Situation Recognition (GSR) aims to generate structured semantic summaries of images for "human-like" event understanding. Specifically, GSR task not only detects the salient activity verb (e.g. buying), but also predicts all corresponding semantic roles (e.g. agent and goods). Inspired by object detection and image captioning tasks, existing methods typically employ a two-stage framework… ▽ More

    Submitted 28 November, 2022; v1 submitted 18 August, 2022; originally announced August 2022.

    Comments: ACM Multimedia 2022 (Oral), Code: https://github.com/zhiqic/GSRFormer

  7. arXiv:2204.06535  [pdf, other

    cs.CL

    Multilingual Event Linking to Wikidata

    Authors: Adithya Pratapa, Rishubh Gupta, Teruko Mitamura

    Abstract: We present a task of multilingual linking of events to a knowledge base. We automatically compile a large-scale dataset for this task, comprising of 1.8M mentions across 44 languages referring to over 10.9K events from Wikidata. We propose two variants of the event linking task: 1) multilingual, where event descriptions are from the same language as the mention, and 2) crosslingual, where all even… ▽ More

    Submitted 16 July, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

    Comments: Camera-ready for Multilingual Information Access workshop at NAACL 2022

  8. arXiv:2109.06417  [pdf, other

    cs.CL

    Cross-document Event Identity via Dense Annotation

    Authors: Adithya Pratapa, Zhengzhong Liu, Kimihiro Hasegawa, Linwei Li, Yukari Yamakawa, Shikun Zhang, Teruko Mitamura

    Abstract: In this paper, we study the identity of textual events from different documents. While the complex nature of event identity is previously studied (Hovy et al., 2013), the case of events across documents is unclear. Prior work on cross-document event coreference has two main drawbacks. First, they restrict the annotations to a limited set of event types. Second, they insufficiently tackle the conce… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: CoNLL 2021 camera-ready

  9. arXiv:2109.03892  [pdf, other

    cs.CL cs.AI cs.LG

    Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models

    Authors: Steven Y. Feng, Kevin Lu, Zhuofu Tao, Malihe Alikhani, Teruko Mitamura, Eduard Hovy, Varun Gangal

    Abstract: We investigate the use of multimodal information contained in images as an effective method for enhancing the commonsense of Transformer models for text generation. We perform experiments using BART and T5 on concept-to-text generation, specifically the task of generative commonsense reasoning, or CommonGen. We call our approach VisCTG: Visually Grounded Concept-to-Text Generation. VisCTG involves… ▽ More

    Submitted 25 March, 2022; v1 submitted 8 September, 2021; originally announced September 2021.

    Comments: Accepted to AAAI 2022. Code at https://github.com/styfeng/VisCTG

  10. arXiv:2105.03075  [pdf, other

    cs.CL cs.AI cs.LG

    A Survey of Data Augmentation Approaches for NLP

    Authors: Steven Y. Feng, Varun Gangal, Jason Wei, Sarath Chandar, Soroush Vosoughi, Teruko Mitamura, Eduard Hovy

    Abstract: Data augmentation has recently seen increased interest in NLP due to more work in low-resource domains, new tasks, and the popularity of large-scale neural networks that require large amounts of training data. Despite this recent upsurge, this area is still relatively underexplored, perhaps due to the challenges posed by the discrete nature of language data. In this paper, we present a comprehensi… ▽ More

    Submitted 1 December, 2021; v1 submitted 7 May, 2021; originally announced May 2021.

    Comments: Accepted to ACL 2021 Findings. GitHub repo with paper list at https://github.com/styfeng/DataAug4NLP ; Talk at https://www.youtube.com/watch?v=kNBVesKUZCk&ab_channel=StevenFeng ; Podcast at https://www.youtube.com/watch?v=qmqyT_97Poc&ab_channel=GradientFlow and https://thedataexchange.media/data-augmentation-in-natural-language-processing

  11. arXiv:2104.06669  [pdf, other

    cs.CL cs.AI

    NAREOR: The Narrative Reordering Problem

    Authors: Varun Gangal, Steven Y. Feng, Malihe Alikhani, Teruko Mitamura, Eduard Hovy

    Abstract: Many implicit inferences exist in text depending on how it is structured that can critically impact the text's interpretation and meaning. One such structural aspect present in text with chronology is the order of its presentation. For narratives or stories, this is known as the narrative order. Reordering a narrative can impact the temporal, causal, event-based, and other inferences readers draw… ▽ More

    Submitted 27 March, 2022; v1 submitted 14 April, 2021; originally announced April 2021.

    Comments: Accepted to AAAI 2022; Code at https://github.com/vgtomahawk/NAREORCamReady

  12. arXiv:2103.01834  [pdf, other

    cs.CL cs.AI

    A Data-Centric Framework for Composable NLP Workflows

    Authors: Zhengzhong Liu, Guanxiong Ding, Avinash Bukkittu, Mansi Gupta, Pengzhi Gao, Atif Ahmed, Shikun Zhang, Xin Gao, Swapnil Singhavi, Linwei Li, Wei Wei, Zecong Hu, Haoran Shi, Haoying Zhang, Xiaodan Liang, Teruko Mitamura, Eric P. Xing, Zhiting Hu

    Abstract: Empirical natural language processing (NLP) systems in application domains (e.g., healthcare, finance, education) involve interoperation among multiple components, ranging from data ingestion, human annotation, to text retrieval, analysis, generation, and visualization. We establish a unified open-source framework to support fast development of such sophisticated NLP workflows in a composable mann… ▽ More

    Submitted 1 September, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

    Comments: 8 pages, 4 figures, EMNLP 2020

  13. arXiv:2101.05479  [pdf, other

    cs.CV cs.LG

    Understanding the Role of Scene Graphs in Visual Question Answering

    Authors: Vinay Damodaran, Sharanya Chakravarthy, Akshay Kumar, Anjana Umapathy, Teruko Mitamura, Yuta Nakashima, Noa Garcia, Chenhui Chu

    Abstract: Visual Question Answering (VQA) is of tremendous interest to the research community with important applications such as aiding visually impaired users and image-based search. In this work, we explore the use of scene graphs for solving the VQA task. We conduct experiments on the GQA dataset which presents a challenging set of questions requiring counting, compositionality and advanced reasoning ca… ▽ More

    Submitted 16 January, 2021; v1 submitted 14 January, 2021; originally announced January 2021.

  14. arXiv:2010.01794  [pdf, other

    cs.CL cs.AI cs.LG

    GenAug: Data Augmentation for Finetuning Text Generators

    Authors: Steven Y. Feng, Varun Gangal, Dongyeop Kang, Teruko Mitamura, Eduard Hovy

    Abstract: In this paper, we investigate data augmentation for text generation, which we call GenAug. Text generation and language modeling are important tasks within natural language processing, and are especially challenging for low-data regimes. We propose and evaluate various augmentation methods, including some that incorporate external knowledge, for finetuning GPT-2 on a subset of Yelp Reviews. We als… ▽ More

    Submitted 10 October, 2020; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020 Deep Learning Inside Out (DeeLIO) Workshop; Code available at https://github.com/styfeng/GenAug

  15. arXiv:2008.12520  [pdf, other

    cs.CV cs.CL

    A Dataset and Baselines for Visual Question Answering on Art

    Authors: Noa Garcia, Chentao Ye, Zihua Liu, Qingtao Hu, Mayu Otani, Chenhui Chu, Yuta Nakashima, Teruko Mitamura

    Abstract: Answering questions related to art pieces (paintings) is a difficult task, as it implies the understanding of not only the visual information that is shown in the picture, but also the contextual knowledge that is acquired through the study of the history of art. In this work, we introduce our first attempt towards building a new dataset, coined AQUA (Art QUestion Answering). The question-answer (… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

  16. arXiv:1907.10136  [pdf, other

    cs.CL

    Dr.Quad at MEDIQA 2019: Towards Textual Inference and Question Entailment using contextualized representations

    Authors: Vinayshekhar Bannihatti Kumar, Ashwin Srinivasan, Aditi Chaudhary, James Route, Teruko Mitamura, Eric Nyberg

    Abstract: This paper presents the submissions by Team Dr.Quad to the ACL-BioNLP 2019 shared task on Textual Inference and Question Entailment in the Medical Domain. Our system is based on the prior work Liu et al. (2019) which uses a multi-task objective function for textual entailment. In this work, we explore different strategies for generalizing state-of-the-art language understanding models to the speci… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Comments: Accepted in ACL challenge MediQA as part of the BioNLP workshop

  17. arXiv:1907.01643  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Pentagon at MEDIQA 2019: Multi-task Learning for Filtering and Re-ranking Answers using Language Inference and Question Entailment

    Authors: Hemant Pugaliya, Karan Saxena, Shefali Garg, Sheetal Shalini, Prashant Gupta, Eric Nyberg, Teruko Mitamura

    Abstract: Parallel deep learning architectures like fine-tuned BERT and MT-DNN, have quickly become the state of the art, bypassing previous deep and shallow learning methods by a large margin. More recently, pre-trained models from large related datasets have been able to perform well on many downstream tasks by just fine-tuning on domain-specific datasets . However, using powerful models on non-trivial ta… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

  18. arXiv:1902.08899  [pdf, other

    cs.CL

    The ARIEL-CMU Systems for LoReHLT18

    Authors: Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W Black, Jaime Carbonell, Graham V. Horwood , et al. (5 additional authors not shown)

    Abstract: This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech).

    Submitted 24 February, 2019; originally announced February 2019.

  19. arXiv:1809.00647  [pdf, other

    cs.CL cs.AI

    Automatic Event Salience Identification

    Authors: Zhengzhong Liu, Chenyan Xiong, Teruko Mitamura, Eduard Hovy

    Abstract: Identifying the salience (i.e. importance) of discourse units is an important task in language understanding. While events play important roles in text documents, little research exists on analyzing their saliency status. This paper empirically studies the Event Salience task and proposes two salience detection models based on content similarities and discourse relations. The first is a feature ba… ▽ More

    Submitted 3 September, 2018; originally announced September 2018.

    Comments: EMNLP 2018, 11 pages. Datasets, models and codes: https://github.com/hunterhector/EventSalience

  20. arXiv:1806.05099  [pdf, other

    cs.CL

    Graph-Based Decoding for Event Sequencing and Coreference Resolution

    Authors: Zhengzhong Liu, Teruko Mitamura, Eduard Hovy

    Abstract: Events in text documents are interrelated in complex ways. In this paper, we study two types of relation: Event Coreference and Event Sequencing. We show that the popular tree-like decoding structure for automated Event Coreference is not suitable for Event Sequencing. To this end, we propose a graph-based decoding algorithm that is applicable to both tasks. The new decoding algorithm supports fle… ▽ More

    Submitted 13 June, 2018; originally announced June 2018.

    Comments: 13 pages. COLING 2018