Skip to main content

Showing 1–17 of 17 results for author: Iso, H

  1. arXiv:2406.00584  [pdf, other

    cs.DB cs.AI

    A Blueprint Architecture of Compound AI Systems for Enterprise

    Authors: Eser Kandogan, Sajjadur Rahman, Nikita Bhutani, Dan Zhang, Rafael Li Chen, Kushan Mitra, Sairam Gurajada, Pouya Pezeshkpour, Hayate Iso, Yanlin Feng, Hannah Kim, Chen Shen, Jin Wang, Estevam Hruschka

    Abstract: Large Language Models (LLMs) have showcased remarkable capabilities surpassing conventional NLP challenges, creating opportunities for use in production use cases. Towards this goal, there is a notable shift to building compound AI systems, wherein LLMs are integrated into an expansive software infrastructure with many components like models, retrievers, databases and tools. In this paper, we intr… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: Compound AI Systems Workshop at the Data+AI Summit 2024

  2. arXiv:2404.04399  [pdf, other

    stat.ML cs.AI cs.LG stat.AP stat.ME

    Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

    Authors: Toru Shirakawa, Yi Li, Yulun Wu, Sky Qiu, Yuxuan Li, Mingduo Zhao, Hiroyasu Iso, Mark van der Laan

    Abstract: We propose Deep Longitudinal Targeted Minimum Loss-based Estimation (Deep LTMLE), a novel approach to estimate the counterfactual mean of outcome under dynamic treatment policies in longitudinal problem settings. Our approach utilizes a transformer architecture with heterogeneous type embedding trained using temporal-difference learning. After obtaining an initial estimate using the transformer, f… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  3. arXiv:2402.17717  [pdf, other

    cs.CL

    AmbigNLG: Addressing Task Ambiguity in Instruction for NLG

    Authors: Ayana Niwa, Hayate Iso

    Abstract: In this study, we introduce AmbigNLG, a new task designed to tackle the challenge of task ambiguity in instructions for Natural Language Generation (NLG) tasks. Despite the impressive capabilities of Large Language Models (LLMs) in understanding and executing a wide range of tasks through natural language interaction, their performance is significantly hindered by the ambiguity present in real-wor… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: work in progress

  4. arXiv:2402.13492  [pdf, other

    cs.CL

    Retrieval Helps or Hurts? A Deeper Dive into the Efficacy of Retrieval Augmentation to Language Models

    Authors: Seiji Maekawa, Hayate Iso, Sairam Gurajada, Nikita Bhutani

    Abstract: While large language models (LMs) demonstrate remarkable performance, they encounter challenges in providing accurate responses when queried for information beyond their pre-trained memorization. Although augmenting them with relevant external information can mitigate these issues, failure to consider the necessity of retrieval may adversely affect overall performance. Previous research has primar… ▽ More

    Submitted 27 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: NAACL2024 (main)

  5. arXiv:2311.06383  [pdf, other

    cs.CL cs.LG

    Distilling Large Language Models using Skill-Occupation Graph Context for HR-Related Tasks

    Authors: Pouya Pezeshkpour, Hayate Iso, Thom Lake, Nikita Bhutani, Estevam Hruschka

    Abstract: Numerous HR applications are centered around resumes and job descriptions. While they can benefit from advancements in NLP, particularly large language models, their real-world adoption faces challenges due to absence of comprehensive benchmarks for various HR tasks, and lack of smaller models with competitive capabilities. In this paper, we aim to bridge this gap by introducing the Resume-Job Des… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  6. arXiv:2309.11063  [pdf, other

    cs.CL

    XATU: A Fine-grained Instruction-based Benchmark for Explainable Text Updates

    Authors: Haopeng Zhang, Hayate Iso, Sairam Gurajada, Nikita Bhutani

    Abstract: Text editing is a crucial task of modifying text to better align with user intents. However, existing text editing benchmark datasets contain only coarse-grained instructions and lack explainability, thus resulting in outputs that deviate from the intended changes outlined in the gold reference. To comprehensively investigate the text editing capabilities of large language models (LLMs), this pape… ▽ More

    Submitted 14 March, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: LREC-COLING 2024

  7. arXiv:2309.07382  [pdf, other

    cs.CL

    Less is More for Long Document Summary Evaluation by LLMs

    Authors: Yunshu Wu, Hayate Iso, Pouya Pezeshkpour, Nikita Bhutani, Estevam Hruschka

    Abstract: Large Language Models (LLMs) have shown promising performance in summary evaluation tasks, yet they face challenges such as high computational costs and the Lost-in-the-Middle problem where important information in the middle of long documents is often overlooked. To address these issues, this paper introduces a novel approach, Extract-then-Evaluate, which involves extracting key sentences from a… ▽ More

    Submitted 18 January, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: EACL (main)

  8. arXiv:2212.10708  [pdf, other

    cs.CL

    Zero-shot Triplet Extraction by Template Infilling

    Authors: Bosung Kim, Hayate Iso, Nikita Bhutani, Estevam Hruschka, Ndapa Nakashole, Tom Mitchell

    Abstract: The task of triplet extraction aims to extract pairs of entities and their corresponding relations from unstructured text. Most existing methods train an extraction model on training data involving specific target relations, and are incapable of extracting new relations that were not observed at training time. Generalizing the model to unseen relations typically requires fine-tuning on synthetic t… ▽ More

    Submitted 20 September, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: IJCNLP-AACL 2023 (main)

  9. arXiv:2211.08723  [pdf, other

    cs.CL

    Noisy Pairing and Partial Supervision for Opinion Summarization

    Authors: Hayate Iso, Xiaolan Wang, Yoshi Suhara

    Abstract: Current opinion summarization systems simply generate summaries reflecting important opinions from customer reviews, but the generated summaries may not attract the reader's attention. Although it is helpful to automatically generate professional reviewer-like summaries from customer reviews, collecting many training pairs of customer and professional reviews is generally tricky. We propose a weak… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

  10. arXiv:2211.08387  [pdf, other

    cs.CL

    AutoTemplate: A Simple Recipe for Lexically Constrained Text Generation

    Authors: Hayate Iso

    Abstract: Lexically constrained text generation is one of the constrained text generation tasks, which aims to generate text that covers all the given constraint lexicons. While the existing approaches tackle this problem using a lexically constrained beam search algorithm or dedicated model using non-autoregressive decoding, there is a trade-off between the generated text quality and the hard constraint sa… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  11. arXiv:2110.07520  [pdf, other

    cs.CL

    Comparative Opinion Summarization via Collaborative Decoding

    Authors: Hayate Iso, Xiaolan Wang, Stefanos Angelidis, Yoshihiko Suhara

    Abstract: Opinion summarization focuses on generating summaries that reflect popular subjective information expressed in multiple online reviews. While generated summaries offer general and concise information about a particular hotel or product, the information may be insufficient to help the user compare multiple different choices. Thus, the user may still struggle with the question "Which one should I pi… ▽ More

    Submitted 15 April, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: Findings of ACL 2022

  12. arXiv:2106.07583  [pdf, ps, other

    cs.CL

    Biomedical Entity Linking with Contrastive Context Matching

    Authors: Shogo Ujiie, Hayate Iso, Eiji Aramaki

    Abstract: We introduce BioCoM, a contrastive learning framework for biomedical entity linking that uses only two resources: a small-sized dictionary and a large number of raw biomedical articles. Specifically, we build the training instances from raw PubMed articles by dictionary matching and use them to train a context-aware entity linking model with contrastive learning. We predict the normalized biomedic… ▽ More

    Submitted 15 June, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

  13. arXiv:2104.10493  [pdf, other

    cs.CL

    End-to-end Biomedical Entity Linking with Span-based Dictionary Matching

    Authors: Shogo Ujiie, Hayate Iso, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki

    Abstract: Disease name recognition and normalization, which is generally called biomedical entity linking, is a fundamental process in biomedical text mining. Recently, neural joint learning of both tasks has been proposed to utilize the mutual benefits. While this approach achieves high performance, disease concepts that do not appear in the training dataset cannot be accurately predicted. This study intro… ▽ More

    Submitted 21 April, 2021; originally announced April 2021.

  14. arXiv:2104.01371  [pdf, other

    cs.CL

    Convex Aggregation for Opinion Summarization

    Authors: Hayate Iso, Xiaolan Wang, Yoshihiko Suhara, Stefanos Angelidis, Wang-Chiew Tan

    Abstract: Recent advances in text autoencoders have significantly improved the quality of the latent space, which enables models to generate grammatical and consistent text from aggregated latent vectors. As a successful application of this property, unsupervised opinion summarization models generate a summary by decoding the aggregated latent vectors of inputs. More specifically, they perform the aggregati… ▽ More

    Submitted 16 November, 2021; v1 submitted 3 April, 2021; originally announced April 2021.

    Comments: Findings of EMNLP 2021

  15. Fact-based Text Editing

    Authors: Hayate Iso, Chao Qiao, Hang Li

    Abstract: We propose a novel text editing task, referred to as \textit{fact-based text editing}, in which the goal is to revise a given document to better describe the facts in a knowledge base (e.g., several triples). The task is important in practice because reflecting the truth is a common requirement in text editing. First, we propose a method for automatically generating a dataset for research on fact-… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: ACL 2020

  16. Learning to Select, Track, and Generate for Data-to-Text

    Authors: Hayate Iso, Yui Uehara, Tatsuya Ishigaki, Hiroshi Noji, Eiji Aramaki, Ichiro Kobayashi, Yusuke Miyao, Naoaki Okazaki, Hiroya Takamura

    Abstract: We propose a data-to-text generation model with two modules, one for tracking and the other for text generation. Our tracking module selects and keeps track of salient information and memorizes which record has been mentioned. Our generation module generates a summary conditioned on the state of tracking module. Our model is considered to simulate the human-like writing process that gradually sele… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Comments: ACL 2019

  17. arXiv:1705.02750  [pdf, other

    cs.CL

    Density Estimation for Geolocation via Convolutional Mixture Density Network

    Authors: Hayate Iso, Shoko Wakamiya, Eiji Aramaki

    Abstract: Nowadays, geographic information related to Twitter is crucially important for fine-grained applications. However, the amount of geographic information avail- able on Twitter is low, which makes the pursuit of many applications challenging. Under such circumstances, estimating the location of a tweet is an important goal of the study. Unlike most previous studies that estimate the pre-defined dist… ▽ More

    Submitted 8 May, 2017; originally announced May 2017.

    Comments: 8 pages