Skip to main content

Showing 1–50 of 180 results for author: Choi, E

  1. arXiv:2407.06249  [pdf, other

    cs.CL cs.SE

    CodeUpdateArena: Benchmarking Knowledge Editing on API Updates

    Authors: Zeyu Leo Liu, Shrey Pandit, Xi Ye, Eunsol Choi, Greg Durrett

    Abstract: Large language models (LLMs) are increasingly being used to synthesize and reason about source code. However, the static nature of these models' knowledge does not reflect the fact that libraries and API functions they invoke are continuously evolving, with functionality being added or changing. While numerous benchmarks evaluate how LLMs can generate code, no prior work has studied how an LLMs' k… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Under Review

  2. arXiv:2406.19188  [pdf, other

    cs.LG

    Averaging log-likelihoods in direct alignment

    Authors: Nathan Grinsztajn, Yannis Flet-Berliac, Mohammad Gheshlaghi Azar, Florian Strub, Bill Wu, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Olivier Pietquin, Matthieu Geist

    Abstract: To better align Large Language Models (LLMs) with human judgment, Reinforcement Learning from Human Feedback (RLHF) learns a reward model and then optimizes it using regularized RL. Recently, direct alignment methods were introduced to learn such a fine-tuned model directly from a preference dataset without computing a proxy reward function. These methods are built upon contrastive losses involvin… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.19185  [pdf, other

    cs.LG

    Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion

    Authors: Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist

    Abstract: Reinforcement Learning (RL) has been used to finetune Large Language Models (LLMs) using a reward model trained from preference data, to better align with human judgment. The recently introduced direct alignment methods, which are often simpler, more stable, and computationally lighter, can more directly achieve this. However, these approaches cannot optimize arbitrary rewards, and the preference-… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  4. arXiv:2406.17761  [pdf, other

    cs.CL cs.AI cs.LG

    CaLMQA: Exploring culturally specific long-form question answering across 23 languages

    Authors: Shane Arora, Marzena Karpinska, Hung-Ting Chen, Ipsita Bhattacharjee, Mohit Iyyer, Eunsol Choi

    Abstract: Large language models (LLMs) are used for long-form question answering (LFQA), which requires them to generate paragraph-length answers to complex questions. While LFQA has been well-studied in English, this research has not been extended to other languages. To bridge this gap, we introduce CaLMQA, a collection of 1.5K complex culturally specific questions spanning 23 languages and 51 culturally a… ▽ More

    Submitted 3 July, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    Comments: 39 pages, 17 figures. Code and data available at https://github.com/2015aroras/CaLMQA. Revised argument in section 4, results unchanged

  5. arXiv:2406.17692  [pdf, other

    cs.CL cs.LG

    From Distributional to Overton Pluralism: Investigating Large Language Model Alignment

    Authors: Thom Lake, Eunsol Choi, Greg Durrett

    Abstract: The alignment process changes several properties of a large language model's (LLM's) output distribution. We analyze two aspects of post-alignment distributional shift of LLM responses. First, we re-examine previously reported reductions in response diversity post-alignment. Our analysis suggests that an apparent drop in the diversity of responses is largely explained by quality control and inform… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  6. arXiv:2406.16341  [pdf, other

    cs.CL

    EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records

    Authors: Yeonsu Kwon, Jiho Kim, Gyubok Lee, Seongsu Bae, Daeun Kyung, Wonchul Cha, Tom Pollard, Alistair Johnson, Edward Choi

    Abstract: Electronic Health Records (EHRs) are integral for storing comprehensive patient medical records, combining structured data (e.g., medications) with detailed clinical notes (e.g., physician notes). These elements are essential for straightforward data retrieval and provide deep, contextual insights into patient care. However, they often suffer from discrepancies due to unintuitive EHR system design… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  7. arXiv:2406.14670  [pdf, other

    cs.CL cs.AI cs.LG

    Exploring Design Choices for Building Language-Specific LLMs

    Authors: Atula Tejaswi, Nilesh Gupta, Eunsol Choi

    Abstract: Despite rapid progress in large language models (LLMs), their performance on a vast majority of languages remain unsatisfactory. In this paper, we study building language-specific LLMs by adapting monolingual and multilingual LLMs. We conduct systematic experiments on how design choices (base model selection, vocabulary extension, and continued fine-tuning) impact the adapted LLM, both in terms of… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 15 pages, 6 figures, 11 tables

  8. arXiv:2406.13144  [pdf, other

    cs.CL cs.AI

    DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents

    Authors: Jiho Kim, Woosog Chay, Hyeonji Hwang, Daeun Kyung, Hyunseung Chung, Eunbyeol Cho, Yohan Jo, Edward Choi

    Abstract: Recent advancements in Large Language Models (LLMs) have significantly enhanced the capabilities of conversational agents, making them applicable to various fields (e.g., education). Despite their progress, the evaluation of the agents often overlooks the complexities of real-world conversations, such as real-time interactions, multi-party dialogues, and extended contextual dependencies. To bridge… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  9. arXiv:2406.01660  [pdf, other

    cs.LG cs.AI stat.ML

    Self-Improving Robust Preference Optimization

    Authors: Eugene Choi, Arash Ahmadian, Matthieu Geist, Oilvier Pietquin, Mohammad Gheshlaghi Azar

    Abstract: Both online and offline RLHF methods such as PPO and DPO have been extremely successful in aligning AI with human preferences. Despite their success, the existing methods suffer from a fundamental problem that their optimal solution is highly task-dependent (i.e., not robust to out-of-distribution (OOD) tasks). Here we address this challenge by proposing Self-Improving Robust Preference Optimizati… ▽ More

    Submitted 7 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  10. arXiv:2406.00019  [pdf, other

    cs.CL cs.AI cs.DB cs.IR

    EHR-SeqSQL : A Sequential Text-to-SQL Dataset For Interactively Exploring Electronic Health Records

    Authors: Jaehee Ryu, Seonhee Cho, Gyubok Lee, Edward Choi

    Abstract: In this paper, we introduce EHR-SeqSQL, a novel sequential text-to-SQL dataset for Electronic Health Record (EHR) databases. EHR-SeqSQL is designed to address critical yet underexplored aspects in text-to-SQL parsing: interactivity, compositionality, and efficiency. To the best of our knowledge, EHR-SeqSQL is not only the largest but also the first medical text-to-SQL dataset benchmark to include… ▽ More

    Submitted 23 May, 2024; originally announced June 2024.

    Comments: ACL 2024 (Findings)

  11. arXiv:2405.19597  [pdf, other

    cs.LG cs.AI cs.CL

    SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors

    Authors: Vijay Lingam, Atula Tejaswi, Aditya Vavre, Aneesh Shetty, Gautham Krishna Gudur, Joydeep Ghosh, Alex Dimakis, Eunsol Choi, Aleksandar Bojchevski, Sujay Sanghavi

    Abstract: Popular parameter-efficient fine-tuning (PEFT) methods, such as LoRA and its variants, freeze pre-trained model weights \(W\) and inject learnable matrices \(ΔW\). These \(ΔW\) matrices are structured for efficient parameterization, often using techniques like low-rank approximations or scaling vectors. However, these methods typically show a performance gap compared to full fine-tuning. Although… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 17 pages, 5 figures, 14 tables

  12. arXiv:2405.11855  [pdf, other

    cs.RO

    Salience-guided Ground Factor for Robust Localization of Delivery Robots in Complex Urban Environments

    Authors: Jooyong Park, Jungwoo Lee, Euncheol Choi, Younggun Cho

    Abstract: In urban environments for delivery robots, particularly in areas such as campuses and towns, many custom features defy standard road semantic categorizations. Addressing this challenge, our paper introduces a method leveraging Salient Object Detection (SOD) to extract these unique features, employing them as pivotal factors for enhanced robot loop closure and localization. Traditional geometric fe… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 8 pages, 9 figures, 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)

  13. arXiv:2405.06673  [pdf, other

    cs.CL cs.AI

    Overview of the EHRSQL 2024 Shared Task on Reliable Text-to-SQL Modeling on Electronic Health Records

    Authors: Gyubok Lee, Sunjun Kweon, Seongsu Bae, Edward Choi

    Abstract: Electronic Health Records (EHRs) are relational databases that store the entire medical histories of patients within hospitals. They record numerous aspects of patients' medical care, from hospital admission and diagnosis to treatment and discharge. While EHRs are vital sources of clinical data, exploring them beyond a predefined set of queries requires skills in query languages like SQL. To make… ▽ More

    Submitted 23 May, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: The 6th Clinical Natural Language Processing Workshop at NAACL 2024; Minor Change from Camera-Ready

  14. arXiv:2405.01588  [pdf, other

    cs.CL cs.AI

    Towards Unbiased Evaluation of Detecting Unanswerable Questions in EHRSQL

    Authors: Yongjin Yang, Sihyeon Kim, SangMook Kim, Gyubok Lee, Se-Young Yun, Edward Choi

    Abstract: Incorporating unanswerable questions into EHR QA systems is crucial for testing the trustworthiness of a system, as providing non-existent responses can mislead doctors in their diagnoses. The EHRSQL dataset stands out as a promising benchmark because it is the only dataset that incorporates unanswerable questions in the EHR QA system alongside practical questions. However, in this work, we identi… ▽ More

    Submitted 28 April, 2024; originally announced May 2024.

    Comments: DPFM Workshop, ICLR 2024

  15. arXiv:2404.13318  [pdf, other

    cs.LG

    EHRFL: Federated Learning Framework for Heterogeneous EHRs and Precision-guided Selection of Participating Clients

    Authors: Jiyoun Kim, Junu Kim, Kyunghoon Hur, Edward Choi

    Abstract: In this study, we provide solutions to two practical yet overlooked scenarios in federated learning for electronic health records (EHRs): firstly, we introduce EHRFL, a framework that facilitates federated learning across healthcare institutions with distinct medical coding systems and database schemas using text-based linearization of EHRs. Secondly, we focus on a scenario where a single healthca… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  16. arXiv:2404.13272  [pdf, other

    cs.HC

    DinAR: Augmenting Reality for Sustainable Dining

    Authors: MJ Johns, Eunsol Sol Choi, Derusha Baskaran

    Abstract: Sustainable food is among the many challenges associated with climate change. The resources required to grow or gather the food and the distance it travels to reach the consumer are two key factors of an ingredient's sustainability. Food that is grown locally and is currently "in-season" will have a lower carbon footprint, but when dining out these details unfortunately may not affect one's orderi… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: Presented at CHI 2024 (arXiv:2404.05889), 5 pages, and 4 figures

    Report number: ARSJ/2024/10

  17. arXiv:2404.12447  [pdf, other

    cs.CL

    AmbigDocs: Reasoning across Documents on Different Entities under the Same Name

    Authors: Yoonsang Lee, Xi Ye, Eunsol Choi

    Abstract: Different entities with the same name can be difficult to distinguish. Handling confusing entity mentions is a crucial skill for language models (LMs). For example, given the question "Where was Michael Jordan educated?" and a set of documents discussing different people named Michael Jordan, can LMs distinguish entity mentions to generate a cohesive answer to the question? To test this ability, w… ▽ More

    Submitted 26 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  18. arXiv:2404.02581  [pdf, other

    cs.CL cs.IR

    Multi-Granularity Guided Fusion-in-Decoder

    Authors: Eunseong Choi, Hyeri Lee, Jongwuk Lee

    Abstract: In Open-domain Question Answering (ODQA), it is essential to discern relevant contexts as evidence and avoid spurious ones among retrieved results. The model architecture that uses concatenated multiple contexts in the decoding phase, i.e., Fusion-in-Decoder, demonstrates promising performance but generates incorrect outputs from seemingly plausible contexts. To address this problem, we propose th… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Findings of the Association for Computational Linguistics: NAACL 2024; 12 pages; 8 figures and 5 tables. Code and data available at http://github.com/eunseongc/MGFiD

  19. Contextual AI Journaling: Integrating LLM and Time Series Behavioral Sensing Technology to Promote Self-Reflection and Well-being using the MindScape App

    Authors: Subigya Nepal, Arvind Pillai, William Campbell, Talie Massachi, Eunsol Soul Choi, Orson Xu, Joanna Kuc, Jeremy Huckins, Jason Holden, Colin Depp, Nicholas Jacobson, Mary Czerwinski, Eric Granholm, Andrew T. Campbell

    Abstract: MindScape aims to study the benefits of integrating time series behavioral patterns (e.g., conversational engagement, sleep, location) with Large Language Models (LLMs) to create a new form of contextual AI journaling, promoting self-reflection and well-being. We argue that integrating behavioral sensing in LLMs will likely lead to a new frontier in AI. In this Late-Breaking Work paper, we discuss… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    ACM Class: H.5.0; H.5.3; H.5.m; J.0

  20. arXiv:2403.15879  [pdf, other

    cs.AI

    TrustSQL: Benchmarking Text-to-SQL Reliability with Penalty-Based Scoring

    Authors: Gyubok Lee, Woosog Chay, Seonhee Cho, Edward Choi

    Abstract: Text-to-SQL enables users to interact with databases using natural language, simplifying the retrieval and synthesis of information. Despite the remarkable success of large language models (LLMs) in translating natural language questions into SQL queries, widespread deployment remains limited due to two primary challenges. First, the effective use of text-to-SQL models depends on users' understand… ▽ More

    Submitted 2 July, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

    Comments: under review

  21. arXiv:2403.12481  [pdf, other

    cs.LG cs.CV

    TT-BLIP: Enhancing Fake News Detection Using BLIP and Tri-Transformer

    Authors: Eunjee Choi, Jong-Kook Kim

    Abstract: Detecting fake news has received a lot of attention. Many previous methods concatenate independently encoded unimodal data, ignoring the benefits of integrated multimodal information. Also, the absence of specialized feature extraction for text and images further limits these methods. This paper introduces an end-to-end model called TT-BLIP that applies the bootstrapping language-image pretraining… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 8 pages, submitted to conference

  22. arXiv:2403.06537  [pdf, other

    cs.CL

    On the Consideration of AI Openness: Can Good Intent Be Abused?

    Authors: Yeeun Kim, Eunkyung Choi, Hyunjun Kim, Hongseok Oh, Hyunseo Shin, Wonseok Hwang

    Abstract: Openness is critical for the advancement of science. In particular, recent rapid progress in AI has been made possible only by various open-source models, datasets, and libraries. However, this openness also means that technologies can be freely used for socially harmful purposes. Can open-source models or datasets be used for malicious purposes? If so, how easy is it to adapt technology for such… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 10 pages

  23. arXiv:2403.03866  [pdf, other

    cs.CL

    KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions

    Authors: Fangyuan Xu, Kyle Lo, Luca Soldaini, Bailey Kuehl, Eunsol Choi, David Wadden

    Abstract: Large language models (LLMs) adapted to follow user instructions are now widely deployed as conversational agents. In this work, we examine one increasingly common instruction-following task: providing writing assistance to compose a long-form answer. To evaluate the capabilities of current LLMs on this task, we construct KIWI, a dataset of knowledge-intensive writing instructions in the scientifi… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  24. arXiv:2403.01628  [pdf, ps, other

    cs.LG

    Recent Advances, Applications, and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2023 Symposium

    Authors: Hyewon Jeong, Sarah Jabbour, Yuzhe Yang, Rahul Thapta, Hussein Mozannar, William Jongwon Han, Nikita Mehandru, Michael Wornow, Vladislav Lialin, Xin Liu, Alejandro Lozano, Jiacheng Zhu, Rafal Dariusz Kocielnik, Keith Harrigian, Haoran Zhang, Edward Lee, Milos Vukadinovic, Aparna Balagopalan, Vincent Jeanselme, Katherine Matton, Ilker Demirel, Jason Fries, Parisa Rashidi, Brett Beaulieu-Jones, Xuhai Orson Xu , et al. (18 additional authors not shown)

    Abstract: The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four vir… ▽ More

    Submitted 5 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: ML4H 2023, Research Roundtables

  25. arXiv:2403.01469  [pdf, other

    cs.CL

    KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations

    Authors: Sunjun Kweon, Byungjin Choi, Minkyu Kim, Rae Woong Park, Edward Choi

    Abstract: We introduce KorMedMCQA, the first Korean multiple-choice question answering (MCQA) benchmark derived from Korean healthcare professional licensing examinations, covering from the year 2012 to year 2023. This dataset consists of a selection of questions from the license examinations for doctors, nurses, and pharmacists, featuring a diverse array of subjects. We conduct baseline experiments on vari… ▽ More

    Submitted 5 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

  26. arXiv:2402.16040  [pdf, other

    cs.CL

    EHRNoteQA: An LLM Benchmark for Real-World Clinical Practice Using Discharge Summaries

    Authors: Sunjun Kweon, Jiyoun Kim, Heeyoung Kwak, Dongchul Cha, Hangyul Yoon, Kwanghyun Kim, Jeewon Yang, Seunghyun Won, Edward Choi

    Abstract: Discharge summaries in Electronic Health Records (EHRs) are crucial for clinical decision-making, but their length and complexity make information extraction challenging, especially when dealing with accumulated summaries across multiple patient admissions. Large Language Models (LLMs) show promise in addressing this challenge by efficiently analyzing vast and complex data. Existing benchmarks, ho… ▽ More

    Submitted 27 June, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

    Comments: Under Review

  27. arXiv:2402.15838  [pdf, other

    cs.IR

    ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval

    Authors: Soyoung Yoon, Eunbi Choi, Jiyeon Kim, Hyeongu Yun, Yireun Kim, Seung-won Hwang

    Abstract: We propose ListT5, a novel reranking approach based on Fusion-in-Decoder (FiD) that handles multiple candidate passages at both train and inference time. We also introduce an efficient inference framework for listwise ranking based on m-ary tournament sort with output caching. We evaluate and compare our model on the BEIR benchmark for zero-shot retrieval task, demonstrating that ListT5 (1) outper… ▽ More

    Submitted 6 June, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

    Comments: Accepted to ACL 2024 main (long)

  28. arXiv:2402.15096  [pdf, other

    cs.LG cs.CV cs.MM

    Multimodal Transformer With a Low-Computational-Cost Guarantee

    Authors: Sungjin Park, Edward Choi

    Abstract: Transformer-based models have significantly improved performance across a range of multimodal understanding tasks, such as visual question answering and action recognition. However, multimodal Transformers significantly suffer from a quadratic complexity of the multi-head attention with the input sequence length, especially as the number of modalities increases. To address this, we introduce Low-C… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted to ICASSP 2024 (5 pages)

  29. arXiv:2402.13605  [pdf, other

    cs.CL

    KorNAT: LLM Alignment Benchmark for Korean Social Values and Common Knowledge

    Authors: Jiyoung Lee, Minwoo Kim, Seungho Kim, Junghwan Kim, Seunghyun Won, Hwaran Lee, Edward Choi

    Abstract: For Large Language Models (LLMs) to be effectively deployed in a specific country, they must possess an understanding of the nation's culture and basic knowledge. To this end, we introduce National Alignment, which measures an alignment between an LLM and a targeted country from two aspects: social value alignment and common knowledge alignment. Social value alignment evaluates how well the model… ▽ More

    Submitted 5 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted at ACL 2024 Findings (35 pages, 7 figures, 16 tables)

  30. arXiv:2402.05904  [pdf, other

    cs.CL cs.CY cs.HC cs.SI

    FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs

    Authors: Eun Cheol Choi, Emilio Ferrara

    Abstract: Our society is facing rampant misinformation harming public health and trust. To address the societal challenge, we introduce FACT-GPT, a system leveraging Large Language Models (LLMs) to automate the claim matching stage of fact-checking. FACT-GPT, trained on a synthetic dataset, identifies social media content that aligns with, contradicts, or is irrelevant to previously debunked claims. Our eva… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  31. arXiv:2402.02023  [pdf, other

    cs.LG cs.AI

    Self-Supervised Contrastive Learning for Long-term Forecasting

    Authors: Junwoo Park, Daehoon Gwak, Jaegul Choo, Edward Choi

    Abstract: Long-term forecasting presents unique challenges due to the time and memory complexity of handling long sequences. Existing methods, which rely on sliding windows to process long sequences, struggle to effectively capture long-term variations that are partially caught within the short window (i.e., outer-window variations). In this paper, we introduce a novel approach that overcomes this limitatio… ▽ More

    Submitted 24 March, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted at International Conference on Learning Representations (ICLR) 2024

  32. arXiv:2402.01591  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    BAT: Learning to Reason about Spatial Sounds with Large Language Models

    Authors: Zhisheng Zheng, Puyuan Peng, Ziyang Ma, Xie Chen, Eunsol Choi, David Harwath

    Abstract: Spatial sound reasoning is a fundamental human skill, enabling us to navigate and interpret our surroundings based on sound. In this paper we present BAT, which combines the spatial sound perception ability of a binaural acoustic scene analysis model with the natural language reasoning capabilities of a large language model (LLM) to replicate this innate ability. To address the lack of existing da… ▽ More

    Submitted 25 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted to ICML 2024. Our demo, dataset, code and model weights are available at: https://zhishengzheng.com/BAT

  33. arXiv:2401.14107  [pdf, other

    cs.LG eess.SP

    Learning under Label Noise through Few-Shot Human-in-the-Loop Refinement

    Authors: Aaqib Saeed, Dimitris Spathis, Jungwoo Oh, Edward Choi, Ali Etemad

    Abstract: Wearable technologies enable continuous monitoring of various health metrics, such as physical activity, heart rate, sleep, and stress levels. A key challenge with wearable data is obtaining quality labels. Unlike modalities like video where the videos themselves can be effectively used to label objects or events, wearable data do not contain obvious cues about the physical manifestation of the us… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  34. arXiv:2401.03835  [pdf, other

    cs.CV eess.IV

    Limitations of Data-Driven Spectral Reconstruction -- Optics-Aware Analysis and Mitigation

    Authors: Qiang Fu, Matheus Souza, Eunsue Choi, Suhyun Shin, Seung-Hwan Baek, Wolfgang Heidrich

    Abstract: Hyperspectral imaging empowers machine vision systems with the distinct capability of identifying materials through recording their spectral signatures. Recent efforts in data-driven spectral reconstruction aim at extracting spectral information from RGB images captured by cost-effective RGB cameras, instead of dedicated hardware. In this paper we systematically analyze the performance of such m… ▽ More

    Submitted 2 April, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: 12 pages, 7 figures, 8 tables

  35. arXiv:2312.09424  [pdf, other

    cs.CL cs.AI

    Open Domain Knowledge Extraction for Knowledge Graphs

    Authors: Kun Qian, Anton Belyi, Fei Wu, Samira Khorshidi, Azadeh Nikfarjam, Rahul Khot, Yisi Sang, Katherine Luna, Xianqi Chu, Eric Choi, Yash Govind, Chloe Seivwright, Yiwen Sun, Ahmed Fakhry, Theo Rekatsinas, Ihab Ilyas, Xiaoguang Qi, Yunyao Li

    Abstract: The quality of a knowledge graph directly impacts the quality of downstream applications (e.g. the number of answerable questions using the graph). One ongoing challenge when building a knowledge graph is to ensure completeness and freshness of the graph's entities and facts. In this paper, we introduce ODKE, a scalable and extensible framework that sources high-quality entities and facts from ope… ▽ More

    Submitted 30 October, 2023; originally announced December 2023.

    Comments: 7 pages, 7 figures, 5 tables, preprint technical report, no code or data is released

    MSC Class: 68T30 (primary) ACM Class: F.4.1; I.2.4

  36. arXiv:2311.17396  [pdf, other

    cs.CV eess.IV

    Spectral and Polarization Vision: Spectro-polarimetric Real-world Dataset

    Authors: Yujin Jeon, Eunsue Choi, Youngchan Kim, Yunseong Moon, Khalid Omer, Felix Heide, Seung-Hwan Baek

    Abstract: Image datasets are essential not only in validating existing methods in computer vision but also in developing new methods. Most existing image datasets focus on trichromatic intensity images to mimic human vision. However, polarization and spectrum, the wave properties of light that animals in harsh environments and with limited brain capacity often rely on, remain underrepresented in existing da… ▽ More

    Submitted 30 November, 2023; v1 submitted 29 November, 2023; originally announced November 2023.

  37. arXiv:2311.09579  [pdf, other

    cs.CL

    Crafting In-context Examples according to LMs' Parametric Knowledge

    Authors: Yoonsang Lee, Pranav Atreya, Xi Ye, Eunsol Choi

    Abstract: In-context learning can improve the performances of knowledge-rich tasks such as question answering. In such scenarios, in-context examples trigger a language model (LM) to surface information stored in its parametric knowledge. We study how to better construct in-context example sets, based on whether the model is aware of the in-context examples. We identify 'known' examples, where models can co… ▽ More

    Submitted 3 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

  38. arXiv:2311.09469  [pdf, other

    cs.CL

    Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs

    Authors: Michael J. Q. Zhang, Eunsol Choi

    Abstract: Resolving ambiguities through interaction is a hallmark of natural language, and modeling this behavior is a core challenge in crafting AI assistants. In this work, we study such behavior in LMs by proposing a task-agnostic framework for resolving ambiguity by asking users clarifying questions. Our framework breaks down this objective into three subtasks: (1) determining when clarification is need… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  39. arXiv:2310.20204  [pdf, other

    cs.LG cs.CL

    General-Purpose Retrieval-Enhanced Medical Prediction Model Using Near-Infinite History

    Authors: Junu Kim, Chaeeun Shim, Bosco Seong Kyu Yang, Chami Im, Sung Yoon Lim, Han-Gil Jeong, Edward Choi

    Abstract: Developing clinical prediction models (e.g., mortality prediction) based on electronic health records (EHRs) typically relies on expert opinion for feature selection and adjusting observation window size. This burdens experts and creates a bottleneck in the development process. We propose Retrieval-Enhanced Medical prediction model (REMed) to address such challenges. REMed can essentially evaluate… ▽ More

    Submitted 20 March, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: The source codes corresponding to this paper are available at: https://github.com/starmpcc/REMed

  40. arXiv:2310.19276  [pdf, other

    hep-th cs.LG math-ph math.AG

    Machine Learning Regularization for the Minimum Volume Formula of Toric Calabi-Yau 3-folds

    Authors: Eugene Choi, Rak-Kyeong Seong

    Abstract: We present a collection of explicit formulas for the minimum volume of Sasaki-Einstein 5-manifolds. The cone over these 5-manifolds is a toric Calabi-Yau 3-fold. These toric Calabi-Yau 3-folds are associated with an infinite class of 4d N=1 supersymmetric gauge theories, which are realized as worldvolume theories of D3-branes probing the toric Calabi-Yau 3-folds. Under the AdS/CFT correspondence,… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: 15 pages, 9 figures, 4 tables

    Report number: UNIST-MTH-23-RS-05

    Journal ref: Phys. Rev. D 109, 046015 (2024)

  41. arXiv:2310.18652  [pdf, other

    cs.CL cs.AI cs.CV

    EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

    Authors: Seongsu Bae, Daeun Kyung, Jaehee Ryu, Eunbyeol Cho, Gyubok Lee, Sunjun Kweon, Jungwoo Oh, Lei Ji, Eric I-Chao Chang, Tackeun Kim, Edward Choi

    Abstract: Electronic Health Records (EHRs), which contain patients' medical histories in various multi-modal formats, often overlook the potential for joint reasoning across imaging and table modalities underexplored in current EHR Question Answering (QA) systems. In this paper, we introduce EHRXQA, a novel multi-modal question answering dataset combining structured EHRs and chest X-ray images. To develop o… ▽ More

    Submitted 25 December, 2023; v1 submitted 28 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023 Datasets and Benchmarks Track (10 pages for main text, 4 pages for references, 39 pages for supplementary materials)

  42. arXiv:2310.12150  [pdf, other

    cs.CL

    Understanding Retrieval Augmentation for Long-Form Question Answering

    Authors: Hung-Ting Chen, Fangyuan Xu, Shane Arora, Eunsol Choi

    Abstract: We present a study of retrieval-augmented language models (LMs) on long-form question answering. We analyze how retrieval augmentation impacts different LMs, by comparing answers generated from models while using the same evidence documents, and how differing quality of retrieval document set impacts the answers generated from the same LM. We study various attributes of generated answers (e.g., fl… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  43. arXiv:2310.11220  [pdf, other

    cs.CL

    KG-GPT: A General Framework for Reasoning on Knowledge Graphs Using Large Language Models

    Authors: Jiho Kim, Yeonsu Kwon, Yohan Jo, Edward Choi

    Abstract: While large language models (LLMs) have made considerable advancements in understanding and generating unstructured text, their application in structured data remains underexplored. Particularly, using LLMs for complex reasoning tasks on knowledge graphs (KGs) remains largely untouched. To address this, we propose KG-GPT, a multi-purpose framework leveraging LLMs for tasks employing KGs. KG-GPT co… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 Findings

  44. arXiv:2310.09223  [pdf, other

    cs.CL cs.CY cs.HC

    Automated Claim Matching with Large Language Models: Empowering Fact-Checkers in the Fight Against Misinformation

    Authors: Eun Cheol Choi, Emilio Ferrara

    Abstract: In today's digital era, the rapid spread of misinformation poses threats to public well-being and societal trust. As online misinformation proliferates, manual verification by fact checkers becomes increasingly challenging. We introduce FACT-GPT (Fact-checking Augmentation with Claim matching Task-oriented Generative Pre-trained Transformer), a framework designed to automate the claim matching pha… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  45. arXiv:2310.04408  [pdf, other

    cs.CL

    RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation

    Authors: Fangyuan Xu, Weijia Shi, Eunsol Choi

    Abstract: Retrieving documents and prepending them in-context at inference time improves performance of language model (LMs) on a wide range of tasks. However, these documents, often spanning hundreds of words, make inference substantially more expensive. We propose compressing the retrieved documents into textual summaries prior to in-context integration. This not only reduces the computational costs but a… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  46. Forgetting-aware Linear Bias for Attentive Knowledge Tracing

    Authors: Yoonjin Im, Eunseong Choi, Heejin Kook, Jongwuk Lee

    Abstract: Knowledge Tracing (KT) aims to track proficiency based on a question-solving history, allowing us to offer a streamlined curriculum. Recent studies actively utilize attention-based mechanisms to capture the correlation between questions and combine it with the learner's characteristics for responses. However, our empirical study shows that existing attention-based KT models neglect the learner's f… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM'23), 5 pages, 3 figures, 2 tables

  47. arXiv:2309.06248  [pdf, other

    cs.LG

    Rethinking Evaluation Metric for Probability Estimation Models Using Esports Data

    Authors: Euihyeon Choi, Jooyoung Kim, Wonkyung Lee

    Abstract: Probability estimation models play an important role in various fields, such as weather forecasting, recommendation systems, and sports analysis. Among several models estimating probabilities, it is difficult to evaluate which model gives reliable probabilities since the ground-truth probabilities are not available. The win probability estimation model for esports, which calculates the win probabi… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: 7 pages

  48. arXiv:2309.00237  [pdf, other

    cs.CL cs.AI

    Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes

    Authors: Sunjun Kweon, Junu Kim, Jiyoun Kim, Sujeong Im, Eunbyeol Cho, Seongsu Bae, Jungwoo Oh, Gyubok Lee, Jong Hak Moon, Seng Chan You, Seungjin Baek, Chang Hoon Han, Yoon Bin Jung, Yohan Jo, Edward Choi

    Abstract: The development of large language models tailored for handling patients' clinical notes is often hindered by the limited accessibility and usability of these notes due to strict privacy regulations. To address these challenges, we first create synthetic large-scale clinical notes using publicly available case reports extracted from biomedical literature. We then use these synthetic notes to train… ▽ More

    Submitted 13 June, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: ACL 2024 (Findings)

  49. arXiv:2308.07407  [pdf, other

    cs.CL cs.HC

    Development and Evaluation of Three Chatbots for Postpartum Mood and Anxiety Disorders

    Authors: Xuewen Yao, Miriam Mikhelson, S. Craig Watkins, Eunsol Choi, Edison Thomaz, Kaya de Barbaro

    Abstract: In collaboration with Postpartum Support International (PSI), a non-profit organization dedicated to supporting caregivers with postpartum mood and anxiety disorders, we developed three chatbots to provide context-specific empathetic support to postpartum caregivers, leveraging both rule-based and generative models. We present and evaluate the performance of our chatbots using both machine-based m… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

  50. arXiv:2308.02596  [pdf, other

    physics.soc-ph cond-mat.dis-nn cs.DM stat.CO

    Revisiting small-world network models: Exploring technical realizations and the equivalence of the Newman-Watts and Harary models

    Authors: Seora Son, Eun Ji Choi, Sang Hoon Lee

    Abstract: We address the relatively less known facts on the equivalence and technical realizations surrounding two network models showing the "small-world" property, namely the Newman-Watts and the Harary models. We provide the most accurate (in terms of faithfulness to the original literature) versions of these models to clarify the deviation from them existing in their variants adopted in one of the most… ▽ More

    Submitted 12 December, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

    Comments: 11 pages, 5 figures, 1 table

    Journal ref: J. Korean Phys. Soc. 83, 879 (2023)