Skip to main content

Showing 1–50 of 113 results for author: Seo, M

  1. arXiv:2406.19502  [pdf, other

    cs.CL cs.AI

    Investigating How Large Language Models Leverage Internal Knowledge to Perform Complex Reasoning

    Authors: Miyoung Ko, Sue Hyun Park, Joonsuk Park, Minjoon Seo

    Abstract: Despite significant advancements, there is a limited understanding of how large language models (LLMs) utilize knowledge for reasoning. To address this, we propose a method that deconstructs complex real-world questions into a graph, representing each question as a node with parent nodes of background knowledge needed to solve the question. We develop the DepthQA dataset, deconstructing questions… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Work in progress; code is available at https://github.com/kaistAI/knowledge-reasoning

  2. arXiv:2406.15275  [pdf, other

    cs.CL

    Cognitive Map for Language Models: Optimal Planning via Verbally Representing the World Model

    Authors: Doyoung Kim, Jongwon Lee, Jinho Park, Minjoon Seo

    Abstract: Language models have demonstrated impressive capabilities across various natural language processing tasks, yet they struggle with planning tasks requiring multi-step simulations. Inspired by human cognitive processes, this paper investigates the optimal planning power of language models that can construct a cognitive map of a given environment. Our experiments demonstrate that cognitive map signi… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  3. arXiv:2406.11823  [pdf, other

    cs.CV cs.CL

    On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning

    Authors: Geewook Kim, Minjoon Seo

    Abstract: Recent advancements in language and vision assistants have showcased impressive capabilities but suffer from a lack of transparency, limiting broader research and reproducibility. While open-source models handle general image tasks effectively, they face challenges with the high computational demands of complex visually-situated text understanding. Such tasks often require increased token inputs a… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 17 pages, 8 figures

  4. arXiv:2406.11813  [pdf, other

    cs.CL

    How Do Large Language Models Acquire Factual Knowledge During Pretraining?

    Authors: Hoyeon Chang, Jinho Park, Seonghyeon Ye, Sohee Yang, Youngkyung Seo, Du-Seong Chang, Minjoon Seo

    Abstract: Despite the recent observation that large language models (LLMs) can store substantial factual knowledge, there is a limited understanding of the mechanisms of how they acquire factual knowledge through pretraining. This work addresses this gap by studying how LLMs acquire factual knowledge during pretraining. The findings reveal several important insights into the dynamics of factual knowledge ac… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    ACM Class: I.2.7

  5. arXiv:2406.05761  [pdf, other

    cs.CL

    The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

    Authors: Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Guijin Son, Yejin Cho, Sheikh Shafayat, Jinheon Baek, Sue Hyun Park, Hyeonbin Hwang, Jinkyung Jo, Hyowon Cho, Haebin Shin, Seongyun Lee, Hanseok Oh, Noah Lee, Namgyu Ho, Se June Joo, Miyoung Ko, Yoonjoo Lee, Hyungjoo Chae, Jamin Shin, Joel Jang , et al. (7 additional authors not shown)

    Abstract: As language models (LMs) become capable of handling a wide range of tasks, their evaluation is becoming as challenging as their development. Most generation benchmarks currently assess LMs using abstract evaluation criteria like helpfulness and harmlessness, which often lack the flexibility and granularity of human assessment. Additionally, these benchmarks tend to focus disproportionately on spec… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Work in Progress

  6. arXiv:2406.04678  [pdf, other

    cs.CV

    ACE Metric: Advection and Convection Evaluation for Accurate Weather Forecasting

    Authors: Doyi Kim, Minseok Seo, Yeji Choi

    Abstract: Recently, data-driven weather forecasting methods have received significant attention for surpassing the RMSE performance of traditional NWP (Numerical Weather Prediction)-based methods. However, data-driven models are tuned to minimize the loss between forecasted data and ground truths, often using pixel-wise loss. This can lead to models that produce blurred outputs, which, despite being signifi… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 9 pages

  7. arXiv:2405.17977  [pdf, other

    cs.CL

    Aligning to Thousands of Preferences via System Message Generalization

    Authors: Seongyun Lee, Sue Hyun Park, Seungone Kim, Minjoon Seo

    Abstract: Although humans inherently have diverse values, current large language model (LLM) alignment methods often assume that aligning LLMs with the general public's preferences is optimal. A major challenge in adopting a more individualized approach to LLM alignment is its lack of scalability, as it involves repeatedly acquiring preference data and training new reward models and LLMs for each individual… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Work in progress

  8. arXiv:2405.11162  [pdf, other

    cs.CL

    LG AI Research & KAIST at EHRSQL 2024: Self-Training Large Language Models with Pseudo-Labeled Unanswerable Questions for a Reliable Text-to-SQL System on EHRs

    Authors: Yongrae Jo, Seongyun Lee, Minju Seo, Sung Ju Hwang, Moontae Lee

    Abstract: Text-to-SQL models are pivotal for making Electronic Health Records (EHRs) accessible to healthcare professionals without SQL knowledge. With the advancements in large language models, these systems have become more adept at translating complex questions into SQL queries. Nonetheless, the critical need for reliability in healthcare necessitates these models to accurately identify unanswerable ques… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: NAACL 2024 Clinical NLP Workshop

  9. arXiv:2405.01535  [pdf, other

    cs.CL

    Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

    Authors: Seungone Kim, Juyoung Suk, Shayne Longpre, Bill Yuchen Lin, Jamin Shin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo

    Abstract: Proprietary LMs such as GPT-4 are often employed to assess the quality of responses from various LMs. However, concerns including transparency, controllability, and affordability strongly motivate the development of open-source LMs specialized in evaluations. On the other hand, existing open evaluator LMs exhibit critical shortcomings: 1) they issue scores that significantly diverge from those ass… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Work in Progress

  10. arXiv:2404.14687  [pdf, other

    cs.MM cs.AI cs.CL cs.CV

    Pegasus-v1 Technical Report

    Authors: Raehyuk Jung, Hyojun Go, Jaehyuk Yi, Jiho Jang, Daniel Kim, Jay Suh, Aiden Lee, Cooper Han, Jae Lee, Jeff Kim, Jin-Young Kim, Junwan Kim, Kyle Park, Lucas Lee, Mars Ha, Minjoon Seo, Abraham Jo, Ed Park, Hassan Kianinejad, SJ Kim, Tony Moon, Wade Jeong, Andrei Popescu, Esther Kim, EK Yoon , et al. (19 additional authors not shown)

    Abstract: This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's archi… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  11. arXiv:2404.13081  [pdf, other

    cs.CL cs.AI cs.LG

    SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs

    Authors: Jaehyung Kim, Jaehyun Nam, Sangwoo Mo, Jongjin Park, Sang-Woo Lee, Minjoon Seo, Jung-Woo Ha, Jinwoo Shin

    Abstract: Large language models (LLMs) have made significant advancements in various natural language processing tasks, including question answering (QA) tasks. While incorporating new information with the retrieval of relevant passages is a promising way to improve QA with LLMs, the existing methods often require additional fine-tuning which becomes infeasible with recent LLMs. Augmenting retrieved passage… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted at ICLR 2024

  12. arXiv:2404.10346  [pdf, other

    cs.CL

    Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards

    Authors: Hyeonbin Hwang, Doyoung Kim, Seungone Kim, Seonghyeon Ye, Minjoon Seo

    Abstract: Training on large amounts of rationales (i.e., CoT Fine-tuning) is effective at improving the reasoning capabilities of large language models (LLMs). However, acquiring human-authored rationales or augmenting rationales from proprietary models is costly and not scalable. In this paper, we study the problem of whether LLMs could self-improve their reasoning capabilities. To this end, we propose Sel… ▽ More

    Submitted 16 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: Preprint Under Review

  13. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  14. arXiv:2404.01628  [pdf, other

    cs.CV cs.AI cs.LG

    Learning Equi-angular Representations for Online Continual Learning

    Authors: Minhyuk Seo, Hyunseo Koh, Wonje Jeung, Minjae Lee, San Kim, Hankook Lee, Sungjun Cho, Sungik Choi, Hyunwoo Kim, Jonghyun Choi

    Abstract: Online continual learning suffers from an underfitted solution due to insufficient training for prompt model update (e.g., single-epoch training). To address the challenge, we propose an efficient online continual learning method using the neural collapse phenomenon. In particular, we induce neural collapse to form a simplex equiangular tight frame (ETF) structure in the representation space so th… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  15. arXiv:2403.10853  [pdf, other

    cs.LG cs.AI cs.CV

    Just Say the Name: Online Continual Learning with Category Names Only via Data Generation

    Authors: Minhyuk Seo, Diganta Misra, Seongwon Cho, Minjae Lee, Jonghyun Choi

    Abstract: In real-world scenarios, extensive manual annotation for continual learning is impractical due to prohibitive costs. Although prior arts, influenced by large-scale webly supervised training, suggest leveraging web-scraped data in continual learning, this poses challenges such as data imbalance, usage restrictions, and privacy concerns. Addressing the risks of continual webly supervised training, w… ▽ More

    Submitted 30 April, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

  16. arXiv:2403.09024  [pdf, other

    cs.CL cs.AI

    Semiparametric Token-Sequence Co-Supervision

    Authors: Hyunji Lee, Doyoung Kim, Jihoon Jun, Sejune Joo, Joel Jang, Kyoung-Woon On, Minjoon Seo

    Abstract: In this work, we introduce a semiparametric token-sequence co-supervision training method. It trains a language model by simultaneously leveraging supervision from the traditional next token prediction loss which is calculated over the parametric token embedding space and the next sequence prediction loss which is calculated over the nonparametric sequence embedding space. The nonparametric sequen… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  17. arXiv:2403.07548  [pdf, other

    cs.AI cs.LG cs.RO

    Online Continual Learning For Interactive Instruction Following Agents

    Authors: Byeonghwi Kim, Minhyuk Seo, Jonghyun Choi

    Abstract: In learning an embodied agent executing daily tasks via language directives, the literature largely assumes that the agent learns all training data at the beginning. We argue that such a learning scenario is less realistic since a robotic agent is supposed to learn the world continuously as it explores and perceives it. To take a step towards a more realistic embodied agent learning scenario, we p… ▽ More

    Submitted 12 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: ICLR 2024 (Project page: https://bhkim94.github.io/projects/CL-ALFRED)

  18. arXiv:2402.14334  [pdf, other

    cs.CL

    INSTRUCTIR: A Benchmark for Instruction Following of Information Retrieval Models

    Authors: Hanseok Oh, Hyunji Lee, Seonghyeon Ye, Haebin Shin, Hansol Jang, Changwook Jun, Minjoon Seo

    Abstract: Despite the critical need to align search targets with users' intention, retrievers often only prioritize query information without delving into the users' intended search context. Enhancing the capability of retrievers to understand intentions and preferences of users, akin to language model instructions, has the potential to yield more aligned search targets. Prior studies restrict the applicati… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  19. arXiv:2402.13482  [pdf, other

    cs.CL cs.AI cs.LG

    Retrieval-Augmented Data Augmentation for Low-Resource Domain Tasks

    Authors: Minju Seo, Jinheon Baek, James Thorne, Sung Ju Hwang

    Abstract: Despite large successes of recent language models on diverse tasks, they suffer from severe performance degeneration in low-resource settings with limited training data available. Many existing works tackle this problem by generating synthetic data from the training data and then training models on them, recently using Large Language Models (LLMs). However, in low-resource settings, the amount of… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  20. arXiv:2402.11253  [pdf, other

    cs.LG cs.AI cs.CL

    Aligning Large Language Models by On-Policy Self-Judgment

    Authors: Sangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu

    Abstract: Existing approaches for aligning large language models with human preferences face a trade-off that requires a separate reward model (RM) for on-policy learning. In this paper, we present a novel alignment framework, SELF-JUDGE that (1) does on-policy learning and 2) is parameter efficient, as it does not require an additional RM for evaluating the samples for on-policy learning. To this end, we p… ▽ More

    Submitted 25 June, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: Published as a main conference paper at ACL 2024

  21. arXiv:2402.03469  [pdf, other

    cs.LG cs.AI cs.CL

    Rethinking the Role of Proxy Rewards in Language Model Alignment

    Authors: Sungdong Kim, Minjoon Seo

    Abstract: Learning from human feedback via proxy reward modeling has been studied to align Large Language Models (LLMs) with human values. However, achieving reliable training through that proxy reward model (RM) is not a trivial problem, and its behavior remained as a black-box. In this paper, we study the role of proxy rewards in the LLM alignment via `reverse reward engineering' by composing interpretabl… ▽ More

    Submitted 28 April, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: Under review; Preprint

  22. arXiv:2401.15726  [pdf, other

    cs.CV

    Long-Term Typhoon Trajectory Prediction: A Physics-Conditioned Approach Without Reanalysis Data

    Authors: Young-Jae Park, Minseok Seo, Doyi Kim, Hyeri Kim, Sanghoon Choi, Beomkyu Choi, Jeongwon Ryu, Sohee Son, Hae-Gon Jeon, Yeji Choi

    Abstract: In the face of escalating climate changes, typhoon intensities and their ensuing damage have surged. Accurate trajectory prediction is crucial for effective damage control. Traditional physics-based models, while comprehensive, are computationally intensive and rely heavily on the expertise of forecasters. Contemporary data-driven methods often rely on reanalysis data, which can be considered to b… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: This paper was accepted for a Spotlight presentation at ICLR 2024

  23. arXiv:2401.10695  [pdf, other

    cs.CL

    LangBridge: Multilingual Reasoning Without Multilingual Supervision

    Authors: Dongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo

    Abstract: We introduce LangBridge, a zero-shot approach to adapt language models for multilingual reasoning tasks without multilingual supervision. LangBridge operates by bridging two models, each specialized in different aspects: (1) one specialized in understanding multiple languages (e.g., mT5 encoder) and (2) one specialized in reasoning (e.g., MetaMath). LangBridge connects the two models by introducin… ▽ More

    Submitted 3 June, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: ACL 2024 Main

  24. arXiv:2401.06591  [pdf, other

    cs.CL

    Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation

    Authors: Seongyun Lee, Seungone Kim, Sue Hyun Park, Geewook Kim, Minjoon Seo

    Abstract: Assessing long-form responses generated by Vision-Language Models (VLMs) is challenging. It not only requires checking whether the VLM follows the given instruction but also verifying whether the text output is properly grounded on the given image. Inspired by the recent approach of evaluating LMs with LMs, in this work, we propose to evaluate VLMs with VLMs. For this purpose, we present a new fee… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

    Comments: Work in progress

  25. arXiv:2312.13947  [pdf, other

    eess.IV cs.LG math.NA physics.med-ph

    PhysRFANet: Physics-Guided Neural Network for Real-Time Prediction of Thermal Effect During Radiofrequency Ablation Treatment

    Authors: Minwoo Shin, Minjee Seo, Seonaeng Cho, Juil Park, Joon Ho Kwon, Deukhee Lee, Kyungho Yoon

    Abstract: Radiofrequency ablation (RFA) is a widely used minimally invasive technique for ablating solid tumors. Achieving precise personalized treatment necessitates feedback information on in situ thermal effects induced by the RFA procedure. While computer simulation facilitates the prediction of electrical and thermal phenomena associated with RFA, its practical implementation in clinical settings is hi… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  26. arXiv:2312.08684  [pdf, other

    cs.RO

    Stein-MAP: A Sequential Variational Inference Framework for Maximum A Posteriori Estimation

    Authors: Min-Won Seo, Solmaz S. Kia

    Abstract: State estimation poses substantial challenges in robotics, often involving encounters with multimodality in real-world scenarios. To address these challenges, it is essential to calculate Maximum a posteriori (MAP) sequences from joint probability distributions of latent states and observations over time. However, it generally involves a trade-off between approximation errors and computational com… ▽ More

    Submitted 16 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 13 pages

  27. arXiv:2312.02819  [pdf, other

    cs.CV

    Deterministic Guidance Diffusion Model for Probabilistic Weather Forecasting

    Authors: Donggeun Yoon, Minseok Seo, Doyi Kim, Yeji Choi, Donghyeon Cho

    Abstract: Weather forecasting requires not only accuracy but also the ability to perform probabilistic prediction. However, deterministic weather forecasting methods do not support probabilistic predictions, and conversely, probabilistic models tend to be less accurate. To address these challenges, in this paper, we introduce the \textbf{\textit{D}}eterministic \textbf{\textit{G}}uidance \textbf{\textit{D}}… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 16 pages

  28. arXiv:2311.11602   

    cs.CV cs.AI

    A Multi-In-Single-Out Network for Video Frame Interpolation without Optical Flow

    Authors: Jaemin Lee, Minseok Seo, Sangwoo Lee, Hyobin Park, Dong-Geol Choi

    Abstract: In general, deep learning-based video frame interpolation (VFI) methods have predominantly focused on estimating motion vectors between two input frames and warping them to the target time. While this approach has shown impressive performance for linear motion between two input frames, it exhibits limitations when dealing with occlusions and nonlinear movements. Recently, generative models have be… ▽ More

    Submitted 4 December, 2023; v1 submitted 20 November, 2023; originally announced November 2023.

    Comments: Discovering a problem with the manuscript

  29. arXiv:2311.09765  [pdf, other

    cs.IR cs.AI

    Back to Basics: A Simple Recipe for Improving Out-of-Domain Retrieval in Dense Encoders

    Authors: Hyunji Lee, Luca Soldaini, Arman Cohan, Minjoon Seo, Kyle Lo

    Abstract: Prevailing research practice today often relies on training dense retrievers on existing large datasets such as MSMARCO and then experimenting with ways to improve zero-shot generalization capabilities to unseen domains. While prior work has tackled this challenge through resource-intensive steps such as data augmentation, architectural modifications, increasing model size, or even further base mo… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  30. arXiv:2311.09069  [pdf, other

    cs.CL cs.AI

    How Well Do Large Language Models Truly Ground?

    Authors: Hyunji Lee, Sejune Joo, Chaeeun Kim, Joel Jang, Doyoung Kim, Kyoung-Woon On, Minjoon Seo

    Abstract: To reduce issues like hallucinations and lack of control in Large Language Models (LLMs), a common method is to generate responses by grounding on external contexts given as input, known as knowledge-augmented models. However, previous research often narrowly defines "grounding" as just having the correct answer, which does not ensure the reliability of the entire response. To overcome this, we pr… ▽ More

    Submitted 29 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: published at NAACL 2022

  31. arXiv:2311.08329  [pdf, other

    cs.CL

    KTRL+F: Knowledge-Augmented In-Document Search

    Authors: Hanseok Oh, Haebin Shin, Miyoung Ko, Hyunji Lee, Minjoon Seo

    Abstract: We introduce a new problem KTRL+F, a knowledge-augmented in-document search task that necessitates real-time identification of all semantic targets within a document with the awareness of external sources through a single natural query. KTRL+F addresses following unique challenges for in-document search: 1)utilizing knowledge outside the document for extended use of additional information about ta… ▽ More

    Submitted 18 April, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

  32. arXiv:2311.07362  [pdf, other

    cs.CL cs.CV

    Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision

    Authors: Seongyun Lee, Sue Hyun Park, Yongrae Jo, Minjoon Seo

    Abstract: Large multimodal models suffer from multimodal hallucination, where they provide incorrect responses misaligned with the given visual information. Recent works have conjectured that one of the reasons behind multimodal hallucination is due to the vision encoder failing to ground on the image properly. To mitigate this issue, we propose a novel approach that leverages self-feedback as visual cues.… ▽ More

    Submitted 2 April, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

  33. arXiv:2310.09759  [pdf, other

    cs.CV

    Prototype-oriented Unsupervised Change Detection for Disaster Management

    Authors: Youngtack Oh, Minseok Seo, Doyi Kim, Junghoon Seo

    Abstract: Climate change has led to an increased frequency of natural disasters such as floods and cyclones. This emphasizes the importance of effective disaster monitoring. In response, the remote sensing community has explored change detection methods. These methods are primarily categorized into supervised techniques, which yield precise results but come with high labeling costs, and unsupervised techniq… ▽ More

    Submitted 16 October, 2023; v1 submitted 15 October, 2023; originally announced October 2023.

    Comments: 4page, 2 figures

  34. arXiv:2310.08491  [pdf, other

    cs.CL cs.LG

    Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

    Authors: Seungone Kim, Jamin Shin, Yejin Cho, Joel Jang, Shayne Longpre, Hwaran Lee, Sangdoo Yun, Seongjin Shin, Sungdong Kim, James Thorne, Minjoon Seo

    Abstract: Recently, using a powerful proprietary Large Language Model (LLM) (e.g., GPT-4) as an evaluator for long-form responses has become the de facto standard. However, for practitioners with large-scale evaluation tasks and custom criteria in consideration (e.g., child-readability), using proprietary LLMs as an evaluator is unreliable due to the closed-source nature, uncontrolled versioning, and prohib… ▽ More

    Submitted 9 March, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: ICLR 2024

  35. arXiv:2309.01952  [pdf, other

    cs.RO

    Deep Imitation Learning for Humanoid Loco-manipulation through Human Teleoperation

    Authors: Mingyo Seo, Steve Han, Kyutae Sim, Seung Hyeon Bang, Carlos Gonzalez, Luis Sentis, Yuke Zhu

    Abstract: We tackle the problem of developing humanoid loco-manipulation skills with deep imitation learning. The difficulty of collecting task demonstrations and training policies for humanoids with a high degree of freedom presents substantial challenges. We introduce TRILL, a data-efficient framework for training humanoid loco-manipulation policies from human demonstrations. In this framework, we collect… ▽ More

    Submitted 19 November, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: Accepted to Humanoids 2023

  36. arXiv:2308.13564  [pdf, other

    econ.EM cs.LG math.ST stat.CO stat.ML

    SGMM: Stochastic Approximation to Generalized Method of Moments

    Authors: Xiaohong Chen, Sokbae Lee, Yuan Liao, Myung Hwan Seo, Youngki Shin, Myunghyun Song

    Abstract: We introduce a new class of algorithms, Stochastic Generalized Method of Moments (SGMM), for estimation and inference on (overidentified) moment restriction models. Our SGMM is a novel stochastic approximation alternative to the popular Hansen (1982) (offline) GMM, and offers fast and scalable implementation with the ability to handle streaming datasets in real time. We establish the almost sure c… ▽ More

    Submitted 30 October, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: 46 pages, 4 tables, 2 figures

  37. arXiv:2308.11839  [pdf, other

    cs.RO

    Bayesian Online Learning for Human-assisted Target Localization

    Authors: Min-Won Seo, Solmaz S. Kia

    Abstract: We consider a human-assisted autonomy sensor fusion for dynamic target localization in a Bayesian framework. To compensate for the shortcomings of an autonomous tracking system, we propose to collect spatial sensing information from human operators who visually monitor the target and can provide target localization information in the form of free sketches encircling the area where the target is lo… ▽ More

    Submitted 20 November, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: 6 figures

  38. arXiv:2307.10928  [pdf, other

    cs.CL cs.AI

    FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets

    Authors: Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, Juho Kim, Minjoon Seo

    Abstract: Evaluation of Large Language Models (LLMs) is challenging because instruction-following necessitates alignment with human values and the required set of skills varies depending on the instruction. However, previous studies have mainly focused on coarse-grained evaluation (i.e. overall preference-based evaluation), which limits interpretability since it does not consider the nature of user instruct… ▽ More

    Submitted 14 April, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: ICLR 2024 Spotlight

  39. arXiv:2307.07123  [pdf, other

    cs.CV eess.IV

    Improved Flood Insights: Diffusion-Based SAR to EO Image Translation

    Authors: Minseok Seo, Youngtack Oh, Doyi Kim, Dongmin Kang, Yeji Choi

    Abstract: Driven by rapid climate change, the frequency and intensity of flood events are increasing. Electro-Optical (EO) satellite imagery is commonly utilized for rapid response. However, its utilities in flood situations are hampered by issues such as cloud cover and limitations during nighttime, making accurate assessment of damage challenging. Several alternative flood detection techniques utilizing S… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Comments: 10 pages, 6 figures

    Report number: 10

  40. arXiv:2307.02682  [pdf, other

    cs.CV cs.CL

    Zero-Shot Dense Video Captioning by Jointly Optimizing Text and Moment

    Authors: Yongrae Jo, Seongyun Lee, Aiden SJ Lee, Hyunji Lee, Hanseok Oh, Minjoon Seo

    Abstract: Dense video captioning, a task of localizing meaningful moments and generating relevant captions for videos, often requires a large, expensive corpus of annotated video segments paired with text. In an effort to minimize the annotation cost, we propose ZeroTA, a novel method for dense video captioning in a zero-shot manner. Our method does not require any videos or annotations for training; instea… ▽ More

    Submitted 11 July, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

  41. arXiv:2306.07052  [pdf, other

    cs.CL cs.AI

    Gradient Ascent Post-training Enhances Language Model Generalization

    Authors: Dongkeun Yoon, Joel Jang, Sungdong Kim, Minjoon Seo

    Abstract: In this work, we empirically show that updating pretrained LMs (350M, 1.3B, 2.7B) with just a few steps of Gradient Ascent Post-training (GAP) on random, unlabeled text corpora enhances its zero-shot generalization capabilities across diverse NLP tasks. Specifically, we show that GAP can allow LMs to become comparable to 2-3x times larger LMs across 12 different NLP tasks. We also show that applyi… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Main Conference (Short Paper)

  42. arXiv:2305.19765  [pdf, other

    cs.LG

    A Bayesian Approach To Analysing Training Data Attribution In Deep Learning

    Authors: Elisa Nguyen, Minjoon Seo, Seong Joon Oh

    Abstract: Training data attribution (TDA) techniques find influential training data for the model's prediction on the test data of interest. They approximate the impact of down- or up-weighting a particular training sample. While conceptually useful, they are hardly applicable to deep models in practice, particularly because of their sensitivity to different model initialisation. In this paper, we introduce… ▽ More

    Submitted 31 October, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

  43. arXiv:2305.18952  [pdf, other

    cs.IR cs.AI

    Exploring the Practicality of Generative Retrieval on Dynamic Corpora

    Authors: Soyoung Yoon, Chaeeun Kim, Hyunji Lee, Joel Jang, Sohee Yang, Minjoon Seo

    Abstract: Benchmarking the performance of information retrieval (IR) methods are mostly conducted with a fixed set of documents (static corpora); in realistic scenarios, this is rarely the case and the document to be retrieved are constantly updated and added. In this paper, we focus on conducting a comprehensive comparison between two categories of contemporary retrieval systems, Dual Encoders (DE) and Gen… ▽ More

    Submitted 16 November, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

    Comments: Work in progress

  44. arXiv:2305.14877  [pdf, other

    cs.CL

    Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis

    Authors: Sohee Yang, Jonghyeon Kim, Joel Jang, Seonghyeon Ye, Hyunji Lee, Minjoon Seo

    Abstract: Previous works in prompt engineering for large language models have introduced different gradient-free probability-based prompt selection methods that aim to choose the optimal prompt among the candidates for a given task but have failed to provide a comprehensive and fair comparison between each other. In this paper, we propose a unified framework to interpret and evaluate the existing probabilit… ▽ More

    Submitted 8 March, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: TACL 2024 (Pre-MIT Press publication version)

  45. arXiv:2305.14045  [pdf, other

    cs.CL cs.AI cs.LG

    The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

    Authors: Seungone Kim, Se June Joo, Doyoung Kim, Joel Jang, Seonghyeon Ye, Jamin Shin, Minjoon Seo

    Abstract: Language models (LMs) with less than 100B parameters are known to perform poorly on chain-of-thought (CoT) reasoning in contrast to large LMs when solving unseen tasks. In this work, we aim to equip smaller LMs with the step-by-step reasoning capability by instruction tuning with CoT rationales. In order to achieve this goal, we first introduce a new instruction-tuning dataset called the CoT Colle… ▽ More

    Submitted 14 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023 (Main Conference)

  46. arXiv:2305.13973  [pdf, other

    cs.CL

    Effortless Integration of Memory Management into Open-Domain Conversation Systems

    Authors: Eunbi Choi, Kyoung-Woon On, Gunsoo Han, Sungwoong Kim, Daniel Wontae Nam, Daejin Jo, Seung Eun Rho, Taehwan Kwon, Minjoon Seo

    Abstract: Open-domain conversation systems integrate multiple conversation skills into a single system through a modular approach. One of the limitations of the system, however, is the absence of management capability for external memory. In this paper, we propose a simple method to improve BlenderBot3 by integrating memory management ability into it. Since no training data exists for this purpose, we propo… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  47. arXiv:2305.13735  [pdf, other

    cs.CL cs.AI cs.LG

    Aligning Large Language Models through Synthetic Feedback

    Authors: Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, Kang Min Yoo, Minjoon Seo

    Abstract: Aligning large language models (LLMs) to human values has become increasingly important as it enables sophisticated steering of LLMs. However, it requires significant human demonstrations and feedback or distillation from proprietary LLMs such as ChatGPT. In this work, we propose a novel alignment learning framework with synthetic feedback not dependent on extensive human annotations and proprieta… ▽ More

    Submitted 20 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023 main conference

  48. arXiv:2303.14828  [pdf, other

    cs.CV

    VisDA 2022 Challenge: Domain Adaptation for Industrial Waste Sorting

    Authors: Dina Bashkirova, Samarth Mishra, Diala Lteif, Piotr Teterwak, Donghyun Kim, Fadi Alladkani, James Akl, Berk Calli, Sarah Adel Bargal, Kate Saenko, Daehan Kim, Minseok Seo, YoungJin Jeon, Dong-Geol Choi, Shahaf Ettedgui, Raja Giryes, Shady Abu-Hussein, Binhui Xie, Shuang Li

    Abstract: Label-efficient and reliable semantic segmentation is essential for many real-life applications, especially for industrial settings with high visual diversity, such as waste sorting. In industrial waste sorting, one of the biggest challenges is the extreme diversity of the input stream depending on factors like the location of the sorting facility, the equipment available in the facility, and the… ▽ More

    Submitted 26 March, 2023; originally announced March 2023.

    Comments: Proceedings of Machine Learning Research

  49. arXiv:2303.11606  [pdf, other

    cs.CV

    CAFS: Class Adaptive Framework for Semi-Supervised Semantic Segmentation

    Authors: Jingi Ju, Hyeoncheol Noh, Yooseung Wang, Minseok Seo, Dong-Geol Choi

    Abstract: Semi-supervised semantic segmentation learns a model for classifying pixels into specific classes using a few labeled samples and numerous unlabeled images. The recent leading approach is consistency regularization by selftraining with pseudo-labeling pixels having high confidences for unlabeled images. However, using only highconfidence pixels for self-training may result in losing much of the in… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: 13 pages, 9 figures

  50. arXiv:2303.09779  [pdf, other

    cs.CV

    Bidirectional Domain Mixup for Domain Adaptive Semantic Segmentation

    Authors: Daehan Kim, Minseok Seo, Kwanyong Park, Inkyu Shin, Sanghyun Woo, In-So Kweon, Dong-Geol Choi

    Abstract: Mixup provides interpolated training samples and allows the model to obtain smoother decision boundaries for better generalization. The idea can be naturally applied to the domain adaptation task, where we can mix the source and target samples to obtain domain-mixed samples for better adaptation. However, the extension of the idea from classification to segmentation (i.e., structured output) is no… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: 10 pages, 3 figures, Accepted on AAAI 2023