Skip to main content

Showing 1–50 of 114 results for author: Seo, S

  1. arXiv:2407.11057  [pdf, other

    cs.LG cs.AI q-bio.BM

    SPIN: SE(3)-Invariant Physics Informed Network for Binding Affinity Prediction

    Authors: Seungyeon Choi, Sangmin Seo, Sanghyun Park

    Abstract: Accurate prediction of protein-ligand binding affinity is crucial for rapid and efficient drug development. Recently, the importance of predicting binding affinity has led to increased attention on research that models the three-dimensional structure of protein-ligand complexes using graph neural networks to predict binding affinity. However, traditional methods often fail to accurately model the… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted to ECAI 2024

  2. arXiv:2406.13214  [pdf, other

    cs.LG

    Self-Explainable Temporal Graph Networks based on Graph Information Bottleneck

    Authors: Sangwoo Seo, Sungwon Kim, Jihyeong Jung, Yoonho Lee, Chanyoung Park

    Abstract: Temporal Graph Neural Networks (TGNN) have the ability to capture both the graph topology and dynamic dependencies of interactions within a graph over time. There has been a growing need to explain the predictions of TGNN models due to the difficulty in identifying how past events influence their predictions. Since the explanation model for a static graph cannot be readily applied to temporal grap… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: KDD 2024

  3. arXiv:2406.09170  [pdf, other

    cs.CL

    Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning

    Authors: Bahare Fatemi, Mehran Kazemi, Anton Tsitsulin, Karishma Malkan, Jinyeong Yim, John Palowitch, Sungyong Seo, Jonathan Halcrow, Bryan Perozzi

    Abstract: Large language models (LLMs) have showcased remarkable reasoning capabilities, yet they remain susceptible to errors, particularly in temporal reasoning tasks involving complex temporal logic. Existing research has explored LLM performance on temporal reasoning using diverse datasets and benchmarks. However, these studies often rely on real-world data that LLMs may have encountered during pre-trai… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  4. arXiv:2405.16751  [pdf, other

    cs.AI cs.CL cs.CV cs.MA

    LLM-Based Cooperative Agents using Information Relevance and Plan Validation

    Authors: SeungWon Seo, Junhyeok Lee, SeongRae Noh, HyeongYeop Kang

    Abstract: We address the challenge of multi-agent cooperation, where agents achieve a common goal by interacting with a 3D scene and cooperating with decentralized agents under complex partial observations. This involves managing communication costs and optimizing interaction trajectories in dynamic environments. Our research focuses on three primary limitations of existing cooperative agent systems. Firstl… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  5. arXiv:2405.13345  [pdf, other

    cs.RO cs.LG

    Autonomous Algorithm for Training Autonomous Vehicles with Minimal Human Intervention

    Authors: Sang-Hyun Lee, Daehyeok Kwon, Seung-Woo Seo

    Abstract: Reinforcement learning (RL) provides a compelling framework for enabling autonomous vehicles to continue to learn and improve diverse driving behaviors on their own. However, training real-world autonomous vehicles with current RL algorithms presents several challenges. One critical challenge, often overlooked in these algorithms, is the need to reset a driving environment between every episode. W… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 8 pages, 6 figures, 2 tables, conference

  6. arXiv:2404.16989  [pdf, other

    cs.LG cs.AI cs.RO

    IDIL: Imitation Learning of Intent-Driven Expert Behavior

    Authors: Sangwon Seo, Vaibhav Unhelkar

    Abstract: When faced with accomplishing a task, human experts exhibit intentional behavior. Their unique intents shape their plans and decisions, resulting in experts demonstrating diverse behaviors to accomplish the same task. Due to the uncertainties encountered in the real world and their bounded rationality, experts sometimes adjust their intents, which in turn influences their behaviors during task exe… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Extended version of an identically-titled paper accepted at AAMAS 2024

  7. Traversability-aware Adaptive Optimization for Path Planning and Control in Mountainous Terrain

    Authors: Se-Wook Yoo, E In Son, Seung-Woo Seo

    Abstract: Autonomous navigation in extreme mountainous terrains poses challenges due to the presence of mobility-stressing elements and undulating surfaces, making it particularly difficult compared to conventional off-road driving scenarios. In such environments, estimating traversability solely based on exteroceptive sensors often leads to the inability to reach the goal due to a high prevalence of non-tr… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 8 pages, 7 figures, accepted 2024 RA-L

    Journal ref: IEEE Robotics and Automation Letters 2024

  8. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  9. arXiv:2404.01914  [pdf, other

    cs.CL cs.AI

    SCANNER: Knowledge-Enhanced Approach for Robust Multi-modal Named Entity Recognition of Unseen Entities

    Authors: Hyunjong Ok, Taeho Kil, Sukmin Seo, Jaeho Lee

    Abstract: Recent advances in named entity recognition (NER) have pushed the boundary of the task to incorporate visual signals, leading to many variants, including multi-modal NER (MNER) or grounded MNER (GMNER). A key challenge to these tasks is that the model should be able to generalize to the entities unseen during the training, and should be able to handle the training samples with noisy annotations. T… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 13 pages, 7 figures, NAACL 2024

  10. arXiv:2404.01745  [pdf, other

    cs.CV cs.AI

    Unleash the Potential of CLIP for Video Highlight Detection

    Authors: Donghoon Han, Seunghyeon Seo, Eunhwan Park, Seong-Uk Nam, Nojun Kwak

    Abstract: Multimodal and large language models (LLMs) have revolutionized the utilization of open-world knowledge, unlocking novel potentials across various tasks and applications. Among these domains, the video domain has notably benefited from their capabilities. In this paper, we present Highlight-CLIP (HL-CLIP), a method designed to excel in the video highlight detection task by leveraging the pre-train… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  11. arXiv:2403.15048  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Cartoon Hallucinations Detection: Pose-aware In Context Visual Learning

    Authors: Bumsoo Kim, Wonseop Shin, Kyuchul Lee, Sanghyun Seo

    Abstract: Large-scale Text-to-Image (TTI) models have become a common approach for generating training data in various generative fields. However, visual hallucinations, which contain perceptually critical defects, remain a concern, especially in non-photorealistic styles like cartoon characters. We propose a novel visual hallucination detection system for cartoon character images generated by TTI models. O… ▽ More

    Submitted 24 March, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: 11 pages, 12 figures, 1 table, Project page: https://gh-bumsookim.github.io/Cartoon-Hallucinations-Detection/

  12. arXiv:2403.10906  [pdf, other

    cs.CV

    HourglassNeRF: Casting an Hourglass as a Bundle of Rays for Few-shot Neural Rendering

    Authors: Seunghyeon Seo, Yeonjin Chang, Jayeon Yoo, Seungwoo Lee, Hojun Lee, Nojun Kwak

    Abstract: Recent advancements in the Neural Radiance Field (NeRF) have bolstered its capabilities for novel view synthesis, yet its reliance on dense multi-view training images poses a practical challenge. Addressing this, we propose HourglassNeRF, an effective regularization-based approach with a novel hourglass casting strategy. Our proposed hourglass is conceptualized as a bundle of additional rays withi… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: 21 pages, 11 figures

  13. arXiv:2403.01594  [pdf, other

    cs.HC

    Never Tell the Trick: Covert Interactive Mixed Reality System for Immersive Storytelling

    Authors: Chanwoo Lee, Kyubeom Shim, Sanggyo Seo, Gwonu Ryu, Yongsoon Choi

    Abstract: This study explores the integration of Ultra-Wideband (UWB) technology into Mixed Reality (MR) Systems for immersive storytelling. Addressing the limitations of existing technologies like Microsoft Kinect and HTC Vive, the research focuses on overcoming challenges in robustness to occlusion, tracking volume, and cost efficiency in props tracking. Utilizing UWB technology, the interactive MR system… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Comments: To be presented in IEEE VR 2024

  14. arXiv:2403.01233  [pdf, other

    cs.RO

    Results and Lessons Learned from Autonomous Driving Transportation Services in Airfield, Crowded Indoor, and Urban Environments

    Authors: Doosan Baek, Sanghyun Kim, Seung-Woo Seo, Sang-Hyun Lee

    Abstract: Autonomous vehicles have been actively investigated over the past few decades. Several recent works show the potential of autonomous vehicles in urban environments with impressive experimental results. However, these works note that autonomous vehicles are still occasionally inferior to expert drivers in complex scenarios. Furthermore, they do not focus on the possibilities of autonomous driving t… ▽ More

    Submitted 20 March, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: 8 pages, 7 figures, 4 tables

  15. arXiv:2402.16092  [pdf, other

    cs.CV

    StochCA: A Novel Approach for Exploiting Pretrained Models with Cross-Attention

    Authors: Seungwon Seo, Suho Lee, Sangheum Hwang

    Abstract: Utilizing large-scale pretrained models is a well-known strategy to enhance performance on various target tasks. It is typically achieved through fine-tuning pretrained models on target tasks. However, naïve fine-tuning may not fully leverage knowledge embedded in pretrained models. In this study, we introduce a novel fine-tuning method, called stochastic cross-attention (StochCA), specific to Tra… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

    Comments: The first two authors contributed equally

  16. arXiv:2402.15363  [pdf, other

    cs.RO

    Follow the Footprints: Self-supervised Traversability Estimation for Off-road Vehicle Navigation based on Geometric and Visual Cues

    Authors: Yurim Jeon, E In Son, Seung-Woo Seo

    Abstract: In this study, we address the off-road traversability estimation problem, that predicts areas where a robot can navigate in off-road environments. An off-road environment is an unstructured environment comprising a combination of traversable and non-traversable spaces, which presents a challenge for estimating traversability. This study highlights three primary factors that affect a robot's traver… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted to IEEE International Conference on Robotics and Automation (ICRA) 2024

  17. arXiv:2402.05706  [pdf, other

    cs.CL cs.SD eess.AS

    Unified Speech-Text Pretraining for Spoken Dialog Modeling

    Authors: Heeseung Kim, Soonshin Seo, Kyeongseok Jeong, Ohsung Kwon, Jungwhan Kim, Jaehong Lee, Eunwoo Song, Myungwoo Oh, Sungroh Yoon, Kang Min Yoo

    Abstract: While recent work shows promising results in expanding the capabilities of large language models (LLM) to directly understand and synthesize speech, an LLM-based strategy for modeling spoken dialogs remains elusive and calls for further investigation. This work proposes an extensive speech-text LLM framework, named the Unified Spoken Dialog Model (USDM), to generate coherent spoken responses with… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  18. arXiv:2402.05448  [pdf, other

    cs.CV cs.AI cs.GR cs.LG cs.MM

    Minecraft-ify: Minecraft Style Image Generation with Text-guided Image Editing for In-Game Application

    Authors: Bumsoo Kim, Sanghyun Byun, Yonghoon Jung, Wonseop Shin, Sareer UI Amin, Sanghyun Seo

    Abstract: In this paper, we first present the character texture generation system \textit{Minecraft-ify}, specified to Minecraft video game toward in-game application. Ours can generate face-focused image for texture mapping tailored to 3D virtual character having cube manifold. While existing projects or works only generate texture, proposed system can inverse the user-provided real image, or generate aver… ▽ More

    Submitted 3 March, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: 2 pages, 2 figures. Accepted as Spotlight to NeurIPS 2023 Workshop on Machine Learning for Creativity and Design

  19. arXiv:2402.02733  [pdf, other

    cs.CV cs.AI cs.GR cs.LG cs.MM

    ToonAging: Face Re-Aging upon Artistic Portrait Style Transfer

    Authors: Bumsoo Kim, Abdul Muqeet, Kyuchul Lee, Sanghyun Seo

    Abstract: Face re-aging is a prominent field in computer vision and graphics, with significant applications in photorealistic domains such as movies, advertising, and live streaming. Recently, the need to apply face re-aging to non-photorealistic images, like comics, illustrations, and animations, has emerged as an extension in various entertainment sectors. However, the lack of a network that can seamlessl… ▽ More

    Submitted 28 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted at CVPR 2024 AI4CC Workshop, Project Page: https://gh-bumsookim.github.io/ToonAging/

  20. arXiv:2401.12624  [pdf, other

    cs.AI cs.IT cs.LG cs.NI

    Knowledge Distillation from Language-Oriented to Emergent Communication for Multi-Agent Remote Control

    Authors: Yongjun Kim, Sejin Seo, Jihong Park, Mehdi Bennis, Seong-Lyun Kim, Junil Choi

    Abstract: In this work, we compare emergent communication (EC) built upon multi-agent deep reinforcement learning (MADRL) and language-oriented semantic communication (LSC) empowered by a pre-trained large language model (LLM) using human language. In a multi-agent remote navigation task, with multimodal input data comprising location and channel maps, it is shown that EC incurs high training cost and strug… ▽ More

    Submitted 3 March, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

  21. arXiv:2401.04928  [pdf, other

    cs.LG

    Relaxed Contrastive Learning for Federated Learning

    Authors: Seonguk Seo, Jinkyu Kim, Geeho Kim, Bohyung Han

    Abstract: We propose a novel contrastive learning framework to effectively address the challenges of data heterogeneity in federated learning. We first analyze the inconsistency of gradient updates across clients during local training and establish its dependence on the distribution of feature representations, leading to the derivation of the supervised contrastive learning (SCL) objective to mitigate local… ▽ More

    Submitted 31 May, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

  22. arXiv:2401.03240  [pdf, other

    cs.LG math.OC

    Interpreting Adaptive Gradient Methods by Parameter Scaling for Learning-Rate-Free Optimization

    Authors: Min-Kook Suh, Seung-Woo Seo

    Abstract: We address the challenge of estimating the learning rate for adaptive gradient methods used in training deep neural networks. While several learning-rate-free approaches have been proposed, they are typically tailored for steepest descent. However, although steepest descent methods offer an intuitive approach to finding minima, many deep learning applications require adaptive gradient methods to a… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: Preprint

  23. arXiv:2401.02656  [pdf, other

    cs.CV cs.LG

    GTA: Guided Transfer of Spatial Attention from Object-Centric Representations

    Authors: SeokHyun Seo, Jinwoo Hong, JungWoo Chae, Kyungyul Kim, Sangheum Hwang

    Abstract: Utilizing well-trained representations in transfer learning often results in superior performance and faster convergence compared to training from scratch. However, even if such good representations are transferred, a model can easily overfit the limited training dataset and lose the valuable properties of the transferred representations. This phenomenon is more severe in ViT due to its low induct… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  24. arXiv:2311.10309  [pdf, other

    cs.LG cs.RO

    Imagination-Augmented Hierarchical Reinforcement Learning for Safe and Interactive Autonomous Driving in Urban Environments

    Authors: Sang-Hyun Lee, Yoonjae Jung, Seung-Woo Seo

    Abstract: Hierarchical reinforcement learning (HRL) incorporates temporal abstraction into reinforcement learning (RL) by explicitly taking advantage of hierarchical structure. Modern HRL typically designs a hierarchical agent composed of a high-level policy and low-level policies. The high-level policy selects which low-level policy to activate at a lower frequency and the activated low-level policy select… ▽ More

    Submitted 23 January, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: 15 pages, 9 figures; corrected typos, added references, revised experiments (results unchanged)

  25. arXiv:2311.09195  [pdf, other

    cs.LG cs.RO

    Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning without Task-Specific Knowledge

    Authors: Sang-Hyun Lee, Seung-Woo Seo

    Abstract: A significant bottleneck in applying current reinforcement learning algorithms to real-world scenarios is the need to reset the environment between every episode. This reset process demands substantial human intervention, making it difficult for the agent to learn continuously and autonomously. Several recent works have introduced autonomous reinforcement learning (ARL) algorithms that generate cu… ▽ More

    Submitted 18 February, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: 8 pages, 8 figures

  26. arXiv:2311.03965  [pdf, other

    cs.CV

    Fast Sun-aligned Outdoor Scene Relighting based on TensoRF

    Authors: Yeonjin Chang, Yearim Kim, Seunghyeon Seo, Jung Yi, Nojun Kwak

    Abstract: In this work, we introduce our method of outdoor scene relighting for Neural Radiance Fields (NeRF) named Sun-aligned Relighting TensoRF (SR-TensoRF). SR-TensoRF offers a lightweight and rapid pipeline aligned with the sun, thereby achieving a simplified workflow that eliminates the need for environment maps. Our sun-alignment strategy is motivated by the insight that shadows, unlike viewpoint-dep… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: WACV 2024

  27. arXiv:2311.03651  [pdf, other

    cs.LG cs.AI cs.RO

    SeRO: Self-Supervised Reinforcement Learning for Recovery from Out-of-Distribution Situations

    Authors: Chan Kim, Jaekyung Cho, Christophe Bobda, Seung-Woo Seo, Seong-Woo Kim

    Abstract: Robotic agents trained using reinforcement learning have the problem of taking unreliable actions in an out-of-distribution (OOD) state. Agents can easily become OOD in real-world environments because it is almost impossible for them to visit and learn the entire state space during training. Unfortunately, unreliable actions do not ensure that agents perform their original tasks successfully. Ther… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: 9 pages, 5 figures. Proceedings of the 32nd International Joint Conference on Artificial Intelligence, 2023

  28. ExPECA: An Experimental Platform for Trustworthy Edge Computing Applications

    Authors: Samie Mostafavi, Vishnu Narayanan Moothedath, Stefan Rönngren, Neelabhro Roy, Gourav Prateek Sharma, Sangwon Seo, Manuel Olguín Muñoz, James Gross

    Abstract: This paper presents ExPECA, an edge computing and wireless communication research testbed designed to tackle two pressing challenges: comprehensive end-to-end experimentation and high levels of experimental reproducibility. Leveraging OpenStack-based Chameleon Infrastructure (CHI) framework for its proven flexibility and ease of operation, ExPECA is located in a unique, isolated underground facili… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  29. arXiv:2310.19906  [pdf, other

    cs.LG cs.AI

    Interpretable Prototype-based Graph Information Bottleneck

    Authors: Sangwoo Seo, Sungwon Kim, Chanyoung Park

    Abstract: The success of Graph Neural Networks (GNNs) has led to a need for understanding their decision-making process and providing explanations for their predictions, which has given rise to explainable AI (XAI) that offers transparent explanations for black-box models. Recently, the use of prototypes has successfully improved the explainability of models by learning prototypes to imply training graphs t… ▽ More

    Submitted 20 February, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  30. arXiv:2310.03223  [pdf, other

    cs.LG

    TacoGFN: Target-conditioned GFlowNet for Structure-based Drug Design

    Authors: Tony Shen, Seonghwan Seo, Grayson Lee, Mohit Pandey, Jason R Smith, Artem Cherkasov, Woo Youn Kim, Martin Ester

    Abstract: Searching the vast chemical space for drug-like and synthesizable molecules with high binding affinity to a protein pocket is a challenging task in drug discovery. Recently, molecular deep generative models have been introduced which promise to be more efficient than exhaustive virtual screening, by directly generating molecules based on the protein structure. However, since they learn the distrib… ▽ More

    Submitted 7 April, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023 AID3 and at NeurIPS 2023 GenBio as Spotlight

    Journal ref: NeurIPS 2023 Generative AI and Biology (GenBio) Workshop

  31. arXiv:2310.00681  [pdf, other

    q-bio.BM cs.LG

    PharmacoNet: Accelerating Large-Scale Virtual Screening by Deep Pharmacophore Modeling

    Authors: Seonghwan Seo, Woo Youn Kim

    Abstract: As the size of accessible compound libraries expands to over 10 billion, the need for more efficient structure-based virtual screening methods is emerging. Different pre-screening methods have been developed for rapid screening, but there is still a lack of structure-based methods applicable to various proteins that perform protein-ligand binding conformation prediction and scoring in an extremely… ▽ More

    Submitted 18 December, 2023; v1 submitted 1 October, 2023; originally announced October 2023.

    Comments: 21 pages, 5 figures

    Journal ref: NeurIPS 2023 Workshop on New Frontiers of AI for Drug Discovery and Development

  32. arXiv:2308.11199  [pdf, other

    cs.CV cs.AI cs.LG

    ConcatPlexer: Additional Dim1 Batching for Faster ViTs

    Authors: Donghoon Han, Seunghyeon Seo, Donghyeon Jeon, Jiho Jang, Chaerin Kong, Nojun Kwak

    Abstract: Transformers have demonstrated tremendous success not only in the natural language processing (NLP) domain but also the field of computer vision, igniting various creative approaches and applications. Yet, the superior performance and modeling flexibility of transformers came with a severe increase in computation costs, and hence several works have proposed methods to reduce this burden. Inspired… ▽ More

    Submitted 31 January, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

  33. arXiv:2307.16312  [pdf, other

    math.CO cs.DM

    Open-locating-dominating sets with error correction

    Authors: Devin Jean, Suk Seo

    Abstract: An open-locating-dominating set of a graph models a detection system for a facility with a possible "intruder" or a multiprocessor network with a possible malfunctioning processor. A "sensor" or "detector" is assumed to be installed at a subset of vertices where it can detect an intruder or a malfunctioning processor in their neighborhood, but not at itself. We consider a fault-tolerant variant of… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

    Comments: arXiv admin note: text overlap with arXiv:2306.12583

    MSC Class: 05C69

  34. arXiv:2306.17723  [pdf, other

    cs.CV

    FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis

    Authors: Seunghyeon Seo, Yeonjin Chang, Nojun Kwak

    Abstract: Neural Radiance Field (NeRF) has been a mainstream in novel view synthesis with its remarkable quality of rendered images and simple architecture. Although NeRF has been developed in various directions improving continuously its performance, the necessity of a dense set of multi-view images still exists as a stumbling block to progress for practical application. In this work, we propose FlipNeRF,… ▽ More

    Submitted 14 August, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: ICCV 2023. Project Page: https://shawn615.github.io/flipnerf/

  35. arXiv:2306.15217  [pdf, other

    cs.LG cs.AI

    Unsupervised Episode Generation for Graph Meta-learning

    Authors: Jihyeong Jung, Sangwoo Seo, Sungwon Kim, Chanyoung Park

    Abstract: We propose Unsupervised Episode Generation method called Neighbors as Queries (NaQ) to solve the Few-Shot Node-Classification (FSNC) task by unsupervised Graph Meta-learning. Doing so enables full utilization of the information of all nodes in a graph, which is not possible in current supervised meta-learning methods for FSNC due to the label-scarcity problem. In addition, unlike unsupervised Grap… ▽ More

    Submitted 21 May, 2024; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: 24 pages, 15 figures, 16 Tables; accepted at ICML 2024

  36. arXiv:2306.12583  [pdf, other

    math.CO cs.DM

    On Error-detecting Open-locating-dominating sets

    Authors: Devin Jean, Suk Seo

    Abstract: An open-dominating set S for a graph G is a subset of vertices where every vertex has a neighbor in S. An open-locating-dominating set S for a graph G is an open-dominating set such that each pair of distinct vertices in G have distinct set of open-neighbors in S. We consider a type of a fault-tolerant open-locating dominating set called error-detecting open-locating-dominating sets. We present mo… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    MSC Class: 05C69

  37. arXiv:2306.10989  [pdf, other

    cs.LG

    Scaling of Class-wise Training Losses for Post-hoc Calibration

    Authors: Seungjin Jung, Seungmo Seo, Yonghyun Jeong, Jongwon Choi

    Abstract: The class-wise training losses often diverge as a result of the various levels of intra-class and inter-class appearance variation, and we find that the diverging class-wise training losses cause the uncalibrated prediction with its reliability. To resolve the issue, we propose a new calibration method to synchronize the class-wise training losses. We design a new training loss to alleviate the va… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Comments: Published at ICML 2023. Camera ready version

  38. Improved Training for End-to-End Streaming Automatic Speech Recognition Model with Punctuation

    Authors: Hanbyul Kim, Seunghyun Seo, Lukas Lee, Seolki Baek

    Abstract: Punctuated text prediction is crucial for automatic speech recognition as it enhances readability and impacts downstream natural language processing tasks. In streaming scenarios, the ability to predict punctuation in real-time is particularly desirable but presents a difficult technical challenge. In this work, we propose a method for predicting punctuated text from input speech using a chunk-bas… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: Accepted at INTERSPEECH 2023

    Journal ref: Proc. INTERSPEECH 2023, 1653-1657

  39. arXiv:2306.00680  [pdf, other

    cs.SD cs.AI eess.AS

    Encoder-decoder multimodal speaker change detection

    Authors: Jee-weon Jung, Soonshin Seo, Hee-Soo Heo, Geonmin Kim, You Jin Kim, Young-ki Kwon, Minjae Lee, Bong-Jin Lee

    Abstract: The task of speaker change detection (SCD), which detects points where speakers change in an input, is essential for several applications. Several studies solved the SCD task using audio inputs only and have shown limited performance. Recently, multimodal SCD (MMSCD) models, which utilise text modality in addition to audio, have shown improved performance. In this study, the proposed model are bui… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 5 pages, accepted for presentation at INTERSPEECH 2023

  40. arXiv:2305.18425  [pdf, other

    cs.LG cs.AI

    Efficient Storage of Fine-Tuned Models via Low-Rank Approximation of Weight Residuals

    Authors: Simo Ryu, Seunghyun Seo, Jaejun Yoo

    Abstract: In this paper, we present an efficient method for storing fine-tuned models by leveraging the low-rank properties of weight residuals. Our key observation is that weight residuals in large overparameterized models exhibit even stronger low-rank characteristics. Based on this insight, we propose Efficient Residual Encoding (ERE), a novel approach that achieves efficient storage of fine-tuned model… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: 16 pages, 8 figures

    ACM Class: I.2.6

  41. arXiv:2305.01160  [pdf, other

    cs.LG cs.CV

    Long-Tailed Recognition by Mutual Information Maximization between Latent Features and Ground-Truth Labels

    Authors: Min-Kook Suh, Seung-Woo Seo

    Abstract: Although contrastive learning methods have shown prevailing performance on a variety of representation learning tasks, they encounter difficulty when the training dataset is long-tailed. Many researchers have combined contrastive learning and a logit adjustment technique to address this problem, but the combinations are done ad-hoc and a theoretical background has not yet been provided. The goal o… ▽ More

    Submitted 8 August, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

    Comments: ICML 2023 camera-ready

  42. arXiv:2304.05303  [pdf, other

    cs.CV cs.CL

    ELVIS: Empowering Locality of Vision Language Pre-training with Intra-modal Similarity

    Authors: Sumin Seo, JaeWoong Shin, Jaewoo Kang, Tae Soo Kim, Thijs Kooi

    Abstract: Deep learning has shown great potential in assisting radiologists in reading chest X-ray (CXR) images, but its need for expensive annotations for improving performance prevents widespread clinical application. Visual language pre-training (VLP) can alleviate the burden and cost of annotation by leveraging routinely generated reports for radiographs, which exist in large quantities as well as in pa… ▽ More

    Submitted 23 July, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: Under review

  43. arXiv:2304.03435  [pdf, other

    cs.CV

    Towards Unified Scene Text Spotting based on Sequence Generation

    Authors: Taeho Kil, Seonghyeon Kim, Sukmin Seo, Yoonsik Kim, Daehee Kim

    Abstract: Sequence generation models have recently made significant progress in unifying various vision tasks. Although some auto-regressive models have demonstrated promising results in end-to-end text spotting, they use specific detection formats while ignoring various text shapes and are limited in the maximum number of text instances that can be detected. To overcome these limitations, we propose a UNIf… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2023

  44. arXiv:2304.00792  [pdf, other

    cs.CV cs.LG

    Few-shot Fine-tuning is All You Need for Source-free Domain Adaptation

    Authors: Suho Lee, Seungwon Seo, Jihyo Kim, Yejin Lee, Sangheum Hwang

    Abstract: Recently, source-free unsupervised domain adaptation (SFUDA) has emerged as a more practical and feasible approach compared to unsupervised domain adaptation (UDA) which assumes that labeled source data are always accessible. However, significant limitations associated with SFUDA approaches are often overlooked, which limits their practicality in real-world applications. These limitations include… ▽ More

    Submitted 24 April, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: The first two authors contributed equally

  45. arXiv:2303.09827  [pdf, other

    cs.CL cs.AI

    DORIC : Domain Robust Fine-Tuning for Open Intent Clustering through Dependency Parsing

    Authors: Jihyun Lee, Seungyeon Seo, Yunsu Kim, Gary Geunbae Lee

    Abstract: We present our work on Track 2 in the Dialog System Technology Challenges 11 (DSTC11). DSTC11-Track2 aims to provide a benchmark for zero-shot, cross-domain, intent-set induction. In the absence of in-domain training dataset, robust utterance representation that can be used across domains is necessary to induce users' intentions. To achieve this, we leveraged a multi-domain dialogue dataset to fin… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

  46. arXiv:2303.00413  [pdf, other

    cs.AI cs.LG cs.MA

    Automated Task-Time Interventions to Improve Teamwork using Imitation Learning

    Authors: Sangwon Seo, Bing Han, Vaibhav Unhelkar

    Abstract: Effective human-human and human-autonomy teamwork is critical but often challenging to perfect. The challenge is particularly relevant in time-critical domains, such as healthcare and disaster response, where the time pressures can make coordination increasingly difficult to achieve and the consequences of imperfect coordination can be severe. To improve teamwork in these and other domains, we pre… ▽ More

    Submitted 2 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: Extended version of an identically-titled paper accepted at AAMAS 2023

  47. arXiv:2302.09422  [pdf, other

    cs.LG

    Neural Attention Memory

    Authors: Hyoungwook Nam, Seung Byum Seo

    Abstract: We propose a novel perspective of the attention mechanism by reinventing it as a memory architecture for neural networks, namely Neural Attention Memory (NAM). NAM is a memory structure that is both readable and writable via differentiable linear algebra operations. We explore three use cases of NAM: memory-augmented neural network (MANN), few-shot learning, and efficient long-range attention. Fir… ▽ More

    Submitted 14 October, 2023; v1 submitted 18 February, 2023; originally announced February 2023.

    Comments: Preprint. Under review

  48. arXiv:2302.08788  [pdf, other

    cs.CV

    MixNeRF: Modeling a Ray with Mixture Density for Novel View Synthesis from Sparse Inputs

    Authors: Seunghyeon Seo, Donghoon Han, Yeonjin Chang, Nojun Kwak

    Abstract: Neural Radiance Field (NeRF) has broken new ground in the novel view synthesis due to its simple concept and state-of-the-art quality. However, it suffers from severe performance degradation unless trained with a dense set of images with different camera poses, which hinders its practical applications. Although previous methods addressing this problem achieved promising results, they relied heavil… ▽ More

    Submitted 12 April, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

    Comments: CVPR 2023. Project Page: https://shawn615.github.io/mixnerf/

  49. arXiv:2302.08751  [pdf, other

    cs.CV

    MDPose: Real-Time Multi-Person Pose Estimation via Mixture Density Model

    Authors: Seunghyeon Seo, Jaeyoung Yoo, Jihye Hwang, Nojun Kwak

    Abstract: One of the major challenges in multi-person pose estimation is instance-aware keypoint estimation. Previous methods address this problem by leveraging an off-the-shelf detector, heuristic post-grouping process or explicit instance identification process, hindering further improvements in the inference speed which is an important factor for practical applications. From the statistical point of view… ▽ More

    Submitted 8 May, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

    Comments: UAI 2023

  50. arXiv:2301.03767  [pdf, other

    cs.CV cs.IR cs.LG

    Online Backfilling with No Regret for Large-Scale Image Retrieval

    Authors: Seonguk Seo, Mustafa Gokhan Uzunbas, Bohyung Han, Sara Cao, Joena Zhang, Taipeng Tian, Ser-Nam Lim

    Abstract: Backfilling is the process of re-extracting all gallery embeddings from upgraded models in image retrieval systems. It inevitably requires a prohibitively large amount of computational cost and even entails the downtime of the service. Although backward-compatible learning sidesteps this challenge by tackling query-side representations, this leads to suboptimal solutions in principle because galle… ▽ More

    Submitted 9 January, 2023; originally announced January 2023.