Skip to main content

Showing 1–50 of 636 results for author: Yun, S

  1. arXiv:2407.13078  [pdf, other

    cs.CV cs.AI

    Enhancing Temporal Action Localization: Advanced S6 Modeling with Recurrent Mechanism

    Authors: Sangyoun Lee, Juho Jung, Changdae Oh, Sunghee Yun

    Abstract: Temporal Action Localization (TAL) is a critical task in video analysis, identifying precise start and end times of actions. Existing methods like CNNs, RNNs, GCNs, and Transformers have limitations in capturing long-range dependencies and temporal causality. To address these challenges, we propose a novel TAL architecture leveraging the Selective State Space Model (S6). Our approach integrates th… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 8 pages, 3 figures, Preprint

  2. arXiv:2407.08245  [pdf, other

    cs.LG cs.CV

    Feature Diversification and Adaptation for Federated Domain Generalization

    Authors: Seunghan Yang, Seokeon Choi, Hyunsin Park, Sungha Choi, Simyung Chang, Sungrack Yun

    Abstract: Federated learning, a distributed learning paradigm, utilizes multiple clients to build a robust global model. In real-world applications, local clients often operate within their limited domains, leading to a `domain shift' across clients. Privacy concerns limit each client's learning to its own domain data, which increase the risk of overfitting. Moreover, the process of aggregating models train… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  3. arXiv:2407.06123  [pdf, other

    cs.HC

    Investigating User Perceptions of Collaborative Agenda Setting in Virtual Health Counseling Session

    Authors: Mina Fallah, Farnaz Nouraei, Hye Sun Yun, Timothy Bickmore

    Abstract: Virtual health counselors offer the potential to provide users with information and counseling in complex areas such as disease management and health education. However, ensuring user engagement is challenging, particularly when the volume of information and length of counseling sessions increase. Agenda setting a clinical counseling technique where a patient and clinician collaboratively decide o… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  4. arXiv:2407.03563  [pdf, other

    eess.AS cs.CL cs.LG eess.IV

    Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition

    Authors: Sungnyun Kim, Kangwook Jang, Sangmin Bae, Hoirin Kim, Se-Young Yun

    Abstract: Audio-visual speech recognition (AVSR) aims to transcribe human speech using both audio and video modalities. In practical environments with noise-corrupted audio, the role of video information becomes crucial. However, prior works have primarily focused on enhancing audio features in AVSR, overlooking the importance of video features. In this study, we strengthen the video features by learning th… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  5. arXiv:2407.01639  [pdf, other

    cs.LG cs.SE

    ModelVerification.jl: a Comprehensive Toolbox for Formally Verifying Deep Neural Networks

    Authors: Tianhao Wei, Luca Marzari, Kai S. Yun, Hanjiang Hu, Peizhi Niu, Xusheng Luo, Changliu Liu

    Abstract: Deep Neural Networks (DNN) are crucial in approximating nonlinear functions across diverse applications, ranging from image classification to control. Verifying specific input-output properties can be a highly challenging task due to the lack of a single, self-contained framework that allows a complete range of verification types. To this end, we present \texttt{ModelVerification.jl (MV)}, the fir… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  6. arXiv:2407.01624  [pdf, other

    cs.LG cs.AI

    Guided Trajectory Generation with Diffusion Models for Offline Model-based Optimization

    Authors: Taeyoung Yun, Sujin Yun, Jaewoo Lee, Jinkyoo Park

    Abstract: Optimizing complex and high-dimensional black-box functions is ubiquitous in science and engineering fields. Unfortunately, the online evaluation of these functions is restricted due to time and safety constraints in most cases. In offline model-based optimization (MBO), we aim to find a design that maximizes the target function using only a pre-existing offline dataset. While prior methods consid… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 29 pages, 11 figures, 17 tables

  7. arXiv:2407.00693  [pdf, other

    cs.AI cs.CL cs.LG

    BAPO: Base-Anchored Preference Optimization for Personalized Alignment in Large Language Models

    Authors: Gihun Lee, Minchan Jeong, Yujin Kim, Hojung Jung, Jaehoon Oh, Sangmook Kim, Se-Young Yun

    Abstract: While learning to align Large Language Models (LLMs) with human preferences has shown remarkable success, aligning these models to meet the diverse user preferences presents further challenges in preserving previous knowledge. This paper examines the impact of personalized preference optimization on LLMs, revealing that the extent of knowledge loss varies significantly with preference heterogeneit… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: under review

  8. arXiv:2406.20098  [pdf, other

    cs.CV cs.AI cs.CL

    Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs

    Authors: Sukmin Yun, Haokun Lin, Rusiru Thushara, Mohammad Qazim Bhat, Yongxin Wang, Zutao Jiang, Mingkai Deng, Jinhong Wang, Tianhua Tao, Junbo Li, Haonan Li, Preslav Nakov, Timothy Baldwin, Zhengzhong Liu, Eric P. Xing, Xiaodan Liang, Zhiqiang Shen

    Abstract: Multimodal large language models (MLLMs) have shown impressive success across modalities such as image, video, and audio in a variety of understanding and generation tasks. However, current MLLMs are surprisingly poor at understanding webpage screenshots and generating their corresponding HTML code. To address this problem, we propose Web2Code, a benchmark consisting of a new large-scale webpage-t… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Website at https://mbzuai-llm.github.io/webpage2code/

  9. arXiv:2406.18815  [pdf, other

    cs.LG

    MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation

    Authors: Sanggeon Yun, Ryozo Masukawa, Minhyoung Na, Mohsen Imani

    Abstract: In the context of escalating safety concerns across various domains, the tasks of Video Anomaly Detection (VAD) and Video Anomaly Recognition (VAR) have emerged as critically important for applications in intelligent surveillance, evidence investigation, violence alerting, etc. These tasks, aimed at identifying and classifying deviations from normal behavior in video data, face significant challen… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  10. arXiv:2406.16758  [pdf, other

    cs.CL

    Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters

    Authors: Euiin Yi, Taehyeon Kim, Hongseok Jeung, Du-Seong Chang, Se-Young Yun

    Abstract: Large language models (LLMs) have revolutionized natural language processing and broadened their applicability across diverse commercial applications. However, the deployment of these models is constrained by high inference time in multilingual settings. To mitigate this challenge, this paper explores a training recipe of an assistant model in speculative decoding, which are leveraged to draft and… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  11. arXiv:2406.07975  [pdf, other

    astro-ph.IM

    FINER: Far-Infrared Nebular Emission Receiver for the Large Millimeter Telescope

    Authors: Yoichi Tamura, Takeshi Sakai, Ryohei Kawabe, Takafumi Kojima, Akio Taniguchi, Tatsuya Takekoshi, Haoran Kang, Wenlei Shan, Masato Hagimoto, Norika Okauchi, Airi Tetsuka, Akio K. Inoue, Kotaro Kohno, Kunihiko Tanaka, Tom J. L. C. Bakx, Yoshinobu Fudamoto, Kazuyuki Fujita, Yuichi Harikane, Takuya Hashimoto, Bunyo Hatsukade, David H. Hughes, Takahiro Iino, Yuki Kimura, Hiroyuki Maezawa, Yuichi Matsuda , et al. (12 additional authors not shown)

    Abstract: Unveiling the emergence and prevalence of massive/bright galaxies during the epoch of reionization and beyond, within the first 600 million years of the Universe, stands as a pivotal pursuit in astronomy. Remarkable progress has been made by JWST in identifying an immense population of bright galaxies, which hints at exceptionally efficient galaxy assembly processes. However, the underlying physic… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 12 pages, 8 figures, and 3 tables. Proceedings paper presented in SPIE Astronomical Telescope and Instrumentation 2024

  12. arXiv:2406.02657  [pdf, other

    cs.CL cs.AI cs.LG

    Block Transformer: Global-to-Local Language Modeling for Fast Inference

    Authors: Namgyu Ho, Sangmin Bae, Taehyeon Kim, Hyunjik Jo, Yireun Kim, Tal Schuster, Adam Fisch, James Thorne, Se-Young Yun

    Abstract: This paper presents the Block Transformer architecture which adopts hierarchical global-to-local modeling to autoregressive transformers to mitigate the inference bottlenecks of self-attention. To apply self-attention, the key-value (KV) cache of all previous sequences must be retrieved from memory at every decoding step. Thereby, this KV cache IO becomes a significant bottleneck in batch inferenc… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 30 pages, 21 figures, 5 tables

  13. arXiv:2406.02355  [pdf, other

    cs.CV cs.AI cs.DC cs.LG

    FedDr+: Stabilizing Dot-regression with Global Feature Distillation for Federated Learning

    Authors: Seongyoon Kim, Minchan Jeong, Sungnyun Kim, Sungwoo Cho, Sumyeong Ahn, Se-Young Yun

    Abstract: Federated Learning (FL) has emerged as a pivotal framework for the development of effective global models (global FL) or personalized models (personalized FL) across clients with heterogeneous, non-iid data distribution. A key challenge in FL is client drift, where data heterogeneity impedes the aggregation of scattered knowledge. Recent studies have tackled the client drift issue by identifying s… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  14. arXiv:2406.02021  [pdf, other

    cs.CV cs.AI cs.LG

    MetaMixer Is All You Need

    Authors: Seokju Yun, Dongheon Lee, Youngmin Ro

    Abstract: Transformer, composed of self-attention and Feed-Forward Network, has revolutionized the landscape of network design across various vision tasks. FFN is a versatile operator seamlessly integrated into nearly all AI models to effectively harness rich representations. Recent works also show that FFN functions like key-value memories. Thus, akin to the query-key-value mechanism within self-attention,… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Code: https://github.com/ysj9909/FFNet

  15. arXiv:2405.19806  [pdf, other

    cs.LG

    Preference Alignment with Flow Matching

    Authors: Minu Kim, Yongsik Lee, Sehyeok Kang, Jihwan Oh, Song Chong, Seyoung Yun

    Abstract: We present Preference Flow Matching (PFM), a new framework for preference-based reinforcement learning (PbRL) that streamlines the integration of preferences into an arbitrary class of pre-trained models. Existing PbRL methods require fine-tuning pre-trained models, which presents challenges such as scalability, inefficiency, and the need for model modifications, especially with black-box APIs lik… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  16. arXiv:2405.18027  [pdf, other

    cs.CL

    TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models

    Authors: Jaewoo Ahn, Taehyun Lee, Junyoung Lim, Jin-Hwa Kim, Sangdoo Yun, Hwaran Lee, Gunhee Kim

    Abstract: While Large Language Models (LLMs) can serve as agents to simulate human behaviors (i.e., role-playing agents), we emphasize the importance of point-in-time role-playing. This situates characters at specific moments in the narrative progression for three main reasons: (i) enhancing users' narrative immersion, (ii) avoiding spoilers, and (iii) fostering engagement in fandom role-playing. To accurat… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: ACL 2024 Findings. Code and dataset are released at https://ahnjaewoo.github.io/timechara

  17. arXiv:2405.17995  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture

    Authors: Shentong Mo, Sukmin Yun

    Abstract: The joint-embedding predictive architecture (JEPA) recently has shown impressive results in extracting visual representations from unlabeled imagery under a masking strategy. However, we reveal its disadvantages, notably its insufficient understanding of local semantics. This deficiency originates from masked modeling in the embedding space, resulting in a reduction of discriminative power and can… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  18. arXiv:2405.16907  [pdf, other

    cs.AI cs.LG

    GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning

    Authors: Jaewoo Lee, Sujin Yun, Taeyoung Yun, Jinkyoo Park

    Abstract: Offline Reinforcement Learning (Offline RL) presents challenges of learning effective decision-making policies from static datasets without any online interactions. Data augmentation techniques, such as noise injection and data synthesizing, aim to improve Q-function approximation by smoothing the learned state-action region. However, these methods often fall short of directly improving the qualit… ▽ More

    Submitted 12 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted (Spotlight) to ICLR 2024 Workshop on Generative Models for Decision Making. Jaewoo Lee and Sujin Yun are equal contribution authors

  19. arXiv:2405.13396  [pdf, other

    cs.LG stat.ML

    Why In-Context Learning Transformers are Tabular Data Classifiers

    Authors: Felix den Breejen, Sangmin Bae, Stephen Cha, Se-Young Yun

    Abstract: The recently introduced TabPFN pretrains an In-Context Learning (ICL) transformer on synthetic data to perform tabular data classification. As synthetic data does not share features or labels with real-world data, the underlying mechanism that contributes to the success of this method remains unclear. This study provides an explanation by demonstrating that ICL-transformers acquire the ability to… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 9 pages main body, 22 pages total. Preprint under review

  20. arXiv:2405.07986  [pdf, other

    astro-ph.GA astro-ph.CO

    JWST's PEARLS: resolved study of the stellar and dust components in starburst galaxies at cosmic noon

    Authors: M. Polletta, B. L. Frye, N. Garuda, S. P. Willner, S. Berta, R. Kneissl, H. Dole, R. A. Jansen, M. D. Lehnert, S. H. Cohen, J. Summers, R. A. Windhorst, J. C. J. D'Silva, A. M. Koekemoer, D. Coe, C. J. Conselice, S. P. Driver, N. A. Grogin, M. A. Marshall, M. Nonino, R. Ortiz III, N. Pirzkal, A. Robotham, R. E. Ryan, Jr., C. N. A. Willmer , et al. (13 additional authors not shown)

    Abstract: Dusty star-forming galaxies (DSFGs) contribute significantly to the stellar buildup at cosmic noon. Major mergers and gas accretion are often invoked to explain DSFGs' prodigious star-formation rates (SFRs) and large stellar masses. We conducted a spatially-resolved morphological analysis of the rest-frame UV/NIR emission in three DSFGs at z~2.5. Initially discovered as CO emitters by NOEMA observ… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 24 pages, 21 figures + appendix. Submitted to A&A. Comments welcome!

  21. arXiv:2405.07857  [pdf, other

    cs.CV cs.AI

    Synergistic Integration of Coordinate Network and Tensorial Feature for Improving Neural Radiance Fields from Sparse Inputs

    Authors: Mingyu Kim, Jun-Seong Kim, Se-Young Yun, Jin-Hwa Kim

    Abstract: The multi-plane representation has been highlighted for its fast training and inference across static and dynamic neural radiance fields. This approach constructs relevant features via projection onto learnable grids and interpolating adjacent vertices. However, it has limitations in capturing low-frequency details and tends to overuse parameters for low-frequency features due to its bias toward f… ▽ More

    Submitted 5 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: ICML2024 ; Project page is accessible at https://mingyukim87.github.io/SynergyNeRF ; Code is available at https://github.com/MingyuKim87/SynergyNeRF

  22. arXiv:2405.04819  [pdf, other

    cs.CL cs.AI

    DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature

    Authors: Dawei Li, Shu Yang, Zhen Tan, Jae Young Baik, Sukwon Yun, Joseph Lee, Aaron Chacko, Bojian Hou, Duy Duong-Tran, Ying Ding, Huan Liu, Li Shen, Tianlong Chen

    Abstract: Recent advancements in large language models (LLMs) have achieved promising performances across various applications. Nonetheless, the ongoing challenge of integrating long-tail knowledge continues to impede the seamless adoption of LLMs in specialized domains. In this work, we introduce DALK, a.k.a. Dynamic Co-Augmentation of LLMs and KG, to address this limitation and demonstrate its ability on… ▽ More

    Submitted 12 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: Under Review; Incorrect author name revised

  23. arXiv:2405.04497  [pdf, other

    cs.HC

    Unveiling Disparities in Web Task Handling Between Human and Web Agent

    Authors: Kihoon Son, Jinhyeon Kwon, DaEun Choi, Tae Soo Kim, Young-Ho Kim, Sangdoo Yun, Juho Kim

    Abstract: With the advancement of Large-Language Models (LLMs) and Large Vision-Language Models (LVMs), agents have shown significant capabilities in various tasks, such as data analysis, gaming, or code generation. Recently, there has been a surge in research on web agents, capable of performing tasks within the web environment. However, the web poses unforeseeable scenarios, challenging the generalizabili… ▽ More

    Submitted 8 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  24. arXiv:2405.01686  [pdf, other

    cs.CL cs.AI

    Automatically Extracting Numerical Results from Randomized Controlled Trials with Large Language Models

    Authors: Hye Sun Yun, David Pogrebitskiy, Iain J. Marshall, Byron C. Wallace

    Abstract: Meta-analyses statistically aggregate the findings of different randomized controlled trials (RCTs) to assess treatment effectiveness. Because this yields robust estimates of treatment effectiveness, results from meta-analyses are considered the strongest form of evidence. However, rigorous evidence syntheses are time-consuming and labor-intensive, requiring manual extraction of data from individu… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 24 pages, 7 figures, 6 tables

  25. arXiv:2405.01588  [pdf, other

    cs.CL cs.AI

    Towards Unbiased Evaluation of Detecting Unanswerable Questions in EHRSQL

    Authors: Yongjin Yang, Sihyeon Kim, SangMook Kim, Gyubok Lee, Se-Young Yun, Edward Choi

    Abstract: Incorporating unanswerable questions into EHR QA systems is crucial for testing the trustworthiness of a system, as providing non-existent responses can mislead doctors in their diagnoses. The EHRSQL dataset stands out as a promising benchmark because it is the only dataset that incorporates unanswerable questions in the EHR QA system alongside practical questions. However, in this work, we identi… ▽ More

    Submitted 28 April, 2024; originally announced May 2024.

    Comments: DPFM Workshop, ICLR 2024

  26. arXiv:2404.17507  [pdf, other

    cs.CV

    HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts

    Authors: Wonjae Kim, Sanghyuk Chun, Taekyung Kim, Dongyoon Han, Sangdoo Yun

    Abstract: In an era where the volume of data drives the effectiveness of self-supervised learning, the specificity and clarity of data semantics play a crucial role in model training. Addressing this, we introduce HYPerbolic Entailment filtering (HYPE), a novel methodology designed to meticulously extract modality-wise meaningful and well-aligned data from extensive, noisy image-text pair datasets. Our appr… ▽ More

    Submitted 16 July, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: ECCV 2024; 33pages, 4.5MB

  27. arXiv:2404.14202  [pdf, other

    cs.LG stat.ML

    An Adaptive Approach for Infinitely Many-armed Bandits under Generalized Rotting Constraints

    Authors: Jung-hun Kim, Milan Vojnovic, Se-Young Yun

    Abstract: In this study, we consider the infinitely many-armed bandit problems in a rested rotting setting, where the mean reward of an arm may decrease with each pull, while otherwise, it remains unchanged. We explore two scenarios regarding the rotting of rewards: one in which the cumulative amount of rotting is bounded by $V_T$, referred to as the slow-rotting case, and the other in which the cumulative… ▽ More

    Submitted 24 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  28. arXiv:2404.13949  [pdf, other

    cs.CV cs.RO

    PeLiCal: Targetless Extrinsic Calibration via Penetrating Lines for RGB-D Cameras with Limited Co-visibility

    Authors: Jaeho Shin, Seungsang Yun, Ayoung Kim

    Abstract: RGB-D cameras are crucial in robotic perception, given their ability to produce images augmented with depth data. However, their limited FOV often requires multiple cameras to cover a broader area. In multi-camera RGB-D setups, the goal is typically to reduce camera overlap, optimizing spatial coverage with as few cameras as possible. The extrinsic calibration of these systems introduces additiona… ▽ More

    Submitted 23 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  29. arXiv:2404.11848  [pdf, other

    cs.CV

    Partial Large Kernel CNNs for Efficient Super-Resolution

    Authors: Dongheon Lee, Seokju Yun, Youngmin Ro

    Abstract: Recently, in the super-resolution (SR) domain, transformers have outperformed CNNs with fewer FLOPs and fewer parameters since they can deal with long-range dependency and adaptively adjust weights based on instance. In this paper, we demonstrate that CNNs, although less focused on in the current SR domain, surpass Transformers in direct efficiency measures. By incorporating the advantages of Tran… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  30. arXiv:2404.11025  [pdf, other

    cs.CV

    NeuroHash: A Hyperdimensional Neuro-Symbolic Framework for Spatially-Aware Image Hashing and Retrieval

    Authors: Sanggeon Yun, Ryozo Masukawa, SungHeon Jeong, Mohsen Imani

    Abstract: Customizable image retrieval from large datasets remains a critical challenge, particularly when preserving spatial relationships within images. Traditional hashing methods, primarily based on deep learning, often fail to capture spatial information adequately and lack transparency. In this paper, we introduce NeuroHash, a novel neuro-symbolic framework leveraging Hyperdimensional Computing (HDC)… ▽ More

    Submitted 22 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  31. PEARLS: Discovery of Point-Source Features Within Galaxies in the North Ecliptic Pole Time Domain Field

    Authors: Rafael Ortiz III, Rogier A. Windhorst, Seth H. Cohen, S. P. Willner, Rolf A. Jansen, Timothy Carleton, Patrick S. Kamieneski, Michael J. Rutkowski, Brent Smith, Jake Summers, Tyler J. McCabe, Rosalia O'Brien, Jose M. Diego, Min S. Yun, Jordan C. J. D'Silva, Juno Li, Hansung B. Gim, Nimish P. Hathi, Benne W. Holwerda, Adi Zitrin, Cheng Cheng, Noah J. McLeod, Christopher J. Conselice, Simon P. Driver, Haojing Yan , et al. (9 additional authors not shown)

    Abstract: $ $The first public 0.9-4.4 $μ$m NIRCam images of the North Ecliptic Pole (NEP) Time Domain Field (TDF) uncovered many galaxies that display point-source features in their cores as seen in the longer wavelength filters. We visually identified a sample of 66 galaxies ($\sim$1 galaxy per arcmin$^2… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 11 pages, 6 figures, 1 table

  32. arXiv:2404.10308  [pdf, other

    cs.LG cs.AI

    Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

    Authors: Woomin Song, Seunghyuk Oh, Sangwoo Mo, Jaehyung Kim, Sukmin Yun, Jung-Woo Ha, Jinwoo Shin

    Abstract: Large language models (LLMs) have shown remarkable performance in various natural language processing tasks. However, a primary constraint they face is the context limit, i.e., the maximum number of tokens they can process. Previous works have explored architectural changes and modifications in positional encoding to relax the constraint, but they often require expensive training or do not address… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted to ICLR 2024. The first two authors contributed equally

  33. arXiv:2404.09207  [pdf, other

    cs.LG

    DEGNN: Dual Experts Graph Neural Network Handling Both Edge and Node Feature Noise

    Authors: Tai Hasegawa, Sukwon Yun, Xin Liu, Yin Jun Phua, Tsuyoshi Murata

    Abstract: Graph Neural Networks (GNNs) have achieved notable success in various applications over graph data. However, recent research has revealed that real-world graphs often contain noise, and GNNs are susceptible to noise in the graph. To address this issue, several Graph Structure Learning (GSL) models have been introduced. While GSL models are tailored to enhance robustness against edge noise through… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: PAKDD 2024, the code is available at https://github.com/TaiHasegawa/DEGNN

  34. arXiv:2404.08058  [pdf, other

    astro-ph.GA

    Birds of a Feather: Resolving Stellar Mass Assembly With JWST/NIRCam in a Pair of Kindred $z \sim 2$ Dusty Star-forming Galaxies Lensed by the PLCK G165.7+67.0 Cluster

    Authors: Patrick S. Kamieneski, Brenda L. Frye, Rogier A. Windhorst, Kevin C. Harrington, Min S. Yun, Allison Noble, Massimo Pascale, Nicholas Foo, Seth H. Cohen, Rolf A. Jansen, Timothy Carleton, Anton M. Koekemoer, Christopher N. A. Willmer, Jake S. Summers, Nikhil Garuda, Reagen Leimbach, Benne W. Holwerda, Justin D. R. Pierel, Eric F. Jimenez-Andrade, S. P. Willner, Belen Alcalde Pampliega, Amit Vishwas, William C. Keel, Q. Daniel Wang, Cheng Cheng , et al. (16 additional authors not shown)

    Abstract: We present a new parametric lens model for the G165.7+67.0 galaxy cluster, which was discovered with $Planck$ through its bright submillimeter flux, originating from a pair of extraordinary dusty star-forming galaxies (DSFGs) at $z\approx 2.2$. Using JWST and interferometric mm/radio observations, we characterize the intrinsic physical properties of the DSFGs, which are separated by only… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 47 pages, 21 figures, 5 tables. Submitted to ApJ, comments welcome!

  35. arXiv:2403.19997  [pdf, other

    cond-mat.soft physics.app-ph

    Size-dependent fracture in elastomers: experiments and continuum modeling

    Authors: Jaehee Lee, Jeongun Lee, Seounghee Yun, Sanha Kim, Shawn A. Chester, Hansohl Cho

    Abstract: Elastomeric materials display a complicated set of stretchability and fracture properties that strongly depend on the flaw size, which has long been of interest to engineers and materials scientists. Here, we combine experiments and numerical simulations for a comprehensive understanding of the nonlocal, size-dependent features of fracture in elastomers. We show the size-dependent fracture behavio… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

  36. arXiv:2403.19522  [pdf, other

    cs.LG cs.CV

    Model Stock: All we need is just a few fine-tuned models

    Authors: Dong-Hwan Jang, Sangdoo Yun, Dongyoon Han

    Abstract: This paper introduces an efficient fine-tuning method for large pre-trained models, offering strong in-distribution (ID) and out-of-distribution (OOD) performance. Breaking away from traditional practices that need a multitude of fine-tuned models for averaging, our approach employs significantly fewer models to achieve final weights yet yield superior accuracy. Drawing from key insights in the we… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Code at https://github.com/naver-ai/model-stock

  37. arXiv:2403.18260  [pdf, other

    cs.CV cs.CL

    Toward Interactive Regional Understanding in Vision-Large Language Models

    Authors: Jungbeom Lee, Sanghyuk Chun, Sangdoo Yun

    Abstract: Recent Vision-Language Pre-training (VLP) models have demonstrated significant advancements. Nevertheless, these models heavily rely on image-text pairs that capture only coarse and global information of an image, leading to a limitation in their regional understanding ability. In this work, we introduce \textbf{RegionVLM}, equipped with explicit regional modeling capabilities, allowing them to un… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: NAACL 2024 Main Conference

  38. arXiv:2403.14027  [pdf, other

    cs.CV

    EcoSense: Energy-Efficient Intelligent Sensing for In-Shore Ship Detection through Edge-Cloud Collaboration

    Authors: Wenjun Huang, Hanning Chen, Yang Ni, Arghavan Rezvani, Sanggeon Yun, Sungheon Jeon, Eric Pedley, Mohsen Imani

    Abstract: Detecting marine objects inshore presents challenges owing to algorithmic intricacies and complexities in system deployment. We propose a difficulty-aware edge-cloud collaborative sensing system that splits the task into object localization and fine-grained classification. Objects are classified either at the edge or within the cloud, based on their estimated difficulty. The framework comprises a… ▽ More

    Submitted 26 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  39. arXiv:2403.13298  [pdf, other

    cs.CV cs.LG

    Rotary Position Embedding for Vision Transformer

    Authors: Byeongho Heo, Song Park, Dongyoon Han, Sangdoo Yun

    Abstract: Rotary Position Embedding (RoPE) performs remarkably on language models, especially for length extrapolation of Transformers. However, the impacts of RoPE on computer vision domains have been underexplored, even though RoPE appears capable of enhancing Vision Transformer (ViT) performance in a way similar to the language domain. This study provides a comprehensive analysis of RoPE when applied to… ▽ More

    Submitted 16 July, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: Accepted to ECCV 2024

  40. arXiv:2403.08108  [pdf, other

    cs.CV

    TaskCLIP: Extend Large Vision-Language Model for Task Oriented Object Detection

    Authors: Hanning Chen, Wenjun Huang, Yang Ni, Sanggeon Yun, Fei Wen, Hugo Latapie, Mohsen Imani

    Abstract: Task-oriented object detection aims to find objects suitable for accomplishing specific tasks. As a challenging task, it requires simultaneous visual data processing and reasoning under ambiguous semantics. Recent solutions are mainly all-in-one models. However, the object detection backbones are pre-trained without text supervision. Thus, to incorporate task requirements, their intricate models u… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  41. arXiv:2403.07145   

    physics.optics physics.app-ph

    Electrically Programmable Pixelated Graphene-Integrated Plasmonic Metasurfaces for Coherent Mid-Infrared Emission

    Authors: Xiu Liu, Yibai Zhong, Zexiao Wang, Tianyi Huang, Sen Lin, Jingyi Zou, Haozhe Wang, Zhien Wang, Zhuo Li, Xiao Luo, Rui Cheng, Jiayu Li, Hyeong Seok Yun, Han Wang, Jing Kong, Xu Zhang, Sheng Shen

    Abstract: Active metasurfaces have recently emerged as compact, lightweight, and efficient platforms for dynamic control of electromagnetic fields and optical responses. However, the complexities associated with their post-fabrication tunability significantly hinder their widespread applications, especially for the mid-infrared range due to material scarcity and design intricacy. Here, we experimentally dem… ▽ More

    Submitted 6 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Needs more updates for the experiments

  42. arXiv:2403.06342  [pdf, other

    math.NA cs.LG

    Separable Physics-informed Neural Networks for Solving the BGK Model of the Boltzmann Equation

    Authors: Jaemin Oh, Seung Yeon Cho, Seok-Bae Yun, Eunbyung Park, Youngjoon Hong

    Abstract: In this study, we introduce a method based on Separable Physics-Informed Neural Networks (SPINNs) for effectively solving the BGK model of the Boltzmann equation. While the mesh-free nature of PINNs offers significant advantages in handling high-dimensional partial differential equations (PDEs), challenges arise when applying quadrature rules for accurate integral evaluation in the BGK operator, w… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    MSC Class: 68T20; 35R09

  43. arXiv:2403.05973  [pdf, other

    cs.CL cs.AI cs.LG

    Calibrating Large Language Models Using Their Generations Only

    Authors: Dennis Ulmer, Martin Gubri, Hwaran Lee, Sangdoo Yun, Seong Joon Oh

    Abstract: As large language models (LLMs) are increasingly deployed in user-facing applications, building trust and maintaining safety by accurately quantifying a model's confidence in its prediction becomes even more important. However, finding effective ways to calibrate LLMs - especially when the only interface to the models is their generated text - remains a challenge. We propose APRICOT (auxiliary pre… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  44. arXiv:2403.05763  [pdf, other

    cs.AR cs.AI cs.LG

    HDReason: Algorithm-Hardware Codesign for Hyperdimensional Knowledge Graph Reasoning

    Authors: Hanning Chen, Yang Ni, Ali Zakeri, Zhuowen Zou, Sanggeon Yun, Fei Wen, Behnam Khaleghi, Narayan Srinivasa, Hugo Latapie, Mohsen Imani

    Abstract: In recent times, a plethora of hardware accelerators have been put forth for graph learning applications such as vertex classification and graph classification. However, previous works have paid little attention to Knowledge Graph Completion (KGC), a task that is well-known for its significantly higher algorithm complexity. The state-of-the-art KGC solutions based on graph convolution neural netwo… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  45. arXiv:2402.19321  [pdf, ps, other

    astro-ph.SR astro-ph.EP

    Connections between Planetary Populations and the Chemical Characteristics of their Host Stars

    Authors: Sol Yun, Young Sun Lee, Young Kwang Kim, Timothy C. Beers, Togay Berfin, Dongwook Lim

    Abstract: Chemical anomalies in planet-hosting stars (PHSs) are studied in order to assess how the planetary nature and multiplicity affect the atmospheric chemical abundances of their host stars. We employ APOGEE DR17 to select thin-disk stars of the Milky Way, and cross-match them with the Kepler Input Catalog to identify confirmed PHSs, which results in 227 PHSs with available chemical-abundance ratios f… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 14 pages, 6 figures

  46. arXiv:2402.12991  [pdf, other

    cs.LG cs.AI cs.CL cs.CR

    TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification

    Authors: Martin Gubri, Dennis Ulmer, Hwaran Lee, Sangdoo Yun, Seong Joon Oh

    Abstract: Large Language Model (LLM) services and models often come with legal rules on who can use them and how they must use them. Assessing the compliance of the released LLMs is crucial, as these rules protect the interests of the LLM contributor and prevent misuse. In this context, we describe the novel fingerprinting problem of Black-box Identity Verification (BBIV). The goal is to determine whether a… ▽ More

    Submitted 6 June, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: Accepted at ACL 2024 (findings)

  47. Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning

    Authors: Haeju Lee, Minchan Jeong, Se-Young Yun, Kee-Eung Kim

    Abstract: Prompt tuning, in which prompts are optimized to adapt large-scale pre-trained language models to downstream tasks instead of fine-tuning the full model parameters, has been shown to be particularly effective when the prompts are trained in a multi-task transfer learning setting. These methods generally involve individually training prompts for each source task and then aggregating them to provide… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: The first two authors equally contributed to this work. Findings of EMNLP 2023

  48. arXiv:2402.06974  [pdf, other

    cs.LG

    Hypernetwork-Driven Model Fusion for Federated Domain Generalization

    Authors: Marc Bartholet, Taehyeon Kim, Ami Beuret, Se-Young Yun, Joachim M. Buhmann

    Abstract: Federated Learning (FL) faces significant challenges with domain shifts in heterogeneous data, degrading performance. Traditional domain generalization aims to learn domain-invariant features, but the federated nature of model averaging often limits this due to its linear aggregation of local learning. To address this, we propose a robust framework, coined as hypernetwork-based Federated Fusion (h… ▽ More

    Submitted 28 May, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

  49. arXiv:2402.05353  [pdf, other

    cs.LG cs.DC

    Revisiting Early-Learning Regularization When Federated Learning Meets Noisy Labels

    Authors: Taehyeon Kim, Donggyu Kim, Se-Young Yun

    Abstract: In the evolving landscape of federated learning (FL), addressing label noise presents unique challenges due to the decentralized and diverse nature of data collection across clients. Traditional centralized learning approaches to mitigate label noise are constrained in FL by privacy concerns and the heterogeneity of client data. This paper revisits early-learning regularization, introducing an inn… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  50. arXiv:2402.03898  [pdf, other

    cs.CL cs.AI cs.LG

    DistiLLM: Towards Streamlined Distillation for Large Language Models

    Authors: Jongwoo Ko, Sungnyun Kim, Tianyi Chen, Se-Young Yun

    Abstract: Knowledge distillation (KD) is widely used for compressing a teacher model to a smaller student model, reducing its inference cost and memory footprint while preserving model capabilities. However, current KD methods for auto-regressive sequence models (e.g., large language models) suffer from missing a standardized objective function. Moreover, the recent use of student-generated outputs to addre… ▽ More

    Submitted 3 July, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: ICML 2024; Code is available at https://github.com/jongwooko/distillm