Skip to main content

Showing 1–23 of 23 results for author: Go, H

  1. arXiv:2405.17825  [pdf, other

    cs.CV cs.AI

    Diffusion Model Patching via Mixture-of-Prompts

    Authors: Seokil Ham, Sangmin Woo, Jin-Young Kim, Hyojun Go, Byeongjun Park, Changick Kim

    Abstract: We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while keeping the original model frozen. The effectiveness of DMP is not merely due to the addition of parameters but stems from… ▽ More

    Submitted 30 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Project page: https://sangminwoo.github.io/DMP/

  2. arXiv:2404.16111  [pdf, other

    cs.LO

    Forcing, Transition Algebras, and Calculi

    Authors: Hashimoto Go, Daniel Găină, Ionuţ Ţuţu

    Abstract: We bring forward a logical system of transition algebras that enhances many-sorted first-order logic using features from dynamic logics. The sentences we consider include compositions, unions, and transitive closures of transition relations, which are treated similarly to the actions used in dynamic logics in order to define necessity and possibility operators. This leads to a higher degree of exp… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  3. arXiv:2404.14687  [pdf, other

    cs.MM cs.AI cs.CL cs.CV

    Pegasus-v1 Technical Report

    Authors: Raehyuk Jung, Hyojun Go, Jaehyuk Yi, Jiho Jang, Daniel Kim, Jay Suh, Aiden Lee, Cooper Han, Jae Lee, Jeff Kim, Jin-Young Kim, Junwan Kim, Kyle Park, Lucas Lee, Mars Ha, Minjoon Seo, Abraham Jo, Ed Park, Hassan Kianinejad, SJ Kim, Tony Moon, Wade Jeong, Andrei Popescu, Esther Kim, EK Yoon , et al. (19 additional authors not shown)

    Abstract: This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's archi… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  4. arXiv:2403.10348  [pdf, other

    cs.CV cs.LG

    Denoising Task Difficulty-based Curriculum for Training Diffusion Models

    Authors: Jin-Young Kim, Hyojun Go, Soonwoo Kwon, Hyun-Gyoon Kim

    Abstract: Diffusion-based generative models have emerged as powerful tools in the realm of generative modeling. Despite extensive research on denoising across various timesteps and noise levels, a conflict persists regarding the relative difficulties of the denoising tasks. While various studies argue that lower timesteps present more challenging tasks, others contend that higher timesteps are more difficul… ▽ More

    Submitted 15 July, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  5. arXiv:2403.09176  [pdf, other

    cs.CV

    Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts

    Authors: Byeongjun Park, Hyojun Go, Jin-Young Kim, Sangmin Woo, Seokil Ham, Changick Kim

    Abstract: Diffusion models have achieved remarkable success across a range of generative tasks. Recent efforts to enhance diffusion model architectures have reimagined them as a form of multi-task learning, where each task corresponds to a denoising task at a specific noise level. While these efforts have focused on parameter isolation and task routing, they fall short of capturing detailed inter-task relat… ▽ More

    Submitted 10 July, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Project Page: https://byeongjun-park.github.io/Switch-DiT/

  6. arXiv:2312.15980  [pdf, other

    cs.CV cs.AI

    HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D

    Authors: Sangmin Woo, Byeongjun Park, Hyojun Go, Jin-Young Kim, Changick Kim

    Abstract: Recent progress in single-image 3D generation highlights the importance of multi-view coherency, leveraging 3D priors from large-scale diffusion models pretrained on Internet-scale images. However, the aspect of novel-view diversity remains underexplored within the research landscape due to the ambiguity in converting a 2D image into 3D content, where numerous potential shapes can emerge. Here, we… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: Project page: https://byeongjun-park.github.io/HarmonyView/

  7. arXiv:2310.07138  [pdf, other

    cs.CV cs.AI

    Denoising Task Routing for Diffusion Models

    Authors: Byeongjun Park, Sangmin Woo, Hyojun Go, Jin-Young Kim, Changick Kim

    Abstract: Diffusion models generate highly realistic images by learning a multi-step denoising process, naturally embodying the principles of multi-task learning (MTL). Despite the inherent connection between diffusion models and MTL, there remains an unexplored area in designing neural architectures that explicitly incorporate MTL into the framework of diffusion models. In this paper, we present Denoising… ▽ More

    Submitted 20 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: ICLR 2024

  8. arXiv:2306.04990  [pdf, other

    cs.CV

    Multi-Architecture Multi-Expert Diffusion Models

    Authors: Yunsung Lee, Jin-Young Kim, Hyojun Go, Myeongho Jeong, Shinhyeok Oh, Seungtaek Choi

    Abstract: In this paper, we address the performance degradation of efficient diffusion models by introducing Multi-architecturE Multi-Expert diffusion models (MEME). We identify the need for tailored operations at different time-steps in diffusion processes and leverage this insight to create compact yet high-performing models. MEME assigns distinct architectures to different time-step intervals, balancing… ▽ More

    Submitted 27 December, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: To be published in the AAAI 2024 Proceedings Main Track

  9. arXiv:2306.04175  [pdf, other

    cs.CV

    ScoreCL: Augmentation-Adaptive Contrastive Learning via Score-Matching Function

    Authors: Jin-Young Kim, Soonwoo Kwon, Hyojun Go, Yunsung Lee, Seungtaek Choi, Hyun-Gyoon Kim

    Abstract: Self-supervised contrastive learning (CL) has achieved state-of-the-art performance in representation learning by minimizing the distance between positive pairs while maximizing that of negative ones. Recently, it has been verified that the model learns better representation with diversely augmented positive pairs because they enable the model to be more view-invariant. However, only a few studies… ▽ More

    Submitted 15 July, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

  10. arXiv:2306.00354  [pdf, other

    cs.CV cs.AI cs.LG

    Addressing Negative Transfer in Diffusion Models

    Authors: Hyojun Go, JinYoung Kim, Yunsung Lee, Seunghyun Lee, Shinhyeok Oh, Hyeongdon Moon, Seungtaek Choi

    Abstract: Diffusion-based generative models have achieved remarkable success in various domains. It trains a shared model on denoising tasks that encompass different noise levels simultaneously, representing a form of multi-task learning (MTL). However, analyzing and improving diffusion models from an MTL perspective remains under-explored. In particular, MTL can sometimes lead to the well-known phenomenon… ▽ More

    Submitted 30 December, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Neurips 2023. Project page: https://gohyojun15.github.io/ANT_diffusion/

  11. arXiv:2305.18977  [pdf, other

    cs.CL

    Cross Encoding as Augmentation: Towards Effective Educational Text Classification

    Authors: Hyun Seung Lee, Seungtaek Choi, Yunsung Lee, Hyeongdon Moon, Shinhyeok Oh, Myeongho Jeong, Hyojun Go, Christian Wallraven

    Abstract: Text classification in education, usually called auto-tagging, is the automated process of assigning relevant tags to educational content, such as questions and textbooks. However, auto-tagging suffers from a data scarcity problem, which stems from two major challenges: 1) it possesses a large tag space and 2) it is multi-label. Though a retrieval approach is reportedly good at low-resource scenar… ▽ More

    Submitted 30 May, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of ACL2023

  12. arXiv:2305.16626  [pdf, other

    cs.CL cs.AI

    Evaluation of Question Generation Needs More References

    Authors: Shinhyeok Oh, Hyojun Go, Hyeongdon Moon, Yunsung Lee, Myeongho Jeong, Hyun Seung Lee, Seungtaek Choi

    Abstract: Question generation (QG) is the task of generating a valid and fluent question based on a given context and the target answer. According to various purposes, even given the same context, instructors can ask questions about different concepts, and even the same concept can be written in different ways. However, the evaluation for QG usually depends on single reference-based similarity metrics, such… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of ACL2023

    ACM Class: I.2.7

  13. arXiv:2304.08204  [pdf, other

    cs.CV

    Learning Geometry-aware Representations by Sketching

    Authors: Hyundo Lee, Inwoo Hwang, Hyunsung Go, Won-Seok Choi, Kibeom Kim, Byoung-Tak Zhang

    Abstract: Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such information into a visual representation of a scene, we propose learning to represent the scene by sketching, inspired by human behavior. Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strok… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  14. arXiv:2212.05973  [pdf, other

    cs.CV

    Towards Practical Plug-and-Play Diffusion Models

    Authors: Hyojun Go, Yunsung Lee, Jin-Young Kim, Seunghyun Lee, Myeongho Jeong, Hyun Seung Lee, Seungtaek Choi

    Abstract: Diffusion-based generative models have achieved remarkable success in image generation. Their guidance formulation allows an external model to plug-and-play control the generation process for various tasks without finetuning the diffusion model. However, the direct use of publicly available off-the-shelf models for guidance fails due to their poor performance on noisy inputs. For that, the existin… ▽ More

    Submitted 27 March, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

    Comments: CVPR 2023 camera-ready

  15. arXiv:2210.01370  [pdf, other

    cs.CV

    Towards Flexible Inductive Bias via Progressive Reparameterization Scheduling

    Authors: Yunsung Lee, Gyuseong Lee, Kwangrok Ryoo, Hyojun Go, Jihye Park, Seungryong Kim

    Abstract: There are two de facto standard architectures in recent computer vision: Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). Strong inductive biases of convolutions help the model learn sample effectively, but such strong biases also limit the upper bound of CNNs when sufficient data are available. On the contrary, ViT is inferior to CNNs for small data but superior for sufficient… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: Accepted at VIPriors ECCVW 2022, camera-ready version

  16. arXiv:2209.07105  [pdf, other

    cs.CV cs.AI

    Bridging Implicit and Explicit Geometric Transformation for Single-Image View Synthesis

    Authors: Byeongjun Park, Hyojun Go, Changick Kim

    Abstract: Creating novel views from a single image has achieved tremendous strides with advanced autoregressive models, as unseen regions have to be inferred from the visible scene contents. Although recent methods generate high-quality novel views, synthesizing with only one explicit or implicit 3D geometry has a trade-off between two objectives that we call the "seesaw" problem: 1) preserving reprojected… ▽ More

    Submitted 15 March, 2024; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: TPAMI 2024

  17. arXiv:2206.10878  [pdf, other

    cs.CV

    Feature Re-calibration based Multiple Instance Learning for Whole Slide Image Classification

    Authors: Philip Chikontwe, Soo Jeong Nam, Heounjeong Go, Meejeong Kim, Hyun Jung Sung, Sang Hyun Park

    Abstract: Whole slide image (WSI) classification is a fundamental task for the diagnosis and treatment of diseases; but, curation of accurate labels is time-consuming and limits the application of fully-supervised methods. To address this, multiple instance learning (MIL) is a popular method that poses classification as a weakly supervised learning task with slide-level labels only. While current MIL method… ▽ More

    Submitted 21 July, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

    Comments: MICCAI 2022

  18. arXiv:2205.07039  [pdf, other

    cs.LG

    Fake News Quick Detection on Dynamic Heterogeneous Information Networks

    Authors: Jin Ho Go, Alina Sari, Jiaojiao Jiang, Shuiqiao Yang, Sanjay Jha

    Abstract: The spread of fake news has caused great harm to society in recent years. So the quick detection of fake news has become an important task. Some current detection methods often model news articles and other related components as a static heterogeneous information network (HIN) and use expensive message-passing algorithms. However, in the real-world, quickly identifying fake news is of great signif… ▽ More

    Submitted 14 May, 2022; originally announced May 2022.

  19. arXiv:2111.04371  [pdf, other

    cs.CV cs.CR cs.LG

    Geometrically Adaptive Dictionary Attack on Face Recognition

    Authors: Junyoung Byun, Hyojun Go, Changick Kim

    Abstract: CNN-based face recognition models have brought remarkable performance improvement, but they are vulnerable to adversarial perturbations. Recent studies have shown that adversaries can fool the models even if they can only access the models' hard-label output. However, since many queries are needed to find imperceptible adversarial noise, reducing the number of queries is crucial for these attacks.… ▽ More

    Submitted 8 November, 2021; originally announced November 2021.

    Comments: Accepted at WACV 2022

  20. arXiv:2111.04310  [pdf, other

    cs.CV

    Residual-Guided Learning Representation for Self-Supervised Monocular Depth Estimation

    Authors: Byeongjun Park, Taekyung Kim, Hyojun Go, Changick Kim

    Abstract: Photometric consistency loss is one of the representative objective functions commonly used for self-supervised monocular depth estimation. However, this loss often causes unstable depth predictions in textureless or occluded regions due to incorrect guidance. Recent self-supervised learning approaches tackle this issue by utilizing feature representations explicitly learned from auto-encoders, ex… ▽ More

    Submitted 8 November, 2021; originally announced November 2021.

    Comments: 5 pages, 2 figures

  21. arXiv:2109.02259  [pdf, other

    cs.CV

    CTRL-C: Camera calibration TRansformer with Line-Classification

    Authors: Jinwoo Lee, Hyunsung Go, Hyunjoon Lee, Sunghyun Cho, Minhyuk Sung, Junho Kim

    Abstract: Single image camera calibration is the task of estimating the camera parameters from a single input image, such as the vanishing points, focal length, and horizon line. In this work, we propose Camera calibration TRansformer with Line-Classification (CTRL-C), an end-to-end neural network-based approach to single image camera calibration, which directly estimates the camera parameters from an image… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

    Comments: Accepted to ICCV 2021

  22. arXiv:2101.04829  [pdf, other

    cs.CR cs.CV cs.LG

    On the Effectiveness of Small Input Noise for Defending Against Query-based Black-Box Attacks

    Authors: Junyoung Byun, Hyojun Go, Changick Kim

    Abstract: While deep neural networks show unprecedented performance in various tasks, the vulnerability to adversarial examples hinders their deployment in safety-critical systems. Many studies have shown that attacks are also possible even in a black-box setting where an adversary cannot access the target model's internal information. Most black-box attacks are based on queries, each of which obtains the t… ▽ More

    Submitted 8 November, 2021; v1 submitted 12 January, 2021; originally announced January 2021.

    Comments: Accepted at WACV 2022

  23. arXiv:2012.06441  [pdf, other

    cs.NE hep-th

    Deep learning architecture for decrypting information on the event horizon

    Authors: Hyunju Go

    Abstract: According to 't Hooft, to recover the invariance under the Poincaré group in a holographic setting, the evolution law for the direction orthogonal to the given surface and the time evolution law must commute. The condition of commutativity assumes that the time-evolution law on the given surface and the target surface is the same. Meanwhile, the AdS/CFT correspondence implies that there exists a m… ▽ More

    Submitted 14 July, 2024; v1 submitted 10 December, 2020; originally announced December 2020.

    Comments: 10 pages, 3 figures