Skip to main content

Showing 1–26 of 26 results for author: Jo, D

  1. arXiv:2406.12311  [pdf, other

    cs.LG

    Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models

    Authors: Dongwon Jo, Taesu Kim, Yulhwa Kim, Jae-Joon Kim

    Abstract: Binarization, which converts weight parameters to binary values, has emerged as an effective strategy to reduce the size of large language models (LLMs). However, typical binarization techniques significantly diminish linguistic effectiveness of LLMs. To address this issue, we introduce a novel binarization technique called Mixture of Scales (BinaryMoS). Unlike conventional methods, BinaryMoS empl… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2406.11674  [pdf, other

    cs.CL

    Endor: Hardware-Friendly Sparse Format for Offloaded LLM Inference

    Authors: Donghyeon Joo, Ramyad Hadidi, Soheil Feizi, Bahar Asgari

    Abstract: The increasing size of large language models (LLMs) challenges their usage on resource-constrained platforms. For example, memory on modern GPUs is insufficient to hold LLMs that are hundreds of Gigabytes in size. Offloading is a popular method to escape this constraint by storing weights of an LLM model to host CPU memory and SSD, then loading each weight to GPU before every use. In our case stud… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 14 pages, 16 figures

  3. arXiv:2405.16301  [pdf, other

    cs.CV cs.LG

    Active Learning for Finely-Categorized Image-Text Retrieval by Selecting Hard Negative Unpaired Samples

    Authors: Dae Ung Jo, Kyuewang Lee, JaeHo Chung, Jin Young Choi

    Abstract: Securing a sufficient amount of paired data is important to train an image-text retrieval (ITR) model, but collecting paired data is very expensive. To address this issue, in this paper, we propose an active learning algorithm for ITR that can collect paired data cost-efficiently. Previous studies assume that image-text pairs are given and their category labels are asked to the annotator. However,… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  4. arXiv:2310.06404  [pdf, other

    cs.CL cs.AI cs.LG

    Hexa: Self-Improving for Knowledge-Grounded Dialogue System

    Authors: Daejin Jo, Daniel Wontae Nam, Gunsoo Han, Kyoung-Woon On, Taehwan Kwon, Seungeun Rho, Sungwoong Kim

    Abstract: A common practice in knowledge-grounded dialogue generation is to explicitly utilize intermediate steps (e.g., web-search, memory retrieval) with modular approaches. However, data for such steps are often inaccessible compared to those of dialogue responses as they are unobservable in an ordinary dialogue. To fill in the absence of these data, we develop a self-improving method to improve the gene… ▽ More

    Submitted 2 April, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  5. arXiv:2307.01193  [pdf, other

    cs.LG cs.AI

    Squeezing Large-Scale Diffusion Models for Mobile

    Authors: Jiwoong Choi, Minkyu Kim, Daehyun Ahn, Taesu Kim, Yulhwa Kim, Dongwon Jo, Hyesung Jeon, Jae-Joon Kim, Hyungjun Kim

    Abstract: The emergence of diffusion models has greatly broadened the scope of high-fidelity image synthesis, resulting in notable advancements in both practical implementation and academic research. With the active adoption of the model in various real-world applications, the need for on-device deployment has grown considerably. However, deploying large diffusion models such as Stable Diffusion with more t… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: 7 pages, 8 figures, ICML 2023 Workshop on Challenges in Deployable Generative AI

  6. arXiv:2305.13973  [pdf, other

    cs.CL

    Effortless Integration of Memory Management into Open-Domain Conversation Systems

    Authors: Eunbi Choi, Kyoung-Woon On, Gunsoo Han, Sungwoong Kim, Daniel Wontae Nam, Daejin Jo, Seung Eun Rho, Taehwan Kwon, Minjoon Seo

    Abstract: Open-domain conversation systems integrate multiple conversation skills into a single system through a modular approach. One of the limitations of the system, however, is the absence of management capability for external memory. In this paper, we propose a simple method to improve BlenderBot3 by integrating memory management ability into it. Since no training data exists for this purpose, we propo… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  7. arXiv:2303.12208  [pdf, other

    cs.CV cs.CL cs.LG

    MAGVLT: Masked Generative Vision-and-Language Transformer

    Authors: Sungwoong Kim, Daejin Jo, Donghoon Lee, Jongmin Kim

    Abstract: While generative modeling on multimodal image-text data has been actively developed with large-scale paired datasets, there have been limited attempts to generate both image and text data by a single model rather than a generation of one fixed modality conditioned on the other modality. In this paper, we explore a unified generative vision-and-language (VL) model that can produce both images and t… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: CVPR 2023

  8. arXiv:2210.05409  [pdf, other

    cs.LG cs.AI

    LECO: Learnable Episodic Count for Task-Specific Intrinsic Reward

    Authors: Daejin Jo, Sungwoong Kim, Daniel Wontae Nam, Taehwan Kwon, Seungeun Rho, Jongmin Kim, Donghoon Lee

    Abstract: Episodic count has been widely used to design a simple yet effective intrinsic motivation for reinforcement learning with a sparse reward. However, the use of episodic count in a high-dimensional state space as well as over a long episode time requires a thorough state compression and fast hashing, which hinders rigorous exploitation of it in such hard and complex exploration environments. Moreove… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022

  9. arXiv:2209.08206  [pdf, other

    cs.CL cs.LG

    Selective Token Generation for Few-shot Natural Language Generation

    Authors: Daejin Jo, Taehwan Kwon, Eun-Sol Kim, Sungwoong Kim

    Abstract: Natural language modeling with limited training data is a challenging problem, and many algorithms make use of large-scale pretrained language models (PLMs) for this due to its great generalization ability. Among them, additive learning that incorporates a task-specific adapter on top of the fixed large-scale PLM has been popularly used in the few-shot setting. However, this added adapter is still… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

    Comments: COLING 2022

  10. arXiv:2208.13427  [pdf, other

    cs.LG math.AT

    The PWLR Graph Representation: A Persistent Weisfeiler-Lehman scheme with Random Walks for Graph Classification

    Authors: Sun Woo Park, Yun Young Choi, Dosang Joe, U Jin Choi, Youngho Woo

    Abstract: This paper presents the Persistent Weisfeiler-Lehman Random walk scheme (abbreviated as PWLR) for graph representations, a novel mathematical framework which produces a collection of explainable low-dimensional representations of graphs with discrete and continuous node features. The proposed scheme effectively incorporates normalized Weisfeiler-Lehman procedure, random walks on graphs, and persis… ▽ More

    Submitted 29 August, 2022; originally announced August 2022.

    Comments: Accepted to the ICML 2022 Workshop on Topology, Algebra, and Geometry in Machine Learning

  11. arXiv:2203.11889  [pdf, other

    cs.LG cs.AI cs.NE cs.SC stat.ML

    Insights From the NeurIPS 2021 NetHack Challenge

    Authors: Eric Hambro, Sharada Mohanty, Dmitrii Babaev, Minwoo Byeon, Dipam Chakraborty, Edward Grefenstette, Minqi Jiang, Daejin Jo, Anssi Kanervisto, Jongmin Kim, Sungwoong Kim, Robert Kirk, Vitaly Kurin, Heinrich Küttler, Taehwon Kwon, Donghoon Lee, Vegard Mella, Nantas Nardelli, Ivan Nazarov, Nikita Ovsov, Jack Parker-Holder, Roberta Raileanu, Karolis Ramanauskas, Tim Rocktäschel, Danielle Rothermel , et al. (4 additional authors not shown)

    Abstract: In this report, we summarize the takeaways from the first NeurIPS 2021 NetHack Challenge. Participants were tasked with developing a program or agent that can win (i.e., 'ascend' in) the popular dungeon-crawler game of NetHack by interacting with the NetHack Learning Environment (NLE), a scalable, procedurally generated, and challenging Gym environment for reinforcement learning (RL). The challeng… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: Under review at PMLR for the NeuRIPS 2021 Competition Workshop Track, 10 pages + 10 in appendices

  12. arXiv:2201.07436  [pdf, other

    cs.CV

    Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth

    Authors: Doyeon Kim, Woonghyun Ka, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim

    Abstract: Depth estimation from a single image is an important task that can be applied to various fields in computer vision, and has grown rapidly with the development of convolutional neural networks. In this paper, we propose a novel structure and training strategy for monocular depth estimation to further improve the prediction accuracy of the network. We deploy a hierarchical transformer encoder to cap… ▽ More

    Submitted 29 October, 2022; v1 submitted 19 January, 2022; originally announced January 2022.

    Comments: 11pages, 5 figures

  13. arXiv:2106.11825   

    cs.NI

    Beyond 5G URLLC Evolution: New Service Modes and Practical Considerations

    Authors: Hirley Alves, Gweon Do Jo, JaeSheung Shin, Choongil Yeh, Nurul Huda Mahmood, Carlos Lima, Chanho Yoon, Nandana Rahatheva, Ok-Sun Park, Seokki Kim, Eunah Kim, Ville Niemelä, Hyeon Woo Lee, Ari Pouttu, Hyun Kyu Chung, Matti Latva-aho

    Abstract: Ultra-reliable low latency communications (URLLC) arose to serve industrial IoT (IIoT) use cases within the 5G. Currently, it has inherent limitations to support future services. Based on state-of-the-art research and practical deployment experience, in this article, we introduce and advocate for three variants: broadband, scalable and extreme URLLC. We discuss use cases and key performance indica… ▽ More

    Submitted 16 June, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: The manuscript is undergoing extensive review

  14. arXiv:2106.07217  [pdf, other

    cs.CV cs.AI

    Influential Rank: A New Perspective of Post-training for Robust Model against Noisy Labels

    Authors: Seulki Park, Hwanjun Song, Daeho Um, Dae Ung Jo, Sangdoo Yun, Jin Young Choi

    Abstract: Deep neural network can easily overfit to even noisy labels due to its high capacity, which degrades the generalization performance of a model. To overcome this issue, we propose a new approach for learning from noisy labels (LNL) via post-training, which can significantly improve the generalization performance of any pre-trained model on noisy label data. To this end, we rather exploit the overfi… ▽ More

    Submitted 19 April, 2023; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: 15 pages

  15. arXiv:2009.02018  [pdf, other

    cs.CV

    TiVGAN: Text to Image to Video Generation with Step-by-Step Evolutionary Generator

    Authors: Doyeon Kim, Donggyu Joo, Junmo Kim

    Abstract: Advances in technology have led to the development of methods that can create desired visual multimedia. In particular, image generation using deep learning has been extensively studied across diverse fields. In comparison, video generation, especially on conditional inputs, remains a challenging and less explored area. To narrow this gap, we aim to train our model to produce a video corresponding… ▽ More

    Submitted 27 June, 2021; v1 submitted 4 September, 2020; originally announced September 2020.

    Comments: IEEE Access

  16. arXiv:2006.10222  [pdf, other

    cs.LG stat.ML

    Class-Attentive Diffusion Network for Semi-Supervised Classification

    Authors: Jongin Lim, Daeho Um, Hyung Jin Chang, Dae Ung Jo, Jin Young Choi

    Abstract: Recently, graph neural networks for semi-supervised classification have been widely studied. However, existing methods only use the information of limited neighbors and do not deal with the inter-class connections in graphs. In this paper, we propose Adaptive aggregation with Class-Attentive Diffusion (AdaCAD), a new aggregation scheme that adaptively aggregates nodes probably of the same class am… ▽ More

    Submitted 29 December, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

    Comments: Accepted to AAAI 2021

  17. arXiv:2006.00703  [pdf, other

    eess.AS cs.CL cs.SD

    Streaming Language Identification using Combination of Acoustic Representations and ASR Hypotheses

    Authors: Chander Chandak, Zeynab Raeesy, Ariya Rastrow, Yuzong Liu, Xiangyang Huang, Siyu Wang, Dong Kwon Joo, Roland Maas

    Abstract: This paper presents our modeling and architecture approaches for building a highly accurate low-latency language identification system to support multilingual spoken queries for voice assistants. A common approach to solve multilingual speech recognition is to run multiple monolingual ASR systems in parallel and rely on a language identification (LID) component that detects the input language. Con… ▽ More

    Submitted 1 June, 2020; originally announced June 2020.

    Comments: 5 pages, 2 figures

  18. arXiv:2005.02794  [pdf

    stat.ML cs.AI cs.LG

    Token Manipulation Generative Adversarial Network for Text Generation

    Authors: DaeJin Jo

    Abstract: MaskGAN opens the query for the conditional language model by filling in the blanks between the given tokens. In this paper, we focus on addressing the limitations caused by having to specify blanks to be filled. We decompose conditional text generation problem into two tasks, make-a-blank and fill-in-the-blank, and extend the former to handle more complex manipulations on the given tokens. We cas… ▽ More

    Submitted 11 May, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

    Comments: 5 pages, 2 figures

  19. arXiv:2004.07507  [pdf, other

    cs.LG cs.CV stat.ML

    Continual Learning with Extended Kronecker-factored Approximate Curvature

    Authors: Janghyeon Lee, Hyeong Gwon Hong, Donggyu Joo, Junmo Kim

    Abstract: We propose a quadratic penalty method for continual learning of neural networks that contain batch normalization (BN) layers. The Hessian of a loss function represents the curvature of the quadratic penalty function, and a Kronecker-factored approximate curvature (K-FAC) is used widely to practically compute the Hessian of a neural network. However, the approximation is not valid if there is depen… ▽ More

    Submitted 16 April, 2020; originally announced April 2020.

    Comments: CVPR 2020

  20. arXiv:2002.06774  [pdf, other

    cs.LG stat.ML

    Residual Continual Learning

    Authors: Janghyeon Lee, Donggyu Joo, Hyeong Gwon Hong, Junmo Kim

    Abstract: We propose a novel continual learning method called Residual Continual Learning (ResCL). Our method can prevent the catastrophic forgetting phenomenon in sequential learning of multiple tasks, without any source task information except the original network. ResCL reparameterizes network parameters by linearly combining each layer of the original network and a fine-tuned network; therefore, the siz… ▽ More

    Submitted 17 February, 2020; originally announced February 2020.

    Comments: AAAI 2020

  21. arXiv:1905.12867  [pdf, other

    cs.LG stat.ML

    Cross-modal Variational Auto-encoder with Distributed Latent Spaces and Associators

    Authors: Dae Ung Jo, ByeongJu Lee, Jongwon Choi, Haanju Yoo, Jin Young Choi

    Abstract: In this paper, we propose a novel structure for a cross-modal data association, which is inspired by the recent research on the associative learning structure of the brain. We formulate the cross-modal association in Bayesian inference framework realized by a deep neural network with multiple variational auto-encoders and variational associators. The variational associators transfer the latent spa… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: 10 pages, 6 figures

  22. arXiv:1901.06140  [pdf, other

    cs.CV

    Backbone Can Not be Trained at Once: Rolling Back to Pre-trained Network for Person Re-Identification

    Authors: Youngmin Ro, Jongwon Choi, Dae Ung Jo, Byeongho Heo, Jongin Lim, Jin Young Choi

    Abstract: In person re-identification (ReID) task, because of its shortage of trainable dataset, it is common to utilize fine-tuning method using a classification network pre-trained on a large dataset. However, it is relatively difficult to sufficiently fine-tune the low-level layers of the network due to the gradient vanishing problem. In this work, we propose a novel fine-tuning strategy that allows low-… ▽ More

    Submitted 18 January, 2019; originally announced January 2019.

    Comments: Accepted to AAAI 2019

  23. arXiv:1812.09666  [pdf, ps, other

    cs.IT cs.CR

    A Proof of the Beierle-Kranz-Leander Conjecture related to Lightweight Multiplication in $\mathds{F}_{2^n}$

    Authors: Sihem Mesnager, Kwang Ho Kim, Dujin Jo, Junyop Choe, Munhyon Han, Dok Nam Lee

    Abstract: Lightweight cryptography is a key tool for building strong security solutions for pervasive devices with limited resources. Due to the stringent cost constraints inherent in extremely large applications (ranging from RFIDs and smart cards to mobile devices), the efficient implementation of cryptographic hardware and software algorithms is of utmost importance to realize the vision of generalized c… ▽ More

    Submitted 23 December, 2018; originally announced December 2018.

  24. arXiv:1804.07455  [pdf, other

    cs.CV

    Generating a Fusion Image: One's Identity and Another's Shape

    Authors: Donggyu Joo, Doyeon Kim, Junmo Kim

    Abstract: Generating a novel image by manipulating two input images is an interesting research problem in the study of generative adversarial networks (GANs). We propose a new GAN-based network that generates a fusion image with the identity of input image x and the shape of input image y. Our network can simultaneously train on more than two image datasets in an unsupervised manner. We define an identity l… ▽ More

    Submitted 25 January, 2022; v1 submitted 20 April, 2018; originally announced April 2018.

    Comments: CVPR 2018

  25. arXiv:1706.05781  [pdf, other

    cs.SD cs.LG cs.MM

    Kapre: On-GPU Audio Preprocessing Layers for a Quick Implementation of Deep Neural Network Models with Keras

    Authors: Keunwoo Choi, Deokjin Joo, Juho Kim

    Abstract: We introduce Kapre, Keras layers for audio and music signal preprocessing. Music research using deep neural networks requires a heavy and tedious preprocessing stage, for which audio processing parameters are often ignored in parameter optimisation. To solve this problem, Kapre implements time-frequency conversions, normalisation, and data augmentation as Keras layers. We report simple benchmark r… ▽ More

    Submitted 19 June, 2017; originally announced June 2017.

    Comments: ICML 2017 machine learning for music discovery

  26. arXiv:1704.07528  [pdf, other

    cs.GR

    Automatic Content-aware Projection for 360° Videos

    Authors: Yeong Won Kim, Dae-Yong Jo, Chang-Ryeol Lee, Hyeok-Jae Choi, Yong Hoon Kwon, Kuk-Jin Yoon

    Abstract: To watch 360° videos on normal 2D displays, we need to project the selected part of the 360° image onto the 2D display plane. In this paper, we propose a fully-automated framework for generating content-aware 2D normal-view perspective videos from 360° videos. Especially, we focus on the projection step preserving important image contents and reducing image distortion. Basically, our projection me… ▽ More

    Submitted 10 September, 2017; v1 submitted 24 April, 2017; originally announced April 2017.

    Comments: Accepted to International Conference on Computer Vision (ICCV), 2017