Skip to main content

Showing 1–17 of 17 results for author: Ko, D

  1. arXiv:2406.06134  [pdf, other

    cs.CV cs.AI cs.LG

    DiffInject: Revisiting Debias via Synthetic Data Generation using Diffusion-based Style Injection

    Authors: Donggeun Ko, Sangwoo Jo, Dongjun Lee, Namjun Park, Jaekwang Kim

    Abstract: Dataset bias is a significant challenge in machine learning, where specific attributes, such as texture or color of the images are unintentionally learned resulting in detrimental performance. To address this, previous efforts have focused on debiasing models either by developing novel debiasing algorithms or by generating synthetic data to mitigate the prevalent dataset biases. However, generativ… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 10 pages (including supplementary), 3 figures, SynData4CV@CVPR 24 (Workshop)

  2. arXiv:2406.05606  [pdf, other

    cs.CL

    GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge?

    Authors: Dayoon Ko, Jinyoung Kim, Hahyeon Choi, Gunhee Kim

    Abstract: In the real world, knowledge is constantly evolving, which can render existing knowledge-based datasets outdated. This unreliability highlights the critical need for continuous updates to ensure both accuracy and relevance in knowledge-intensive tasks. To address this, we propose GrowOVER-QA and GrowOVER-Dialogue, dynamic open-domain QA and dialogue benchmarks that undergo a continuous cycle of up… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Main

  3. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  4. arXiv:2310.15747  [pdf, other

    cs.CV

    Large Language Models are Temporal and Causal Reasoners for Video Question Answering

    Authors: Dohwan Ko, Ji Soo Lee, Wooyoung Kang, Byungseok Roh, Hyunwoo J. Kim

    Abstract: Large Language Models (LLMs) have shown remarkable performances on a wide range of natural language understanding and generation tasks. We observe that the LLMs provide effective priors in exploiting $\textit{linguistic shortcuts}$ for temporal and causal reasoning in Video Question Answering (VideoQA). However, such priors often cause suboptimal results on VideoQA by leading the model to over-rel… ▽ More

    Submitted 6 November, 2023; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted paper at EMNLP 2023 Main

  5. arXiv:2310.14159  [pdf, other

    cs.CL cs.CV

    Can Language Models Laugh at YouTube Short-form Videos?

    Authors: Dayoon Ko, Sangho Lee, Gunhee Kim

    Abstract: As short-form funny videos on social networks are gaining popularity, it becomes demanding for AI models to understand them for better communication with humans. Unfortunately, previous video humor datasets target specific domains, such as speeches or sitcoms, and mostly focus on verbal cues. We curate a user-generated dataset of 10K multimodal funny videos from YouTube, called ExFunTube. Using a… ▽ More

    Submitted 31 March, 2024; v1 submitted 21 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023; references added

  6. arXiv:2308.09363  [pdf, other

    cs.CV

    Open-vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models

    Authors: Dohwan Ko, Ji Soo Lee, Miso Choi, Jaewon Chu, Jihwan Park, Hyunwoo J. Kim

    Abstract: Video Question Answering (VideoQA) is a challenging task that entails complex multi-modal reasoning. In contrast to multiple-choice VideoQA which aims to predict the answer given several options, the goal of open-ended VideoQA is to answer questions without restricting candidate answers. However, the majority of previous VideoQA models formulate open-ended VideoQA as a classification task to class… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: Accepted paper at ICCV 2023

  7. arXiv:2308.03400  [pdf, other

    cs.IR

    Hierarchical Contrastive Learning with Multiple Augmentation for Sequential Recommendation

    Authors: Dongjun Lee, Donggeun Ko, Jaekwang Kim

    Abstract: Sequential recommendation addresses the issue of preference drift by predicting the next item based on the user's previous behaviors. Recently, a promising approach using contrastive learning has emerged, demonstrating its effectiveness in recommending items under sparse user-item interactions. Significantly, the effectiveness of combinations of various augmentation methods has been demonstrated i… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: 10 pages, 4 figures

  8. arXiv:2303.13009  [pdf, other

    cs.CV

    MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models

    Authors: Dohwan Ko, Joonmyung Choi, Hyeong Kyu Choi, Kyoung-Woon On, Byungseok Roh, Hyunwoo J. Kim

    Abstract: Foundation models have shown outstanding performance and generalization capabilities across domains. Since most studies on foundation models mainly focus on the pretraining phase, a naive strategy to minimize a single task-specific loss is adopted for fine-tuning. However, such fine-tuning methods do not fully leverage other losses that are potentially beneficial for the target task. Therefore, we… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: Accepted paper at CVPR 2023

  9. arXiv:2212.10504  [pdf, other

    cs.CL

    Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?

    Authors: Sang-Woo Lee, Sungdong Kim, Donghyeon Ko, Donghoon Ham, Youngki Hong, Shin Ah Oh, Hyunhoon Jung, Wangkyo Jung, Kyunghyun Cho, Donghyun Kwak, Hyungsuk Noh, Woomyoung Park

    Abstract: Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i.e., slots) to fulfill a specific task. A series of approaches based on this framework achieved remarkable success on various TOD benchmarks. However, we argue that the current TOD benchmarks are limited to surrogate real-worl… ▽ More

    Submitted 24 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  10. arXiv:2209.08945  [pdf, other

    cs.LG cs.IT stat.ML

    A novel approach for wafer defect pattern classification based on topological data analysis

    Authors: Seungchan Ko, Dowan Koo

    Abstract: In semiconductor manufacturing, wafer map defect pattern provides critical information for facility maintenance and yield management, so the classification of defect patterns is one of the most important tasks in the manufacturing process. In this paper, we propose a novel way to represent the shape of the defect pattern as a finite-dimensional vector, which will be used as an input for a neural n… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

  11. arXiv:2203.16784  [pdf, other

    cs.CV

    Video-Text Representation Learning via Differentiable Weak Temporal Alignment

    Authors: Dohwan Ko, Joonmyung Choi, Juyeon Ko, Shinyeong Noh, Kyoung-Woon On, Eun-Sol Kim, Hyunwoo J. Kim

    Abstract: Learning generic joint representations for video and text by a supervised method requires a prohibitively substantial amount of manually annotated video datasets. As a practical alternative, a large-scale but uncurated and narrated video dataset, HowTo100M, has recently been introduced. But it is still challenging to learn joint embeddings of video and text in a self-supervised manner, due to its… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

  12. arXiv:2202.11359   

    cs.CV cs.AI cs.LG

    Deepfake Detection for Facial Images with Facemasks

    Authors: Donggeun Ko, Sangjun Lee, Jinyong Park, Saebyeol Shin, Donghee Hong, Simon S. Woo

    Abstract: Hyper-realistic face image generation and manipulation have givenrise to numerous unethical social issues, e.g., invasion of privacy,threat of security, and malicious political maneuvering, which re-sulted in the development of recent deepfake detection methods with the rising demands of deepfake forensics. Proposed deepfake detection methods to date have shown remarkable detection performance and… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

    Comments: This submission has been removed by arXiv administrators because the submitter did not have the authority to grant the license at the time of submission

  13. arXiv:2201.11331  [pdf

    cs.AI

    Epistemic AI platform accelerates innovation by connecting biomedical knowledge

    Authors: Da Chen Emily Koo, Heather Bowling, Kenneth Ashworth, David J. Heeger, Stefano Pacifico

    Abstract: Epistemic AI accelerates biomedical discovery by finding hidden connections in the network of biomedical knowledge. The Epistemic AI web-based software platform embodies the concept of knowledge mapping, an interactive process that relies on a knowledge graph in combination with natural language processing (NLP), information retrieval, relevance feedback, and network analysis. Knowledge mapping re… ▽ More

    Submitted 31 March, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

    Comments: 12 pages, 2 main figures

  14. arXiv:2102.05182  [pdf, other

    astro-ph.GA cs.LG

    A Deep Learning Approach for Characterizing Major Galaxy Mergers

    Authors: Skanda Koppula, Victor Bapst, Marc Huertas-Company, Sam Blackwell, Agnieszka Grabska-Barwinska, Sander Dieleman, Andrea Huber, Natasha Antropova, Mikolaj Binkowski, Hannah Openshaw, Adria Recasens, Fernando Caro, Avishai Deke, Yohan Dubois, Jesus Vega Ferrero, David C. Koo, Joel R. Primack, Trevor Back

    Abstract: Fine-grained estimation of galaxy merger stages from observations is a key problem useful for validation of our current theoretical understanding of galaxy formation. To this end, we demonstrate a CNN-based regression model that is able to predict, for the first time, using a single image, the merger stage relative to the first perigee passage with a median error of 38.3 million years (Myrs) over… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

    Comments: Third Workshop on Machine Learning and the Physical Sciences (NeurIPS 2020), Vancouver, Canada

  15. arXiv:1907.10326  [pdf, other

    cs.CV

    From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation

    Authors: Jin Han Lee, Myung-Kyu Han, Dong Wook Ko, Il Hong Suh

    Abstract: Estimating accurate depth from a single image is challenging because it is an ill-posed problem as infinitely many 3D scenes can be projected to the same 2D scene. However, recent works based on deep convolutional neural networks show great progress with plausible results. The convolutional neural networks are generally composed of two parts: an encoder for dense feature extraction and a decoder f… ▽ More

    Submitted 23 September, 2021; v1 submitted 24 July, 2019; originally announced July 2019.

  16. Sequential Image-based Attention Network for Inferring Force Estimation without Haptic Sensor

    Authors: Hochul Shin, Hyeon Cho, Dongyi Kim, Daekwan Ko, Soochul Lim, Wonjun Hwang

    Abstract: Humans can infer approximate interaction force between objects from only vision information because we already have learned it through experiences. Based on this idea, we propose a recurrent convolutional neural network-based method using sequential images for inferring interaction force without using a haptic sensor. For training and validating deep learning methods, we collected a large number o… ▽ More

    Submitted 20 October, 2019; v1 submitted 17 November, 2018; originally announced November 2018.

    Comments: Accepted by IEEE Access on Oct. 08, 2019

  17. arXiv:1008.4938  [pdf

    q-bio.QM cs.SC

    Towards Solving the Inverse Protein Folding Problem

    Authors: Yoojin Hong, Kyung Dae Ko, Gaurav Bhardwaj, Zhenhai Zhang, Damian B. van Rossum, Randen L. Patterson

    Abstract: Accurately assigning folds for divergent protein sequences is a major obstacle to structural studies and underlies the inverse protein folding problem. Herein, we outline our theories for fold-recognition in the "twilight-zone" of sequence similarity (<25% identity). Our analyses demonstrate that structural sequence profiles built using Position-Specific Scoring Matrices (PSSMs) significantly outp… ▽ More

    Submitted 29 August, 2010; originally announced August 2010.

    Comments: 22 pages, 11 figures