Skip to main content

Showing 1–30 of 30 results for author: Oh, S W

  1. arXiv:2404.16035  [pdf, other

    cs.CV cs.AI

    MaGGIe: Masked Guided Gradual Human Instance Matting

    Authors: Chuong Huynh, Seoung Wug Oh, Abhinav Shrivastava, Joon-Young Lee

    Abstract: Human matting is a foundation task in image and video processing, where human foreground pixels are extracted from the input. Prior works either improve the accuracy by additional guidance or improve the temporal consistency of a single instance across frames. We propose a new framework MaGGIe, Masked Guided Gradual Human Instance Matting, which predicts alpha mattes progressively for each human i… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: CVPR 2024. Project link: https://maggie-matt.github.io

  2. arXiv:2403.16509  [pdf, other

    cs.LG

    Human Understanding AI Paper Challenge 2024 -- Dataset Design

    Authors: Se Won Oh, Hyuntae Jeong, Jeong Mook Lim, Seungeun Chung, Kyoung Ju Noh

    Abstract: In 2024, we will hold a research paper competition (the third Human Understanding AI Paper Challenge) for the research and development of artificial intelligence technologies to understand human daily life. This document introduces the datasets that will be provided to participants in the competition, and summarizes the issues to consider in data processing and learning model development.

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 7 pages, 3 figures

    ACM Class: J.7; E.m

  3. arXiv:2312.04885  [pdf, other

    cs.CV

    VISAGE: Video Instance Segmentation with Appearance-Guided Enhancement

    Authors: Hanjung Kim, Jaehyun Kang, Miran Heo, Sukjun Hwang, Seoung Wug Oh, Seon Joo Kim

    Abstract: In recent years, online Video Instance Segmentation (VIS) methods have shown remarkable advancement with their powerful query-based detectors. Utilizing the output queries of the detector at the frame-level, these methods achieve high accuracy on challenging benchmarks. However, our observations demonstrate that these methods heavily rely on location information, which often causes incorrect assoc… ▽ More

    Submitted 8 March, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: Technical report

  4. arXiv:2310.12982  [pdf, other

    cs.CV

    Putting the Object Back into Video Object Segmentation

    Authors: Ho Kei Cheng, Seoung Wug Oh, Brian Price, Joon-Young Lee, Alexander Schwing

    Abstract: We present Cutie, a video object segmentation (VOS) network with object-level memory reading, which puts the object representation from memory back into the video object segmentation result. Recent works on VOS employ bottom-up pixel-level memory reading which struggles due to matching noise, especially in the presence of distractors, resulting in lower performance in more challenging data. In con… ▽ More

    Submitted 11 April, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: CVPR 2024 Highlight. Project page: https://hkchengrex.github.io/Cutie

  5. arXiv:2309.03903  [pdf, other

    cs.CV

    Tracking Anything with Decoupled Video Segmentation

    Authors: Ho Kei Cheng, Seoung Wug Oh, Brian Price, Alexander Schwing, Joon-Young Lee

    Abstract: Training data for video segmentation are expensive to annotate. This impedes extensions of end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary settings. To 'track anything' without training on video data for every individual task, we develop a decoupled video segmentation approach (DEVA), composed of task-specific image-level segmentation and class/task-agnostic b… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: Accepted to ICCV 2023. Project page: https://hkchengrex.github.io/Tracking-Anything-with-DEVA

  6. arXiv:2302.04871  [pdf, other

    cs.CV

    In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing

    Authors: Yiran Xu, Zhixin Shu, Cameron Smith, Seoung Wug Oh, Jia-Bin Huang

    Abstract: 3D-aware GANs offer new capabilities for view synthesis while preserving the editing functionalities of their 2D counterparts. GAN inversion is a crucial step that seeks the latent code to reconstruct input images or videos, subsequently enabling diverse editing tasks through manipulation of this latent code. However, a model pre-trained on a particular dataset (e.g., FFHQ) often has difficulty re… ▽ More

    Submitted 14 April, 2024; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: Project page: https://in-n-out-3d.github.io/

  7. arXiv:2212.10149  [pdf, other

    cs.CV

    Tracking by Associating Clips

    Authors: Sanghyun Woo, Kwanyong Park, Seoung Wug Oh, In So Kweon, Joon-Young Lee

    Abstract: The tracking-by-detection paradigm today has become the dominant method for multi-object tracking and works by detecting objects in each frame and then performing data association across frames. However, its sequential frame-wise matching property fundamentally suffers from the intermediate interruptions in a video, such as object occlusions, fast camera movements, and abrupt light changes. Moreov… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: ECCV 2022

  8. arXiv:2212.10147  [pdf, other

    cs.CV

    Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection

    Authors: Sanghyun Woo, Kwanyong Park, Seoung Wug Oh, In So Kweon, Joon-Young Lee

    Abstract: Scaling object taxonomies is one of the important steps toward a robust real-world deployment of recognition systems. We have faced remarkable progress in images since the introduction of the LVIS benchmark. To continue this success in videos, a new video benchmark, TAO, was recently presented. Given the recent encouraging results from both detection and tracking communities, we are interested in… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: ECCV 2022

  9. arXiv:2211.08834  [pdf, other

    cs.CV

    A Generalized Framework for Video Instance Segmentation

    Authors: Miran Heo, Sukjun Hwang, Jeongseok Hyun, Hanjung Kim, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim

    Abstract: The handling of long videos with complex and occluded sequences has recently emerged as a new challenge in the video instance segmentation (VIS) community. However, existing methods have limitations in addressing this challenge. We argue that the biggest bottleneck in current approaches is the discrepancy between training and inference. To effectively bridge this gap, we propose a Generalized fram… ▽ More

    Submitted 24 March, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: CVPR 2023

  10. arXiv:2208.01924  [pdf, other

    cs.CV

    Per-Clip Video Object Segmentation

    Authors: Kwanyong Park, Sanghyun Woo, Seoung Wug Oh, In So Kweon, Joon-Young Lee

    Abstract: Recently, memory-based approaches show promising results on semi-supervised video object segmentation. These methods predict object masks frame-by-frame with the help of frequently updated memory of the previous mask. Different from this per-frame inference, we investigate an alternative perspective by treating video object segmentation as clip-wise mask propagation. In this per-clip inference sch… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

    Comments: CVPR 2022; Code is available at https://github.com/pkyong95/PCVOS

  11. arXiv:2207.13353  [pdf, other

    cs.CV

    One-Trimap Video Matting

    Authors: Hongje Seong, Seoung Wug Oh, Brian Price, Euntai Kim, Joon-Young Lee

    Abstract: Recent studies made great progress in video matting by extending the success of trimap-based image matting to the video domain. In this paper, we push this task toward a more practical setting and propose One-Trimap Video Matting network (OTVM) that performs video matting robustly using only one user-annotated trimap. A key of OTVM is the joint modeling of trimap propagation and alpha prediction.… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV 2022

  12. arXiv:2207.10391  [pdf, other

    cs.CV cs.LG

    Error Compensation Framework for Flow-Guided Video Inpainting

    Authors: Jaeyeon Kang, Seoung Wug Oh, Seon Joo Kim

    Abstract: The key to video inpainting is to use correlation information from as many reference frames as possible. Existing flow-based propagation methods split the video synthesis process into multiple steps: flow completion -> pixel propagation -> synthesis. However, there is a significant drawback that the errors in each step continue to accumulate and amplify in the next step. To this end, we propose an… ▽ More

    Submitted 21 July, 2022; originally announced July 2022.

    Comments: ECCV2022 accepted

  13. arXiv:2206.04403  [pdf, other

    cs.CV

    VITA: Video Instance Segmentation via Object Token Association

    Authors: Miran Heo, Sukjun Hwang, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim

    Abstract: We introduce a novel paradigm for offline Video Instance Segmentation (VIS), based on the hypothesis that explicit object-oriented information can be a strong clue for understanding the context of the entire sequence. To this end, we propose VITA, a simple structure built on top of an off-the-shelf Transformer-based image instance segmentation model. Specifically, we use an image object detector a… ▽ More

    Submitted 20 October, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

  14. arXiv:2206.02116  [pdf, other

    cs.CV

    Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints to Better Classify Objects in Videos

    Authors: Sukjun Hwang, Miran Heo, Seoung Wug Oh, Seon Joo Kim

    Abstract: Recently, both long-tailed recognition and object tracking have made great advances individually. TAO benchmark presented a mixture of the two, long-tailed object tracking, in order to further reflect the aspect of the real-world. To date, existing solutions have adopted detectors showing robustness in long-tailed distributions, which derive per-frame results. Then, they used tracking algorithms t… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

    Comments: Accepted to CVPR 2022

  15. arXiv:2112.04177  [pdf, other

    cs.CV

    VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video Instance Segmentation

    Authors: Su Ho Han, Sukjun Hwang, Seoung Wug Oh, Yeonchool Park, Hyunwoo Kim, Min-Jung Kim, Seon Joo Kim

    Abstract: For online video instance segmentation (VIS), fully utilizing the information from previous frames in an efficient manner is essential for real-time applications. Most previous methods follow a two-stage approach requiring additional computations such as RPN and RoIAlign, and do not fully exploit the available information in the video for all subtasks in VIS. In this paper, we propose a novel sing… ▽ More

    Submitted 30 March, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

  16. arXiv:2109.11404  [pdf, other

    cs.CV

    Hierarchical Memory Matching Network for Video Object Segmentation

    Authors: Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

    Abstract: We present Hierarchical Memory Matching Network (HMMN) for semi-supervised video object segmentation. Based on a recent memory-based method [33], we propose two advanced memory read modules that enable us to perform memory reading in multiple scales while exploiting temporal smoothness. We first propose a kernel guided memory matching module that replaces the non-local dense memory read, commonly… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: Accepted to ICCV 2021

  17. arXiv:2106.03299  [pdf, other

    cs.CV

    Video Instance Segmentation using Inter-Frame Communication Transformers

    Authors: Sukjun Hwang, Miran Heo, Seoung Wug Oh, Seon Joo Kim

    Abstract: We propose a novel end-to-end solution for video instance segmentation (VIS) based on transformers. Recently, the per-clip pipeline shows superior performance over per-frame methods leveraging richer information from multiple frames. However, previous per-clip models require heavy computation and memory usage to achieve frame-to-frame communications, limiting practicality. In this work, we propose… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.

  18. arXiv:2105.14584  [pdf, other

    cs.CV cs.AI cs.LG

    Polygonal Point Set Tracking

    Authors: Gunhee Nam, Miran Heo, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim

    Abstract: In this paper, we propose a novel learning-based polygonal point set tracking method. Compared to existing video object segmentation~(VOS) methods that propagate pixel-wise object mask information, we propagate a polygonal point set over frames. Specifically, the set is defined as a subset of points in the target contour, and our goal is to track corresponding points on the target contour. Those… ▽ More

    Submitted 30 May, 2021; originally announced May 2021.

    Comments: 14 pages, 10 figures, 6 tables

  19. arXiv:2105.08336  [pdf, other

    cs.CV

    Exemplar-Based Open-Set Panoptic Segmentation Network

    Authors: Jaedong Hwang, Seoung Wug Oh, Joon-Young Lee, Bohyung Han

    Abstract: We extend panoptic segmentation to the open-world and introduce an open-set panoptic segmentation (OPS) task. This task requires performing panoptic segmentation for not only known classes but also unknown ones that have not been acknowledged during training. We investigate the practical challenges of the task and construct a benchmark on top of an existing dataset, COCO. In addition, we propose a… ▽ More

    Submitted 18 May, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

    Comments: CVPR 2021

  20. arXiv:2012.01632  [pdf, other

    cs.CV

    Single-shot Path Integrated Panoptic Segmentation

    Authors: Sukjun Hwang, Seoung Wug Oh, Seon Joo Kim

    Abstract: Panoptic segmentation, which is a novel task of unifying instance segmentation and semantic segmentation, has attracted a lot of attention lately. However, most of the previous methods are composed of multiple pathways with each pathway specialized to a designated segmentation task. In this paper, we propose to resolve panoptic segmentation in single-shot by integrating the execution flows. With t… ▽ More

    Submitted 3 December, 2020; v1 submitted 2 December, 2020; originally announced December 2020.

    Comments: 10 pages, 5 figures

  21. arXiv:2007.08786  [pdf, other

    cs.CV

    Cross-Identity Motion Transfer for Arbitrary Objects through Pose-Attentive Video Reassembling

    Authors: Subin Jeon, Seonghyeon Nam, Seoung Wug Oh, Seon Joo Kim

    Abstract: We propose an attention-based networks for transferring motions between arbitrary objects. Given a source image(s) and a driving video, our networks animate the subject in the source images according to the motion in the driving video. In our attention mechanism, dense similarities between the learned keypoints in the source and the driving images are computed in order to retrieve the appearance i… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  22. arXiv:2004.02432  [pdf, other

    cs.CV

    Deep Space-Time Video Upsampling Networks

    Authors: Jaeyeon Kang, Younghyun Jo, Seoung Wug Oh, Peter Vajda, Seon Joo Kim

    Abstract: Video super-resolution (VSR) and frame interpolation (FI) are traditional computer vision problems, and the performance have been improving by incorporating deep learning recently. In this paper, we investigate the problem of jointly upsampling videos both in space and time, which is becoming more important with advances in display systems. One solution for this is to run VSR and FI, one by one, i… ▽ More

    Submitted 9 August, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: ECCV2020 accepted

  23. arXiv:2003.09171  [pdf, other

    cs.CV cs.LG eess.IV

    DMV: Visual Object Tracking via Part-level Dense Memory and Voting-based Retrieval

    Authors: Gunhee Nam, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim

    Abstract: We propose a novel memory-based tracker via part-level dense memory and voting-based retrieval, called DMV. Since deep learning techniques have been introduced to the tracking field, Siamese trackers have attracted many researchers due to the balance between speed and accuracy. However, most of them are based on a single template matching, which limits the performance as it restricts the accessibl… ▽ More

    Submitted 20 March, 2020; originally announced March 2020.

    Comments: 19 pages, 9 figures

  24. arXiv:2003.09124  [pdf, other

    eess.IV cs.CV

    Learning the Loss Functions in a Discriminative Space for Video Restoration

    Authors: Younghyun Jo, Jaeyeon Kang, Seoung Wug Oh, Seonghyeon Nam, Peter Vajda, Seon Joo Kim

    Abstract: With more advanced deep network architectures and learning schemes such as GANs, the performance of video restoration algorithms has greatly improved recently. Meanwhile, the loss functions for optimizing deep neural networks remain relatively unchanged. To this end, we propose a new framework for building effective loss functions by learning a discriminative space specific to a video restoration… ▽ More

    Submitted 20 March, 2020; originally announced March 2020.

    Comments: 24 pages

  25. arXiv:1908.11587  [pdf, other

    cs.CV

    Copy-and-Paste Networks for Deep Video Inpainting

    Authors: Sungho Lee, Seoung Wug Oh, DaeYeun Won, Seon Joo Kim

    Abstract: We present a novel deep learning based algorithm for video inpainting. Video inpainting is a process of completing corrupted or missing regions in videos. Video inpainting has additional challenges compared to image inpainting due to the extra temporal information as well as the need for maintaining the temporal coherency. We propose a novel DNN-based framework called the Copy-and-Paste Networks f… ▽ More

    Submitted 30 August, 2019; originally announced August 2019.

    Comments: ICCV 2019

  26. arXiv:1908.08718  [pdf, other

    cs.CV

    Onion-Peel Networks for Deep Video Completion

    Authors: Seoung Wug Oh, Sungho Lee, Joon-Young Lee, Seon Joo Kim

    Abstract: We propose the onion-peel networks for video completion. Given a set of reference images and a target image with holes, our network fills the hole by referring the contents in the reference images. Our onion-peel network progressively fills the hole from the hole boundary enabling it to exploit richer contextual information for the missing regions every step. Given a sufficient number of recurrenc… ▽ More

    Submitted 23 August, 2019; originally announced August 2019.

    Comments: ICCV 2019

  27. arXiv:1904.09791  [pdf, other

    cs.CV

    Fast User-Guided Video Object Segmentation by Interaction-and-Propagation Networks

    Authors: Seoung Wug Oh, Joon-Young Lee, Ning Xu, Seon Joo Kim

    Abstract: We present a deep learning method for the interactive video object segmentation. Our method is built upon two core operations, interaction and propagation, and each operation is conducted by Convolutional Neural Networks. The two networks are connected both internally and externally so that the networks are trained jointly and interact with each other to solve the complex video object segmentation… ▽ More

    Submitted 2 May, 2019; v1 submitted 22 April, 2019; originally announced April 2019.

    Comments: CVPR 2019

  28. arXiv:1904.00607  [pdf, other

    cs.CV

    Video Object Segmentation using Space-Time Memory Networks

    Authors: Seoung Wug Oh, Joon-Young Lee, Ning Xu, Seon Joo Kim

    Abstract: We propose a novel solution for semi-supervised video object segmentation. By the nature of the problem, available cues (e.g. video frame(s) with object masks) become richer with the intermediate predictions. However, the existing methods are unable to fully exploit this rich source of information. We resolve the issue by leveraging memory networks and learn to read relevant information from all a… ▽ More

    Submitted 12 August, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

    Comments: ICCV 2019

  29. arXiv:1701.03416  [pdf, ps, other

    cs.NI

    Guaranteeing QoS using Unlicensed TV White Spaces for Smart Grid Applications

    Authors: Naveed Ul Hassan, Wayes Tushar, Chau Yuen, See Gim Kerk, Ser Wah Oh

    Abstract: In this paper, we consider the utilization of TV White Spaces (TVWS) by small Cognitive Radio (CR) network operators to support the communication needs of various smart grid applications. We first propose a multi-tier communication network architecture for smart metering applications in dense urban environments. Our measurement campaign, without any competition from other CR operators, reveals tha… ▽ More

    Submitted 6 December, 2016; originally announced January 2017.

    Comments: Journal paper

  30. Approaching the Computational Color Constancy as a Classification Problem through Deep Learning

    Authors: Seoung Wug Oh, Seon Joo Kim

    Abstract: Computational color constancy refers to the problem of computing the illuminant color so that the images of a scene under varying illumination can be normalized to an image under the canonical illumination. In this paper, we adopt a deep learning framework for the illumination estimation problem. The proposed method works under the assumption of uniform illumination over the scene and aims for the… ▽ More

    Submitted 29 August, 2016; originally announced August 2016.

    Comments: This is a preprint of an article accepted for publication in Pattern Recognition, ELSEVIER

    Journal ref: Pattern Recognition, Volume 61, January 2017, Pages 405 to 416