Skip to main content

Showing 51–100 of 186 results for author: Woo, S

  1. arXiv:2303.11793  [pdf, other

    cs.CV

    Bridging Optimal Transport and Jacobian Regularization by Optimal Trajectory for Enhanced Adversarial Defense

    Authors: Binh M. Le, Shahroz Tariq, Simon S. Woo

    Abstract: Deep neural networks, particularly in vision tasks, are notably susceptible to adversarial perturbations. To overcome this challenge, developing a robust classifier is crucial. In light of the recent advancements in the robustness of classifiers, we delve deep into the intricacies of adversarial training and Jacobian regularization, two pivotal defenses. Our work is the first carefully analyzes an… ▽ More

    Submitted 12 February, 2024; v1 submitted 21 March, 2023; originally announced March 2023.

  2. arXiv:2303.09779  [pdf, other

    cs.CV

    Bidirectional Domain Mixup for Domain Adaptive Semantic Segmentation

    Authors: Daehan Kim, Minseok Seo, Kwanyong Park, Inkyu Shin, Sanghyun Woo, In-So Kweon, Dong-Geol Choi

    Abstract: Mixup provides interpolated training samples and allows the model to obtain smoother decision boundaries for better generalization. The idea can be naturally applied to the domain adaptation task, where we can mix the source and target samples to obtain domain-mixed samples for better adaptation. However, the extension of the idea from classification to segmentation (i.e., structured output) is no… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: 10 pages, 3 figures, Accepted on AAAI 2023

  3. Why Do Facial Deepfake Detectors Fail?

    Authors: Binh Le, Shahroz Tariq, Alsharif Abuadbba, Kristen Moore, Simon Woo

    Abstract: Recent rapid advancements in deepfake technology have allowed the creation of highly realistic fake media, such as video, image, and audio. These materials pose significant challenges to human authentication, such as impersonation, misinformation, or even a threat to national security. To keep pace with these rapid advancements, several deepfake detection algorithms have been proposed, leading to… ▽ More

    Submitted 10 September, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: 5 pages, ACM ASIACCS 2023

  4. arXiv:2301.04333  [pdf, other

    cs.LG cs.AI

    Learnable Path in Neural Controlled Differential Equations

    Authors: Sheo Yon Jhin, Minju Jo, Seungji Kook, Noseong Park, Sungpil Woo, Sunhwan Lim

    Abstract: Neural controlled differential equations (NCDEs), which are continuous analogues to recurrent neural networks (RNNs), are a specialized model in (irregular) time-series processing. In comparison with similar models, e.g., neural ordinary differential equations (NODEs), the key distinctive characteristics of NCDEs are i) the adoption of the continuous path created by an interpolation algorithm from… ▽ More

    Submitted 11 January, 2023; originally announced January 2023.

    Comments: Accepted by AAAI 2023

  5. arXiv:2301.00808  [pdf, other

    cs.CV

    ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

    Authors: Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie

    Abstract: Driven by improved architectures and better representation learning frameworks, the field of visual recognition has enjoyed rapid modernization and performance boost in the early 2020s. For example, modern ConvNets, represented by ConvNeXt, have demonstrated strong performance in various scenarios. While these models were originally designed for supervised learning with ImageNet labels, they can a… ▽ More

    Submitted 2 January, 2023; originally announced January 2023.

    Comments: Code and models available at https://github.com/facebookresearch/ConvNeXt-V2

  6. arXiv:2212.11895  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci

    Excitonic Absorption Signatures of Twisted Bilayer WSe$_{2}$ by Electron Energy-Loss Spectroscopy

    Authors: Steffi Y. Woo, Alberto Zobelli, Robert Schneider, Ashish Arora, Johann A. Preuß, Benjamin J. Carey, Steffen Michaelis de Vasconcellos, Maurizia Palummo, Rudolf Bratschitsch, Luiz H. G. Tizei

    Abstract: Moiré twist angle underpins the interlayer interaction of excitons in twisted van der Waals hetero- and homo-structures. The influence of twist angle on the excitonic absorption of twisted bilayer tungsten diselenide (WSe$_{2}$) has been investigated using electron energy-loss spectroscopy. Atomic-resolution imaging by scanning transmission electron microscopy was used to determine key structural… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

  7. arXiv:2212.10149  [pdf, other

    cs.CV

    Tracking by Associating Clips

    Authors: Sanghyun Woo, Kwanyong Park, Seoung Wug Oh, In So Kweon, Joon-Young Lee

    Abstract: The tracking-by-detection paradigm today has become the dominant method for multi-object tracking and works by detecting objects in each frame and then performing data association across frames. However, its sequential frame-wise matching property fundamentally suffers from the intermediate interruptions in a video, such as object occlusions, fast camera movements, and abrupt light changes. Moreov… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: ECCV 2022

  8. arXiv:2212.10147  [pdf, other

    cs.CV

    Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection

    Authors: Sanghyun Woo, Kwanyong Park, Seoung Wug Oh, In So Kweon, Joon-Young Lee

    Abstract: Scaling object taxonomies is one of the important steps toward a robust real-world deployment of recognition systems. We have faced remarkable progress in images since the introduction of the LVIS benchmark. To continue this success in videos, a new video benchmark, TAO, was recently presented. Given the recent encouraging results from both detection and tracking communities, we are interested in… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: ECCV 2022

  9. arXiv:2212.08356  [pdf, other

    cs.CV

    Test-time Adaptation in the Dynamic World with Compound Domain Knowledge Management

    Authors: Junha Song, Kwanyong Park, InKyu Shin, Sanghyun Woo, Chaoning Zhang, In So Kweon

    Abstract: Prior to the deployment of robotic systems, pre-training the deep-recognition models on all potential visual cases is infeasible in practice. Hence, test-time adaptation (TTA) allows the model to adapt itself to novel environments and improve its performance during test time (i.e., lifelong adaptation). Several works for TTA have shown promising adaptation performances in continuously changing env… ▽ More

    Submitted 15 April, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

    Comments: 8 pages

  10. arXiv:2212.08355  [pdf, other

    cs.CV

    Learning Classifiers of Prototypes and Reciprocal Points for Universal Domain Adaptation

    Authors: Sungsu Hur, Inkyu Shin, Kwanyong Park, Sanghyun Woo, In So Kweon

    Abstract: Universal Domain Adaptation aims to transfer the knowledge between the datasets by handling two shifts: domain-shift and category-shift. The main challenge is correctly distinguishing the unknown target samples while adapting the distribution of known class knowledge from source to target. Most existing methods approach this problem by first training the target adapted known classifier and then re… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

    Comments: Accepted at WACV 2023

  11. arXiv:2212.04761  [pdf, other

    cs.CV

    Leveraging Spatio-Temporal Dependency for Skeleton-Based Action Recognition

    Authors: Jungho Lee, Minhyeok Lee, Suhwan Cho, Sungmin Woo, Sungjun Jang, Sangyoun Lee

    Abstract: Skeleton-based action recognition has attracted considerable attention due to its compact representation of the human body's skeletal sructure. Many recent methods have achieved remarkable performance using graph convolutional networks (GCNs) and convolutional neural networks (CNNs), which extract spatial and temporal features, respectively. Although spatial and temporal dependencies in the human… ▽ More

    Submitted 18 July, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

    Comments: Accepted by ICCV 2023

  12. arXiv:2212.04548  [pdf, other

    cs.LG

    STLGRU: Spatio-Temporal Lightweight Graph GRU for Traffic Flow Prediction

    Authors: Kishor Kumar Bhaumik, Fahim Faisal Niloy, Saif Mahmud, Simon Woo

    Abstract: Reliable forecasting of traffic flow requires efficient modeling of traffic data. Indeed, different correlations and influences arise in a dynamic traffic network, making modeling a complicated task. Existing literature has proposed many different methods to capture traffic networks' complex underlying spatial-temporal relations. However, given the heterogeneity of traffic data, consistently captu… ▽ More

    Submitted 19 February, 2024; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: PAKDD 2024 (Oral)

  13. arXiv:2211.15926  [pdf, other

    cs.CR cs.CV cs.LG

    Interpretations Cannot Be Trusted: Stealthy and Effective Adversarial Perturbations against Interpretable Deep Learning

    Authors: Eldor Abdukhamidov, Mohammed Abuhamad, Simon S. Woo, Eric Chan-Tin, Tamer Abuhmed

    Abstract: Deep learning methods have gained increased attention in various applications due to their outstanding performance. For exploring how this high performance relates to the proper use of data artifacts and the accurate problem formulation of a given task, interpretation models have become a crucial component in developing deep learning-based systems. Interpretation models enable the understanding of… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  14. arXiv:2211.13916  [pdf, other

    cs.CV cs.AI cs.LG

    Towards Good Practices for Missing Modality Robust Action Recognition

    Authors: Sangmin Woo, Sumin Lee, Yeonju Park, Muhammad Adi Nugroho, Changick Kim

    Abstract: Standard multi-modal models assume the use of the same modalities in training and inference stages. However, in practice, the environment in which multi-modal models operate may not satisfy such assumption. As such, their performances degrade drastically if any modality is missing in the inference stage. We ask: how can we train a model that is robust to missing modalities? This paper seeks a set… ▽ More

    Submitted 30 March, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: AAAI 2023 (Oral); Code: https://github.com/sangminwoo/ActionMAE

  15. arXiv:2210.07817  [pdf

    cs.IR cs.AI

    Discussion about Attacks and Defenses for Fair and Robust Recommendation System Design

    Authors: Mirae Kim, Simon Woo

    Abstract: Information has exploded on the Internet and mobile with the advent of the big data era. In particular, recommendation systems are widely used to help consumers who struggle to select the best products among such a large amount of information. However, recommendation systems are vulnerable to malicious user biases, such as fake reviews to promote or demote specific products, as well as attacks tha… ▽ More

    Submitted 28 September, 2022; originally announced October 2022.

  16. arXiv:2210.02182  [pdf, other

    cs.CV

    CFL-Net: Image Forgery Localization Using Contrastive Learning

    Authors: Fahim Faisal Niloy, Kishor Kumar Bhaumik, Simon S. Woo

    Abstract: Conventional forgery localizing methods usually rely on different forgery footprints such as JPEG artifacts, edge inconsistency, camera noise, etc., with cross-entropy loss to locate manipulated regions. However, these methods have the disadvantage of over-fitting and focusing on only a few specific forgery footprints. On the other hand, real-life manipulated images are generated via a wide variet… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: WACV 2023

  17. arXiv:2209.12107  [pdf, other

    eess.SY cs.LG

    Valuation of Public Bus Electrification with Open Data

    Authors: Upadhi Vijay, Soomin Woo, Scott J. Moura, Akshat Jain, David Rodriguez, Sergio Gambacorta, Giuseppe Ferrara, Luigi Lanuzza, Christian Zulberti, Erika Mellekas, Carlo Papa

    Abstract: This research provides a novel framework to estimate the economic, environmental, and social values of electrifying public transit buses, for cities across the world, based on open-source data. Electric buses are a compelling candidate to replace diesel buses for the environmental and social benefits. However, the state-of-art models to evaluate the value of bus electrification are limited in appl… ▽ More

    Submitted 24 September, 2022; originally announced September 2022.

  18. arXiv:2208.14625  [pdf, other

    cs.CV cs.AI

    Temporal Flow Mask Attention for Open-Set Long-Tailed Recognition of Wild Animals in Camera-Trap Images

    Authors: Jeongsoo Kim, Sangmin Woo, Byeongjun Park, Changick Kim

    Abstract: Camera traps, unmanned observation devices, and deep learning-based image recognition systems have greatly reduced human effort in collecting and analyzing wildlife images. However, data collected via above apparatus exhibits 1) long-tailed and 2) open-ended distribution problems. To tackle the open-set long-tailed recognition problem, we propose the Temporal Flow Mask Attention Network that compr… ▽ More

    Submitted 31 August, 2022; originally announced August 2022.

    Comments: ICIP 2022

  19. arXiv:2208.11314  [pdf, other

    cs.CV

    Modality Mixer for Multi-modal Action Recognition

    Authors: Sumin Lee, Sangmin Woo, Yeonju Park, Muhammad Adi Nugroho, Changick Kim

    Abstract: In multi-modal action recognition, it is important to consider not only the complementary nature of different modalities but also global action content. In this paper, we propose a novel network, named Modality Mixer (M-Mixer) network, to leverage complementary information across modalities and temporal context of an action for multi-modal action recognition. We also introduce a simple yet effecti… ▽ More

    Submitted 21 February, 2023; v1 submitted 24 August, 2022; originally announced August 2022.

    Comments: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023

  20. Towards an Awareness of Time Series Anomaly Detection Models' Adversarial Vulnerability

    Authors: Shahroz Tariq, Binh M. Le, Simon S. Woo

    Abstract: Time series anomaly detection is extensively studied in statistics, economics, and computer science. Over the years, numerous methods have been proposed for time series anomaly detection using deep learning-based methods. Many of these methods demonstrate state-of-the-art performance on benchmark datasets, giving the false impression that these systems are robust and deployable in many practical a… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    Comments: Part of Proceedings of the 31st ACM International Conference on Information and Knowledge Management (CIKM '22)

  21. arXiv:2208.01924  [pdf, other

    cs.CV

    Per-Clip Video Object Segmentation

    Authors: Kwanyong Park, Sanghyun Woo, Seoung Wug Oh, In So Kweon, Joon-Young Lee

    Abstract: Recently, memory-based approaches show promising results on semi-supervised video object segmentation. These methods predict object masks frame-by-frame with the help of frequently updated memory of the previous mask. Different from this per-frame inference, we investigate an alternative perspective by treating video object segmentation as clip-wise mask propagation. In this per-clip inference sch… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

    Comments: CVPR 2022; Code is available at https://github.com/pkyong95/PCVOS

  22. L2Fuzz: Discovering Bluetooth L2CAP Vulnerabilities Using Stateful Fuzz Testing

    Authors: Haram Park, Carlos Kayembe Nkuba, Seunghoon Woo, Heejo Lee

    Abstract: Bluetooth Basic Rate/Enhanced Data Rate (BR/EDR) is a wireless technology used in billions of devices. Recently, several Bluetooth fuzzing studies have been conducted to detect vulnerabilities in Bluetooth devices, but they fall short of effectively generating malformed packets. In this paper, we propose L2FUZZ, a stateful fuzzer to detect vulnerabilities in Bluetooth BR/EDR Logical Link Control a… ▽ More

    Submitted 29 July, 2022; originally announced August 2022.

    Comments: Updated version (2022.07.30)

    Journal ref: 2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

  23. SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech

    Authors: Hyunjae Cho, Wonbin Jung, Junhyeok Lee, Sang Hoon Woo

    Abstract: In this paper, we present SANE-TTS, a stable and natural end-to-end multilingual TTS model. By the difficulty of obtaining multilingual corpus for given speaker, training multilingual TTS model with monolingual corpora is unavoidable. We introduce speaker regularization loss that improves speech naturalness during cross-lingual synthesis as well as domain adversarial training, which is applied in… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

    Comments: Accepted to Interspeech 2022

  24. arXiv:2205.06421  [pdf, other

    cs.CV cs.AI

    Talking Face Generation with Multilingual TTS

    Authors: Hyoung-Kyu Song, Sang Hoon Woo, Junhyeok Lee, Seungmin Yang, Hyunjae Cho, Youseong Lee, Dongho Choi, Kang-wook Kim

    Abstract: In this work, we propose a joint system combining a talking face generation system with a text-to-speech system that can generate multilingual talking face videos from only the text input. Our system can synthesize natural multilingual speeches while maintaining the vocal identity of the speaker, as well as lip movements synchronized to the synthesized speech. We demonstrate the generalization cap… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: Accepted to CVPR Demo Track (2022)

  25. arXiv:2202.12520  [pdf, other

    cond-mat.mes-hall physics.optics

    Cathodoluminescence excitation spectroscopy: nanoscale imaging of excitation pathways

    Authors: Nadezda Varkentina, Yves Auad, Steffi Y. Woo, Alberto Zobelli, Jean-Denis Blazit, Xiaoyan Li, Marcel Tencé, Kenji Watanabe, Takashi Taniguchi, Odile Stéphan, Mathieu Kociak, Luiz H. G. Tizei

    Abstract: Following the lifespan of optical excitations from their creation to decay into photons is crucial in understanding materials optical properties. Macroscopically, techniques such as the photoluminescence excitation spectroscopy provide unique information on the photophysics of materials with applications as diverse as quantum optics or photovoltaics. Materials excitation and emission pathways are… ▽ More

    Submitted 8 July, 2022; v1 submitted 25 February, 2022; originally announced February 2022.

  26. arXiv:2202.11359   

    cs.CV cs.AI cs.LG

    Deepfake Detection for Facial Images with Facemasks

    Authors: Donggeun Ko, Sangjun Lee, Jinyong Park, Saebyeol Shin, Donghee Hong, Simon S. Woo

    Abstract: Hyper-realistic face image generation and manipulation have givenrise to numerous unethical social issues, e.g., invasion of privacy,threat of security, and malicious political maneuvering, which re-sulted in the development of recent deepfake detection methods with the rising demands of deepfake forensics. Proposed deepfake detection methods to date have shown remarkable detection performance and… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

    Comments: This submission has been removed by arXiv administrators because the submitter did not have the authority to grant the license at the time of submission

  27. Substrate influence on transition metal dichalcogenide monolayer exciton absorption linewidth broadening

    Authors: Fuhui Shao, Steffi Y. Woo, Nianjheng Wu, Robert Schneider, Andrew J. Mayne, Steffen Michaelis de Vasconcellos, Ashish Arora, Benjamin J. Carey, Johann A. Preuß, Noémie Bonnet, Cecilia Mattevi, Kenji Watanabe, Takashi Taniguchi, Zhichuan Niu, Rudolf Bratschitsch, Luiz H. G. Tizei

    Abstract: The excitonic states of transition metal dichacolgenide (TMD) monolayers are heavily influenced by their external dielectric environment based on the substrate used. In this work, various wide bandgap dielectric materials, namely hexagonal boron nitride (\textit{h}-BN) and amorphous silicon nitride (Si$_3$N$_4$), under different configurations as support or encapsulation material for WS$_2$ monola… ▽ More

    Submitted 9 February, 2022; originally announced February 2022.

  28. Semiclassical magnetotransport including the effects of the Berry curvature and Lorentz force

    Authors: Seungchan Woo, Brett Min, Hongki Min

    Abstract: In topological semimetals and insulators, negative longitudinal magnetoresistance and angle-dependent planar Hall effect have been reported arising from the Berry curvature. Using the Boltzmann transport theory, we present a closed-form expression for the nonequilibrium distribution function which includes both the effects of the Berry curvature and Lorentz force. Using this formulation, we obtain… ▽ More

    Submitted 30 May, 2022; v1 submitted 6 February, 2022; originally announced February 2022.

    Comments: 14 pages, 3 figures

    Journal ref: Phys. Rev. B 105, 205126 (2022)

  29. arXiv:2201.10168  [pdf, other

    cs.CV

    Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos

    Authors: Sangmin Woo, Jinyoung Park, Inyong Koo, Sumin Lee, Minki Jeong, Changick Kim

    Abstract: Natural Language Video Grounding (NLVG) aims to localize time segments in an untrimmed video according to sentence queries. In this work, we present a new paradigm named Explore-And-Match for NLVG that seamlessly unifies the strengths of two streams of NLVG methods: proposal-free and proposal-based; the former explores the search space to find time segments directly, and the latter matches the pre… ▽ More

    Submitted 4 August, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

    Comments: Code: https://github.com/sangminwoo/Explore-And-Match

  30. arXiv:2201.07394  [pdf, other

    cs.CV

    KappaFace: Adaptive Additive Angular Margin Loss for Deep Face Recognition

    Authors: Chingis Oinar, Binh M. Le, Simon S. Woo

    Abstract: Feature learning is a widely used method employed for large-scale face recognition. Recently, large-margin softmax loss methods have demonstrated significant enhancements on deep face recognition. These methods propose fixed positive margins in order to enforce intra-class compactness and inter-class diversity. However, the majority of the proposed methods do not consider the class imbalance issue… ▽ More

    Submitted 6 December, 2023; v1 submitted 18 January, 2022; originally announced January 2022.

  31. arXiv:2201.06026  [pdf, other

    cs.LG cs.AI cs.SE

    Toward Among-Device AI from On-Device AI with Stream Pipelines

    Authors: MyungJoo Ham, Sangjung Woo, Jaeyun Jung, Wook Song, Gichan Jang, Yongjoo Ahn, Hyoung Joo Ahn

    Abstract: Modern consumer electronic devices often provide intelligence services with deep neural networks. We have started migrating the computing locations of intelligence services from cloud servers (traditional AI systems) to the corresponding devices (on-device AI systems). On-device AI systems generally have the advantages of preserving privacy, removing network latency, and saving cloud costs. With t… ▽ More

    Submitted 16 January, 2022; originally announced January 2022.

    Comments: to appear in ICSE 2022 SEIP (preprint)

  32. arXiv:2112.12001  [pdf, other

    cs.CV

    DA-FDFtNet: Dual Attention Fake Detection Fine-tuning Network to Detect Various AI-Generated Fake Images

    Authors: Young Oh Bang, Simon S. Woo

    Abstract: Due to the advancement of Generative Adversarial Networks (GAN), Autoencoders, and other AI technologies, it has been much easier to create fake images such as "Deepfakes". More recent research has introduced few-shot learning, which uses a small amount of training data to produce fake images and videos more effectively. Therefore, the ease of generating manipulated images and the difficulty of di… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

  33. arXiv:2112.08050  [pdf, other

    cs.CV cs.CY

    Exploring the Asynchronous of the Frequency Spectra of GAN-generated Facial Images

    Authors: Binh M. Le, Simon S. Woo

    Abstract: The rapid progression of Generative Adversarial Networks (GANs) has raised a concern of their misuse for malicious purposes, especially in creating fake face images. Although many proposed methods succeed in detecting GAN-based synthetic images, they are still limited by the need for large quantities of the training fake image dataset and challenges for the detector's generalizability to unknown f… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

    Comments: International Workshop on Safety and Security of Deep Learning IJCAI, 2021

  34. arXiv:2112.03553  [pdf, other

    cs.CV

    ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images

    Authors: Binh M. Le, Simon S. Woo

    Abstract: Despite significant advancements of deep learning-based forgery detectors for distinguishing manipulated deepfake images, most detection approaches suffer from moderate to significant performance degradation with low-quality compressed deepfake images. Because of the limited information in low-quality images, detecting low-quality deepfake remains an important challenge. In this work, we apply fre… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

    Journal ref: Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

  35. arXiv:2110.04111  [pdf, other

    cs.CV cs.AI cs.LG

    Discover, Hallucinate, and Adapt: Open Compound Domain Adaptation for Semantic Segmentation

    Authors: KwanYong Park, Sanghyun Woo, Inkyu Shin, In So Kweon

    Abstract: Unsupervised domain adaptation (UDA) for semantic segmentation has been attracting attention recently, as it could be beneficial for various label-scarce real-world scenarios (e.g., robot control, autonomous driving, medical imaging, etc.). Despite the significant progress in this field, current works mainly focus on a single-source single-target setting, which cannot handle more practical setting… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2020

  36. arXiv:2109.09456  [pdf

    physics.soc-ph

    Multi-modal Matching Problem of Shared Mobility

    Authors: Soomin Woo

    Abstract: Rideshare is one way to share mobility in transportation without increasing traffic demand, where travel mobility and usage of vehicle capacity can be improved. However, current literature on rideshare has allowed only one-modal trips and may be limited in the matching efficiency, especially when there is a large gap between the supply and demand of mobility. Therefore, the objectives of this pape… ▽ More

    Submitted 12 August, 2021; originally announced September 2021.

  37. arXiv:2109.02993  [pdf, other

    cs.CV cs.MM cs.SD eess.AS eess.IV

    Evaluation of an Audio-Video Multimodal Deepfake Dataset using Unimodal and Multimodal Detectors

    Authors: Hasam Khalid, Minha Kim, Shahroz Tariq, Simon S. Woo

    Abstract: Significant advancements made in the generation of deepfakes have caused security and privacy issues. Attackers can easily impersonate a person's identity in an image by replacing his face with the target person's face. Moreover, a new domain of cloning human voices using deep-learning technologies is also emerging. Now, an attacker can generate realistic cloned voices of humans using only a few s… ▽ More

    Submitted 7 September, 2021; originally announced September 2021.

    Comments: 2 Figures, 2 Tables, Accepted for publication at the 1st Workshop on Synthetic Multimedia - Audiovisual Deepfake Generation and Detection (ADGD '21) at ACM MM 2021

    ACM Class: I.4.9; I.5.4

  38. arXiv:2109.01486  [pdf, other

    eess.IV cs.CV

    Studying the Effects of Self-Attention for Medical Image Analysis

    Authors: Adrit Rao, Jongchan Park, Sanghyun Woo, Joon-Young Lee, Oliver Aalami

    Abstract: When the trained physician interprets medical images, they understand the clinical importance of visual features. By applying cognitive attention, they apply greater focus onto clinically relevant regions while disregarding unnecessary features. The use of computer vision to automate the classification of medical images is widely studied. However, the standard convolutional neural network (CNN) do… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.

    Comments: ICCV 2021 CVAMD

  39. arXiv:2108.09954  [pdf

    cs.ET cs.AI physics.app-ph

    Pulse-Width Modulation Neuron Implemented by Single Positive-Feedback Device

    Authors: Sung Yun Woo, Dongseok Kwon, Byung-Gook Park, Jong-Ho Lee, Jong-Ho Bae

    Abstract: Positive-feedback (PF) device and its operation scheme to implement pulse width modulation (PWM) function was proposed and demonstrated, and the device operation mechanism for implementing PWM function was analyzed. By adjusting the amount of the charge stored in the n- floating body (Qn), the potential of the floating body linearly changes with time. When Qn reaches to a threshold value (Qth), th… ▽ More

    Submitted 23 August, 2021; originally announced August 2021.

  40. arXiv:2108.05570  [pdf, other

    cs.CV

    LabOR: Labeling Only if Required for Domain Adaptive Semantic Segmentation

    Authors: Inkyu Shin, Dong-jin Kim, Jae Won Cho, Sanghyun Woo, Kwanyong Park, In So Kweon

    Abstract: Unsupervised Domain Adaptation (UDA) for semantic segmentation has been actively studied to mitigate the domain gap between label-rich source data and unlabeled target data. Despite these efforts, UDA still has a long way to go to reach the fully supervised performance. To this end, we propose a Labeling Only if Required strategy, LabOR, where we introduce a human-in-the-loop approach to adaptivel… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

    Comments: Accepted to ICCV 2021 (Oral)

  41. arXiv:2108.05530  [pdf, other

    eess.SY

    Flow-Aware Platoon Formation of Connected Automated Vehicles

    Authors: Soomin Woo, Alexander Skabardonis

    Abstract: Connected Automated Vehicles (CAVs) bring promise of increasing traffic capacity and energy efficiency by forming platoons with short headways on the road. However at low CAV penetration, the capacity gain will be small because the CAVs that randomly enter the road will be sparsely distributed, diminishing the probability of forming long platoons. Many researchers propose to solve this issue by pl… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

  42. arXiv:2108.05080  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset

    Authors: Hasam Khalid, Shahroz Tariq, Minha Kim, Simon S. Woo

    Abstract: While the significant advancements have made in the generation of deepfakes using deep learning technologies, its misuse is a well-known issue now. Deepfakes can cause severe security and privacy issues as they can be used to impersonate a person's identity in a video by replacing his/her face with another person's face. Recently, a new problem of generating synthesized human voice of a person is… ▽ More

    Submitted 1 March, 2022; v1 submitted 11 August, 2021; originally announced August 2021.

    Comments: Part of Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks 2021)

    ACM Class: I.4.9; I.5.4

  43. MKConv: Multidimensional Feature Representation for Point Cloud Analysis

    Authors: Sungmin Woo, Dogyoon Lee, Sangwon Hwang, Woojin Kim, Sangyoun Lee

    Abstract: Despite the remarkable success of deep learning, an optimal convolution operation on point clouds remains elusive owing to their irregular data structure. Existing methods mainly focus on designing an effective continuous kernel function that can handle an arbitrary point in continuous space. Various approaches exhibiting high performance have been proposed, but we observe that the standard pointw… ▽ More

    Submitted 17 July, 2023; v1 submitted 27 July, 2021; originally announced July 2021.

    Comments: Accepted by Pattern Recognition 2023

    Journal ref: Pattern Recognition 143C (2023) 109800

  44. arXiv:2107.11052  [pdf, other

    cs.CV

    Unsupervised Domain Adaptation for Video Semantic Segmentation

    Authors: Inkyu Shin, Kwanyong Park, Sanghyun Woo, In So Kweon

    Abstract: Unsupervised Domain Adaptation for semantic segmentation has gained immense popularity since it can transfer knowledge from simulation to real (Sim2Real) by largely cutting out the laborious per pixel labeling efforts at real. In this work, we present a new video extension of this task, namely Unsupervised Domain Adaptation for Video Semantic Segmentation. As it became easy to obtain large-scale v… ▽ More

    Submitted 13 September, 2021; v1 submitted 23 July, 2021; originally announced July 2021.

  45. arXiv:2107.07154  [pdf, other

    cs.CV cs.AI

    What and When to Look?: Temporal Span Proposal Network for Video Relation Detection

    Authors: Sangmin Woo, Junhyug Noh, Kangil Kim

    Abstract: Identifying relations between objects is central to understanding the scene. While several works have been proposed for relation modeling in the image domain, there have been many constraints in the video domain due to challenging dynamics of spatio-temporal interactions (e.g., between which objects are there an interaction? when do relations start and end?). To date, two representative methods ha… ▽ More

    Submitted 5 October, 2022; v1 submitted 15 July, 2021; originally announced July 2021.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  46. arXiv:2107.02408  [pdf, other

    cs.CV cs.CR cs.LG cs.MM

    CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation

    Authors: Minha Kim, Shahroz Tariq, Simon S. Woo

    Abstract: Over the last few decades, artificial intelligence research has made tremendous strides, but it still heavily relies on fixed datasets in stationary environments. Continual learning is a growing field of research that examines how AI systems can learn sequentially from a continuous stream of linked data in the same way that biological systems do. Simultaneously, fake media such as deepfakes and sy… ▽ More

    Submitted 5 August, 2021; v1 submitted 6 July, 2021; originally announced July 2021.

    Comments: 13 pages, 7 Figures, 13 Tables, Accepted for publication in the 29th ACM International Conference on Multimedia (ACMMM '21)

    ACM Class: I.4.9; I.5.4

  47. arXiv:2106.09453  [pdf, other

    cs.CV

    Learning to Associate Every Segment for Video Panoptic Segmentation

    Authors: Sanghyun Woo, Dahun Kim, Joon-Young Lee, In So Kweon

    Abstract: Temporal correspondence - linking pixels or objects across frames - is a fundamental supervisory signal for the video models. For the panoptic understanding of dynamic scenes, we further extend this concept to every segment. Specifically, we aim to learn coarse segment-level matching and fine pixel-level matching together. We implement this idea by designing two novel learning objectives. To valid… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    Comments: Accepted to CVPR2021

  48. Tackling the Challenges in Scene Graph Generation with Local-to-Global Interactions

    Authors: Sangmin Woo, Junhyug Noh, Kangil Kim

    Abstract: In this work, we seek new insights into the underlying challenges of the Scene Graph Generation (SGG) task. Quantitative and qualitative analysis of the Visual Genome dataset implies -- 1) Ambiguity: even if inter-object relationship contains the same object (or predicate), they may not be visually or semantically similar, 2) Asymmetry: despite the nature of the relationship that embodied the dire… ▽ More

    Submitted 12 April, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: IEEE Transactions on Neural Networks and Learning Systems (TNNLS)

  49. arXiv:2105.13617  [pdf, other

    cs.CV

    FReTAL: Generalizing Deepfake Detection using Knowledge Distillation and Representation Learning

    Authors: Minha Kim, Shahroz Tariq, Simon S. Woo

    Abstract: As GAN-based video and image manipulation technologies become more sophisticated and easily accessible, there is an urgent need for effective deepfake detection technologies. Moreover, various deepfake generation techniques have emerged over the past few years. While many deepfake detection methods have been proposed, their performance suffers from new types of deepfake methods on which they are n… ▽ More

    Submitted 28 May, 2021; originally announced May 2021.

    Comments: 12 pages, 2 figures, 5 tables, accepted for publication at the Workshop on Media Forensics 2021

    ACM Class: I.4.9; I.5.4

  50. arXiv:2105.06117  [pdf, other

    cs.CV

    TAR: Generalized Forensic Framework to Detect Deepfakes using Weakly Supervised Learning

    Authors: Sangyup Lee, Shahroz Tariq, Junyaup Kim, Simon S. Woo

    Abstract: Deepfakes have become a critical social problem, and detecting them is of utmost importance. Also, deepfake generation methods are advancing, and it is becoming harder to detect. While many deepfake detection models can detect different types of deepfakes separately, they perform poorly on generalizing the detection performance over multiple types of deepfake. This motivates us to develop a genera… ▽ More

    Submitted 13 May, 2021; originally announced May 2021.

    Comments: 16 pages, 3 figures, to be published in IFIP-SEC 2021