Skip to main content

Showing 1–6 of 6 results for author: Nugroho, M A

  1. arXiv:2405.18012  [pdf, other

    cs.CV eess.IV

    Flow-Assisted Motion Learning Network for Weakly-Supervised Group Activity Recognition

    Authors: Muhammad Adi Nugroho, Sangmin Woo, Sumin Lee, Jinyoung Park, Yooseung Wang, Donguk Kim, Changick Kim

    Abstract: Weakly-Supervised Group Activity Recognition (WSGAR) aims to understand the activity performed together by a group of individuals with the video-level label and without actor-level labels. We propose Flow-Assisted Motion Learning Network (Flaming-Net) for WSGAR, which consists of the motion-aware actor encoder to extract actor features and the two-pathways relation module to infer the interaction… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  2. arXiv:2311.12344  [pdf, other

    cs.CV

    Modality Mixer Exploiting Complementary Information for Multi-modal Action Recognition

    Authors: Sumin Lee, Sangmin Woo, Muhammad Adi Nugroho, Changick Kim

    Abstract: Due to the distinctive characteristics of sensors, each modality exhibits unique physical properties. For this reason, in the context of multi-modal action recognition, it is important to consider not only the overall action content but also the complementary nature of different modalities. In this paper, we propose a novel network, named Modality Mixer (M-Mixer) network, which effectively leverag… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2208.11314

  3. arXiv:2308.09322  [pdf, other

    cs.CV cs.AI cs.MM

    Audio-Visual Glance Network for Efficient Video Recognition

    Authors: Muhammad Adi Nugroho, Sangmin Woo, Sumin Lee, Changick Kim

    Abstract: Deep learning has made significant strides in video understanding tasks, but the computation required to classify lengthy and massive videos using clip-level video classifiers remains impractical and prohibitively expensive. To address this issue, we propose Audio-Visual Glance Network (AVGN), which leverages the commonly available audio and visual modalities to efficiently process the spatio-temp… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  4. arXiv:2211.13916  [pdf, other

    cs.CV cs.AI cs.LG

    Towards Good Practices for Missing Modality Robust Action Recognition

    Authors: Sangmin Woo, Sumin Lee, Yeonju Park, Muhammad Adi Nugroho, Changick Kim

    Abstract: Standard multi-modal models assume the use of the same modalities in training and inference stages. However, in practice, the environment in which multi-modal models operate may not satisfy such assumption. As such, their performances degrade drastically if any modality is missing in the inference stage. We ask: how can we train a model that is robust to missing modalities? This paper seeks a set… ▽ More

    Submitted 30 March, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: AAAI 2023 (Oral); Code: https://github.com/sangminwoo/ActionMAE

  5. arXiv:2208.11314  [pdf, other

    cs.CV

    Modality Mixer for Multi-modal Action Recognition

    Authors: Sumin Lee, Sangmin Woo, Yeonju Park, Muhammad Adi Nugroho, Changick Kim

    Abstract: In multi-modal action recognition, it is important to consider not only the complementary nature of different modalities but also global action content. In this paper, we propose a novel network, named Modality Mixer (M-Mixer) network, to leverage complementary information across modalities and temporal context of an action for multi-modal action recognition. We also introduce a simple yet effecti… ▽ More

    Submitted 21 February, 2023; v1 submitted 24 August, 2022; originally announced August 2022.

    Comments: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023

  6. arXiv:2207.02066  [pdf, other

    cs.CV

    Test-time Adaptation for Real Image Denoising via Meta-transfer Learning

    Authors: Agus Gunawan, Muhammad Adi Nugroho, Se Jin Park

    Abstract: In recent years, a ton of research has been conducted on real image denoising tasks. However, the efforts are more focused on improving real image denoising through creating a better network architecture. We explore a different direction where we propose to improve real image denoising performance through a better learning strategy that can enable test-time adaptation on the multi-task network. Th… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.