subscribe to arXiv mailings

SYNAuG: Exploiting Synthetic Data for Data Imbalance Problems

Authors: Moon Ye-Bin, Nam Hyeon-Woo, Wonseok Choi, Nayeong Kim, Suha Kwak, Tae-Hyun Oh

Abstract: Data imbalance in training data often leads to biased predictions from trained models, which in turn causes ethical and social issues. A straightforward solution is to carefully curate training data, but given the enormous scale of modern neural networks, this is prohibitively labor-intensive and thus impractical. Inspired by recent developments in generative models, this paper explores the potent… ▽ More Data imbalance in training data often leads to biased predictions from trained models, which in turn causes ethical and social issues. A straightforward solution is to carefully curate training data, but given the enormous scale of modern neural networks, this is prohibitively labor-intensive and thus impractical. Inspired by recent developments in generative models, this paper explores the potential of synthetic data to address the data imbalance problem. To be specific, our method, dubbed SYNAuG, leverages synthetic data to equalize the unbalanced distribution of training data. Our experiments demonstrate that, although a domain gap between real and synthetic data exists, training with SYNAuG followed by fine-tuning with a few real samples allows to achieve impressive performance on diverse tasks with different data imbalance issues, surpassing existing task-specific methods for the same purpose. △ Less

Submitted 25 April, 2024; v1 submitted 2 August, 2023; originally announced August 2023.

Comments: The paper is under consideration at Pattern Recognition Letters

arXiv:2307.14611 [pdf, other]

TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation

Authors: Moon Ye-Bin, Jisoo Kim, Hongyeob Kim, Kilho Son, Tae-Hyun Oh

Abstract: We propose TextManiA, a text-driven manifold augmentation method that semantically enriches visual feature spaces, regardless of class distribution. TextManiA augments visual data with intra-class semantic perturbation by exploiting easy-to-understand visually mimetic words, i.e., attributes. This work is built on an interesting hypothesis that general language models, e.g., BERT and GPT, encompas… ▽ More We propose TextManiA, a text-driven manifold augmentation method that semantically enriches visual feature spaces, regardless of class distribution. TextManiA augments visual data with intra-class semantic perturbation by exploiting easy-to-understand visually mimetic words, i.e., attributes. This work is built on an interesting hypothesis that general language models, e.g., BERT and GPT, encompass visual information to some extent, even without training on visual training data. Given the hypothesis, TextManiA transfers pre-trained text representation obtained from a well-established large language encoder to a target visual feature space being learned. Our extensive analysis hints that the language encoder indeed encompasses visual information at least useful to augment visual representation. Our experiments demonstrate that TextManiA is particularly powerful in scarce samples with class imbalance as well as even distribution. We also show compatibility with the label mix-based approaches in evenly distributed scarce data. △ Less

Submitted 11 September, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

Comments: Accepted at ICCV 2023. [Project Pages] https://textmania.github.io/

arXiv:2302.09765 [pdf, other]

ENInst: Enhancing Weakly-supervised Low-shot Instance Segmentation

Authors: Moon Ye-Bin, Dongmin Choi, Yongjin Kwon, Junsik Kim, Tae-Hyun Oh

Abstract: We address a weakly-supervised low-shot instance segmentation, an annotation-efficient training method to deal with novel classes effectively. Since it is an under-explored problem, we first investigate the difficulty of the problem and identify the performance bottleneck by conducting systematic analyses of model components and individual sub-tasks with a simple baseline model. Based on the analy… ▽ More We address a weakly-supervised low-shot instance segmentation, an annotation-efficient training method to deal with novel classes effectively. Since it is an under-explored problem, we first investigate the difficulty of the problem and identify the performance bottleneck by conducting systematic analyses of model components and individual sub-tasks with a simple baseline model. Based on the analyses, we propose ENInst with sub-task enhancement methods: instance-wise mask refinement for enhancing pixel localization quality and novel classifier composition for improving classification accuracy. Our proposed method lifts the overall performance by enhancing the performance of each sub-task. We demonstrate that our ENInst is 7.5 times more efficient in achieving comparable performance to the existing fully-supervised few-shot models and even outperforms them at times. △ Less

Submitted 30 July, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

Comments: Accepted at Pattern Recognition (PR)

arXiv:2208.06787 [pdf, other]

HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields

Authors: Kim Jun-Seong, Kim Yu-Ji, Moon Ye-Bin, Tae-Hyun Oh

Abstract: We propose high dynamic range (HDR) radiance fields, HDR-Plenoxels, that learn a plenoptic function of 3D HDR radiance fields, geometry information, and varying camera settings inherent in 2D low dynamic range (LDR) images. Our voxel-based volume rendering pipeline reconstructs HDR radiance fields with only multi-view LDR images taken from varying camera settings in an end-to-end manner and has a… ▽ More We propose high dynamic range (HDR) radiance fields, HDR-Plenoxels, that learn a plenoptic function of 3D HDR radiance fields, geometry information, and varying camera settings inherent in 2D low dynamic range (LDR) images. Our voxel-based volume rendering pipeline reconstructs HDR radiance fields with only multi-view LDR images taken from varying camera settings in an end-to-end manner and has a fast convergence speed. To deal with various cameras in real-world scenarios, we introduce a tone mapping module that models the digital in-camera imaging pipeline (ISP) and disentangles radiometric settings. Our tone mapping module allows us to render by controlling the radiometric settings of each novel view. Finally, we build a multi-view dataset with varying camera conditions, which fits our problem setting. Our experiments show that HDR-Plenoxels can express detail and high-quality HDR novel views from only LDR images with various cameras. △ Less

Submitted 18 November, 2022; v1 submitted 14 August, 2022; originally announced August 2022.

Comments: Accepted at ECCV 2022. [Project page] https://hdr-plenoxels.github.io [Code] https://github.com/postech-ami/HDR-Plenoxels

arXiv:2108.06098 [pdf, other]

FedPara: Low-Rank Hadamard Product for Communication-Efficient Federated Learning

Authors: Nam Hyeon-Woo, Moon Ye-Bin, Tae-Hyun Oh

Abstract: In this work, we propose a communication-efficient parameterization, FedPara, for federated learning (FL) to overcome the burdens on frequent model uploads and downloads. Our method re-parameterizes weight parameters of layers using low-rank weights followed by the Hadamard product. Compared to the conventional low-rank parameterization, our FedPara method is not restricted to low-rank constraints… ▽ More In this work, we propose a communication-efficient parameterization, FedPara, for federated learning (FL) to overcome the burdens on frequent model uploads and downloads. Our method re-parameterizes weight parameters of layers using low-rank weights followed by the Hadamard product. Compared to the conventional low-rank parameterization, our FedPara method is not restricted to low-rank constraints, and thereby it has a far larger capacity. This property enables to achieve comparable performance while requiring 3 to 10 times lower communication costs than the model with the original layers, which is not achievable by the traditional low-rank methods. The efficiency of our method can be further improved by combining with other efficient FL optimizers. In addition, we extend our method to a personalized FL application, pFedPara, which separates parameters into global and local ones. We show that pFedPara outperforms competing personalized FL methods with more than three times fewer parameters. △ Less

Submitted 19 January, 2023; v1 submitted 13 August, 2021; originally announced August 2021.

Comments: Accepted at ICLR 2022

Showing 1–5 of 5 results for author: Ye-Bin, M