-
FedUV: Uniformity and Variance for Heterogeneous Federated Learning
Authors:
Ha Min Son,
Moon-Hyun Kim,
Tai-Myoung Chung,
Chao Huang,
Xin Liu
Abstract:
Federated learning is a promising framework to train neural networks with widely distributed data. However, performance degrades heavily with heterogeneously distributed data. Recent work has shown this is due to the final layer of the network being most prone to local bias, some finding success freezing the final layer as an orthogonal classifier. We investigate the training dynamics of the class…
▽ More
Federated learning is a promising framework to train neural networks with widely distributed data. However, performance degrades heavily with heterogeneously distributed data. Recent work has shown this is due to the final layer of the network being most prone to local bias, some finding success freezing the final layer as an orthogonal classifier. We investigate the training dynamics of the classifier by applying SVD to the weights motivated by the observation that freezing weights results in constant singular values. We find that there are differences when training in IID and non-IID settings. Based on this finding, we introduce two regularization terms for local training to continuously emulate IID settings: (1) variance in the dimension-wise probability distribution of the classifier and (2) hyperspherical uniformity of representations of the encoder. These regularizations promote local models to act as if it were in an IID setting regardless of the local data distribution, thus offsetting proneness to bias while being flexible to the data. On extensive experiments in both label-shift and feature-shift settings, we verify that our method achieves highest performance by a large margin especially in highly non-IID cases in addition to being scalable to larger models and datasets.
△ Less
Submitted 1 March, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
Breaking Temporal Consistency: Generating Video Universal Adversarial Perturbations Using Image Models
Authors:
Hee-Seon Kim,
Minji Son,
Minbeom Kim,
Myung-Joon Kwon,
Changick Kim
Abstract:
As video analysis using deep learning models becomes more widespread, the vulnerability of such models to adversarial attacks is becoming a pressing concern. In particular, Universal Adversarial Perturbation (UAP) poses a significant threat, as a single perturbation can mislead deep learning models on entire datasets. We propose a novel video UAP using image data and image model. This enables us t…
▽ More
As video analysis using deep learning models becomes more widespread, the vulnerability of such models to adversarial attacks is becoming a pressing concern. In particular, Universal Adversarial Perturbation (UAP) poses a significant threat, as a single perturbation can mislead deep learning models on entire datasets. We propose a novel video UAP using image data and image model. This enables us to take advantage of the rich image data and image model-based studies available for video applications. However, there is a challenge that image models are limited in their ability to analyze the temporal aspects of videos, which is crucial for a successful video attack. To address this challenge, we introduce the Breaking Temporal Consistency (BTC) method, which is the first attempt to incorporate temporal information into video attacks using image models. We aim to generate adversarial videos that have opposite patterns to the original. Specifically, BTC-UAP minimizes the feature similarity between neighboring frames in videos. Our approach is simple but effective at attacking unseen video models. Additionally, it is applicable to videos of varying lengths and invariant to temporal shifts. Our approach surpasses existing methods in terms of effectiveness on various datasets, including ImageNet, UCF-101, and Kinetics-400.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
Sketch-based Video Object Localization
Authors:
Sangmin Woo,
So-Yeong Jeon,
Jinyoung Park,
Minji Son,
Sumin Lee,
Changick Kim
Abstract:
We introduce Sketch-based Video Object Localization (SVOL), a new task aimed at localizing spatio-temporal object boxes in video queried by the input sketch. We first outline the challenges in the SVOL task and build the Sketch-Video Attention Network (SVANet) with the following design principles: (i) to consider temporal information of video and bridge the domain gap between sketch and video; (ii…
▽ More
We introduce Sketch-based Video Object Localization (SVOL), a new task aimed at localizing spatio-temporal object boxes in video queried by the input sketch. We first outline the challenges in the SVOL task and build the Sketch-Video Attention Network (SVANet) with the following design principles: (i) to consider temporal information of video and bridge the domain gap between sketch and video; (ii) to accurately identify and localize multiple objects simultaneously; (iii) to handle various styles of sketches; (iv) to be classification-free. In particular, SVANet is equipped with a Cross-modal Transformer that models the interaction between learnable object tokens, query sketch, and video through attention operations, and learns upon a per-frame set matching strategy that enables frame-wise prediction while utilizing global video context. We evaluate SVANet on a newly curated SVOL dataset. By design, SVANet successfully learns the mapping between the query sketches and video objects, achieving state-of-the-art results on the SVOL benchmark. We further confirm the effectiveness of SVANet via extensive ablation studies and visualizations. Lastly, we demonstrate its transfer capability on unseen datasets and novel categories, suggesting its high scalability in real-world applications.
△ Less
Submitted 29 November, 2023; v1 submitted 2 April, 2023;
originally announced April 2023.
-
Temporal Interpolation Is All You Need for Dynamic Neural Radiance Fields
Authors:
Sungheon Park,
Minjung Son,
Seokhwan Jang,
Young Chun Ahn,
Ji-Yeon Kim,
Nahyup Kang
Abstract:
Temporal interpolation often plays a crucial role to learn meaningful representations in dynamic scenes. In this paper, we propose a novel method to train spatiotemporal neural radiance fields of dynamic scenes based on temporal interpolation of feature vectors. Two feature interpolation methods are suggested depending on underlying representations, neural networks or grids. In the neural represen…
▽ More
Temporal interpolation often plays a crucial role to learn meaningful representations in dynamic scenes. In this paper, we propose a novel method to train spatiotemporal neural radiance fields of dynamic scenes based on temporal interpolation of feature vectors. Two feature interpolation methods are suggested depending on underlying representations, neural networks or grids. In the neural representation, we extract features from space-time inputs via multiple neural network modules and interpolate them based on time frames. The proposed multi-level feature interpolation network effectively captures features of both short-term and long-term time ranges. In the grid representation, space-time features are learned via four-dimensional hash grids, which remarkably reduces training time. The grid representation shows more than 100 times faster training speed than the previous neural-net-based methods while maintaining the rendering quality. Concatenating static and dynamic features and adding a simple smoothness term further improve the performance of our proposed models. Despite the simplicity of the model architectures, our method achieved state-of-the-art performance both in rendering quality for the neural representation and in training speed for the grid representation.
△ Less
Submitted 29 March, 2023; v1 submitted 18 February, 2023;
originally announced February 2023.
-
RainUNet for Super-Resolution Rain Movie Prediction under Spatio-temporal Shifts
Authors:
Jinyoung Park,
Minseok Son,
Seungju Cho,
Inyoung Lee,
Changick Kim
Abstract:
This paper presents a solution to the Weather4cast 2022 Challenge Stage 2. The goal of the challenge is to forecast future high-resolution rainfall events obtained from ground radar using low-resolution multiband satellite images. We suggest a solution that performs data preprocessing appropriate to the challenge and then predicts rainfall movies using a novel RainUNet. RainUNet is a hierarchical…
▽ More
This paper presents a solution to the Weather4cast 2022 Challenge Stage 2. The goal of the challenge is to forecast future high-resolution rainfall events obtained from ground radar using low-resolution multiband satellite images. We suggest a solution that performs data preprocessing appropriate to the challenge and then predicts rainfall movies using a novel RainUNet. RainUNet is a hierarchical U-shaped network with temporal-wise separable block (TS block) using a decoupled large kernel 3D convolution to improve the prediction performance. Various evaluation metrics show that our solution is effective compared to the baseline method. The source codes are available at https://github.com/jinyxp/Weather4cast-2022
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
SinGRAF: Learning a 3D Generative Radiance Field for a Single Scene
Authors:
Minjung Son,
Jeong Joon Park,
Leonidas Guibas,
Gordon Wetzstein
Abstract:
Generative models have shown great promise in synthesizing photorealistic 3D objects, but they require large amounts of training data. We introduce SinGRAF, a 3D-aware generative model that is trained with a few input images of a single scene. Once trained, SinGRAF generates different realizations of this 3D scene that preserve the appearance of the input while varying scene layout. For this purpo…
▽ More
Generative models have shown great promise in synthesizing photorealistic 3D objects, but they require large amounts of training data. We introduce SinGRAF, a 3D-aware generative model that is trained with a few input images of a single scene. Once trained, SinGRAF generates different realizations of this 3D scene that preserve the appearance of the input while varying scene layout. For this purpose, we build on recent progress in 3D GAN architectures and introduce a novel progressive-scale patch discrimination approach during training. With several experiments, we demonstrate that the results produced by SinGRAF outperform the closest related works in both quality and diversity by a large margin.
△ Less
Submitted 2 April, 2023; v1 submitted 30 November, 2022;
originally announced November 2022.
-
Compare Where It Matters: Using Layer-Wise Regularization To Improve Federated Learning on Heterogeneous Data
Authors:
Ha Min Son,
Moon Hyun Kim,
Tai-Myoung Chung
Abstract:
Federated Learning is a widely adopted method to train neural networks over distributed data. One main limitation is the performance degradation that occurs when data is heterogeneously distributed. While many works have attempted to address this problem, these methods under-perform because they are founded on a limited understanding of neural networks. In this work, we verify that only certain im…
▽ More
Federated Learning is a widely adopted method to train neural networks over distributed data. One main limitation is the performance degradation that occurs when data is heterogeneously distributed. While many works have attempted to address this problem, these methods under-perform because they are founded on a limited understanding of neural networks. In this work, we verify that only certain important layers in a neural network require regularization for effective training. We additionally verify that Centered Kernel Alignment (CKA) most accurately calculates similarity between layers of neural networks trained on different data. By applying CKA-based regularization to important layers during training, we significantly improve performance in heterogeneous settings. We present FedCKA: a simple framework that out-performs previous state-of-the-art methods on various deep learning tasks while also improving efficiency and scalability.
△ Less
Submitted 1 December, 2021;
originally announced December 2021.
-
Personalized Federated Learning with Clustering: Non-IID Heart Rate Variability Data Application
Authors:
Joo Hun Yoo,
Ha Min Son,
Hyejun Jeong,
Eun-Hye Jang,
Ah Young Kim,
Han Young Yu,
Hong Jin Jeon,
Tai-Myoung Chung
Abstract:
While machine learning techniques are being applied to various fields for their exceptional ability to find complex relations in large datasets, the strengthening of regulations on data ownership and privacy is causing increasing difficulty in its application to medical data. In light of this, Federated Learning has recently been proposed as a solution to train on private data without breach of co…
▽ More
While machine learning techniques are being applied to various fields for their exceptional ability to find complex relations in large datasets, the strengthening of regulations on data ownership and privacy is causing increasing difficulty in its application to medical data. In light of this, Federated Learning has recently been proposed as a solution to train on private data without breach of confidentiality. This conservation of privacy is particularly appealing in the field of healthcare, where patient data is highly confidential. However, many studies have shown that its assumption of Independent and Identically Distributed data is unrealistic for medical data. In this paper, we propose Personalized Federated Cluster Models, a hierarchical clustering-based FL process, to predict Major Depressive Disorder severity from Heart Rate Variability. By allowing clients to receive more personalized model, we address problems caused by non-IID data, showing an accuracy increase in severity prediction. This increase in performance may be sufficient to use Personalized Federated Cluster Models in many existing Federated Learning scenarios.
△ Less
Submitted 10 August, 2021; v1 submitted 4 August, 2021;
originally announced August 2021.
-
Fuzzy Approximate Reasoning Method based on Least Common Multiple and its Property Analysis
Authors:
I. M. Son,
S. I. Kwak,
M. O. Choe
Abstract:
This paper shows a novel fuzzy approximate reasoning method based on the least common multiple (LCM). Its fundamental idea is to obtain a new fuzzy reasoning result by the extended distance measure based on LCM between the antecedent fuzzy set and the consequent one in discrete SISO fuzzy system. The proposed method is called LCM one. And then this paper analyzes its some properties, i.e., the red…
▽ More
This paper shows a novel fuzzy approximate reasoning method based on the least common multiple (LCM). Its fundamental idea is to obtain a new fuzzy reasoning result by the extended distance measure based on LCM between the antecedent fuzzy set and the consequent one in discrete SISO fuzzy system. The proposed method is called LCM one. And then this paper analyzes its some properties, i.e., the reductive property, information loss occurred in reasoning process, and the convergence of fuzzy control. Theoretical and experimental research results highlight that proposed method meaningfully improve the reductive property and information loss and controllability than the previous fuzzy reasoning methods.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
NPRportrait 1.0: A Three-Level Benchmark for Non-Photorealistic Rendering of Portraits
Authors:
Paul L. Rosin,
Yu-Kun Lai,
David Mould,
Ran Yi,
Itamar Berger,
Lars Doyle,
Seungyong Lee,
Chuan Li,
Yong-Jin Liu,
Amir Semmo,
Ariel Shamir,
Minjung Son,
Holger Winnemoller
Abstract:
Despite the recent upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer, the state of performance evaluation in this field is limited, especially compared to the norms in the computer vision and machine learning communities. Unfortunately, the task of evaluating image stylisation is thus far…
▽ More
Despite the recent upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer, the state of performance evaluation in this field is limited, especially compared to the norms in the computer vision and machine learning communities. Unfortunately, the task of evaluating image stylisation is thus far not well defined, since it involves subjective, perceptual and aesthetic aspects. To make progress towards a solution, this paper proposes a new structured, three level, benchmark dataset for the evaluation of stylised portrait images. Rigorous criteria were used for its construction, and its consistency was validated by user studies. Moreover, a new methodology has been developed for evaluating portrait stylisation algorithms, which makes use of the different benchmark levels as well as annotations provided by user studies regarding the characteristics of the faces. We perform evaluation for a wide variety of image stylisation methods (both portrait-specific and general purpose, and also both traditional NPR approaches and neural style transfer) using the new benchmark dataset.
△ Less
Submitted 1 September, 2020;
originally announced September 2020.
-
Deep Convolutional Neural Network for Identifying Seam-Carving Forgery
Authors:
Seung-Hun Nam,
Wonhyuk Ahn,
In-Jae Yu,
Myung-Joon Kwon,
Minseok Son,
Heung-Kyu Lee
Abstract:
Seam carving is a representative content-aware image retargeting approach to adjust the size of an image while preserving its visually prominent content. To maintain visually important content, seam-carving algorithms first calculate the connected path of pixels, referred to as the seam, according to a defined cost function and then adjust the size of an image by removing and duplicating repeatedl…
▽ More
Seam carving is a representative content-aware image retargeting approach to adjust the size of an image while preserving its visually prominent content. To maintain visually important content, seam-carving algorithms first calculate the connected path of pixels, referred to as the seam, according to a defined cost function and then adjust the size of an image by removing and duplicating repeatedly calculated seams. Seam carving is actively exploited to overcome diversity in the resolution of images between applications and devices; hence, detecting the distortion caused by seam carving has become important in image forensics. In this paper, we propose a convolutional neural network (CNN)-based approach to classifying seam-carving-based image retargeting for reduction and expansion. To attain the ability to learn low-level features, we designed a CNN architecture comprising five types of network blocks specialized for capturing subtle signals. An ensemble module is further adopted to both enhance performance and comprehensively analyze the features in the local areas of the given image. To validate the effectiveness of our work, extensive experiments based on various CNN-based baselines were conducted. Compared to the baselines, our work exhibits state-of-the-art performance in terms of three-class classification (original, seam inserted, and seam removed). In addition, our model with the ensemble module is robust for various unseen cases. The experimental results also demonstrate that our method can be applied to localize both seam-removed and seam-inserted areas.
△ Less
Submitted 7 July, 2020; v1 submitted 5 July, 2020;
originally announced July 2020.
-
A Novel Fuzzy Approximate Reasoning Method Based on Extended Distance Measure in SISO Fuzzy System
Authors:
I. M. Son,
S. I. Kwak,
U. J. Han,
J. H. Pak,
M. Han,
J. R. Pyon,
U. S. Ryu
Abstract:
This paper presents an original method of fuzzy approximate reasoning that can open a new direction of research in the uncertainty inference of Artificial Intelligence(AI) and Computational Intelligence(CI). Fuzzy modus ponens (FMP) and fuzzy modus tollens(FMT) are two fundamental and basic models of general fuzzy approximate reasoning in various fuzzy systems. And the reductive property is one of…
▽ More
This paper presents an original method of fuzzy approximate reasoning that can open a new direction of research in the uncertainty inference of Artificial Intelligence(AI) and Computational Intelligence(CI). Fuzzy modus ponens (FMP) and fuzzy modus tollens(FMT) are two fundamental and basic models of general fuzzy approximate reasoning in various fuzzy systems. And the reductive property is one of the essential and important properties in the approximate reasoning theory and it is a lot of applications. This paper suggests a kind of extended distance measure (EDM) based approximate reasoning method in the single input single output(SISO) fuzzy system with discrete fuzzy set vectors of different dimensions. The EDM based fuzzy approximate reasoning method is consists of two part, i.e., FMP-EDM and FMT-EDM. The distance measure based fuzzy reasoning method that the dimension of the antecedent discrete fuzzy set is equal to one of the consequent discrete fuzzy set has already solved in other paper. In this paper discrete fuzzy set vectors of different dimensions mean that the dimension of the antecedent discrete fuzzy set differs from one of the consequent discrete fuzzy set in the SISO fuzzy system. That is, this paper is based on EDM. The experimental results highlight that the proposed approximate reasoning method is comparatively clear and effective with respect to the reductive property, and in accordance with human thinking than existing fuzzy reasoning methods.
△ Less
Submitted 26 March, 2020;
originally announced March 2020.
-
Reductive property of new fuzzy reasoning method based on distance measure
Authors:
Son-il Kwak,
Gum-ju Kim,
Michio Sugeno,
Gwang-chol Li,
Myong-suk Son,
Hyok-chol Kim,
Un-ha Kim
Abstract:
Firstly in this paper we propose a new criterion function for evaluation of the reductive property about the fuzzy reasoning result for fuzzy modus ponens and fuzzy modus tollens. Secondly unlike fuzzy reasoning methods based on the similarity measure, we propose a new fuzzy reasoning method based on distance measure. Thirdly the reductive property for 5 fuzzy reasoning methods are checked with re…
▽ More
Firstly in this paper we propose a new criterion function for evaluation of the reductive property about the fuzzy reasoning result for fuzzy modus ponens and fuzzy modus tollens. Secondly unlike fuzzy reasoning methods based on the similarity measure, we propose a new fuzzy reasoning method based on distance measure. Thirdly the reductive property for 5 fuzzy reasoning methods are checked with respect to fuzzy modus ponens and fuzzy modus tollens. Through the experiment, we show that proposed method is better than the previous methods in accordance with human thinking.
△ Less
Submitted 7 September, 2018;
originally announced September 2018.
-
Reflection-Aware Sound Source Localization
Authors:
Inkyu An,
Myungbae Son,
Dinesh Manocha,
Sung-eui Yoon
Abstract:
We present a novel, reflection-aware method for 3D sound localization in indoor environments. Unlike prior approaches, which are mainly based on continuous sound signals from a stationary source, our formulation is designed to localize the position instantaneously from signals within a single frame. We consider direct sound and indirect sound signals that reach the microphones after reflecting off…
▽ More
We present a novel, reflection-aware method for 3D sound localization in indoor environments. Unlike prior approaches, which are mainly based on continuous sound signals from a stationary source, our formulation is designed to localize the position instantaneously from signals within a single frame. We consider direct sound and indirect sound signals that reach the microphones after reflecting off surfaces such as ceilings or walls. We then generate and trace direct and reflected acoustic paths using inverse acoustic ray tracing and utilize these paths with Monte Carlo localization to estimate a 3D sound source position. We have implemented our method on a robot with a cube-shaped microphone array and tested it against different settings with continuous and intermittent sound signals with a stationary or a mobile source. Across different settings, our approach can localize the sound with an average distance error of 0.8m tested in a room of 7m by 7m area with 3m height, including a mobile and non-line-of-sight sound source. We also reveal that the modeling of indirect rays increases the localization accuracy by 40% compared to only using direct acoustic rays.
△ Less
Submitted 21 November, 2017;
originally announced November 2017.