-
Diffusion Model Patching via Mixture-of-Prompts
Authors:
Seokil Ham,
Sangmin Woo,
Jin-Young Kim,
Hyojun Go,
Byeongjun Park,
Changick Kim
Abstract:
We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while keeping the original model frozen. The effectiveness of DMP is not merely due to the addition of parameters but stems from…
▽ More
We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while keeping the original model frozen. The effectiveness of DMP is not merely due to the addition of parameters but stems from its dynamic gating mechanism, which selects and combines a subset of learnable prompts at every step of the generative process (e.g., reverse denoising steps). This strategy, which we term "mixture-of-prompts", enables the model to draw on the distinct expertise of each prompt, essentially "patching" the model's functionality at every step with minimal yet specialized parameters. Uniquely, DMP enhances the model by further training on the same dataset on which it was originally trained, even in a scenario where significant improvements are typically not expected due to model convergence. Experiments show that DMP significantly enhances the converged FID of DiT-L/2 on FFHQ 256x256 by 10.38%, achieved with only a 1.43% parameter increase and 50K additional training iterations.
△ Less
Submitted 30 May, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts
Authors:
Byeongjun Park,
Hyojun Go,
Jin-Young Kim,
Sangmin Woo,
Seokil Ham,
Changick Kim
Abstract:
Diffusion models have achieved remarkable success across a range of generative tasks. Recent efforts to enhance diffusion model architectures have reimagined them as a form of multi-task learning, where each task corresponds to a denoising task at a specific noise level. While these efforts have focused on parameter isolation and task routing, they fall short of capturing detailed inter-task relat…
▽ More
Diffusion models have achieved remarkable success across a range of generative tasks. Recent efforts to enhance diffusion model architectures have reimagined them as a form of multi-task learning, where each task corresponds to a denoising task at a specific noise level. While these efforts have focused on parameter isolation and task routing, they fall short of capturing detailed inter-task relationships and risk losing semantic information, respectively. In response, we introduce Switch Diffusion Transformer (Switch-DiT), which establishes inter-task relationships between conflicting tasks without compromising semantic information. To achieve this, we employ a sparse mixture-of-experts within each transformer block to utilize semantic information and facilitate handling conflicts in tasks through parameter isolation. Additionally, we propose a diffusion prior loss, encouraging similar tasks to share their denoising paths while isolating conflicting ones. Through these, each transformer block contains a shared expert across all tasks, where the common and task-specific denoising paths enable the diffusion model to construct its beneficial way of synergizing denoising tasks. Extensive experiments validate the effectiveness of our approach in improving both image quality and convergence rate, and further analysis demonstrates that Switch-DiT constructs tailored denoising paths across various generation scenarios.
△ Less
Submitted 10 July, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
NEO-KD: Knowledge-Distillation-Based Adversarial Training for Robust Multi-Exit Neural Networks
Authors:
Seokil Ham,
Jungwuk Park,
Dong-Jun Han,
Jaekyun Moon
Abstract:
While multi-exit neural networks are regarded as a promising solution for making efficient inference via early exits, combating adversarial attacks remains a challenging problem. In multi-exit networks, due to the high dependency among different submodels, an adversarial example targeting a specific exit not only degrades the performance of the target exit but also reduces the performance of all o…
▽ More
While multi-exit neural networks are regarded as a promising solution for making efficient inference via early exits, combating adversarial attacks remains a challenging problem. In multi-exit networks, due to the high dependency among different submodels, an adversarial example targeting a specific exit not only degrades the performance of the target exit but also reduces the performance of all other exits concurrently. This makes multi-exit networks highly vulnerable to simple adversarial attacks. In this paper, we propose NEO-KD, a knowledge-distillation-based adversarial training strategy that tackles this fundamental challenge based on two key contributions. NEO-KD first resorts to neighbor knowledge distillation to guide the output of the adversarial examples to tend to the ensemble outputs of neighbor exits of clean data. NEO-KD also employs exit-wise orthogonal knowledge distillation for reducing adversarial transferability across different submodels. The result is a significantly improved robustness against adversarial attacks. Experimental results on various datasets/models show that our method achieves the best adversarial accuracy with reduced computation budgets, compared to the baselines relying on existing adversarial training or knowledge distillation techniques for multi-exit networks.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
Discovering User Types: Mapping User Traits by Task-Specific Behaviors in Reinforcement Learning
Authors:
L. L. Ankile,
B. S. Ham,
K. Mao,
E. Shin,
S. Swaroop,
F. Doshi-Velez,
W. Pan
Abstract:
When assisting human users in reinforcement learning (RL), we can represent users as RL agents and study key parameters, called \emph{user traits}, to inform intervention design. We study the relationship between user behaviors (policy classes) and user traits. Given an environment, we introduce an intuitive tool for studying the breakdown of "user types": broad sets of traits that result in the s…
▽ More
When assisting human users in reinforcement learning (RL), we can represent users as RL agents and study key parameters, called \emph{user traits}, to inform intervention design. We study the relationship between user behaviors (policy classes) and user traits. Given an environment, we introduce an intuitive tool for studying the breakdown of "user types": broad sets of traits that result in the same behavior. We show that seemingly different real-world environments admit the same set of user types and formalize this observation as an equivalence relation defined on environments. By transferring intervention design between environments within the same equivalence class, we can help rapidly personalize interventions.
△ Less
Submitted 16 July, 2023;
originally announced July 2023.
-
Multitask Learning for Multiple Recognition Tasks: A Framework for Lower-limb Exoskeleton Robot Applications
Authors:
Joonhyun Kim,
Seongmin Ha,
Dongbin Shin,
Seoyeon Ham,
Jaepil Jang,
Wansoo Kim
Abstract:
To control the lower-limb exoskeleton robot effectively, it is essential to accurately recognize user status and environmental conditions. Previous studies have typically addressed these recognition challenges through independent models for each task, resulting in an inefficient model development process. In this study, we propose a Multitask learning approach that can address multiple recognition…
▽ More
To control the lower-limb exoskeleton robot effectively, it is essential to accurately recognize user status and environmental conditions. Previous studies have typically addressed these recognition challenges through independent models for each task, resulting in an inefficient model development process. In this study, we propose a Multitask learning approach that can address multiple recognition challenges simultaneously. This approach can enhance data efficiency by enabling knowledge sharing between each recognition model. We demonstrate the effectiveness of this approach using Gait phase recognition (GPR) and Terrain classification (TC) as examples, the most conventional recognition tasks in lower-limb exoskeleton robots. We first created a high-performing GPR model that achieved a Root mean square error (RMSE) value of 2.345 $\pm$ 0.08 and then utilized its knowledge-sharing backbone feature network to learn a TC model with an extremely limited dataset. Using a limited dataset for the TC model allows us to validate the data efficiency of our proposed Multitask learning approach. We compared the accuracy of the proposed TC model against other TC baseline models. The proposed model achieved 99.5 $\pm$ 0.044% accuracy with a limited dataset, outperforming other baseline models, demonstrating its effectiveness in terms of data efficiency. Future research will focus on extending the Multitask learning framework to encompass additional recognition tasks.
△ Less
Submitted 25 June, 2023;
originally announced June 2023.
-
Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs
Authors:
Philipp Benz,
Soomin Ham,
Chaoning Zhang,
Adil Karjauv,
In So Kweon
Abstract:
Convolutional Neural Networks (CNNs) have become the de facto gold standard in computer vision applications in the past years. Recently, however, new model architectures have been proposed challenging the status quo. The Vision Transformer (ViT) relies solely on attention modules, while the MLP-Mixer architecture substitutes the self-attention modules with Multi-Layer Perceptrons (MLPs). Despite t…
▽ More
Convolutional Neural Networks (CNNs) have become the de facto gold standard in computer vision applications in the past years. Recently, however, new model architectures have been proposed challenging the status quo. The Vision Transformer (ViT) relies solely on attention modules, while the MLP-Mixer architecture substitutes the self-attention modules with Multi-Layer Perceptrons (MLPs). Despite their great success, CNNs have been widely known to be vulnerable to adversarial attacks, causing serious concerns for security-sensitive applications. Thus, it is critical for the community to know whether the newly proposed ViT and MLP-Mixer are also vulnerable to adversarial attacks. To this end, we empirically evaluate their adversarial robustness under several adversarial attack setups and benchmark them against the widely used CNNs. Overall, we find that the two architectures, especially ViT, are more robust than their CNN models. Using a toy example, we also provide empirical evidence that the lower adversarial robustness of CNNs can be partially attributed to their shift-invariant property. Our frequency analysis suggests that the most robust ViT architectures tend to rely more on low-frequency features compared with CNNs. Additionally, we have an intriguing finding that MLP-Mixer is extremely vulnerable to universal adversarial perturbations.
△ Less
Submitted 11 October, 2021; v1 submitted 6 October, 2021;
originally announced October 2021.
-
Analyzing the effect of APOE on Alzheimer's disease progression using an event-based model for stratified populations
Authors:
Vikram Venkatraghavan,
Stefan Klein,
Lana Fani,
Leontine S. Ham,
Henri Vrooman,
M. Kamran Ikram,
Wiro J. Niessen,
Esther E. Bron
Abstract:
Alzheimer's disease (AD) is the most common form of dementia and is phenotypically heterogeneous. APOE is a triallelic gene which correlates with phenotypic heterogeneity in AD. In this work, we determined the effect of APOE alleles on the disease progression timeline of AD using a discriminative event-based model (DEBM). Since DEBM is a data-driven model, stratification into smaller disease subgr…
▽ More
Alzheimer's disease (AD) is the most common form of dementia and is phenotypically heterogeneous. APOE is a triallelic gene which correlates with phenotypic heterogeneity in AD. In this work, we determined the effect of APOE alleles on the disease progression timeline of AD using a discriminative event-based model (DEBM). Since DEBM is a data-driven model, stratification into smaller disease subgroups would lead to more inaccurate models as compared to fitting the model on the entire dataset. Hence our secondary aim is to propose and evaluate novel approaches in which we split the different steps of DEBM into group-aspecific and group-specific parts, where the entire dataset is used to train the group-aspecific parts and only the data from a specific group is used to train the group-specific parts of the DEBM. We performed simulation experiments to benchmark the accuracy of the proposed approaches and to select the optimal approach. Subsequently, the chosen approach was applied to the baseline data of 417 cognitively normal, 235 mild cognitively impaired who convert to AD within 3 years, and 342 AD patients from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset to gain new insights into the effect of APOE carriership on the disease progression timeline of AD. The presented models could aid understanding of the disease, and in selecting homogeneous group of presymptomatic subjects at-risk of developing symptoms for clinical trials.
△ Less
Submitted 15 September, 2020;
originally announced September 2020.
-
Experimental demonstrations of unconditional security in a purely classical regime
Authors:
Byoung S. Ham
Abstract:
So far, unconditional security in key distribution processes has been confined to quantum key distribution (QKD) protocols based on the no-cloning theorem of nonorthogonal bases. Recently, a completely different approach, the unconditionally secured classical key distribution (USCKD), has been proposed for unconditional security in the purely classical regime. Unlike QKD, both classical channels a…
▽ More
So far, unconditional security in key distribution processes has been confined to quantum key distribution (QKD) protocols based on the no-cloning theorem of nonorthogonal bases. Recently, a completely different approach, the unconditionally secured classical key distribution (USCKD), has been proposed for unconditional security in the purely classical regime. Unlike QKD, both classical channels and orthogonal bases are key ingredients in USCKD, where unconditional security is provided by deterministic randomness via path superposition-based reversible unitary transformations in a coupled Mach-Zehnder interferometer. Here, the first experimental demonstration of the USCKD protocol is presented.
△ Less
Submitted 13 August, 2020;
originally announced August 2020.
-
QuickTalk: An Association-Free Communication Method for IoT Devices in Proximity
Authors:
Seongmin Ham,
Jihyung Lee,
Kyunghan Lee
Abstract:
IoT devices are in general considered to be straightforward to use. However, we find that there are a number of situations where the usability becomes poor. The situations include but not limited to the followings: 1) when initializing an IoT device, 2) when trying to control an IoT device which is initialized and registered by another person, and 3) when trying to control an IoT device out of man…
▽ More
IoT devices are in general considered to be straightforward to use. However, we find that there are a number of situations where the usability becomes poor. The situations include but not limited to the followings: 1) when initializing an IoT device, 2) when trying to control an IoT device which is initialized and registered by another person, and 3) when trying to control an IoT device out of many of the same type. We tackle these situations by proposing a new association-free communication method, QuickTalk. QuickTalk lets a user device such as a smartphone pinpoint and activate an IoT device with the help of an IR transmitter and communicate with the pinpointed IoT device through the broadcast channel of WiFi. By the nature of its association-free communication, QuickTalk allows a user device to immediately give a command to a specific IoT device in proximity even when the IoT device is uninitialized, unregistered to the control interface of the user, or registered but being physically confused with others. Our experiments of QuickTalk implemented on Raspberry Pi 2 devices show that the end-to-end delay of QuickTalk is upper bounded by 2.5 seconds and its median is only about 0.74 seconds. We further confirm that even when an IoT device has ongoing data sessions, QuickTalk can still establish a reliable communication channel to the IoT device with little impact to the ongoing sessions.
△ Less
Submitted 19 May, 2017;
originally announced May 2017.
-
Traffic Monitoring Using M2M Communication
Authors:
Shiu Kumar,
Eun Sik Ham,
Seong Ro Lee
Abstract:
This paper presents an intelligent traffic monitoring system using wireless vision sensor network that captures and processes the real-time video image to obtain the traffic flow rate and vehicle speeds along different urban roadways. This system will display the traffic states on the front roadways that can guide the drivers to select the right way and avoid potential traffic congestions. On the…
▽ More
This paper presents an intelligent traffic monitoring system using wireless vision sensor network that captures and processes the real-time video image to obtain the traffic flow rate and vehicle speeds along different urban roadways. This system will display the traffic states on the front roadways that can guide the drivers to select the right way and avoid potential traffic congestions. On the other hand, it will also monitor the vehicle speeds and store the vehicle details, for those breaking the roadway speed limits, in its database. The real-time traffic data is processed by the Personal Computer (PC) at the sub roadway station and the traffic flow rate data is transmitted to the main roadway station Arduino 3G via email, where the data is extracted and traffic flow rate displayed.
△ Less
Submitted 31 March, 2014;
originally announced April 2014.