Showing 1–2 of 2 results for author: Kan, W

Search v0.5.6 released 2020-02-24

arXiv:2210.12231 [pdf, other]

cs.LG

Reducing Training Sample Memorization in GANs by Training with Memorization Rejection

Authors: Andrew Bai, Cho-Jui Hsieh, Wendy Kan, Hsuan-Tien Lin

Abstract: Generative adversarial network (GAN) continues to be a popular research direction due to its high generation quality. It is observed that many state-of-the-art GANs generate samples that are more similar to the training set than a holdout testing set from the same distribution, hinting some training samples are implicitly memorized in these models. This memorization behavior is unfavorable in many… ▽ More Generative adversarial network (GAN) continues to be a popular research direction due to its high generation quality. It is observed that many state-of-the-art GANs generate samples that are more similar to the training set than a holdout testing set from the same distribution, hinting some training samples are implicitly memorized in these models. This memorization behavior is unfavorable in many applications that demand the generated samples to be sufficiently distinct from known samples. Nevertheless, it is unclear whether it is possible to reduce memorization without compromising the generation quality. In this paper, we propose memorization rejection, a training scheme that rejects generated samples that are near-duplicates of training samples during training. Our scheme is simple, generic and can be directly applied to any GAN architecture. Experiments on multiple datasets and GAN models validate that memorization rejection effectively reduces training sample memorization, and in many cases does not sacrifice the generation quality. Code to reproduce the experiment results can be found at $\texttt{https://github.com/jybai/MRGAN}$. △ Less

Submitted 21 October, 2022; originally announced October 2022.
arXiv:2106.03062 [pdf, other]

cs.LG

doi 10.1145/3447548.3467198

On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition

Authors: Ching-Yuan Bai, Hsuan-Tien Lin, Colin Raffel, Wendy Chih-wen Kan

Abstract: Many recent developments on generative models for natural images have relied on heuristically-motivated metrics that can be easily gamed by memorizing a small sample from the true distribution or training a model directly to improve the metric. In this work, we critically evaluate the gameability of these metrics by designing and deploying a generative modeling competition. Our competition receive… ▽ More Many recent developments on generative models for natural images have relied on heuristically-motivated metrics that can be easily gamed by memorizing a small sample from the true distribution or training a model directly to improve the metric. In this work, we critically evaluate the gameability of these metrics by designing and deploying a generative modeling competition. Our competition received over 11000 submitted models. The competitiveness between participants allowed us to investigate both intentional and unintentional memorization in generative modeling. To detect intentional memorization, we propose the ``Memorization-Informed Fréchet Inception Distance'' (MiFID) as a new memorization-aware metric and design benchmark procedures to ensure that winning submissions made genuine improvements in perceptual quality. Furthermore, we manually inspect the code for the 1000 top-performing models to understand and label different forms of memorization. Our analysis reveals that unintentional memorization is a serious and common issue in popular generative models. The generated images and our memorization labels of those models as well as code to compute MiFID are released to facilitate future studies on benchmarking generative models. △ Less

Submitted 6 June, 2021; originally announced June 2021.

Comments: In Proceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), August 2021

Search v0.5.6 released 2020-02-24