Skip to main content

Showing 1–8 of 8 results for author: Suh, N

  1. arXiv:2406.16028  [pdf, other

    cs.LG cs.AI

    TimeAutoDiff: Combining Autoencoder and Diffusion model for time series tabular data synthesizing

    Authors: Namjoon Suh, Yuning Yang, Din-Yin Hsieh, Qitong Luan, Shirong Xu, Shixiang Zhu, Guang Cheng

    Abstract: In this paper, we leverage the power of latent diffusion models to generate synthetic time series tabular data. Along with the temporal and feature correlations, the heterogeneous nature of the feature in the table has been one of the main obstacles in time series tabular data modeling. We tackle this problem by combining the ideas of the variational auto-encoder (VAE) and the denoising diffusion… ▽ More

    Submitted 15 July, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2405.15337  [pdf, other

    stat.ML cs.LG

    Discriminative Estimation of Total Variation Distance: A Fidelity Auditor for Generative Data

    Authors: Lan Tao, Shirong Xu, Chi-Hua Wang, Namjoon Suh, Guang Cheng

    Abstract: With the proliferation of generative AI and the increasing volume of generative data (also called as synthetic data), assessing the fidelity of generative data has become a critical concern. In this paper, we propose a discriminative approach to estimate the total variation (TV) distance between two distributions as an effective measure of generative data fidelity. Our method quantitatively charac… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  3. arXiv:2403.12187  [pdf, ps, other

    stat.ML cs.LG math.ST

    Approximation of RKHS Functionals by Neural Networks

    Authors: Tian-Yi Zhou, Namjoon Suh, Guang Cheng, Xiaoming Huo

    Abstract: Motivated by the abundance of functional data such as time series and images, there has been a growing interest in integrating such data into neural networks and learning maps from function spaces to R (i.e., functionals). In this paper, we study the approximation of functionals on reproducing kernel Hilbert spaces (RKHS's) using neural networks. We establish the universality of the approximation… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  4. arXiv:2402.00743  [pdf, other

    cs.LG cs.CL stat.ML

    Theoretical Understanding of In-Context Learning in Shallow Transformers with Unstructured Data

    Authors: Yue Xing, Xiaofeng Lin, Chenheng Xu, Namjoon Suh, Qifan Song, Guang Cheng

    Abstract: Large language models (LLMs) are powerful models that can learn concepts at the inference stage via in-context learning (ICL). While theoretical studies, e.g., \cite{zhang2023trained}, attempt to explain the mechanism of ICL, they assume the input $x_i$ and the output $y_i$ of each demonstration example are in the same token (i.e., structured data). However, in real practice, the examples are usua… ▽ More

    Submitted 18 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  5. arXiv:2401.07187  [pdf, other

    stat.ML cs.LG math.ST

    A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models

    Authors: Namjoon Suh, Guang Cheng

    Abstract: In this article, we review the literature on statistical theories of neural networks from three perspectives. In the first part, results on excess risks for neural networks are reviewed in the nonparametric framework of regression or classification. These results rely on explicit constructions of neural networks, leading to fast convergence rates of excess risks, in that tools from the approximati… ▽ More

    Submitted 4 July, 2024; v1 submitted 13 January, 2024; originally announced January 2024.

    Comments: 33 pages, no figures,Invited for review in Annual Review of Statistics and Its Application (In review)

  6. arXiv:2310.15479  [pdf, other

    stat.ML cs.AI cs.LG

    AutoDiff: combining Auto-encoder and Diffusion model for tabular data synthesizing

    Authors: Namjoon Suh, Xiaofeng Lin, Din-Yin Hsieh, Merhdad Honarkhah, Guang Cheng

    Abstract: Diffusion model has become a main paradigm for synthetic data generation in many subfields of modern machine learning, including computer vision, language model, or speech synthesis. In this paper, we leverage the power of diffusion model for generating synthetic tabular data. The heterogeneous features in tabular data have been main obstacles in tabular data synthesis, and we tackle this problem… ▽ More

    Submitted 16 November, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

  7. arXiv:2309.15075  [pdf, other

    stat.ML cs.LG math.ST

    On Excess Risk Convergence Rates of Neural Network Classifiers

    Authors: Hyunouk Ko, Namjoon Suh, Xiaoming Huo

    Abstract: The recent success of neural networks in pattern recognition and classification problems suggests that neural networks possess qualities distinct from other more classical classifiers such as SVMs or boosting classifiers. This paper studies the performance of plug-in classifiers based on neural networks in a binary classification setting as measured by their excess risks. Compared to the typical s… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  8. arXiv:1912.00524  [pdf, other

    stat.ML cs.LG

    Factor Analysis on Citation, Using a Combined Latent and Logistic Regression Model

    Authors: Namjoon Suh, Xiaoming Huo, Eric Heim, Lee Seversky

    Abstract: We propose a combined model, which integrates the latent factor model and the logistic regression model, for the citation network. It is noticed that neither a latent factor model nor a logistic regression model alone is sufficient to capture the structure of the data. The proposed model has a latent (i.e., factor analysis) model to represents the main technological trends (a.k.a., factors), and a… ▽ More

    Submitted 1 December, 2019; originally announced December 2019.

    Comments: Citation network, matrix decomposition, latent variable model, logistic regression model, convex optimization, alternating direction method of multiplier