Skip to main content

Showing 1–26 of 26 results for author: Vanden-Eijnden, E

  1. arXiv:2406.07507  [pdf, other

    cs.LG math.DS

    Flow Map Matching

    Authors: Nicholas M. Boffi, Michael S. Albergo, Eric Vanden-Eijnden

    Abstract: Generative models based on dynamical transport of measure, such as diffusion models, flow matching models, and stochastic interpolants, learn an ordinary or stochastic differential equation whose trajectories push initial conditions from a known base distribution onto the target. While training is cheap, samples are generated via simulation, which is more expensive than one-step models like GANs.… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2404.01145  [pdf, ps, other

    math.NA cs.LG

    Sequential-in-time training of nonlinear parametrizations for solving time-dependent partial differential equations

    Authors: Huan Zhang, Yifan Chen, Eric Vanden-Eijnden, Benjamin Peherstorfer

    Abstract: Sequential-in-time methods solve a sequence of training problems to fit nonlinear parametrizations such as neural networks to approximate solution trajectories of partial differential equations over time. This work shows that sequential-in-time training methods can be understood broadly as either optimize-then-discretize (OtD) or discretize-then-optimize (DtO) schemes, which are well known concept… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  3. arXiv:2403.13724  [pdf, other

    cs.LG stat.ML

    Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes

    Authors: Yifan Chen, Mark Goldstein, Mengjian Hua, Michael S. Albergo, Nicholas M. Boffi, Eric Vanden-Eijnden

    Abstract: We propose a framework for probabilistic forecasting of dynamical systems based on generative modeling. Given observations of the system state over time, we formulate the forecasting problem as sampling from the conditional distribution of the future system state given its current state. To this end, we leverage the framework of stochastic interpolants, which facilitates the construction of a gene… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  4. arXiv:2401.08740  [pdf, other

    cs.CV cs.LG

    SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers

    Authors: Nanye Ma, Mark Goldstein, Michael S. Albergo, Nicholas M. Boffi, Eric Vanden-Eijnden, Saining Xie

    Abstract: We present Scalable Interpolant Transformers (SiT), a family of generative models built on the backbone of Diffusion Transformers (DiT). The interpolant framework, which allows for connecting two distributions in a more flexible way than standard diffusion models, makes possible a modular study of various design choices impacting generative models built on dynamical transport: using discrete vs. c… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Code available: https://github.com/willisma/SiT

  5. arXiv:2310.11232  [pdf, ps, other

    cs.LG stat.ML

    Learning to Sample Better

    Authors: Michael S. Albergo, Eric Vanden-Eijnden

    Abstract: These lecture notes provide an introduction to recent advances in generative modeling methods based on the dynamical transportation of measures, by means of which samples from a simple base measure are mapped to samples from a target measure of interest. Special emphasis is put on the applications of these methods to Monte-Carlo (MC) sampling techniques, such as importance sampling and Markov Chai… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Les Houches 2022 Summer School on Statistical Physics and Machine Learning

  6. arXiv:2310.03725  [pdf, other

    cs.LG stat.ML

    Stochastic interpolants with data-dependent couplings

    Authors: Michael S. Albergo, Mark Goldstein, Nicholas M. Boffi, Rajesh Ranganath, Eric Vanden-Eijnden

    Abstract: Generative models inspired by dynamical transport of measure -- such as flows and diffusions -- construct a continuous-time map between two probability densities. Conventionally, one of these is the target density, only accessible through samples, while the other is taken as a simple base density that is data-agnostic. In this work, using the framework of stochastic interpolants, we formalize how… ▽ More

    Submitted 15 December, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

  7. arXiv:2310.03695  [pdf, other

    cs.LG math.PR

    Multimarginal generative modeling with stochastic interpolants

    Authors: Michael S. Albergo, Nicholas M. Boffi, Michael Lindsey, Eric Vanden-Eijnden

    Abstract: Given a set of $K$ probability densities, we consider the multimarginal generative modeling problem of learning a joint distribution that recovers these densities as marginals. The structure of this joint distribution should identify multi-way correspondences among the prescribed marginals. We formalize an approach to this task within a generalization of the stochastic interpolant framework, leadi… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  8. arXiv:2310.03575  [pdf, other

    stat.ML cs.LG

    Analysis of learning a flow-based generative model from limited sample complexity

    Authors: Hugo Cui, Florent Krzakala, Eric Vanden-Eijnden, Lenka Zdeborová

    Abstract: We study the problem of training a flow-based generative model, parametrized by a two-layer autoencoder, to sample from a high-dimensional Gaussian mixture. We provide a sharp end-to-end analysis of the problem. First, we provide a tight closed-form characterization of the learnt velocity field, when parametrized by a shallow denoising auto-encoder trained on a finite number $n$ of samples from th… ▽ More

    Submitted 25 June, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

  9. arXiv:2309.12991  [pdf, other

    cond-mat.stat-mech cond-mat.soft cs.LG math.NA

    Deep learning probability flows and entropy production rates in active matter

    Authors: Nicholas M. Boffi, Eric Vanden-Eijnden

    Abstract: Active matter systems, from self-propelled colloids to motile bacteria, are characterized by the conversion of free energy into useful work at the microscopic scale. They involve physics beyond the reach of equilibrium statistical mechanics, and a persistent challenge has been to understand the nature of their nonequilibrium states. The entropy production rate and the probability current provide q… ▽ More

    Submitted 17 June, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

  10. arXiv:2306.15630  [pdf, ps, other

    math.NA cs.LG

    Coupling parameter and particle dynamics for adaptive sampling in Neural Galerkin schemes

    Authors: Yuxiao Wen, Eric Vanden-Eijnden, Benjamin Peherstorfer

    Abstract: Training nonlinear parametrizations such as deep neural networks to numerically approximate solutions of partial differential equations is often based on minimizing a loss that includes the residual, which is analytically available in limited settings only. At the same time, empirically estimating the training loss is challenging because residuals and related quantities can have high variance, esp… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

  11. arXiv:2305.19414  [pdf, other

    cs.LG cond-mat.dis-nn math.NA math.PR

    Efficient Training of Energy-Based Models Using Jarzynski Equality

    Authors: Davide Carbone, Mengjian Hua, Simon Coste, Eric Vanden-Eijnden

    Abstract: Energy-based models (EBMs) are generative models inspired by statistical physics with a wide range of applications in unsupervised learning. Their performance is best measured by the cross-entropy (CE) of the model distribution relative to the data distribution. Using the CE as the objective for training is however challenging because the computation of its gradient with respect to the model param… ▽ More

    Submitted 11 December, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

  12. arXiv:2303.08797  [pdf, other

    cs.LG cond-mat.dis-nn math.PR

    Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

    Authors: Michael S. Albergo, Nicholas M. Boffi, Eric Vanden-Eijnden

    Abstract: A class of generative models that unifies flow-based and diffusion-based methods is introduced. These models extend the framework proposed in Albergo & Vanden-Eijnden (2023), enabling the use of a broad class of continuous-time stochastic processes called `stochastic interpolants' to bridge any two arbitrary probability density functions exactly in finite time. These interpolants are built by comb… ▽ More

    Submitted 6 November, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

  13. arXiv:2210.16286  [pdf, other

    cs.LG math.OC math.PR stat.ML

    A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks

    Authors: Zhengdao Chen, Eric Vanden-Eijnden, Joan Bruna

    Abstract: To understand the training dynamics of neural networks (NNs), prior studies have considered the infinite-width mean-field (MF) limit of two-layer NN, establishing theoretical guarantees of its convergence under gradient flow training as well as its approximation and generalization capabilities. In this work, we study the infinite-width limit of a type of three-layer NN model whose first layer is r… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  14. arXiv:2209.15571  [pdf, other

    cs.LG stat.ML

    Building Normalizing Flows with Stochastic Interpolants

    Authors: Michael S. Albergo, Eric Vanden-Eijnden

    Abstract: A generative model based on a continuous-time normalizing flow between any pair of base and target probability densities is proposed. The velocity field of this flow is inferred from the probability current of a time-dependent density that interpolates between the base and the target in finite time. Unlike conventional normalizing flow inference methods based the maximum likelihood principle, whic… ▽ More

    Submitted 9 March, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

    Comments: ICLR 2023

  15. arXiv:2206.12314  [pdf, other

    stat.ML cs.LG

    Learning sparse features can lead to overfitting in neural networks

    Authors: Leonardo Petrini, Francesco Cagnetta, Eric Vanden-Eijnden, Matthieu Wyart

    Abstract: It is widely believed that the success of deep networks lies in their ability to learn a meaningful representation of the features of the data. Yet, understanding when and how this feature learning improves performance remains a challenge: for example, it is beneficial for modern architectures trained to classify images, whereas it is detrimental for fully-connected networks trained for the same t… ▽ More

    Submitted 12 October, 2022; v1 submitted 24 June, 2022; originally announced June 2022.

  16. arXiv:2206.04642  [pdf, other

    cs.LG cond-mat.dis-nn cond-mat.stat-mech math.NA math.PR

    Probability flow solution of the Fokker-Planck equation

    Authors: Nicholas M. Boffi, Eric Vanden-Eijnden

    Abstract: The method of choice for integrating the time-dependent Fokker-Planck equation in high-dimension is to generate samples from the solution via integration of the associated stochastic differential equation. Here, we study an alternative scheme based on integrating an ordinary differential equation that describes the flow of probability. Acting as a transport map, this equation deterministically pus… ▽ More

    Submitted 15 February, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

  17. arXiv:2204.10782  [pdf, other

    cs.LG math.OC math.PR stat.ML

    On Feature Learning in Neural Networks with Global Convergence Guarantees

    Authors: Zhengdao Chen, Eric Vanden-Eijnden, Joan Bruna

    Abstract: We study the optimization of wide neural networks (NNs) via gradient flow (GF) in setups that allow feature learning while admitting non-asymptotic global convergence guarantees. First, for wide shallow NNs under the mean-field scaling and with a general class of activation functions, we prove that when the input dimension is no less than the size of the training set, the training loss converges t… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    Comments: Accepted by the 10th International Conference on Learning Representations (ICLR 2022)

  18. arXiv:2203.01360  [pdf, other

    math.NA cs.LG stat.ML

    Neural Galerkin Schemes with Active Learning for High-Dimensional Evolution Equations

    Authors: Joan Bruna, Benjamin Peherstorfer, Eric Vanden-Eijnden

    Abstract: Deep neural networks have been shown to provide accurate function approximations in high dimensions. However, fitting network parameters requires informative training data that are often challenging to collect in science and engineering applications. This work proposes Neural Galerkin schemes based on deep learning that generate training data with active learning for numerically solving high-dimen… ▽ More

    Submitted 29 February, 2024; v1 submitted 2 March, 2022; originally announced March 2022.

    Journal ref: Journal of Computational Physics, Volume 496, 2024

  19. arXiv:2107.08001  [pdf, other

    stat.ML cs.LG physics.data-an

    Efficient Bayesian Sampling Using Normalizing Flows to Assist Markov Chain Monte Carlo Methods

    Authors: Marylou Gabrié, Grant M. Rotskoff, Eric Vanden-Eijnden

    Abstract: Normalizing flows can generate complex target distributions and thus show promise in many applications in Bayesian statistics as an alternative or complement to MCMC for sampling posteriors. Since no data set from the target posterior distribution is available beforehand, the flow is typically trained using the reverse Kullback-Leibler (KL) divergence that only requires samples from a base distrib… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

  20. arXiv:2107.05134  [pdf, other

    cs.LG math.OC stat.ML

    Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks

    Authors: Carles Domingo-Enrich, Alberto Bietti, Marylou Gabrié, Joan Bruna, Eric Vanden-Eijnden

    Abstract: Energy-based models (EBMs) are generative models that are usually trained via maximum likelihood estimation. This approach becomes challenging in generic situations where the trained energy is non-convex, due to the need to sample the Gibbs distribution associated with this energy. Using general Fenchel duality results, we derive variational principles dual to maximum likelihood EBMs with shallow… ▽ More

    Submitted 15 February, 2022; v1 submitted 11 July, 2021; originally announced July 2021.

  21. arXiv:2104.07531  [pdf, other

    cs.LG stat.ML

    On Energy-Based Models with Overparametrized Shallow Neural Networks

    Authors: Carles Domingo-Enrich, Alberto Bietti, Eric Vanden-Eijnden, Joan Bruna

    Abstract: Energy-based models (EBMs) are a simple yet powerful framework for generative modeling. They are based on a trainable energy function which defines an associated Gibbs measure, and they can be trained and sampled from via well-established statistical tools, such as MCMC. Neural networks may be used as energy function approximators, providing both a rich class of expressive models as well as a flex… ▽ More

    Submitted 5 May, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

  22. arXiv:2008.09623  [pdf, other

    math.PR cs.LG math.OC stat.ML

    A Dynamical Central Limit Theorem for Shallow Neural Networks

    Authors: Zhengdao Chen, Grant M. Rotskoff, Joan Bruna, Eric Vanden-Eijnden

    Abstract: Recent theoretical works have characterized the dynamics of wide shallow neural networks trained via gradient descent in an asymptotic mean-field limit when the width tends towards infinity. At initialization, the random sampling of the parameters leads to deviations from the mean-field limit dictated by the classical Central Limit Theorem (CLT). However, since gradient descent induces correlation… ▽ More

    Submitted 26 March, 2022; v1 submitted 21 August, 2020; originally announced August 2020.

    Comments: Appeared in Advances in Neural Information Processing Systems 33 (NeurIPS 2020). An error in Theorem 3.5 has been corrected

  23. arXiv:2006.15459  [pdf, other

    cs.LG stat.ML

    Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions

    Authors: Stefano Sarao Mannelli, Eric Vanden-Eijnden, Lenka Zdeborová

    Abstract: We study the dynamics of optimization and the generalization properties of one-hidden layer neural networks with quadratic activation function in the over-parametrized regime where the layer width $m$ is larger than the input dimension $d$. We consider a teacher-student scenario where the teacher has the same structure as the student with a hidden layer of smaller width $m^*\le m$. We describe… ▽ More

    Submitted 18 August, 2020; v1 submitted 27 June, 2020; originally announced June 2020.

    Comments: 10 pages, 4 figures + appendix

    Journal ref: Advances in Neural Information Processing Systems, v33, page 13445--13455, 2020

  24. arXiv:1902.01843  [pdf, other

    stat.ML cs.LG

    Global convergence of neuron birth-death dynamics

    Authors: Grant Rotskoff, Samy Jelassi, Joan Bruna, Eric Vanden-Eijnden

    Abstract: Neural networks with a large number of parameters admit a mean-field description, which has recently served as a theoretical explanation for the favorable training properties of "overparameterized" models. In this regime, gradient descent obeys a deterministic partial differential equation (PDE) that converges to a globally optimal solution for networks with a single hidden layer under appropriate… ▽ More

    Submitted 27 March, 2019; v1 submitted 5 February, 2019; originally announced February 2019.

  25. arXiv:1805.00915  [pdf, other

    stat.ML cond-mat.stat-mech cs.LG

    Trainability and Accuracy of Neural Networks: An Interacting Particle System Approach

    Authors: Grant M. Rotskoff, Eric Vanden-Eijnden

    Abstract: Neural networks, a central tool in machine learning, have demonstrated remarkable, high fidelity performance on image recognition and classification tasks. These successes evince an ability to accurately represent high dimensional functions, but rigorous results about the approximation error of neural networks after training are few. Here we establish conditions for global convergence of the stand… ▽ More

    Submitted 30 July, 2019; v1 submitted 2 May, 2018; originally announced May 2018.

  26. arXiv:1402.1736  [pdf, other

    cond-mat.stat-mech cs.CE

    Flows in Complex Networks: Theory, Algorithms, and Application to Lennard-Jones Cluster Rearrangement

    Authors: Maria Cameron, Eric Vanden-Eijnden

    Abstract: A set of analytical and computational tools based on transition path theory (TPT) is proposed to analyze flows in complex networks. Specifically, TPT is used to study the statistical properties of the reactive trajectories by which transitions occur between specific groups of nodes on the network. Sampling tools are built upon the outputs of TPT that allow to generate these reactive trajectories d… ▽ More

    Submitted 29 January, 2014; originally announced February 2014.

    Comments: 32 pages, 13 figures