Skip to main content

Showing 1–11 of 11 results for author: Paquette, E

  1. arXiv:2406.11733  [pdf, other

    stat.ML cs.LG

    A Clipped Trip: the Dynamics of SGD with Gradient Clipping in High-Dimensions

    Authors: Noah Marshall, Ke Liang Xiao, Atish Agarwala, Elliot Paquette

    Abstract: The success of modern machine learning is due in part to the adaptive optimization methods that have been developed to deal with the difficulties of training large models over complex datasets. One such method is gradient clipping: a practical procedure with limited theoretical underpinnings. In this work, we study clipping in a least squares problem under streaming SGD. We develop a theoretical a… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2405.15074  [pdf, other

    stat.ML cs.LG math.OC math.PR math.ST

    4+3 Phases of Compute-Optimal Neural Scaling Laws

    Authors: Elliot Paquette, Courtney Paquette, Lechao Xiao, Jeffrey Pennington

    Abstract: We consider the three parameter solvable neural scaling model introduced by Maloney, Roberts, and Sully. The model has three parameters: data complexity, target complexity, and model-parameter-count. We use this neural scaling model to derive new predictions about the compute-limited, infinite-data scaling law regime. To train the neural scaling model, we run one-pass stochastic gradient descent o… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  3. arXiv:2308.08977  [pdf, other

    math.OC cs.LG math.PR math.ST stat.ML

    Hitting the High-Dimensional Notes: An ODE for SGD learning dynamics on GLMs and multi-index models

    Authors: Elizabeth Collins-Woodfin, Courtney Paquette, Elliot Paquette, Inbar Seroussi

    Abstract: We analyze the dynamics of streaming stochastic gradient descent (SGD) in the high-dimensional limit when applied to generalized linear models and multi-index models (e.g. logistic regression, phase retrieval) with general data-covariance. In particular, we demonstrate a deterministic equivalent of SGD in the form of a system of ordinary differential equations that describes a wide class of statis… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Preliminary version

  4. arXiv:2308.00836  [pdf, other

    stat.ME cs.CR

    Differentially Private Linear Regression with Linked Data

    Authors: Shurong Lin, Elliot Paquette, Eric D. Kolaczyk

    Abstract: There has been increasing demand for establishing privacy-preserving methodologies for modern statistics and machine learning. Differential privacy, a mathematical notion from computer science, is a rising tool offering robust privacy guarantees. Recent work focuses primarily on developing differentially private versions of individual statistical and machine learning tasks, with nontrivial upstrea… ▽ More

    Submitted 7 May, 2024; v1 submitted 1 August, 2023; originally announced August 2023.

    MSC Class: 68P27; 62-XX ACM Class: G.3; I.0

  5. arXiv:2307.01181  [pdf, ps, other

    math.PR cs.DS cs.LG math.ST stat.ML

    Fitting an ellipsoid to a quadratic number of random points

    Authors: Afonso S. Bandeira, Antoine Maillard, Shahar Mendelson, Elliot Paquette

    Abstract: We consider the problem $(\mathrm{P})$ of fitting $n$ standard Gaussian random vectors in $\mathbb{R}^d$ to the boundary of a centered ellipsoid, as $n, d \to \infty$. This problem is conjectured to have a sharp feasibility transition: for any $\varepsilon > 0$, if $n \leq (1 - \varepsilon) d^2 / 4$ then $(\mathrm{P})$ has a solution with high probability, while $(\mathrm{P})$ has no solutions wit… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: 17 pages

  6. arXiv:2206.07252  [pdf, other

    stat.ML cs.LG math.OC math.PR math.ST

    Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions

    Authors: Courtney Paquette, Elliot Paquette, Ben Adlam, Jeffrey Pennington

    Abstract: Stochastic gradient descent (SGD) is a pillar of modern machine learning, serving as the go-to optimization algorithm for a diverse array of problems. While the empirical success of SGD is often attributed to its computational efficiency and favorable generalization behavior, neither effect is well understood and disentangling them remains an open problem. Even in the simple setting of convex quad… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: text overlap with arXiv:2205.07069

  7. arXiv:2206.01029  [pdf, other

    math.OC cs.LG math.PR stat.ML

    Trajectory of Mini-Batch Momentum: Batch Size Saturation and Convergence in High Dimensions

    Authors: Kiwon Lee, Andrew N. Cheng, Courtney Paquette, Elliot Paquette

    Abstract: We analyze the dynamics of large batch stochastic gradient descent with momentum (SGD+M) on the least squares problem when both the number of samples and dimensions are large. In this setting, we show that the dynamics of SGD+M converge to a deterministic discrete Volterra equation as dimension increases, which we analyze. We identify a stability measurement, the implicit conditioning ratio (ICR),… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

  8. arXiv:2111.03146  [pdf, other

    cs.LG cs.SD eess.AS

    Generating Diverse Realistic Laughter for Interactive Art

    Authors: M. Mehdi Afsar, Eric Park, Étienne Paquette, Gauthier Gidel, Kory W. Mathewson, Eilif Muller

    Abstract: We propose an interactive art project to make those rendered invisible by the COVID-19 crisis and its concomitant solitude reappear through the welcome melody of laughter, and connections created and explored through advanced laughter synthesis approaches. However, the unconditional generation of the diversity of human emotional responses in high-quality auditory synthesis remains an open problem,… ▽ More

    Submitted 29 July, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

    Comments: Presented at Machine Learning for Creativity and Design workshop at NeurIPS 2021, 6 pages

  9. arXiv:2106.05143  [pdf, other

    cs.GR cs.LG

    Neural UpFlow: A Scene Flow Learning Approach to Increase the Apparent Resolution of Particle-Based Liquids

    Authors: Bruno Roy, Pierre Poulin, Eric Paquette

    Abstract: We present a novel up-resing technique for generating high-resolution liquids based on scene flow estimation using deep neural networks. Our approach infers and synthesizes small- and large-scale details solely from a low-resolution particle-based liquid simulation. The proposed network leverages neighborhood contributions to encode inherent liquid properties throughout convolutions. We also propo… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Comments: 14 pages, 18 figures, and 3 tables

  10. arXiv:2106.03696  [pdf, other

    math.OC cs.LG math.PR stat.ML

    Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models

    Authors: Courtney Paquette, Elliot Paquette

    Abstract: We analyze a class of stochastic gradient algorithms with momentum on a high-dimensional random least squares problem. Our framework, inspired by random matrix theory, provides an exact (deterministic) characterization for the sequence of loss values produced by these algorithms which is expressed only in terms of the eigenvalues of the Hessian. This leads to simple expressions for nearly-optimal… ▽ More

    Submitted 25 October, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: 39 pages, 7 figures

  11. arXiv:2102.04396  [pdf, other

    math.OC cs.LG math.PR stat.ML

    SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality

    Authors: Courtney Paquette, Kiwon Lee, Fabian Pedregosa, Elliot Paquette

    Abstract: We propose a new framework, inspired by random matrix theory, for analyzing the dynamics of stochastic gradient descent (SGD) when both number of samples and dimensions are large. This framework applies to any fixed stepsize and the finite sum setting. Using this new framework, we show that the dynamics of SGD on a least squares problem with random data become deterministic in the large sample and… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.