Skip to main content

Showing 1–6 of 6 results for author: Seroussi, I

  1. arXiv:2310.03789  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Grokking as a First Order Phase Transition in Two Layer Networks

    Authors: Noa Rubin, Inbar Seroussi, Zohar Ringel

    Abstract: A key property of deep neural networks (DNNs) is their ability to learn new features during training. This intriguing aspect of deep learning stands out most clearly in recently reported Grokking phenomena. While mainly reflected as a sudden increase in test accuracy, Grokking is also believed to be a beyond lazy-learning/Gaussian Process (GP) phenomenon involving feature learning. Here we apply a… ▽ More

    Submitted 5 May, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

  2. arXiv:2308.08977  [pdf, other

    math.OC cs.LG math.PR math.ST stat.ML

    Hitting the High-Dimensional Notes: An ODE for SGD learning dynamics on GLMs and multi-index models

    Authors: Elizabeth Collins-Woodfin, Courtney Paquette, Elliot Paquette, Inbar Seroussi

    Abstract: We analyze the dynamics of streaming stochastic gradient descent (SGD) in the high-dimensional limit when applied to generalized linear models and multi-index models (e.g. logistic regression, phase retrieval) with general data-covariance. In particular, we demonstrate a deterministic equivalent of SGD in the form of a system of ordinary differential equations that describes a wide class of statis… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Preliminary version

  3. arXiv:2307.14653  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Speed Limits for Deep Learning

    Authors: Inbar Seroussi, Alexander A. Alemi, Moritz Helias, Zohar Ringel

    Abstract: State-of-the-art neural networks require extreme computational power to train. It is therefore natural to wonder whether they are optimally trained. Here we apply a recent advancement in stochastic thermodynamics which allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network, based on the ratio of their Wasserstein-2… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

  4. arXiv:2307.06362  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Spectral-Bias and Kernel-Task Alignment in Physically Informed Neural Networks

    Authors: Inbar Seroussi, Asaf Miron, Zohar Ringel

    Abstract: Physically informed neural networks (PINNs) are a promising emerging method for solving differential equations. As in many other deep learning approaches, the choice of PINN design and training protocol requires careful craftsmanship. Here, we suggest a comprehensive theoretical framework that sheds light on this important problem. Leveraging an equivalence between infinitely over-parameterized ne… ▽ More

    Submitted 5 October, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

  5. arXiv:2112.15383  [pdf, other

    stat.ML cs.LG physics.data-an

    Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs

    Authors: Inbar Seroussi, Gadi Naveh, Zohar Ringel

    Abstract: Deep neural networks (DNNs) are powerful tools for compressing and distilling information. Their scale and complexity, often involving billions of inter-dependent parameters, render direct microscopic analysis difficult. Under such circumstances, a common strategy is to identify slow variables that average the erratic behavior of the fast microscopic variables. Here, we identify a similar separati… ▽ More

    Submitted 22 September, 2022; v1 submitted 31 December, 2021; originally announced December 2021.

  6. arXiv:2103.14723  [pdf, other

    stat.ML cs.LG

    Lower Bounds on the Generalization Error of Nonlinear Learning Models

    Authors: Inbar Seroussi, Ofer Zeitouni

    Abstract: We study in this paper lower bounds for the generalization error of models derived from multi-layer neural networks, in the regime where the size of the layers is commensurate with the number of samples in the training data. We show that unbiased estimators have unacceptable performance for such nonlinear networks in this regime. We derive explicit generalization lower bounds for general biased es… ▽ More

    Submitted 6 July, 2022; v1 submitted 26 March, 2021; originally announced March 2021.

    Comments: Minor correction+conference information. To appear in IEEE Trans. Inf. Th