subscribe to arXiv mailings

Thermodynamic Natural Gradient Descent

Authors: Kaelan Donatella, Samuel Duffield, Maxwell Aifer, Denis Melanson, Gavin Crooks, Patrick J. Coles

Abstract: Second-order training methods have better convergence properties than gradient descent but are rarely used in practice for large-scale training due to their computational overhead. This can be viewed as a hardware limitation (imposed by digital computers). Here we show that natural gradient descent (NGD), a second-order method, can have a similar computational complexity per iteration to a first-o… ▽ More Second-order training methods have better convergence properties than gradient descent but are rarely used in practice for large-scale training due to their computational overhead. This can be viewed as a hardware limitation (imposed by digital computers). Here we show that natural gradient descent (NGD), a second-order method, can have a similar computational complexity per iteration to a first-order method, when employing appropriate hardware. We present a new hybrid digital-analog algorithm for training neural networks that is equivalent to NGD in a certain parameter regime but avoids prohibitively costly linear system solves. Our algorithm exploits the thermodynamic properties of an analog system at equilibrium, and hence requires an analog thermodynamic computer. The training occurs in a hybrid digital-analog loop, where the gradient and Fisher information matrix (or any other positive semi-definite curvature matrix) are calculated at given time intervals while the analog dynamics take place. We numerically demonstrate the superiority of this approach over state-of-the-art digital first- and second-order training methods on classification tasks and language model fine-tuning tasks. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 17 pages, 7 figures

arXiv:2401.16231 [pdf, other]

Error Mitigation for Thermodynamic Computing

Authors: Maxwell Aifer, Denis Melanson, Kaelan Donatella, Gavin Crooks, Thomas Ahle, Patrick J. Coles

Abstract: While physics-based computing can offer speed and energy efficiency compared to digital computing, it also is subject to errors that must be mitigated. For example, many error mitigation methods have been proposed for quantum computing. However this error mitigation framework has yet to be applied to other physics-based computing paradigms. In this work, we consider thermodynamic computing, which… ▽ More While physics-based computing can offer speed and energy efficiency compared to digital computing, it also is subject to errors that must be mitigated. For example, many error mitigation methods have been proposed for quantum computing. However this error mitigation framework has yet to be applied to other physics-based computing paradigms. In this work, we consider thermodynamic computing, which has recently captured attention due to its relevance to artificial intelligence (AI) applications, such as probabilistic AI and generative AI. A key source of errors in this paradigm is the imprecision of the analog hardware components. Here, we introduce a method that reduces the overall error from a linear to a quadratic dependence (from $ε$ to $ε^2$) on the imprecision $ε$, for Gaussian sampling and linear algebra applications. The method involves sampling from an ensemble of imprecise distributions associated with various rounding events and then merging these samples. We numerically demonstrate the scalability of this method for dimensions greater than 1000. Finally, we implement this method on an actual thermodynamic computer and show $20\%$ error reduction for matrix inversion; the first thermodynamic error mitigation experiment. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 17 pages, 8 figures

arXiv:2312.04836 [pdf, other]

Thermodynamic Computing System for AI Applications

Authors: Denis Melanson, Mohammad Abu Khater, Maxwell Aifer, Kaelan Donatella, Max Hunter Gordon, Thomas Ahle, Gavin Crooks, Antonio J. Martinez, Faris Sbahi, Patrick J. Coles

Abstract: Recent breakthroughs in artificial intelligence (AI) algorithms have highlighted the need for novel computing hardware in order to truly unlock the potential for AI. Physics-based hardware, such as thermodynamic computing, has the potential to provide a fast, low-power means to accelerate AI primitives, especially generative AI and probabilistic AI. In this work, we present the first continuous-va… ▽ More Recent breakthroughs in artificial intelligence (AI) algorithms have highlighted the need for novel computing hardware in order to truly unlock the potential for AI. Physics-based hardware, such as thermodynamic computing, has the potential to provide a fast, low-power means to accelerate AI primitives, especially generative AI and probabilistic AI. In this work, we present the first continuous-variable thermodynamic computer, which we call the stochastic processing unit (SPU). Our SPU is composed of RLC circuits, as unit cells, on a printed circuit board, with 8 unit cells that are all-to-all coupled via switched capacitances. It can be used for either sampling or linear algebra primitives, and we demonstrate Gaussian sampling and matrix inversion on our hardware. The latter represents the first thermodynamic linear algebra experiment. We also illustrate the applicability of the SPU to uncertainty quantification for neural network classification. We envision that this hardware, when scaled up in size, will have significant impact on accelerating various probabilistic AI applications. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: 26 pages, 22 figures

arXiv:2311.12759 [pdf, other]

Thermodynamic Matrix Exponentials and Thermodynamic Parallelism

Authors: Samuel Duffield, Maxwell Aifer, Gavin Crooks, Thomas Ahle, Patrick J. Coles

Abstract: Thermodynamic computing exploits fluctuations and dissipation in physical systems to efficiently solve various mathematical problems. For example, it was recently shown that certain linear algebra problems can be solved thermodynamically, leading to an asymptotic speedup scaling with the matrix dimension. The origin of this "thermodynamic advantage" has not yet been fully explained, and it is not… ▽ More Thermodynamic computing exploits fluctuations and dissipation in physical systems to efficiently solve various mathematical problems. For example, it was recently shown that certain linear algebra problems can be solved thermodynamically, leading to an asymptotic speedup scaling with the matrix dimension. The origin of this "thermodynamic advantage" has not yet been fully explained, and it is not clear what other problems might benefit from it. Here we provide a new thermodynamic algorithm for exponentiating a real matrix, with applications in simulating linear dynamical systems. We describe a simple electrical circuit involving coupled oscillators, whose thermal equilibration can implement our algorithm. We also show that this algorithm also provides an asymptotic speedup that is linear in the dimension. Finally, we introduce the concept of thermodynamic parallelism to explain this speedup, stating that thermodynamic noise provides a resource leading to effective parallelization of computations, and we hypothesize this as a mechanism to explain thermodynamic advantage more generally. △ Less

Submitted 5 January, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

Comments: 14 pages, 5 figures

arXiv:2308.05660 [pdf, other]

Thermodynamic Linear Algebra

Authors: Maxwell Aifer, Kaelan Donatella, Max Hunter Gordon, Samuel Duffield, Thomas Ahle, Daniel Simpson, Gavin E. Crooks, Patrick J. Coles

Abstract: Linear algebraic primitives are at the core of many modern algorithms in engineering, science, and machine learning. Hence, accelerating these primitives with novel computing hardware would have tremendous economic impact. Quantum computing has been proposed for this purpose, although the resource requirements are far beyond current technological capabilities, so this approach remains long-term in… ▽ More Linear algebraic primitives are at the core of many modern algorithms in engineering, science, and machine learning. Hence, accelerating these primitives with novel computing hardware would have tremendous economic impact. Quantum computing has been proposed for this purpose, although the resource requirements are far beyond current technological capabilities, so this approach remains long-term in timescale. Here we consider an alternative physics-based computing paradigm based on classical thermodynamics, to provide a near-term approach to accelerating linear algebra. At first sight, thermodynamics and linear algebra seem to be unrelated fields. In this work, we connect solving linear algebra problems to sampling from the thermodynamic equilibrium distribution of a system of coupled harmonic oscillators. We present simple thermodynamic algorithms for (1) solving linear systems of equations, (2) computing matrix inverses, (3) computing matrix determinants, and (4) solving Lyapunov equations. Under reasonable assumptions, we rigorously establish asymptotic speedups for our algorithms, relative to digital methods, that scale linearly in matrix dimension. Our algorithms exploit thermodynamic principles like ergodicity, entropy, and equilibration, highlighting the deep connection between these two seemingly distinct fields, and opening up algebraic applications for thermodynamic computing hardware. △ Less

Submitted 10 June, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

Comments: 15+22 pages, 6 figures

arXiv:1911.01968 [pdf]

Thermodynamic Computing

Authors: Tom Conte, Erik DeBenedictis, Natesh Ganesh, Todd Hylton, John Paul Strachan, R. Stanley Williams, Alexander Alemi, Lee Altenberg, Gavin Crooks, James Crutchfield, Lidia del Rio, Josh Deutsch, Michael DeWeese, Khari Douglas, Massimiliano Esposito, Michael Frank, Robert Fry, Peter Harsha, Mark Hill, Christopher Kello, Jeff Krichmar, Suhas Kumar, Shih-Chii Liu, Seth Lloyd, Matteo Marsili , et al. (14 additional authors not shown)

Abstract: The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hard… ▽ More The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hardware, devices have become so small that we are struggling to eliminate the effects of thermodynamic fluctuations, which are unavoidable at the nanometer scale. In terms of software, our ability to imagine and program effective computational abstractions and implementations are clearly challenged in complex domains. In terms of systems, currently five percent of the power generated in the US is used to run computing systems - this astonishing figure is neither ecologically sustainable nor economically scalable. Economically, the cost of building next-generation semiconductor fabrication plants has soared past $10 billion. All of these difficulties - device scaling, software complexity, adaptability, energy consumption, and fabrication economics - indicate that the current computing paradigm has matured and that continued improvements along this path will be limited. If technological progress is to continue and corresponding social and economic benefits are to continue to accrue, computing must become much more capable, energy efficient, and affordable. We propose that progress in computing can continue under a united, physically grounded, computational paradigm centered on thermodynamics. Herein we propose a research agenda to extend these thermodynamic foundations into complex, non-equilibrium, self-organizing systems and apply them holistically to future computing systems that will harness nature's innate computational capacity. We call this type of computing "Thermodynamic Computing" or TC. △ Less

Submitted 14 November, 2019; v1 submitted 5 November, 2019; originally announced November 2019.

Comments: A Computing Community Consortium (CCC) workshop report, 36 pages

Report number: ccc2019report_6

arXiv:1203.3271 [pdf, other]

doi 10.1103/PhysRevLett.109.120604

The thermodynamics of prediction

Authors: Susanne Still, David A. Sivak, Anthony J. Bell, Gavin E. Crooks

Abstract: A system responding to a stochastic driving signal can be interpreted as computing, by means of its dynamics, an implicit model of the environmental variables. The system's state retains information about past environmental fluctuations, and a fraction of this information is predictive of future ones. The remaining nonpredictive information reflects model complexity that does not improve predictiv… ▽ More A system responding to a stochastic driving signal can be interpreted as computing, by means of its dynamics, an implicit model of the environmental variables. The system's state retains information about past environmental fluctuations, and a fraction of this information is predictive of future ones. The remaining nonpredictive information reflects model complexity that does not improve predictive power, and thus represents the ineffectiveness of the model. We expose the fundamental equivalence between this model inefficiency and thermodynamic inefficiency, measured by dissipation. Our results hold arbitrarily far from thermodynamic equilibrium and are applicable to a wide range of systems, including biomolecular machines. They highlight a profound connection between the effective use of information and efficient thermodynamic operation: any system constructed to keep memory about its environment and to operate with maximal energetic efficiency has to be predictive. △ Less

Submitted 5 October, 2012; v1 submitted 15 March, 2012; originally announced March 2012.

Comments: 5 pages, 1 figure

Journal ref: Phys. Rev. Lett. 109, 120604 (2012)

Showing 1–7 of 7 results for author: Crooks, G