-
On Principles of Emergent Organization
Authors:
Adam T. Rupe,
James P. Crutchfield
Abstract:
After more than a century of concerted effort, physics still lacks basic principles of spontaneous self-organization. To appreciate why, we first state the problem, outline historical approaches, and survey the present state of the physics of self-organization. This frames the particular challenges arising from mathematical intractability and the resulting need for computational approaches, as wel…
▽ More
After more than a century of concerted effort, physics still lacks basic principles of spontaneous self-organization. To appreciate why, we first state the problem, outline historical approaches, and survey the present state of the physics of self-organization. This frames the particular challenges arising from mathematical intractability and the resulting need for computational approaches, as well as those arising from a chronic failure to define structure. Then, an overview of two modern mathematical formulations of organization -- intrinsic computation and evolution operators -- lays out a way to overcome these challenges. Together, the vantage point they afford shows how to account for the emergence of structured states via a statistical mechanics of systems arbitrarily far from equilibrium. The result is a constructive path forward to principles of organization that builds on mathematical identification of structure.
△ Less
Submitted 22 November, 2023;
originally announced November 2023.
-
Extracting Equations of Motion from Superconducting Circuits
Authors:
Christian Z. Pratt,
Kyle J. Ray,
James P. Crutchfield
Abstract:
Alternative computing paradigms open the door to exploiting recent innovations in computational hardware to probe the fundamental thermodynamic limits of information processing. One such paradigm employs superconducting quantum interference devices (SQUIDs) to execute classical computations. This, though, requires constructing sufficiently complex superconducting circuits that support a suite of u…
▽ More
Alternative computing paradigms open the door to exploiting recent innovations in computational hardware to probe the fundamental thermodynamic limits of information processing. One such paradigm employs superconducting quantum interference devices (SQUIDs) to execute classical computations. This, though, requires constructing sufficiently complex superconducting circuits that support a suite of useful information processing tasks and storage operations, as well as understanding these circuits' energetics. First-principle circuit design, though, leads to prohibitive algebraic complications when deriving the effective equations of motion -- complications that to date have precluded achieving these goals, let alone doing so efficiently. We circumvent these complications by (i) specializing our class of circuits and physical operating regimes, (ii) synthesizing existing derivation techniques to suit these specializations, and (iii) implementing solution-finding optimizations which facilitate physically interpreting circuit degrees of freedom that respect physically-grounded constraints. This leads to efficient, practical circuit prototyping and access to scalable circuit architectures. The analytical efficiency is demonstrated by reproducing the potential energy landscape generated by the quantum flux parametron (QFP). We then show how inductively coupling two QFPs produces a device that is capable of executing 2-bit computations via its composite potential energy landscape. More generally, the synthesis methods detailed here provide a basis for constructing universal logic gates and investigating their thermodynamic performance.
△ Less
Submitted 2 July, 2024; v1 submitted 4 July, 2023;
originally announced July 2023.
-
Unsupervised Discovery of Extreme Weather Events Using Universal Representations of Emergent Organization
Authors:
Adam Rupe,
Karthik Kashinath,
Nalini Kumar,
James P. Crutchfield
Abstract:
Spontaneous self-organization is ubiquitous in systems far from thermodynamic equilibrium. While organized structures that emerge dominate transport properties, universal representations that identify and describe these key objects remain elusive. Here, we introduce a theoretically-grounded framework for describing emergent organization that, via data-driven algorithms, is constructive in practice…
▽ More
Spontaneous self-organization is ubiquitous in systems far from thermodynamic equilibrium. While organized structures that emerge dominate transport properties, universal representations that identify and describe these key objects remain elusive. Here, we introduce a theoretically-grounded framework for describing emergent organization that, via data-driven algorithms, is constructive in practice. Its building blocks are spacetime lightcones that embody how information propagates across a system through local interactions. We show that predictive equivalence classes of lightcones -- local causal states -- capture organized behaviors and coherent structures in complex spatiotemporal systems. Employing an unsupervised physics-informed machine learning algorithm and a high-performance computing implementation, we demonstrate automatically discovering coherent structures in two real world domain science problems. We show that local causal states identify vortices and track their power-law decay behavior in two-dimensional fluid turbulence. We then show how to detect and track familiar extreme weather events -- hurricanes and atmospheric rivers -- and discover other novel coherent structures associated with precipitation extremes in high-resolution climate data at the grid-cell level.
△ Less
Submitted 28 September, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
Complexity-calibrated Benchmarks for Machine Learning Reveal When Next-Generation Reservoir Computer Predictions Succeed and Mislead
Authors:
Sarah E. Marzen,
Paul M. Riechers,
James P. Crutchfield
Abstract:
Recurrent neural networks are used to forecast time series in finance, climate, language, and from many other domains. Reservoir computers are a particularly easily trainable form of recurrent neural network. Recently, a "next-generation" reservoir computer was introduced in which the memory trace involves only a finite number of previous symbols. We explore the inherent limitations of finite-past…
▽ More
Recurrent neural networks are used to forecast time series in finance, climate, language, and from many other domains. Reservoir computers are a particularly easily trainable form of recurrent neural network. Recently, a "next-generation" reservoir computer was introduced in which the memory trace involves only a finite number of previous symbols. We explore the inherent limitations of finite-past memory traces in this intriguing proposal. A lower bound from Fano's inequality shows that, on highly non-Markovian processes generated by large probabilistic state machines, next-generation reservoir computers with reasonably long memory traces have an error probability that is at least ~ 60% higher than the minimal attainable error probability in predicting the next observation. More generally, it appears that popular recurrent neural networks fall far short of optimally predicting such complex processes. These results highlight the need for a new generation of optimized recurrent neural network architectures. Alongside this finding, we present concentration-of-measure results for randomly-generated but complex processes. One conclusion is that large probabilistic state machines -- specifically, large $ε$-machines -- are key to generating challenging and structurally-unbiased stimuli for ground-truthing recurrent neural network architectures.
△ Less
Submitted 25 March, 2023;
originally announced March 2023.
-
Intrinsic and Measured Information in Separable Quantum Processes
Authors:
David Gier,
James P. Crutchfield
Abstract:
Stationary quantum information sources emit sequences of correlated qudits -- that is, structured quantum stochastic processes. If an observer performs identical measurements on a qudit sequence, the outcomes are a realization of a classical stochastic process. We introduce quantum-information-theoretic properties for separable qudit sequences that serve as bounds on the classical information prop…
▽ More
Stationary quantum information sources emit sequences of correlated qudits -- that is, structured quantum stochastic processes. If an observer performs identical measurements on a qudit sequence, the outcomes are a realization of a classical stochastic process. We introduce quantum-information-theoretic properties for separable qudit sequences that serve as bounds on the classical information properties of subsequent measured processes. For sources driven by hidden Markov dynamics we describe how an observer can temporarily or permanently synchronize to the source's internal state using specific positive operator-valued measures or adaptive measurement protocols. We introduce a method for approximating an information source with an independent and identically-distributed, Markov, or larger memory model through tomographic reconstruction. We identify broad classes of separable processes based on their quantum information properties and the complexity of measurements required to synchronize to and accurately reconstruct them.
△ Less
Submitted 28 February, 2023;
originally announced March 2023.
-
Whale Casting: Remote mobile streaming humpback whale vocalizations to the world
Authors:
James P. Crutchfield,
Alexandra M. Jurgens
Abstract:
Over several days in early August 2021, while at sea in Chatham Strait, Southeast Alaska, aboard M/Y Blue Pearl, an online twitch.tv stream broadcast in real-time humpback whale vocalizations monitored via hydrophone. Dozens on mainland North American and around the planet listened in and chatted via the stream. The webcasts demonstrated a proof-of-concept: only relatively inexpensive commercial-o…
▽ More
Over several days in early August 2021, while at sea in Chatham Strait, Southeast Alaska, aboard M/Y Blue Pearl, an online twitch.tv stream broadcast in real-time humpback whale vocalizations monitored via hydrophone. Dozens on mainland North American and around the planet listened in and chatted via the stream. The webcasts demonstrated a proof-of-concept: only relatively inexpensive commercial-off-the-shelf equipment is required for remote mobile streaming at sea. These notes document what was required and make recommendations for higher-quality and larger-scale deployments. One conclusion is that real-time, automated audio documenting whale acoustic behavior is readily accessible and, using the cloud, it can be directly integrated into behavioral databases -- information sources that now often focus exclusively on nonreal-time visual-sighting narrative reports and photography.
△ Less
Submitted 5 December, 2022;
originally announced December 2022.
-
Exploring Predictive States via Cantor Embeddings and Wasserstein Distance
Authors:
Samuel P. Loomis,
James P. Crutchfield
Abstract:
Predictive states for stochastic processes are a nonparametric and interpretable construct with relevance across a multitude of modeling paradigms. Recent progress on the self-supervised reconstruction of predictive states from time-series data focused on the use of reproducing kernel Hilbert spaces. Here, we examine how Wasserstein distances may be used to detect predictive equivalences in symbol…
▽ More
Predictive states for stochastic processes are a nonparametric and interpretable construct with relevance across a multitude of modeling paradigms. Recent progress on the self-supervised reconstruction of predictive states from time-series data focused on the use of reproducing kernel Hilbert spaces. Here, we examine how Wasserstein distances may be used to detect predictive equivalences in symbolic data. We compute Wasserstein distances between distributions over sequences ("predictions"), using a finite-dimensional embedding of sequences based on the Cantor for the underlying geometry. We show that exploratory data analysis using the resulting geometry via hierarchical clustering and dimension reduction provides insight into the temporal structure of processes ranging from the relatively simple (e.g., finite-state hidden Markov models) to the very complex (e.g., infinite-state indexed grammars).
△ Less
Submitted 8 June, 2022;
originally announced June 2022.
-
Gigahertz Sub-Landauer Momentum Computing
Authors:
Kyle J. Ray,
James P. Crutchfield
Abstract:
We introduce a fast and highly-efficient physically-realizable bit swap. Employing readily available and scalable Josephson junction microtechnology, the design implements the recently introduced paradigm of momentum computing. Its nanosecond speeds and sub-Landauer thermodynamic efficiency arise from dynamically storing memory in momentum degrees of freedom. As such, during the swap, the microsta…
▽ More
We introduce a fast and highly-efficient physically-realizable bit swap. Employing readily available and scalable Josephson junction microtechnology, the design implements the recently introduced paradigm of momentum computing. Its nanosecond speeds and sub-Landauer thermodynamic efficiency arise from dynamically storing memory in momentum degrees of freedom. As such, during the swap, the microstate distribution is never near equilibrium and the memory-state dynamics fall far outside of stochastic thermodynamics that assumes detailed-balanced Markovian dynamics. The device implements a bit-swap operation -- a fundamental operation necessary to build reversible universal computing. Extensive, physically-calibrated simulations demonstrate that device performance is robust and that momentum computing can support thermodynamically-efficient, high-speed, large-scale general-purpose computing that circumvents Landauer's bound.
△ Less
Submitted 18 November, 2022; v1 submitted 14 February, 2022;
originally announced February 2022.
-
Quantum Information Dimension and Geometric Entropy
Authors:
Fabio Anza,
James P. Crutchfield
Abstract:
Geometric quantum mechanics, through its differential-geometric underpinning, provides additional tools of analysis and interpretation that bring quantum mechanics closer to classical mechanics: state spaces in both are equipped with symplectic geometry. This opens the door to revisiting foundational questions and issues, such as the nature of quantum entropy, from a geometric perspective. Central…
▽ More
Geometric quantum mechanics, through its differential-geometric underpinning, provides additional tools of analysis and interpretation that bring quantum mechanics closer to classical mechanics: state spaces in both are equipped with symplectic geometry. This opens the door to revisiting foundational questions and issues, such as the nature of quantum entropy, from a geometric perspective. Central to this is the concept of geometric quantum state -- the probability measure on a system's space of pure states. This space's continuity leads us to introduce two analysis tools, inspired by Renyi's information theory, to characterize and quantify fundamental properties of geometric quantum states: the quantum information dimension that is the rate of geometric quantum state compression and the dimensional geometric entropy that monitors information stored in quantum states. We recount their classical definitions, information-theoretic meanings, and physical interpretations, and adapt them to quantum systems via the geometric approach. We then explicitly compute them in various examples and classes of quantum system. We conclude commenting on future directions for information in geometric quantum mechanics.
△ Less
Submitted 12 March, 2024; v1 submitted 11 November, 2021;
originally announced November 2021.
-
Topology, Convergence, and Reconstruction of Predictive States
Authors:
Samuel P. Loomis,
James P. Crutchfield
Abstract:
Predictive equivalence in discrete stochastic processes have been applied with great success to identify randomness and structure in statistical physics and chaotic dynamical systems and to inferring hidden Markov models. We examine the conditions under which they can be reliably reconstructed from time-series data, showing that convergence of predictive states can be achieved from empirical sampl…
▽ More
Predictive equivalence in discrete stochastic processes have been applied with great success to identify randomness and structure in statistical physics and chaotic dynamical systems and to inferring hidden Markov models. We examine the conditions under which they can be reliably reconstructed from time-series data, showing that convergence of predictive states can be achieved from empirical samples in the weak topology of measures. Moreover, predictive states may be represented in Hilbert spaces that replicate the weak topology. We mathematically explain how these representations are particularly beneficial when reconstructing high-memory processes and connect them to reproducing kernel Hilbert spaces.
△ Less
Submitted 19 September, 2021;
originally announced September 2021.
-
Nonequilibrium Thermodynamics in Measuring Carbon Footprints: Disentangling Structure and Artifact in Input-Output Accounting
Authors:
Samuel P. Loomis,
Mark Cooper,
James P. Crutchfield
Abstract:
Multiregional input-output (MRIO) tables, in conjunction with Leontief analysis, are widely-used to assess the geographical distribution of carbon emissions and the economic activities that cause them. Majorization, a tool originating in economics that has found utility in statistical mechanics, can provide insight into how Leontief analysis links disparities in emissions with global income inequa…
▽ More
Multiregional input-output (MRIO) tables, in conjunction with Leontief analysis, are widely-used to assess the geographical distribution of carbon emissions and the economic activities that cause them. Majorization, a tool originating in economics that has found utility in statistical mechanics, can provide insight into how Leontief analysis links disparities in emissions with global income inequality. We examine Leontief analysis as a model, drawing out similarities with modern nonequilibrium statistical mechanics. Paralleling the physical concept of thermo-majorization, we define the concept of eco-majorization and show it is a sufficient condition to determine the directionality of embodied emission flows. Surprisingly, relatively small trade deficits and a geographically heterogeneous emissions-per-dollar ratio greatly increases the appearance of eco-majorization, regardless of any further content in the MRIO tables used. Our results are bolstered by a statistical analysis of null models of MRIO tables, based on data provided by the Global Trade Aggregation Project9
△ Less
Submitted 12 November, 2021; v1 submitted 7 June, 2021;
originally announced June 2021.
-
Divergent Predictive States: The Statistical Complexity Dimension of Stationary, Ergodic Hidden Markov Processes
Authors:
Alexandra M. Jurgens,
James P. Crutchfield
Abstract:
Even simply-defined, finite-state generators produce stochastic processes that require tracking an uncountable infinity of probabilistic features for optimal prediction. For processes generated by hidden Markov chains the consequences are dramatic. Their predictive models are generically infinite-state. And, until recently, one could determine neither their intrinsic randomness nor structural comp…
▽ More
Even simply-defined, finite-state generators produce stochastic processes that require tracking an uncountable infinity of probabilistic features for optimal prediction. For processes generated by hidden Markov chains the consequences are dramatic. Their predictive models are generically infinite-state. And, until recently, one could determine neither their intrinsic randomness nor structural complexity. The prequel, though, introduced methods to accurately calculate the Shannon entropy rate (randomness) and to constructively determine their minimal (though, infinite) set of predictive features. Leveraging this, we address the complementary challenge of determining how structured hidden Markov processes are by calculating their statistical complexity dimension -- the information dimension of the minimal set of predictive features. This tracks the divergence rate of the minimal memory resources required to optimally predict a broad class of truly complex processes.
△ Less
Submitted 15 March, 2021; v1 submitted 20 February, 2021;
originally announced February 2021.
-
Discovering Causal Structure with Reproducing-Kernel Hilbert Space $ε$-Machines
Authors:
Nicolas Brodu,
James P. Crutchfield
Abstract:
We merge computational mechanics' definition of causal states (predictively-equivalent histories) with reproducing-kernel Hilbert space (RKHS) representation inference. The result is a widely-applicable method that infers causal structure directly from observations of a system's behaviors whether they are over discrete or continuous events or time. A structural representation -- a finite- or infin…
▽ More
We merge computational mechanics' definition of causal states (predictively-equivalent histories) with reproducing-kernel Hilbert space (RKHS) representation inference. The result is a widely-applicable method that infers causal structure directly from observations of a system's behaviors whether they are over discrete or continuous events or time. A structural representation -- a finite- or infinite-state kernel $ε$-machine -- is extracted by a reduced-dimension transform that gives an efficient representation of causal states and their topology. In this way, the system dynamics are represented by a stochastic (ordinary or partial) differential equation that acts on causal states. We introduce an algorithm to estimate the associated evolution operator. Paralleling the Fokker-Plank equation, it efficiently evolves causal-state distributions and makes predictions in the original data space via an RKHS functional mapping. We demonstrate these techniques, together with their predictive abilities, on discrete-time, discrete-value infinite Markov-order processes generated by finite-state hidden Markov models with (i) finite or (ii) uncountably-infinite causal states and (iii) continuous-time, continuous-value processes generated by thermally-driven chaotic flows. The method robustly estimates causal structure in the presence of varying external and measurement noise levels and for very high dimensional data.
△ Less
Submitted 2 December, 2021; v1 submitted 23 November, 2020;
originally announced November 2020.
-
Refining Landauer's Stack: Balancing Error and Dissipation When Erasing Information
Authors:
Gregory W. Wimsatt,
Alexander B. Boyd,
Paul M. Riechers,
James P. Crutchfield
Abstract:
Nonequilibrium information thermodynamics determines the minimum energy dissipation to reliably erase memory under time-symmetric control protocols. We demonstrate that its bounds are tight and so show that the costs overwhelm those implied by Landauer's energy bound on information erasure. Moreover, in the limit of perfect computation, the costs diverge. The conclusion is that time-asymmetric pro…
▽ More
Nonequilibrium information thermodynamics determines the minimum energy dissipation to reliably erase memory under time-symmetric control protocols. We demonstrate that its bounds are tight and so show that the costs overwhelm those implied by Landauer's energy bound on information erasure. Moreover, in the limit of perfect computation, the costs diverge. The conclusion is that time-asymmetric protocols should be developed for efficient, accurate thermodynamic computing. And, that Landauer's Stack -- the full suite of theoretically-predicted thermodynamic costs -- is ready for experimental test and calibration.
△ Less
Submitted 28 November, 2020;
originally announced November 2020.
-
Spacetime Autoencoders Using Local Causal States
Authors:
Adam Rupe,
James P. Crutchfield
Abstract:
Local causal states are latent representations that capture organized pattern and structure in complex spatiotemporal systems. We expand their functionality, framing them as spacetime autoencoders. Previously, they were only considered as maps from observable spacetime fields to latent local causal state fields. Here, we show that there is a stochastic decoding that maps back from the latent field…
▽ More
Local causal states are latent representations that capture organized pattern and structure in complex spatiotemporal systems. We expand their functionality, framing them as spacetime autoencoders. Previously, they were only considered as maps from observable spacetime fields to latent local causal state fields. Here, we show that there is a stochastic decoding that maps back from the latent fields to observable fields. Furthermore, their Markovian properties define a stochastic dynamic in the latent space. Combined with stochastic decoding, this gives a new method for forecasting spacetime fields.
△ Less
Submitted 12 October, 2020;
originally announced October 2020.
-
Non-Markovian Momentum Computing: Universal and Efficient
Authors:
Kyle J. Ray,
Gregory W. Wimsatt,
Alexander B. Boyd,
James P. Crutchfield
Abstract:
All computation is physically embedded. Reflecting this, a growing body of results embraces rate equations as the underlying mechanics of thermodynamic computation and biological information processing. Strictly applying the implied continuous-time Markov chains, however, excludes a universe of natural computing. We show that expanding the toolset to continuous-time hidden Markov chains substantia…
▽ More
All computation is physically embedded. Reflecting this, a growing body of results embraces rate equations as the underlying mechanics of thermodynamic computation and biological information processing. Strictly applying the implied continuous-time Markov chains, however, excludes a universe of natural computing. We show that expanding the toolset to continuous-time hidden Markov chains substantially removes the constraints. The general point is made concrete by our analyzing two eminently-useful computations that are impossible to describe with a set of rate equations over the memory states. We design and analyze a thermodynamically-costless bit flip, providing a first counterexample to rate-equation modeling. We generalize this to a costless Fredkin gate---a key operation in reversible computing that is computation universal. Going beyond rate-equation dynamics is not only possible, but necessary if stochastic thermodynamics is to become part of the paradigm for physical information processing.
△ Less
Submitted 2 October, 2020;
originally announced October 2020.
-
Shannon Entropy Rate of Hidden Markov Processes
Authors:
Alexandra M. Jurgens,
James P. Crutchfield
Abstract:
Hidden Markov chains are widely applied statistical models of stochastic processes, from fundamental physics and chemistry to finance, health, and artificial intelligence. The hidden Markov processes they generate are notoriously complicated, however, even if the chain is finite state: no finite expression for their Shannon entropy rate exists, as the set of their predictive features is genericall…
▽ More
Hidden Markov chains are widely applied statistical models of stochastic processes, from fundamental physics and chemistry to finance, health, and artificial intelligence. The hidden Markov processes they generate are notoriously complicated, however, even if the chain is finite state: no finite expression for their Shannon entropy rate exists, as the set of their predictive features is generically infinite. As such, to date one cannot make general statements about how random they are nor how structured. Here, we address the first part of this challenge by showing how to efficiently and accurately calculate their entropy rates. We also show how this method gives the minimal set of infinite predictive features. A sequel addresses the challenge's second part on structure.
△ Less
Submitted 28 August, 2020;
originally announced August 2020.
-
Thermodynamic Machine Learning through Maximum Work Production
Authors:
A. B. Boyd,
J. P. Crutchfield,
M. Gu
Abstract:
Adaptive systems -- such as a biological organism gaining survival advantage, an autonomous robot executing a functional task, or a motor protein transporting intracellular nutrients -- must model the regularities and stochasticity in their environments to take full advantage of thermodynamic resources. Analogously, but in a purely computational realm, machine learning algorithms estimate models t…
▽ More
Adaptive systems -- such as a biological organism gaining survival advantage, an autonomous robot executing a functional task, or a motor protein transporting intracellular nutrients -- must model the regularities and stochasticity in their environments to take full advantage of thermodynamic resources. Analogously, but in a purely computational realm, machine learning algorithms estimate models to capture predictable structure and identify irrelevant noise in training data. This happens through optimization of performance metrics, such as model likelihood. If physically implemented, is there a sense in which computational models estimated through machine learning are physically preferred? We introduce the thermodynamic principle that work production is the most relevant performance metric for an adaptive physical agent and compare the results to the maximum-likelihood principle that guides machine learning. Within the class of physical agents that most efficiently harvest energy from their environment, we demonstrate that an efficient agent's model explicitly determines its architecture and how much useful work it harvests from the environment. We then show that selecting the maximum-work agent for given environmental data corresponds to finding the maximum-likelihood model. This establishes an equivalence between nonequilibrium thermodynamics and dynamic learning. In this way, work maximization emerges as an organizing principle that underlies learning in adaptive thermodynamic systems.
△ Less
Submitted 12 April, 2021; v1 submitted 27 June, 2020;
originally announced June 2020.
-
Correlated structural evolution within multiplex networks
Authors:
Haochen Wu,
Ryan G. James,
James P. Crutchfield,
Raissa M. D'Souza
Abstract:
Many natural, engineered, and social systems can be represented using the framework of a layered network, where each layer captures a different type of interaction between the same set of nodes. The study of such multiplex networks is a vibrant area of research. Yet, understanding how to quantify the correlations present between pairs of layers, and more so present in their co-evolution, is lackin…
▽ More
Many natural, engineered, and social systems can be represented using the framework of a layered network, where each layer captures a different type of interaction between the same set of nodes. The study of such multiplex networks is a vibrant area of research. Yet, understanding how to quantify the correlations present between pairs of layers, and more so present in their co-evolution, is lacking. Such methods would enable us to address fundamental questions involving issues such as function, redundancy and potential disruptions. Here we show first how the edge-set of a multiplex network can be used to construct an estimator of a joint probability distribution describing edge existence over all layers. We then adapt an information-theoretic measure of general correlation called the conditional mutual information, which uses the estimated joint probability distribution, to quantify the pairwise correlations present between layers. The pairwise comparisons can also be temporal, allowing us to identify if knowledge of a certain layer can provide additional information about the evolution of another layer.
We analyze datasets from three distinct domains---economic, political, and airline networks---to demonstrate how pairwise correlation in structure and dynamical evolution between layers can be identified and show that anomalies can serve as potential indicators of major events such as shocks.
△ Less
Submitted 9 May, 2020;
originally announced May 2020.
-
Inference, Prediction, and Entropy-Rate Estimation of Continuous-time, Discrete-event Processes
Authors:
S. E. Marzen,
J. P. Crutchfield
Abstract:
Inferring models, predicting the future, and estimating the entropy rate of discrete-time, discrete-event processes is well-worn ground. However, a much broader class of discrete-event processes operates in continuous-time. Here, we provide new methods for inferring, predicting, and estimating them. The methods rely on an extension of Bayesian structural inference that takes advantage of neural ne…
▽ More
Inferring models, predicting the future, and estimating the entropy rate of discrete-time, discrete-event processes is well-worn ground. However, a much broader class of discrete-event processes operates in continuous-time. Here, we provide new methods for inferring, predicting, and estimating them. The methods rely on an extension of Bayesian structural inference that takes advantage of neural network's universal approximation power. Based on experiments with complex synthetic data, the methods are competitive with the state-of-the-art for prediction and entropy-rate estimation.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
Variations on a Demonic Theme: Szilard's Other Engines
Authors:
Kyle J. Ray,
James P. Crutchfield
Abstract:
Szilard's now-famous single-molecule engine was only the first of three constructions he introduced in 1929 to resolve several paradoxes arising from Maxwell's demon. We analyze Szilard's remaining two demon models. We show that the second one, though a markedly different implementation employing a population of distinct molecular species and semi-permeable membranes, is informationally and thermo…
▽ More
Szilard's now-famous single-molecule engine was only the first of three constructions he introduced in 1929 to resolve several paradoxes arising from Maxwell's demon. We analyze Szilard's remaining two demon models. We show that the second one, though a markedly different implementation employing a population of distinct molecular species and semi-permeable membranes, is informationally and thermodynamically equivalent to an ideal gas of the single-molecule engines. Since it is a gas of noninteracting particles one concludes, following Boyd and Crutchfield, that (i) it reduces to a chaotic dynamical system---called the Szilard Map, a composite of three piecewise linear maps that implement the thermodynamic transformations of measurement, control, and erasure; (ii) its transitory functioning as an engine that converts disorganized heat energy to work is governed by the Kolmogorov-Sinai entropy rate; (iii) the demon's minimum necessary "intelligence" for optimal functioning is given by the engine's statistical complexity, and (iv) its functioning saturates thermodynamic bounds and so it is a minimal, optimal implementation. We show that Szilard's third model is rather different and addresses the fundamental issue, raised by the first two, of measurement in and by thermodynamic systems and entropy generation. Taken together, Szilard's suite of constructions lays out a range of possible realizations of Maxwellian demons that anticipated by almost two decades Shannon's and Wiener's concept of information as surprise and cybernetics' notion of functional information. This, in turn, gives new insight into engineering implementations of novel nanoscale information engines that leverage microscopic fluctuations and into the diversity of thermodynamic mechanisms and intrinsic computation harnessed in physical, molecular, biochemical, and biological systems.
△ Less
Submitted 22 March, 2020;
originally announced March 2020.
-
Functional Thermodynamics of Maxwellian Ratchets: Constructing and Deconstructing Patterns, Randomizing and Derandomizing Behaviors
Authors:
Alexandra M. Jurgens,
James P. Crutchfield
Abstract:
Maxwellian ratchets are autonomous, finite-state thermodynamic engines that implement input-output informational transformations. Previous studies of these "demons" focused on how they exploit environmental resources to generate work: They randomize ordered inputs, leveraging increased Shannon entropy to transfer energy from a thermal reservoir to a work reservoir while respecting both Liouvillian…
▽ More
Maxwellian ratchets are autonomous, finite-state thermodynamic engines that implement input-output informational transformations. Previous studies of these "demons" focused on how they exploit environmental resources to generate work: They randomize ordered inputs, leveraging increased Shannon entropy to transfer energy from a thermal reservoir to a work reservoir while respecting both Liouvillian state-space dynamics and the Second Law. However, to date, correctly determining such functional thermodynamic operating regimes was restricted to a very few engines for which correlations among their information-bearing degrees of freedom could be calculated exactly and in closed form---a highly restricted set. Additionally, a key second dimension of ratchet behavior was largely ignored---ratchets do not merely change the randomness of environmental inputs, their operation constructs and deconstructs patterns. To address both dimensions, we adapt recent results from dynamical-systems and ergodic theories that efficiently and accurately calculate the entropy rates and the rate of statistical complexity divergence of general hidden Markov processes. In concert with the Information Processing Second Law, these methods accurately determine thermodynamic operating regimes for finite-state Maxwellian demons with arbitrary numbers of states and transitions. In addition, they facilitate analyzing structure versus randomness trade-offs that a given engine makes. The result is a greatly enhanced perspective on the information processing capabilities of information engines. As an application, we give a thorough-going analysis of the Mandal-Jarzynski ratchet, demonstrating that it has an uncountably-infinite effective state space.
△ Less
Submitted 29 May, 2020; v1 submitted 28 February, 2020;
originally announced March 2020.
-
Thermodynamically-Efficient Local Computation and the Inefficiency of Quantum Memory Compression
Authors:
Samuel P. Loomis,
James P. Crutchfield
Abstract:
Modularity dissipation identifies how locally-implemented computation entails costs beyond those required by Landauer's bound on thermodynamic computing. We establish a general theorem for efficient local computation, giving the necessary and sufficient conditions for a local operation to have zero modularity cost. Applied to thermodynamically-generating stochastic processes it confirms a conjectu…
▽ More
Modularity dissipation identifies how locally-implemented computation entails costs beyond those required by Landauer's bound on thermodynamic computing. We establish a general theorem for efficient local computation, giving the necessary and sufficient conditions for a local operation to have zero modularity cost. Applied to thermodynamically-generating stochastic processes it confirms a conjecture that classical generators are efficient if and only if they satisfy retrodiction, which places minimal memory requirements on the generator. This extends immediately to quantum computation: Any quantum simulator that employs quantum memory compression cannot be thermodynamically efficient.
△ Less
Submitted 1 February, 2020; v1 submitted 7 January, 2020;
originally announced January 2020.
-
Thermodynamic Computing
Authors:
Tom Conte,
Erik DeBenedictis,
Natesh Ganesh,
Todd Hylton,
John Paul Strachan,
R. Stanley Williams,
Alexander Alemi,
Lee Altenberg,
Gavin Crooks,
James Crutchfield,
Lidia del Rio,
Josh Deutsch,
Michael DeWeese,
Khari Douglas,
Massimiliano Esposito,
Michael Frank,
Robert Fry,
Peter Harsha,
Mark Hill,
Christopher Kello,
Jeff Krichmar,
Suhas Kumar,
Shih-Chii Liu,
Seth Lloyd,
Matteo Marsili
, et al. (14 additional authors not shown)
Abstract:
The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hard…
▽ More
The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hardware, devices have become so small that we are struggling to eliminate the effects of thermodynamic fluctuations, which are unavoidable at the nanometer scale. In terms of software, our ability to imagine and program effective computational abstractions and implementations are clearly challenged in complex domains. In terms of systems, currently five percent of the power generated in the US is used to run computing systems - this astonishing figure is neither ecologically sustainable nor economically scalable. Economically, the cost of building next-generation semiconductor fabrication plants has soared past $10 billion. All of these difficulties - device scaling, software complexity, adaptability, energy consumption, and fabrication economics - indicate that the current computing paradigm has matured and that continued improvements along this path will be limited. If technological progress is to continue and corresponding social and economic benefits are to continue to accrue, computing must become much more capable, energy efficient, and affordable. We propose that progress in computing can continue under a united, physically grounded, computational paradigm centered on thermodynamics. Herein we propose a research agenda to extend these thermodynamic foundations into complex, non-equilibrium, self-organizing systems and apply them holistically to future computing systems that will harness nature's innate computational capacity. We call this type of computing "Thermodynamic Computing" or TC.
△ Less
Submitted 14 November, 2019; v1 submitted 5 November, 2019;
originally announced November 2019.
-
Thermal Efficiency of Quantum Memory Compression
Authors:
Samuel P. Loomis,
James P. Crutchfield
Abstract:
Quantum coherence allows for reduced-memory simulators of classical processes. Using recent results in single-shot quantum thermodynamics, we derive a minimal work cost rate for quantum simulators that is quasistatically attainable in the limit of asymptotically-infinite parallel simulation. Comparing this cost with the classical regime reveals that quantizing classical simulators not only results…
▽ More
Quantum coherence allows for reduced-memory simulators of classical processes. Using recent results in single-shot quantum thermodynamics, we derive a minimal work cost rate for quantum simulators that is quasistatically attainable in the limit of asymptotically-infinite parallel simulation. Comparing this cost with the classical regime reveals that quantizing classical simulators not only results in memory compression but also in reduced dissipation. We explore this advantage across a suite of representative examples.
△ Less
Submitted 22 March, 2020; v1 submitted 3 November, 2019;
originally announced November 2019.
-
Probabilistic Deterministic Finite Automata and Recurrent Networks, Revisited
Authors:
S. E. Marzen,
J. P. Crutchfield
Abstract:
Reservoir computers (RCs) and recurrent neural networks (RNNs) can mimic any finite-state automaton in theory, and some workers demonstrated that this can hold in practice. We test the capability of generalized linear models, RCs, and Long Short-Term Memory (LSTM) RNN architectures to predict the stochastic processes generated by a large suite of probabilistic deterministic finite-state automata (…
▽ More
Reservoir computers (RCs) and recurrent neural networks (RNNs) can mimic any finite-state automaton in theory, and some workers demonstrated that this can hold in practice. We test the capability of generalized linear models, RCs, and Long Short-Term Memory (LSTM) RNN architectures to predict the stochastic processes generated by a large suite of probabilistic deterministic finite-state automata (PDFA). PDFAs provide an excellent performance benchmark in that they can be systematically enumerated, the randomness and correlation structure of their generated processes are exactly known, and their optimal memory-limited predictors are easily computed. Unsurprisingly, LSTMs outperform RCs, which outperform generalized linear models. Surprisingly, each of these methods can fall short of the maximal predictive accuracy by as much as 50% after training and, when optimized, tend to fall short of the maximal predictive accuracy by ~5%, even though previously available methods achieve maximal predictive accuracy with orders-of-magnitude less data. Thus, despite the representational universality of RCs and RNNs, using them can engender a surprising predictive gap for simple stimuli. One concludes that there is an important and underappreciated role for methods that infer "causal states" or "predictive state representations".
△ Less
Submitted 16 October, 2019;
originally announced October 2019.
-
DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems
Authors:
Adam Rupe,
Nalini Kumar,
Vladislav Epifanov,
Karthik Kashinath,
Oleksandr Pavlyk,
Frank Schlimbach,
Mostofa Patwary,
Sergey Maidanov,
Victor Lee,
Prabhat,
James P. Crutchfield
Abstract:
Extracting actionable insight from complex unlabeled scientific data is an open challenge and key to unlocking data-driven discovery in science. Complementary and alternative to supervised machine learning approaches, unsupervised physics-based methods based on behavior-driven theories hold great promise. Due to computational limitations, practical application on real-world domain science problems…
▽ More
Extracting actionable insight from complex unlabeled scientific data is an open challenge and key to unlocking data-driven discovery in science. Complementary and alternative to supervised machine learning approaches, unsupervised physics-based methods based on behavior-driven theories hold great promise. Due to computational limitations, practical application on real-world domain science problems has lagged far behind theoretical development. We present our first step towards bridging this divide - DisCo - a high-performance distributed workflow for the behavior-driven local causal state theory. DisCo provides a scalable unsupervised physics-based representation learning method that decomposes spatiotemporal systems into their structurally relevant components, which are captured by the latent local causal state variables. Complex spatiotemporal systems are generally highly structured and organize around a lower-dimensional skeleton of coherent structures, and in several firsts we demonstrate the efficacy of DisCo in capturing such structures from observational and simulated scientific data. To the best of our knowledge, DisCo is also the first application software developed entirely in Python to scale to over 1000 machine nodes, providing good performance along with ensuring domain scientists' productivity. We developed scalable, performant methods optimized for Intel many-core processors that will be upstreamed to open-source Python library packages. Our capstone experiment, using newly developed DisCo workflow and libraries, performs unsupervised spacetime segmentation analysis of CAM5.1 climate simulation data, processing an unprecedented 89.5 TB in 6.6 minutes end-to-end using 1024 Intel Haswell nodes on the Cori supercomputer obtaining 91% weak-scaling and 64% strong-scaling efficiency.
△ Less
Submitted 25 September, 2019;
originally announced September 2019.
-
Towards Unsupervised Segmentation of Extreme Weather Events
Authors:
Adam Rupe,
Karthik Kashinath,
Nalini Kumar,
Victor Lee,
Prabhat,
James P. Crutchfield
Abstract:
Extreme weather is one of the main mechanisms through which climate change will directly impact human society. Coping with such change as a global community requires markedly improved understanding of how global warming drives extreme weather events. While alternative climate scenarios can be simulated using sophisticated models, identifying extreme weather events in these simulations requires aut…
▽ More
Extreme weather is one of the main mechanisms through which climate change will directly impact human society. Coping with such change as a global community requires markedly improved understanding of how global warming drives extreme weather events. While alternative climate scenarios can be simulated using sophisticated models, identifying extreme weather events in these simulations requires automation due to the vast amounts of complex high-dimensional data produced. Atmospheric dynamics, and hydrodynamic flows more generally, are highly structured and largely organize around a lower dimensional skeleton of coherent structures. Indeed, extreme weather events are a special case of more general hydrodynamic coherent structures. We present a scalable physics-based representation learning method that decomposes spatiotemporal systems into their structurally relevant components, which are captured by latent variables known as local causal states. For complex fluid flows we show our method is capable of capturing known coherent structures, and with promising segmentation results on CAM5.1 water vapor data we outline the path to extreme weather identification from unlabeled climate model simulation data.
△ Less
Submitted 16 September, 2019;
originally announced September 2019.
-
Balancing Error and Dissipation in Computing
Authors:
P. M. Riechers,
A. B. Boyd,
G. W. Wimsatt,
J. P. Crutchfield
Abstract:
Modern digital electronics support remarkably reliable computing, especially given the challenge of controlling nanoscale logical components that interact in fluctuating environments. However, we demonstrate that the high-reliability limit is subject to a fundamental error-energy-efficiency tradeoff that arises from time-symmetric control: Requiring a low probability of error causes energy consump…
▽ More
Modern digital electronics support remarkably reliable computing, especially given the challenge of controlling nanoscale logical components that interact in fluctuating environments. However, we demonstrate that the high-reliability limit is subject to a fundamental error-energy-efficiency tradeoff that arises from time-symmetric control: Requiring a low probability of error causes energy consumption to diverge as logarithm of the inverse error rate for nonreciprocal logical transitions. The reciprocity (self-invertibility) of a computation is a stricter condition for thermodynamic efficiency than logical reversibility (invertibility), the latter being the root of Landauer's work bound on erasing information. Beyond engineered computation, the results identify a generic error-dissipation tradeoff in steady-state transformations of genetic information carried out by biological organisms. The lesson is that computation under time-symmetric control cannot reach, and is often far above, the Landauer limit. In this way, time-asymmetry becomes a design principle for thermodynamically efficient computing.
△ Less
Submitted 2 June, 2020; v1 submitted 14 September, 2019;
originally announced September 2019.
-
Fraudulent White Noise: Flat power spectra belie arbitrarily complex processes
Authors:
P. M. Riechers,
J. P. Crutchfield
Abstract:
Power spectral densities are a common, convenient, and powerful way to analyze signals. So much so that they are now broadly deployed across the sciences and engineering---from quantum physics to cosmology, and from crystallography to neuroscience to speech recognition. The features they reveal not only identify prominent signal-frequencies but also hint at mechanisms that generate correlation and…
▽ More
Power spectral densities are a common, convenient, and powerful way to analyze signals. So much so that they are now broadly deployed across the sciences and engineering---from quantum physics to cosmology, and from crystallography to neuroscience to speech recognition. The features they reveal not only identify prominent signal-frequencies but also hint at mechanisms that generate correlation and lead to resonance. Despite their near-centuries-long run of successes in signal analysis, here we show that flat power spectra can be generated by highly complex processes, effectively hiding all inherent structure in complex signals. Historically, this circumstance has been widely misinterpreted, being taken as the renowned signature of "structureless" white noise---the benchmark of randomness. We argue, in contrast, to the extent that most real-world complex systems exhibit correlations beyond pairwise statistics their structures evade power spectra and other pairwise statistical measures. As concrete physical examples, we demonstrate that fraudulent white noise hides the predictable structure of both entangled quantum systems and chaotic crystals.
To make these words of warning operational, we present constructive results that explore how this situation comes about and the high toll it takes in understanding complex mechanisms. First, we give the closed-form solution for the power spectrum of a very broad class of structurally-complex signal generators. Second, we demonstrate the close relationship between eigen-spectra of evolution operators and power spectra. Third, we characterize the minimal generative structure implied by any power spectrum. Fourth, we show how to construct arbitrarily complex processes with flat power spectra. Finally, leveraging this diagnosis of the problem, we point the way to developing more incisive tools for discovering structure in complex signals.
△ Less
Submitted 6 July, 2020; v1 submitted 29 August, 2019;
originally announced August 2019.
-
Harnessing Fluctuations in Thermodynamic Computing via Time-Reversal Symmetries
Authors:
Gregory Wimsatt,
Olli-Pentti Saira,
Alexander B. Boyd,
Matthew H. Matheny,
Siyuan Han,
Michael L. Roukes,
James P. Crutchfield
Abstract:
We experimentally demonstrate that highly structured distributions of work emerge during even the simple task of erasing a single bit. These are signatures of a refined suite of time-reversal symmetries in distinct functional classes of microscopic trajectories. As a consequence, we introduce a broad family of conditional fluctuation theorems that the component work distributions must satisfy. Sin…
▽ More
We experimentally demonstrate that highly structured distributions of work emerge during even the simple task of erasing a single bit. These are signatures of a refined suite of time-reversal symmetries in distinct functional classes of microscopic trajectories. As a consequence, we introduce a broad family of conditional fluctuation theorems that the component work distributions must satisfy. Since they identify entropy production, the component work distributions encode both the frequency of various mechanisms of success and failure during computing, as well giving improved estimates of the total irreversibly-dissipated heat. This new diagnostic tool provides strong evidence that thermodynamic computing at the nanoscale can be constructively harnessed. We experimentally verify this functional decomposition and the new class of fluctuation theorems by measuring transitions between flux states in a superconducting circuit.
△ Less
Submitted 27 June, 2019;
originally announced June 2019.
-
Spacetime Symmetries, Invariant Sets, and Additive Subdynamics of Cellular Automata
Authors:
Adam Rupe,
James P. Crutchfield
Abstract:
Cellular automata are fully-discrete, spatially-extended dynamical systems that evolve by simultaneously applying a local update function. Despite their simplicity, the induced global dynamic produces a stunning array of richly-structured, complex behaviors. These behaviors present a challenge to traditional closed-form analytic methods. In certain cases, specifically when the local update is addi…
▽ More
Cellular automata are fully-discrete, spatially-extended dynamical systems that evolve by simultaneously applying a local update function. Despite their simplicity, the induced global dynamic produces a stunning array of richly-structured, complex behaviors. These behaviors present a challenge to traditional closed-form analytic methods. In certain cases, specifically when the local update is additive, powerful techniques may be brought to bear, including characteristic polynomials, the ergodic theorem with Fourier analysis, and endomorphisms of compact Abelian groups. For general dynamics, though, where such analytics generically do not apply, behavior-driven analysis shows great promise in directly monitoring the emergence of structure and complexity in cellular automata. Here we detail a surprising connection between generalized symmetries in the spacetime fields of configuration orbits as revealed by the behavior-driven local causal states, invariant sets of spatial configurations, and additive subdynamics which allow for closed-form analytic methods.
△ Less
Submitted 30 December, 2018;
originally announced December 2018.
-
Shortcuts to Thermodynamic Computing: The Cost of Fast and Faithful Erasure
Authors:
A. B. Boyd,
A. Patra,
C. Jarzynski,
J. P. Crutchfield
Abstract:
Landauer's Principle states that the energy cost of information processing must exceed the product of the temperature and the change in Shannon entropy of the information-bearing degrees of freedom. However, this lower bound is achievable only for quasistatic, near-equilibrium computations -- that is, only over infinite time. In practice, information processing takes place in finite time, resultin…
▽ More
Landauer's Principle states that the energy cost of information processing must exceed the product of the temperature and the change in Shannon entropy of the information-bearing degrees of freedom. However, this lower bound is achievable only for quasistatic, near-equilibrium computations -- that is, only over infinite time. In practice, information processing takes place in finite time, resulting in dissipation and potentially unreliable logical outcomes. For overdamped Langevin dynamics, we show that counterdiabatic potentials can be crafted to guide systems rapidly and accurately along desired computational paths, providing shortcuts that allows for the precise design of finite-time computations. Such shortcuts require additional work, beyond Landauer's bound, that is irretrievably dissipated into the environment. We show that this dissipated work is proportional to the computation rate as well as the square of the information-storing system's length scale. As a paradigmatic example, we design shortcuts to erase a bit of information metastably stored in a double-well potential. Though dissipated work generally increases with erasure fidelity, we show that it is possible perform perfect erasure in finite time with finite work. We also show that the robustness of information storage affects the energetic cost of erasure---specifically, the dissipated work scales as the information lifetime of the bistable system. Our analysis exposes a rich and nuanced relationship between work, speed, size of the information-bearing degrees of freedom, storage robustness, and the difference between initial and final informational statistics.
△ Less
Submitted 28 December, 2018;
originally announced December 2018.
-
Unique Information and Secret Key Agreement
Authors:
Ryan G. James,
Jeffrey Emenheiser,
James P. Crutchfield
Abstract:
The partial information decomposition (PID) is a promising framework for decomposing a joint random variable into the amount of influence each source variable Xi has on a target variable Y, relative to the other sources. For two sources, influence breaks down into the information that both X0 and X1 redundantly share with Y, what X0 uniquely shares with Y, what X1 uniquely shares with Y, and final…
▽ More
The partial information decomposition (PID) is a promising framework for decomposing a joint random variable into the amount of influence each source variable Xi has on a target variable Y, relative to the other sources. For two sources, influence breaks down into the information that both X0 and X1 redundantly share with Y, what X0 uniquely shares with Y, what X1 uniquely shares with Y, and finally what X0 and X1 synergistically share with Y. Unfortunately, considerable disagreement has arisen as to how these four components should be quantified. Drawing from cryptography, we consider the secret key agreement rate as an operational method of quantifying unique informations. Secret key agreement rate comes in several forms, depending upon which parties are permitted to communicate. We demonstrate that three of these four forms are inconsistent with the PID. The remaining form implies certain interpretations as to the PID's meaning---interpretations not present in PID's definition but that, we argue, need to be explicit. These reveal an inconsistency between third-order connected information, two-way secret key agreement rate, and synergy. Similar difficulties arise with a popular PID measure in light the results here as well as from a maximum entropy viewpoint. We close by reviewing the challenges facing the PID.
△ Less
Submitted 1 November, 2018;
originally announced November 2018.
-
Strong and Weak Optimizations in Classical and Quantum Models of Stochastic Processes
Authors:
Samuel Loomis,
James P. Crutchfield
Abstract:
Among the predictive hidden Markov models that describe a given stochastic process, the ε-machine is strongly minimal in that it minimizes every Rényi-based memory measure. Quantum models can be smaller still. In contrast with the ε-machine's unique role in the classical setting, however, among the class of processes described by pure-state hidden quantum Markov models, there are those for which t…
▽ More
Among the predictive hidden Markov models that describe a given stochastic process, the ε-machine is strongly minimal in that it minimizes every Rényi-based memory measure. Quantum models can be smaller still. In contrast with the ε-machine's unique role in the classical setting, however, among the class of processes described by pure-state hidden quantum Markov models, there are those for which there does not exist any strongly minimal model. Quantum memory optimization then depends on which memory measure best matches a given problem circumstance.
△ Less
Submitted 26 August, 2018;
originally announced August 2018.
-
A Perspective on Unique Information: Directionality, Intuitions, and Secret Key Agreement
Authors:
Ryan G. James,
Jeffrey Emenheiser,
James P. Crutchfield
Abstract:
Recently, the partial information decomposition emerged as a promising framework for identifying the meaningful components of the information contained in a joint distribution. Its adoption and practical application, however, have been stymied by the lack of a generally-accepted method of quantifying its components. Here, we briefly discuss the bivariate (two-source) partial information decomposit…
▽ More
Recently, the partial information decomposition emerged as a promising framework for identifying the meaningful components of the information contained in a joint distribution. Its adoption and practical application, however, have been stymied by the lack of a generally-accepted method of quantifying its components. Here, we briefly discuss the bivariate (two-source) partial information decomposition and two implicitly directional interpretations used to intuitively motivate alternative component definitions. Drawing parallels with secret key agreement rates from information-theoretic cryptography, we demonstrate that these intuitions are mutually incompatible and suggest that this underlies the persistence of competing definitions and interpretations. Having highlighted this hitherto unacknowledged issue, we outline several possible solutions.
△ Less
Submitted 26 August, 2018;
originally announced August 2018.
-
Modes of Information Flow
Authors:
Ryan G. James,
Blanca Daniella Mansante Ayala,
Bahti Zakirov,
James P. Crutchfield
Abstract:
Information flow between components of a system takes many forms and is key to understanding the organization and functioning of large-scale, complex systems. We demonstrate three modalities of information flow from time series X to time series Y. Intrinsic information flow exists when the past of X is individually predictive of the present of Y, independent of Y's past; this is most commonly cons…
▽ More
Information flow between components of a system takes many forms and is key to understanding the organization and functioning of large-scale, complex systems. We demonstrate three modalities of information flow from time series X to time series Y. Intrinsic information flow exists when the past of X is individually predictive of the present of Y, independent of Y's past; this is most commonly considered information flow. Shared information flow exists when X's past is predictive of Y's present in the same manner as Y's past; this occurs due to synchronization or common driving, for example. Finally, synergistic information flow occurs when neither X's nor Y's pasts are predictive of Y's present on their own, but taken together they are. The two most broadly-employed information-theoretic methods of quantifying information flow---time-delayed mutual information and transfer entropy---are both sensitive to a pair of these modalities: time-delayed mutual information to both intrinsic and shared flow, and transfer entropy to both intrinsic and synergistic flow. To quantify each mode individually we introduce our cryptographic flow ansatz, positing that intrinsic flow is synonymous with secret key agreement between X and Y. Based on this, we employ an easily-computed secret-key-agreement bound---intrinsic mutual information&mdashto quantify the three flow modalities in a variety of systems including asymmetric flows and financial markets.
△ Less
Submitted 20 August, 2018;
originally announced August 2018.
-
Optimized Bacteria are Environmental Prediction Engines
Authors:
Sarah E. Marzen,
James P. Crutchfield
Abstract:
Experimentalists have observed phenotypic variability in isogenic bacteria populations. We explore the hypothesis that in fluctuating environments this variability is tuned to maximize a bacterium's expected log growth rate, potentially aided by epigenetic markers that store information about past environments. We show that, in a complex, memoryful environment, the maximal expected log growth rate…
▽ More
Experimentalists have observed phenotypic variability in isogenic bacteria populations. We explore the hypothesis that in fluctuating environments this variability is tuned to maximize a bacterium's expected log growth rate, potentially aided by epigenetic markers that store information about past environments. We show that, in a complex, memoryful environment, the maximal expected log growth rate is linear in the instantaneous predictive information---the mutual information between a bacterium's epigenetic markers and future environmental states. Hence, under resource constraints, optimal epigenetic markers are causal states---the minimal sufficient statistics for prediction. This is the minimal amount of information about the past needed to predict the future as well as possible. We suggest new theoretical investigations into and new experiments on bacteria phenotypic bet-hedging in fluctuating complex environments.
△ Less
Submitted 8 February, 2018;
originally announced February 2018.
-
Local Causal States and Discrete Coherent Structures
Authors:
Adam Rupe,
James P. Crutchfield
Abstract:
Coherent structures form spontaneously in nonlinear spatiotemporal systems and are found at all spatial scales in natural phenomena from laboratory hydrodynamic flows and chemical reactions to ocean, atmosphere, and planetary climate dynamics. Phenomenologically, they appear as key components that organize the macroscopic behaviors in such systems. Despite a century of effort, they have eluded rig…
▽ More
Coherent structures form spontaneously in nonlinear spatiotemporal systems and are found at all spatial scales in natural phenomena from laboratory hydrodynamic flows and chemical reactions to ocean, atmosphere, and planetary climate dynamics. Phenomenologically, they appear as key components that organize the macroscopic behaviors in such systems. Despite a century of effort, they have eluded rigorous analysis and empirical prediction, with progress being made only recently. As a step in this, we present a formal theory of coherent structures in fully-discrete dynamical field theories. It builds on the notion of structure introduced by computational mechanics, generalizing it to a local spatiotemporal setting. The analysis' main tool employs the \localstates, which are used to uncover a system's hidden spatiotemporal symmetries and which identify coherent structures as spatially-localized deviations from those symmetries. The approach is behavior-driven in the sense that it does not rely on directly analyzing spatiotemporal equations of motion, rather it considers only the spatiotemporal fields a system generates. As such, it offers an unsupervised approach to discover and describe coherent structures. We illustrate the approach by analyzing coherent structures generated by elementary cellular automata, comparing the results with an earlier, dynamic-invariant-set approach that decomposes fields into domains, particles, and particle interactions.
△ Less
Submitted 1 January, 2018;
originally announced January 2018.
-
The Origins of Computational Mechanics: A Brief Intellectual History and Several Clarifications
Authors:
James P. Crutchfield
Abstract:
The principle goal of computational mechanics is to define pattern and structure so that the organization of complex systems can be detected and quantified. Computational mechanics developed from efforts in the 1970s and early 1980s to identify strange attractors as the mechanism driving weak fluid turbulence via the method of reconstructing attractor geometry from measurement time series and in t…
▽ More
The principle goal of computational mechanics is to define pattern and structure so that the organization of complex systems can be detected and quantified. Computational mechanics developed from efforts in the 1970s and early 1980s to identify strange attractors as the mechanism driving weak fluid turbulence via the method of reconstructing attractor geometry from measurement time series and in the mid-1980s to estimate equations of motion directly from complex time series. In providing a mathematical and operational definition of structure it addressed weaknesses of these early approaches to discovering patterns in natural systems.
Since then, computational mechanics has led to a range of results from theoretical physics and nonlinear mathematics to diverse applications---from closed-form analysis of Markov and non-Markov stochastic processes that are ergodic or nonergodic and their measures of information and intrinsic computation to complex materials and deterministic chaos and intelligence in Maxwellian demons to quantum compression of classical processes and the evolution of computation and language.
This brief review clarifies several misunderstandings and addresses concerns recently raised regarding early works in the field (1980s). We show that misguided evaluations of the contributions of computational mechanics are groundless and stem from a lack of familiarity with its basic goals and from a failure to consider its historical context. For all practical purposes, its modern methods and results largely supersede the early works. This not only renders recent criticism moot and shows the solid ground on which computational mechanics stands but, most importantly, shows the significant progress achieved over three decades and points to the many intriguing and outstanding challenges in understanding the computational nature of complex dynamic systems.
△ Less
Submitted 18 October, 2017;
originally announced October 2017.
-
Optimizing Quantum Models of Classical Channels: The reverse Holevo problem
Authors:
S. Loomis,
J. R. Mahoney,
C. Aghamohammadi,
J. P. Crutchfield
Abstract:
Given a classical channel---a stochastic map from inputs to outputs---the input can often be transformed to an intermediate variable that is informationally smaller than the input. The new channel accurately simulates the original but at a smaller transmission rate. Here, we examine this procedure when the intermediate variable is a quantum state. We determine when and how well quantum simulations…
▽ More
Given a classical channel---a stochastic map from inputs to outputs---the input can often be transformed to an intermediate variable that is informationally smaller than the input. The new channel accurately simulates the original but at a smaller transmission rate. Here, we examine this procedure when the intermediate variable is a quantum state. We determine when and how well quantum simulations of classical channels may improve upon the minimal rates of classical simulation. This inverts Holevo's original question of quantifying the capacity of quantum channels with classical resources. We also show that this problem is equivalent to another, involving the local generation of a distribution from common entanglement.
△ Less
Submitted 29 October, 2019; v1 submitted 23 September, 2017;
originally announced September 2017.
-
Unique Information via Dependency Constraints
Authors:
Ryan G. James,
Jeffrey Emenheiser,
James P. Crutchfield
Abstract:
The partial information decomposition (PID) is perhaps the leading proposal for resolving information shared between a set of sources and a target into redundant, synergistic, and unique constituents. Unfortunately, the PID framework has been hindered by a lack of a generally agreed-upon, multivariate method of quantifying the constituents. Here, we take a step toward rectifying this by developing…
▽ More
The partial information decomposition (PID) is perhaps the leading proposal for resolving information shared between a set of sources and a target into redundant, synergistic, and unique constituents. Unfortunately, the PID framework has been hindered by a lack of a generally agreed-upon, multivariate method of quantifying the constituents. Here, we take a step toward rectifying this by developing a decomposition based on a new method that quantifies unique information. We first develop a broadly applicable method---the dependency decomposition---that delineates how statistical dependencies influence the structure of a joint distribution. The dependency decomposition then allows us to define a measure of the information about a target that can be uniquely attributed to a particular source as the least amount which the source-target statistical dependency can influence the information shared between the sources and the target. The result is the first measure that satisfies the core axioms of the PID framework while not satisfying the Blackwell relation, which depends on a particular interpretation of how the variables are related. This makes a key step forward to a practical PID.
△ Less
Submitted 27 October, 2018; v1 submitted 19 September, 2017;
originally announced September 2017.
-
Above and Beyond the Landauer Bound: Thermodynamics of Modularity
Authors:
Alexander B. Boyd,
Dibyendu Mandal,
James P. Crutchfield
Abstract:
Information processing typically occurs via the composition of modular units, such as universal logic gates. The benefit of modular information processing, in contrast to globally integrated information processing, is that complex global computations are more easily and flexibly implemented via a series of simpler, localized information processing operations which only control and change local deg…
▽ More
Information processing typically occurs via the composition of modular units, such as universal logic gates. The benefit of modular information processing, in contrast to globally integrated information processing, is that complex global computations are more easily and flexibly implemented via a series of simpler, localized information processing operations which only control and change local degrees of freedom. We show that, despite these benefits, there are unavoidable thermodynamic costs to modularity---costs that arise directly from the operation of localized processing and that go beyond Landauer's dissipation bound for erasing information. Integrated computations can achieve Landauer's bound, however, when they globally coordinate the control of all of an information reservoir's degrees of freedom. Unfortunately, global correlations among the information-bearing degrees of freedom are easily lost by modular implementations. This is costly since such correlations are a thermodynamic fuel. We quantify the minimum irretrievable dissipation of modular computations in terms of the difference between the change in global nonequilibrium free energy, which captures these global correlations, and the local (marginal) change in nonequilibrium free energy, which bounds modular work production. This modularity dissipation is proportional to the amount of additional work required to perform the computational task modularly. It has immediate consequences for physically embedded transducers, known as information ratchets. We show how to circumvent modularity dissipation by designing internal ratchet states that capture the global correlations and patterns in the ratchet's information reservoir. Designed in this way, information ratchets match the optimum thermodynamic efficiency of globally integrated computations.
△ Less
Submitted 9 August, 2017;
originally announced August 2017.
-
Prediction and Generation of Binary Markov Processes: Can a Finite-State Fox Catch a Markov Mouse?
Authors:
J. Ruebeck,
R. G. James,
J. R. Mahoney,
J. P. Crutchfield
Abstract:
Understanding the generative mechanism of a natural system is a vital component of the scientific method. Here, we investigate one of the fundamental steps toward this goal by presenting the minimal generator of an arbitrary binary Markov process. This is a class of processes whose predictive model is well known. Surprisingly, the generative model requires three distinct topologies for different r…
▽ More
Understanding the generative mechanism of a natural system is a vital component of the scientific method. Here, we investigate one of the fundamental steps toward this goal by presenting the minimal generator of an arbitrary binary Markov process. This is a class of processes whose predictive model is well known. Surprisingly, the generative model requires three distinct topologies for different regions of parameter space. We show that a previously proposed generator for a particular set of binary Markov processes is, in fact, not minimal. Our results shed the first quantitative light on the relative (minimal) costs of prediction and generation. We find, for instance, that the difference between prediction and generation is maximized when the process is approximately independently, identically distributed.
△ Less
Submitted 31 July, 2017;
originally announced August 2017.
-
Extreme Quantum Advantage for Rare-Event Sampling
Authors:
C. Aghamohammadi,
S. P. Loomis,
J. R. Mahoney,
J. P. Crutchfield
Abstract:
We introduce a quantum algorithm for efficient biased sampling of the rare events generated by classical memoryful stochastic processes. We show that this quantum algorithm gives an extreme advantage over known classical biased sampling algorithms in terms of the memory resources required. The quantum memory advantage ranges from polynomial to exponential and when sampling the rare equilibrium con…
▽ More
We introduce a quantum algorithm for efficient biased sampling of the rare events generated by classical memoryful stochastic processes. We show that this quantum algorithm gives an extreme advantage over known classical biased sampling algorithms in terms of the memory resources required. The quantum memory advantage ranges from polynomial to exponential and when sampling the rare equilibrium configurations of spin systems the quantum advantage diverges.
△ Less
Submitted 29 July, 2017;
originally announced July 2017.
-
Prediction and Power in Molecular Sensors: Uncertainty and Dissipation When Conditionally Markovian Channels Are Driven by Semi-Markov Environments
Authors:
Sarah E. Marzen,
James P. Crutchfield
Abstract:
Sensors often serve at least two purposes: predicting their input and minimizing dissipated heat. However, determining whether or not a particular sensor is evolved or designed to be accurate and efficient is difficult. This arises partly from the functional constraints being at cross purposes and partly since quantifying the predictive performance of even in silico sensors can require prohibitive…
▽ More
Sensors often serve at least two purposes: predicting their input and minimizing dissipated heat. However, determining whether or not a particular sensor is evolved or designed to be accurate and efficient is difficult. This arises partly from the functional constraints being at cross purposes and partly since quantifying the predictive performance of even in silico sensors can require prohibitively long simulations. To circumvent these difficulties, we develop expressions for the predictive accuracy and thermodynamic costs of the broad class of conditionally Markovian sensors subject to unifilar hidden semi-Markov (memoryful) environmental inputs. Predictive metrics include the instantaneous memory and the mutual information between present sensor state and input future, while dissipative metrics include power consumption and the nonpredictive information rate. Success in deriving these formulae relies heavily on identifying the environment's causal states, the input's minimal sufficient statistics for prediction. Using these formulae, we study the simplest nontrivial biological sensor model---that of a Hill molecule, characterized by the number of ligands that bind simultaneously, the sensor's cooperativity. When energetic rewards are proportional to total predictable information, the closest cooperativity that optimizes the total energy budget generally depends on the environment's past hysteretically. In this way, the sensor gains robustness to environmental fluctuations. Given the simplicity of the Hill molecule, such hysteresis will likely be found in more complex predictive sensors as well. That is, adaptations that only locally optimize biochemical parameters for prediction and dissipation can lead to sensors that "remember" the past environment.
△ Less
Submitted 12 July, 2017;
originally announced July 2017.
-
Spectral Simplicity of Apparent Complexity, Part II: Exact Complexities and Complexity Spectra
Authors:
Paul M. Riechers,
James P. Crutchfield
Abstract:
The meromorphic functional calculus developed in Part I overcomes the nondiagonalizability of linear operators that arises often in the temporal evolution of complex systems and is generic to the metadynamics of predicting their behavior. Using the resulting spectral decomposition, we derive closed-form expressions for correlation functions, finite-length Shannon entropy-rate approximates, asympto…
▽ More
The meromorphic functional calculus developed in Part I overcomes the nondiagonalizability of linear operators that arises often in the temporal evolution of complex systems and is generic to the metadynamics of predicting their behavior. Using the resulting spectral decomposition, we derive closed-form expressions for correlation functions, finite-length Shannon entropy-rate approximates, asymptotic entropy rate, excess entropy, transient information, transient and asymptotic state uncertainty, and synchronization information of stochastic processes generated by finite-state hidden Markov models. This introduces analytical tractability to investigating information processing in discrete-event stochastic processes, symbolic dynamics, and chaotic dynamical systems. Comparisons reveal mathematical similarities between complexity measures originally thought to capture distinct informational and computational properties. We also introduce a new kind of spectral analysis via coronal spectrograms and the frequency-dependent spectra of past-future mutual information. We analyze a number of examples to illustrate the methods, emphasizing processes with multivariate dependencies beyond pairwise correlation. An appendix presents spectral decomposition calculations for one example in full detail.
△ Less
Submitted 2 June, 2017;
originally announced June 2017.
-
Spectral Simplicity of Apparent Complexity, Part I: The Nondiagonalizable Metadynamics of Prediction
Authors:
Paul M. Riechers,
James P. Crutchfield
Abstract:
Virtually all questions that one can ask about the behavioral and structural complexity of a stochastic process reduce to a linear algebraic framing of a time evolution governed by an appropriate hidden-Markov process generator. Each type of question---correlation, predictability, predictive cost, observer synchronization, and the like---induces a distinct generator class. Answers are then functio…
▽ More
Virtually all questions that one can ask about the behavioral and structural complexity of a stochastic process reduce to a linear algebraic framing of a time evolution governed by an appropriate hidden-Markov process generator. Each type of question---correlation, predictability, predictive cost, observer synchronization, and the like---induces a distinct generator class. Answers are then functions of the class-appropriate transition dynamic. Unfortunately, these dynamics are generically nonnormal, nondiagonalizable, singular, and so on. Tractably analyzing these dynamics relies on adapting the recently introduced meromorphic functional calculus, which specifies the spectral decomposition of functions of nondiagonalizable linear operators, even when the function poles and zeros coincide with the operator's spectrum. Along the way, we establish special properties of the projection operators that demonstrate how they capture the organization of subprocesses within a complex system. Circumventing the spurious infinities of alternative calculi, this leads in the sequel, Part II, to the first closed-form expressions for complexity measures, couched either in terms of the Drazin inverse (negative-one power of a singular operator) or the eigenvalues and projection operators of the appropriate transition dynamic.
△ Less
Submitted 22 May, 2017;
originally announced May 2017.
-
Structure and Randomness of Continuous-Time Discrete-Event Processes
Authors:
S. E. Marzen,
J. P. Crutchfield
Abstract:
Loosely speaking, the Shannon entropy rate is used to gauge a stochastic process' intrinsic randomness; the statistical complexity gives the cost of predicting the process. We calculate, for the first time, the entropy rate and statistical complexity of stochastic processes generated by finite unifilar hidden semi-Markov models---memoryful, state-dependent versions of renewal processes. Calculatin…
▽ More
Loosely speaking, the Shannon entropy rate is used to gauge a stochastic process' intrinsic randomness; the statistical complexity gives the cost of predicting the process. We calculate, for the first time, the entropy rate and statistical complexity of stochastic processes generated by finite unifilar hidden semi-Markov models---memoryful, state-dependent versions of renewal processes. Calculating these quantities requires introducing novel mathematical objects (ε-machines of hidden semi-Markov processes) and new information-theoretic methods to stochastic processes.
△ Less
Submitted 15 April, 2017;
originally announced April 2017.
-
Nearly Maximally Predictive Features and Their Dimensions
Authors:
Sarah E. Marzen,
James P. Crutchfield
Abstract:
Scientific explanation often requires inferring maximally predictive features from a given data set. Unfortunately, the collection of minimal maximally predictive features for most stochastic processes is uncountably infinite. In such cases, one compromises and instead seeks nearly maximally predictive features. Here, we derive upper-bounds on the rates at which the number and the coding cost of n…
▽ More
Scientific explanation often requires inferring maximally predictive features from a given data set. Unfortunately, the collection of minimal maximally predictive features for most stochastic processes is uncountably infinite. In such cases, one compromises and instead seeks nearly maximally predictive features. Here, we derive upper-bounds on the rates at which the number and the coding cost of nearly maximally predictive features scales with desired predictive power. The rates are determined by the fractal dimensions of a process' mixed-state distribution. These results, in turn, show how widely-used finite-order Markov models can fail as predictors and that mixed-state predictive features offer a substantial improvement.
△ Less
Submitted 27 February, 2017;
originally announced February 2017.