subscribe to arXiv mailings

Active search for Bifurcations

Authors: Yorgos M. Psarellis, Themistoklis P. Sapsis, Ioannis G. Kevrekidis

Abstract: Bifurcations mark qualitative changes of long-term behavior in dynamical systems and can often signal sudden ("hard") transitions or catastrophic events (divergences). Accurately locating them is critical not just for deeper understanding of observed dynamic behavior, but also for designing efficient interventions. When the dynamical system at hand is complex, possibly noisy, and expensive to samp… ▽ More Bifurcations mark qualitative changes of long-term behavior in dynamical systems and can often signal sudden ("hard") transitions or catastrophic events (divergences). Accurately locating them is critical not just for deeper understanding of observed dynamic behavior, but also for designing efficient interventions. When the dynamical system at hand is complex, possibly noisy, and expensive to sample, standard (e.g. continuation based) numerical methods may become impractical. We propose an active learning framework, where Bayesian Optimization is leveraged to discover saddle-node or Hopf bifurcations, from a judiciously chosen small number of vector field observations. Such an approach becomes especially attractive in systems whose state x parameter space exploration is resource-limited. It also naturally provides a framework for uncertainty quantification (aleatoric and epistemic), useful in systems with inherent stochasticity. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 27 pages, 7 figures

MSC Class: 37M20

arXiv:2406.04519 [pdf, other]

Multifidelity digital twin for real-time monitoring of structural dynamics in aquaculture net cages

Authors: Eirini Katsidoniotaki, Biao Su, Eleni Kelasidi, Themistoklis P. Sapsis

Abstract: As the global population grows and climate change intensifies, sustainable food production is critical. Marine aquaculture offers a viable solution, providing a sustainable protein source. However, the industry's expansion requires novel technologies for remote management and autonomous operations. Digital twin technology can advance the aquaculture industry, but its adoption has been limited. Fis… ▽ More As the global population grows and climate change intensifies, sustainable food production is critical. Marine aquaculture offers a viable solution, providing a sustainable protein source. However, the industry's expansion requires novel technologies for remote management and autonomous operations. Digital twin technology can advance the aquaculture industry, but its adoption has been limited. Fish net cages, which are flexible floating structures, are critical yet vulnerable components of aquaculture farms. Exposed to harsh and dynamic marine environments, the cages experience significant loads and risk damage, leading to fish escapes, environmental impacts, and financial losses. We propose a multifidelity surrogate modeling framework for integration into a digital twin for real-time monitoring of aquaculture net cage structural dynamics under stochastic marine conditions. Central to this framework is the nonlinear autoregressive Gaussian process method, which learns complex, nonlinear cross-correlations between models of varying fidelity. It combines low-fidelity simulation data with a small set of high-fidelity field sensor measurements, which offer the real dynamics but are costly and spatially sparse. Validated at the SINTEF ACE fish farm in Norway, our digital twin receives online metocean data and accurately predicts net cage displacements and mooring line loads, aligning closely with field measurements. The proposed framework is beneficial where application-specific data are scarce, offering rapid predictions and real-time system representation. The developed digital twin prevents potential damages by assessing structural integrity and facilitates remote operations with unmanned underwater vehicles. Our work also compares GP and GCNs for predicting net cage deformation, highlighting the latter's effectiveness in complex structural applications. △ Less

Submitted 10 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

arXiv:2306.15159 [pdf, other]

Evaluation of machine learning architectures on the quantification of epistemic and aleatoric uncertainties in complex dynamical systems

Authors: Stephen Guth, Alireza Mojahed, Themistoklis P. Sapsis

Abstract: Machine learning methods for the construction of data-driven reduced order model models are used in an increasing variety of engineering domains, especially as a supplement to expensive computational fluid dynamics for design problems. An important check on the reliability of surrogate models is Uncertainty Quantification (UQ), a self assessed estimate of the model error. Accurate UQ allows for co… ▽ More Machine learning methods for the construction of data-driven reduced order model models are used in an increasing variety of engineering domains, especially as a supplement to expensive computational fluid dynamics for design problems. An important check on the reliability of surrogate models is Uncertainty Quantification (UQ), a self assessed estimate of the model error. Accurate UQ allows for cost savings by reducing both the required size of training data sets and the required safety factors, while poor UQ prevents users from confidently relying on model predictions. We examine several machine learning techniques, including both Gaussian processes and a family UQ-augmented neural networks: Ensemble neural networks (ENN), Bayesian neural networks (BNN), Dropout neural networks (D-NN), and Gaussian neural networks (G-NN). We evaluate UQ accuracy (distinct from model accuracy) using two metrics: the distribution of normalized residuals on validation data, and the distribution of estimated uncertainties. We apply these metrics to two model data sets, representative of complex dynamical systems: an ocean engineering problem in which a ship traverses irregular wave episodes, and a dispersive wave turbulence system with extreme events, the Majda-McLaughlin-Tabak model. We present conclusions concerning model architecture and hyperparameter tuning. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: Submitted for publication to "Computer Methods in Applied Mechanics and Engineering." 25 pages, 20 figures. arXiv admin note: text overlap with arXiv:1505.05424 by other authors

arXiv:2209.04744 [pdf, other]

Active Learning for Optimal Intervention Design in Causal Models

Authors: Jiaqi Zhang, Louis Cammarata, Chandler Squires, Themistoklis P. Sapsis, Caroline Uhler

Abstract: Sequential experimental design to discover interventions that achieve a desired outcome is a key problem in various domains including science, engineering and public policy. When the space of possible interventions is large, making an exhaustive search infeasible, experimental design strategies are needed. In this context, encoding the causal relationships between the variables, and thus the effec… ▽ More Sequential experimental design to discover interventions that achieve a desired outcome is a key problem in various domains including science, engineering and public policy. When the space of possible interventions is large, making an exhaustive search infeasible, experimental design strategies are needed. In this context, encoding the causal relationships between the variables, and thus the effect of interventions on the system, is critical for identifying desirable interventions more efficiently. Here, we develop a causal active learning strategy to identify interventions that are optimal, as measured by the discrepancy between the post-interventional mean of the distribution and a desired target mean. The approach employs a Bayesian update for the causal model and prioritizes interventions using a carefully designed, causally informed acquisition function. This acquisition function is evaluated in closed form, allowing for fast optimization. The resulting algorithms are theoretically grounded with information-theoretic bounds and provable consistency results for linear causal models with known causal graph. We apply our approach to both synthetic data and single-cell transcriptomic data from Perturb-CITE-seq experiments to identify optimal perturbations that induce a specific cell state transition. The causally informed acquisition function generally outperforms existing criteria allowing for optimal intervention design with fewer but carefully selected samples. △ Less

Submitted 16 August, 2023; v1 submitted 10 September, 2022; originally announced September 2022.

arXiv:2208.13080 [pdf, other]

Information FOMO: The unhealthy fear of missing out on information. A method for removing misleading data for healthier models

Authors: Ethan Pickering, Themistoklis P. Sapsis

Abstract: Misleading or unnecessary data can have out-sized impacts on the health or accuracy of Machine Learning (ML) models. We present a Bayesian sequential selection method, akin to Bayesian experimental design, that identifies critically important information within a dataset, while ignoring data that is either misleading or brings unnecessary complexity to the surrogate model of choice. Our method imp… ▽ More Misleading or unnecessary data can have out-sized impacts on the health or accuracy of Machine Learning (ML) models. We present a Bayesian sequential selection method, akin to Bayesian experimental design, that identifies critically important information within a dataset, while ignoring data that is either misleading or brings unnecessary complexity to the surrogate model of choice. Our method improves sample-wise error convergence and eliminates instances where more data leads to worse performance and instabilities of the surrogate model, often termed sample-wise ``double descent''. We find these instabilities are a result of the complexity of the underlying map and linked to extreme events and heavy tails. Our approach has two key features. First, the selection algorithm dynamically couples the chosen model and data. Data is chosen based on its merits towards improving the selected model, rather than being compared strictly against other data. Second, a natural convergence of the method removes the need for dividing the data into training, testing, and validation sets. Instead, the selection metric inherently assesses testing and validation error through global statistics of the model. This ensures that key information is never wasted in testing or validation. The method is applied using both Gaussian process regression and deep neural network surrogate models. △ Less

Submitted 7 July, 2024; v1 submitted 27 August, 2022; originally announced August 2022.

Comments: 13 pages, 6 figures, Submitted

arXiv:2204.02488 [pdf, other]

Discovering and forecasting extreme events via active learning in neural operators

Authors: Ethan Pickering, Stephen Guth, George Em Karniadakis, Themistoklis P. Sapsis

Abstract: Extreme events in society and nature, such as pandemic spikes, rogue waves, or structural failures, can have catastrophic consequences. Characterizing extremes is difficult as they occur rarely, arise from seemingly benign conditions, and belong to complex and often unknown infinite-dimensional systems. Such challenges render attempts at characterizing them as moot. We address each of these diffic… ▽ More Extreme events in society and nature, such as pandemic spikes, rogue waves, or structural failures, can have catastrophic consequences. Characterizing extremes is difficult as they occur rarely, arise from seemingly benign conditions, and belong to complex and often unknown infinite-dimensional systems. Such challenges render attempts at characterizing them as moot. We address each of these difficulties by combining novel training schemes in Bayesian experimental design (BED) with an ensemble of deep neural operators (DNOs). This model-agnostic framework pairs a BED scheme that actively selects data for quantifying extreme events with an ensemble of DNOs that approximate infinite-dimensional nonlinear operators. We find that not only does this framework clearly beat Gaussian processes (GPs) but that 1) shallow ensembles of just two members perform best; 2) extremes are uncovered regardless of the state of initial data (i.e. with or without extremes); 3) our method eliminates "double-descent" phenomena; 4) the use of batches of suboptimal acquisition points compared to step-by-step global optima does not hinder BED performance; and 5) Monte Carlo acquisition outperforms standard optimizers in high-dimensions. Together these conclusions form the foundation of an AI-assisted experimental infrastructure that can efficiently infer and pinpoint critical situations across many domains, from physical to societal systems. △ Less

Submitted 20 September, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: 25 pages, 8 figures, Submitted to Nature Computational Science

arXiv:2203.04515 [pdf, other]

Structure and Distribution Metric for Quantifying the Quality of Uncertainty: Assessing Gaussian Processes, Deep Neural Nets, and Deep Neural Operators for Regression

Authors: Ethan Pickering, Themistoklis P. Sapsis

Abstract: We propose two bounded comparison metrics that may be implemented to arbitrary dimensions in regression tasks. One quantifies the structure of uncertainty and the other quantifies the distribution of uncertainty. The structure metric assesses the similarity in shape and location of uncertainty with the true error, while the distribution metric quantifies the supported magnitudes between the two. W… ▽ More We propose two bounded comparison metrics that may be implemented to arbitrary dimensions in regression tasks. One quantifies the structure of uncertainty and the other quantifies the distribution of uncertainty. The structure metric assesses the similarity in shape and location of uncertainty with the true error, while the distribution metric quantifies the supported magnitudes between the two. We apply these metrics to Gaussian Processes (GPs), Ensemble Deep Neural Nets (DNNs), and Ensemble Deep Neural Operators (DNOs) on high-dimensional and nonlinear test cases. We find that comparing a model's uncertainty estimates with the model's squared error provides a compelling ground truth assessment. We also observe that both DNNs and DNOs, especially when compared to GPs, provide encouraging metric values in high dimensions with either sparse or plentiful data. △ Less

Submitted 8 March, 2022; originally announced March 2022.

Comments: 9 pages, 6 figures

arXiv:2110.01374 [pdf, other]

doi 10.1098/rsta.2021.0209

Hybrid quadrature moment method for accurate and stable representation of non-Gaussian processes and their dynamics

Authors: Alexis-Tzianni Charalampopoulos, Spencer H. Bryngelson, Tim Colonius, Themistoklis P. Sapsis

Abstract: Solving the population balance equation (PBE) for the dynamics of a dispersed phase coupled to a continuous fluid is expensive. Still, one can reduce the cost by representing the evolving particle density function in terms of its moments. In particular, quadrature-based moment methods (QBMMs) invert these moments with a quadrature rule, approximating the required statistics. QBMMs have been shown… ▽ More Solving the population balance equation (PBE) for the dynamics of a dispersed phase coupled to a continuous fluid is expensive. Still, one can reduce the cost by representing the evolving particle density function in terms of its moments. In particular, quadrature-based moment methods (QBMMs) invert these moments with a quadrature rule, approximating the required statistics. QBMMs have been shown to accurately model sprays and soot with a relatively compact set of moments. However, significantly non-Gaussian processes such as bubble dynamics lead to numerical instabilities when extending their moment sets accordingly. We solve this problem by training a recurrent neural network (RNN) that adjusts the QBMM quadrature to evaluate unclosed moments with higher accuracy. The proposed method is tested on a simple model of bubbles oscillating in response to a temporally fluctuating pressure field. The approach decreases model-form error by a factor of 10 when compared to traditional QBMMs. It is both numerically stable and computationally efficient since it does not expand the baseline moment set. Additional quadrature points are also assessed, optimally placed and weighted according to an additional RNN. These points further decrease the error at low cost since the moment set is again unchanged. △ Less

Submitted 15 September, 2021; originally announced October 2021.

Journal ref: Philosophical Transactions of the Royal Society A, 380 (2229), 2022

arXiv:2005.08741 [pdf, other]

doi 10.1016/j.physd.2021.132843

Sparse Methods for Automatic Relevance Determination

Authors: Samuel H. Rudy, Themistoklis P. Sapsis

Abstract: This work considers methods for imposing sparsity in Bayesian regression with applications in nonlinear system identification. We first review automatic relevance determination (ARD) and analytically demonstrate the need to additional regularization or thresholding to achieve sparse models. We then discuss two classes of methods, regularization based and thresholding based, which build on ARD to l… ▽ More This work considers methods for imposing sparsity in Bayesian regression with applications in nonlinear system identification. We first review automatic relevance determination (ARD) and analytically demonstrate the need to additional regularization or thresholding to achieve sparse models. We then discuss two classes of methods, regularization based and thresholding based, which build on ARD to learn parsimonious solutions to linear problems. In the case of orthogonal covariates, we analytically demonstrate favorable performance with regards to learning a small set of active terms in a linear system with a sparse solution. Several example problems are presented to compare the set of proposed methods in terms of advantages and limitations to ARD in bases with hundreds of elements. The aim of this paper is to analyze and understand the assumptions that lead to several algorithms and to provide theoretical and empirical results so that the reader may gain insight and make more informed choices regarding sparse Bayesian regression. △ Less

Submitted 18 May, 2020; originally announced May 2020.

arXiv:1910.05266 [pdf, other]

Backpropagation Algorithms and Reservoir Computing in Recurrent Neural Networks for the Forecasting of Complex Spatiotemporal Dynamics

Authors: Pantelis R. Vlachas, Jaideep Pathak, Brian R. Hunt, Themistoklis P. Sapsis, Michelle Girvan, Edward Ott, Petros Koumoutsakos

Abstract: We examine the efficiency of Recurrent Neural Networks in forecasting the spatiotemporal dynamics of high dimensional and reduced order complex systems using Reservoir Computing (RC) and Backpropagation through time (BPTT) for gated network architectures. We highlight advantages and limitations of each method and discuss their implementation for parallel computing architectures. We quantify the re… ▽ More We examine the efficiency of Recurrent Neural Networks in forecasting the spatiotemporal dynamics of high dimensional and reduced order complex systems using Reservoir Computing (RC) and Backpropagation through time (BPTT) for gated network architectures. We highlight advantages and limitations of each method and discuss their implementation for parallel computing architectures. We quantify the relative prediction accuracy of these algorithms for the longterm forecasting of chaotic systems using as benchmarks the Lorenz-96 and the Kuramoto-Sivashinsky (KS) equations. We find that, when the full state dynamics are available for training, RC outperforms BPTT approaches in terms of predictive performance and in capturing of the long-term statistics, while at the same time requiring much less training time. However, in the case of reduced order data, large scale RC models can be unstable and more likely than the BPTT algorithms to diverge. In contrast, RNNs trained via BPTT show superior forecasting abilities and capture well the dynamics of reduced order systems. Furthermore, the present study quantifies for the first time the Lyapunov Spectrum of the KS equation with BPTT, achieving similar accuracy as RC. This study establishes that RNNs are a potent computational framework for the learning and forecasting of complex spatiotemporal systems. △ Less

Submitted 17 February, 2020; v1 submitted 9 October, 2019; originally announced October 2019.

Comments: 41 pages, submitted to Elsevier Journal of Neural Networks (accepted)

arXiv:1907.10413 [pdf, other]

doi 10.1063/1.5120830

Learning the Tangent Space of Dynamical Instabilities from Data

Authors: Antoine Blanchard, Themistoklis P. Sapsis

Abstract: For a large class of dynamical systems, the optimally time-dependent (OTD) modes, a set of deformable orthonormal tangent vectors that track directions of instabilities along any trajectory, are known to depend "pointwise" on the state of the system on the attractor, and not on the history of the trajectory. We leverage the power of neural networks to learn this "pointwise" mapping from phase spac… ▽ More For a large class of dynamical systems, the optimally time-dependent (OTD) modes, a set of deformable orthonormal tangent vectors that track directions of instabilities along any trajectory, are known to depend "pointwise" on the state of the system on the attractor, and not on the history of the trajectory. We leverage the power of neural networks to learn this "pointwise" mapping from phase space to OTD space directly from data. The result of the learning process is a cartography of directions associated with strongest instabilities in phase space. Implications for data-driven prediction and control of dynamical instabilities are discussed. △ Less

Submitted 8 November, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

arXiv:1907.07552 [pdf, other]

doi 10.1098/rspa.2019.0834

Output-weighted optimal sampling for Bayesian regression and rare event statistics using few samples

Authors: Themistoklis P. Sapsis

Abstract: For many important problems the quantity of interest is an unknown function of the parameters, which is a random vector with known statistics. Since the dependence of the output on this random vector is unknown, the challenge is to identify its statistics, using the minimum number of function evaluations. This problem can been seen in the context of active learning or optimal experimental design.… ▽ More For many important problems the quantity of interest is an unknown function of the parameters, which is a random vector with known statistics. Since the dependence of the output on this random vector is unknown, the challenge is to identify its statistics, using the minimum number of function evaluations. This problem can been seen in the context of active learning or optimal experimental design. We employ Bayesian regression to represent the derived model uncertainty due to finite and small number of input-output pairs. In this context we evaluate existing methods for optimal sample selection, such as model error minimization and mutual information maximization. We show that for the case of known output variance, the commonly employed criteria in the literature do not take into account the output values of the existing input-output pairs, while for the case of unknown output variance this dependence can be very weak. We introduce a criterion that takes into account the values of the output for the existing samples and adaptively selects inputs from regions of the parameter space which have important contribution to the output. The new method allows for application to high-dimensional inputs, paving the way for optimal experimental design in high-dimensions. △ Less

Submitted 30 November, 2019; v1 submitted 17 July, 2019; originally announced July 2019.

Comments: 34 pages; 13 figures

arXiv:1804.07240 [pdf, other]

doi 10.1073/pnas.1813263115

A sequential sampling strategy for extreme event statistics in nonlinear dynamical systems

Authors: Mustafa A. Mohamad, Themistoklis P. Sapsis

Abstract: We develop a method for the evaluation of extreme event statistics associated with nonlinear dynamical systems, using a small number of samples. From an initial dataset of design points, we formulate a sequential strategy that provides the 'next-best' data point (set of parameters) that when evaluated results in improved estimates of the probability density function (pdf) for a scalar quantity of… ▽ More We develop a method for the evaluation of extreme event statistics associated with nonlinear dynamical systems, using a small number of samples. From an initial dataset of design points, we formulate a sequential strategy that provides the 'next-best' data point (set of parameters) that when evaluated results in improved estimates of the probability density function (pdf) for a scalar quantity of interest. The approach utilizes Gaussian process regression to perform Bayesian inference on the parameter-to-observation map describing the quantity of interest. We then approximate the desired pdf along with uncertainty bounds utilizing the posterior distribution of the inferred map. The 'next-best' design point is sequentially determined through an optimization procedure that selects the point in parameter space that maximally reduces uncertainty between the estimated bounds of the pdf prediction. Since the optimization process utilizes only information from the inferred map it has minimal computational cost. Moreover, the special form of the metric emphasizes the tails of the pdf. The method is practical for systems where the dimensionality of the parameter space is of moderate size, i.e. order O(10). We apply the method to estimate the extreme event statistics for a very high-dimensional system with millions of degrees of freedom: an offshore platform subjected to three-dimensional irregular waves. It is demonstrated that the developed approach can accurately determine the extreme event statistics using limited number of samples. △ Less

Submitted 19 April, 2018; originally announced April 2018.

arXiv:1802.07486 [pdf, other]

doi 10.1098/rspa.2017.0844

Data-Driven Forecasting of High-Dimensional Chaotic Systems with Long Short-Term Memory Networks

Authors: Pantelis R. Vlachas, Wonmin Byeon, Zhong Y. Wan, Themistoklis P. Sapsis, Petros Koumoutsakos

Abstract: We introduce a data-driven forecasting method for high-dimensional chaotic systems using long short-term memory (LSTM) recurrent neural networks. The proposed LSTM neural networks perform inference of high-dimensional dynamical systems in their reduced order space and are shown to be an effective set of nonlinear approximators of their attractor. We demonstrate the forecasting performance of the L… ▽ More We introduce a data-driven forecasting method for high-dimensional chaotic systems using long short-term memory (LSTM) recurrent neural networks. The proposed LSTM neural networks perform inference of high-dimensional dynamical systems in their reduced order space and are shown to be an effective set of nonlinear approximators of their attractor. We demonstrate the forecasting performance of the LSTM and compare it with Gaussian processes (GPs) in time series obtained from the Lorenz 96 system, the Kuramoto-Sivashinsky equation and a prototype climate model. The LSTM networks outperform the GPs in short-term forecasting accuracy in all applications considered. A hybrid architecture, extending the LSTM with a mean stochastic model (MSM-LSTM), is proposed to ensure convergence to the invariant measure. This novel hybrid method is fully data-driven and extends the forecasting capabilities of LSTM networks. △ Less

Submitted 19 September, 2019; v1 submitted 21 February, 2018; originally announced February 2018.

Comments: 31 pages

arXiv:1706.00676 [pdf, other]

Extreme events and their optimal mitigation in nonlinear structural systems excited by stochastic loads: Application to ocean engineering systems

Authors: Han Kyul Joo, Mustafa A. Mohamad, Themistoklis P. Sapsis

Abstract: We develop an efficient numerical method for the probabilistic quantification of the response statistics of nonlinear multi-degree-of-freedom structural systems under extreme forcing events, emphasizing accurate heavy-tail statistics. The response is decomposed to a statistically stationary part and an intermittent component. The stationary part is quantified using a statistical linearization meth… ▽ More We develop an efficient numerical method for the probabilistic quantification of the response statistics of nonlinear multi-degree-of-freedom structural systems under extreme forcing events, emphasizing accurate heavy-tail statistics. The response is decomposed to a statistically stationary part and an intermittent component. The stationary part is quantified using a statistical linearization method while the intermittent part, associated with extreme transient responses, is quantified through i) either a few carefully selected simulations or ii) through the use of effective measures (effective stiffness and damping). The developed approach is able to accurately capture the extreme response statistics orders of magnitude faster compared with direct methods. The scheme is applied to the design and optimization of small attachments that can mitigate and suppress extreme forcing events delivered to a primary structural system. Specifically, we consider the problem of suppression of extreme responses in two prototype ocean engineering systems. First, we consider linear and cubic springs and perform parametric optimization by minimizing the forth-order moments of the response. We then consider a more generic, possibly asymmetric, piecewise linear spring and optimize its nonlinear characteristics. The resulting asymmetric spring design far outperforms the optimal cubic energy sink and the linear tuned mass dampers. △ Less

Submitted 1 June, 2017; originally announced June 2017.

Showing 1–15 of 15 results for author: Sapsis, T P