-
Fast Iterative Solver For Neural Network Method: II. 1D Diffusion-Reaction Problems And Data Fitting
Authors:
Zhiqiang Cai,
Anastassia Doktorova,
Robert D. Falgout,
César Herrera
Abstract:
This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis funct…
▽ More
This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis functions is tri-diagonal and well-conditioned, the mass matrix for NNs is dense and ill-conditioned. For example, the condition number of the NN mass matrix for quasi-uniform meshes is at least ${\cal O}(n^4)$. We present a factorization of the mass matrix that enables solving the systems of linear equations in ${\cal O}(n)$ operations. To determine the non-linear parameters (the weights and bias of the hidden layer), one step of a damped Newton method is employed at each iteration. A Gauss-Newton method is used in place of Newton for the instances in which the Hessian matrices are singular. This modified dBN is referred to as dBGN. For both methods, the computational cost per iteration is ${\cal O}(n)$. Numerical results demonstrate the ability dBN and dBGN to efficiently achieve accurate results and outperform BFGS for select examples.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Fast Iterative Solver For Neural Network Method: I. 1D Diffusion Problems
Authors:
Zhiqiang Cai,
Anastassia Doktorova,
Robert D. Falgout,
César Herrera
Abstract:
The discretization of the deep Ritz method [18] for the Poisson equation leads to a high-dimensional non-convex minimization problem, that is difficult and expensive to solve numerically. In this paper, we consider the shallow Ritz approximation to one-dimensional diffusion problems and introduce an effective and efficient iterative method, a damped block Newton (dBN) method, for solving the resul…
▽ More
The discretization of the deep Ritz method [18] for the Poisson equation leads to a high-dimensional non-convex minimization problem, that is difficult and expensive to solve numerically. In this paper, we consider the shallow Ritz approximation to one-dimensional diffusion problems and introduce an effective and efficient iterative method, a damped block Newton (dBN) method, for solving the resulting non-convex minimization problem.
The method employs the block Gauss-Seidel method as an outer iteration by dividing the parameters of a shallow neural network into the linear parameters (the weights and bias of the output layer) and the non-linear parameters (the weights and bias of the hidden layer). Per each outer iteration, the linear and the non-linear parameters are updated by exact inversion and one step of a damped Newton method, respectively. Inverses of the coefficient matrix and the Hessian matrix are tridiagonal and diagonal, respectively, and hence the cost of each dBN iteration is $\mathcal{O}(n)$. To move the breakpoints (the non-linear parameters) more efficiently, we propose an adaptive damped block Newton (AdBN) method by combining the dBN with the adaptive neuron enhancement (ANE) method [25]. Numerical examples demonstrate the ability of dBN and AdBN not only to move the breakpoints quickly and efficiently but also to achieve a nearly optimal order of convergence for AdBN. These iterative solvers are capable of outperforming BFGS for select examples.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Parallel-in-time solution of scalar nonlinear conservation laws
Authors:
H. De Sterck,
R. D. Falgout,
O. A. Krzysik,
J. B. Schroder
Abstract:
We consider the parallel-in-time solution of scalar nonlinear conservation laws in one spatial dimension. The equations are discretized in space with a conservative finite-volume method using weighted essentially non-oscillatory (WENO) reconstructions, and in time with high-order explicit Runge-Kutta methods. The solution of the global, discretized space-time problem is sought via a nonlinear iter…
▽ More
We consider the parallel-in-time solution of scalar nonlinear conservation laws in one spatial dimension. The equations are discretized in space with a conservative finite-volume method using weighted essentially non-oscillatory (WENO) reconstructions, and in time with high-order explicit Runge-Kutta methods. The solution of the global, discretized space-time problem is sought via a nonlinear iteration that uses a novel linearization strategy in cases of non-differentiable equations. Under certain choices of discretization and algorithmic parameters, the nonlinear iteration coincides with Newton's method, although, more generally, it is a preconditioned residual correction scheme. At each nonlinear iteration, the linearized problem takes the form of a certain discretization of a linear conservation law over the space-time domain in question. An approximate parallel-in-time solution of the linearized problem is computed with a single multigrid reduction-in-time (MGRIT) iteration. The MGRIT iteration employs a novel coarse-grid operator that is a modified conservative semi-Lagrangian discretization and generalizes those we have developed previously for non-conservative scalar linear hyperbolic problems. Numerical tests are performed for the inviscid Burgers and Buckley--Leverett equations. For many test problems, the solver converges in just a handful of iterations with convergence rate independent of mesh resolution, including problems with (interacting) shocks and rarefactions.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Efficient multigrid reduction-in-time for method-of-lines discretizations of linear advection
Authors:
H. De Sterck,
R. D. Falgout,
O. A. Krzysik,
J. B. Schroder
Abstract:
Parallel-in-time methods for partial differential equations (PDEs) have been the subject of intense development over recent decades, particularly for diffusion-dominated problems. It has been widely reported in the literature, however, that many of these methods perform quite poorly for advection-dominated problems. Here we analyze the particular iterative parallel-in-time algorithm of multigrid r…
▽ More
Parallel-in-time methods for partial differential equations (PDEs) have been the subject of intense development over recent decades, particularly for diffusion-dominated problems. It has been widely reported in the literature, however, that many of these methods perform quite poorly for advection-dominated problems. Here we analyze the particular iterative parallel-in-time algorithm of multigrid reduction-in-time (MGRIT) for discretizations of constant-wave-speed linear advection problems. We focus on common method-of-lines discretizations that employ upwind finite differences in space and Runge-Kutta methods in time. Using a convergence framework we developed in previous work, we prove for a subclass of these discretizations that, if using the standard approach of rediscretizing the fine-grid problem on the coarse grid, robust MGRIT convergence with respect to CFL number and coarsening factor is not possible. This poor convergence and non-robustness is caused, at least in part, by an inadequate coarse-grid correction for smooth Fourier modes known as characteristic components.We propose an alternative coarse-grid that provides a better correction of these modes. This coarse-grid operator is related to previous work and uses a semi-Lagrangian discretization combined with an implicitly treated truncation error correction. Theory and numerical experiments show the coarse-grid operator yields fast MGRIT convergence for many of the method-of-lines discretizations considered, including for both implicit and explicit discretizations of high order. Parallel results demonstrate substantial speed-up over sequential time-stepping.
△ Less
Submitted 20 March, 2023; v1 submitted 14 September, 2022;
originally announced September 2022.
-
Multigrid Reduction in Time for Chaotic Dynamical Systems
Authors:
David A. Vargas,
Robert D. Falgout,
Stefanie Günther,
Jacob B. Schroder
Abstract:
As CPU clock speeds have stagnated and high performance computers continue to have ever higher core counts, increased parallelism is needed to take advantage of these new architectures. Traditional serial time-marching schemes can be a significant bottleneck, as many types of simulations require large numbers of time-steps which must be computed sequentially. Parallel in Time schemes, such as the…
▽ More
As CPU clock speeds have stagnated and high performance computers continue to have ever higher core counts, increased parallelism is needed to take advantage of these new architectures. Traditional serial time-marching schemes can be a significant bottleneck, as many types of simulations require large numbers of time-steps which must be computed sequentially. Parallel in Time schemes, such as the Multigrid Reduction in Time (MGRIT) method, remedy this by parallelizing across time-steps, and have shown promising results for parabolic problems. However, chaotic problems have proved more difficult, since chaotic initial value problems (IVPs) are inherently ill-conditioned. MGRIT relies on a hierarchy of successively coarser time-grids to iteratively correct the solution on the finest time-grid, but due to the nature of chaotic systems, small inaccuracies on the coarser levels can be greatly magnified and lead to poor coarse-grid corrections. Here we introduce a modified MGRIT algorithm based on an existing quadratically converging nonlinear extension to the multigrid Full Approximation Scheme (FAS), as well as a novel time-coarsening scheme. Together, these approaches better capture long-term chaotic behavior on coarse-grids and greatly improve convergence of MGRIT for chaotic IVPs. Further, we introduce a novel low memory variant of the algorithm for solving chaotic PDEs with MGRIT which not only solves the IVP, but also provides estimates for the unstable Lyapunov vectors of the system. We provide supporting numerical results for the Lorenz system and demonstrate parallel speedup for the chaotic Kuramoto- Sivashinsky partial differential equation over a significantly longer time-domain than in previous works.
△ Less
Submitted 26 August, 2022;
originally announced August 2022.
-
A New Semi-Structured Algebraic Multigrid Method
Authors:
Victor A. Paludetto Magri,
Robert D. Falgout,
Ulrike M. Yang
Abstract:
Multigrid methods are well suited to large massively parallel computer architectures because they are mathematically optimal and display excellent parallelization properties. Since current architecture trends are favoring regular compute patterns to achieve high performance, the ability to express structure has become much more important. The hypre software library provides high-performance multig…
▽ More
Multigrid methods are well suited to large massively parallel computer architectures because they are mathematically optimal and display excellent parallelization properties. Since current architecture trends are favoring regular compute patterns to achieve high performance, the ability to express structure has become much more important. The hypre software library provides high-performance multigrid preconditioners and solvers through conceptual interfaces, including a semi-structured interface that describes matrices primarily in terms of stencils and logically structured grids. This paper presents a new semi-structured algebraic multigrid (SSAMG) method built on this interface. The numerical convergence and performance of a CPU implementation of this method are evaluated for a set of semi-structured problems. SSAMG achieves significantly better setup times than hypre's unstructured AMG solvers and comparable convergence. In addition, the new method is capable of solving more complex problems than hypre's structured solvers.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
Fast multigrid reduction-in-time for advection via modified semi-Lagrangian coarse-grid operators
Authors:
H. De Sterck,
R. D. Falgout,
O. A. Krzysik
Abstract:
Many iterative parallel-in-time algorithms have been shown to be highly efficient for diffusion-dominated partial differential equations (PDEs), but are inefficient or even divergent when applied to advection-dominated PDEs. We consider the application of the multigrid reduction-in-time (MGRIT) algorithm to linear advection PDEs. The key to efficient time integration with this method is using a co…
▽ More
Many iterative parallel-in-time algorithms have been shown to be highly efficient for diffusion-dominated partial differential equations (PDEs), but are inefficient or even divergent when applied to advection-dominated PDEs. We consider the application of the multigrid reduction-in-time (MGRIT) algorithm to linear advection PDEs. The key to efficient time integration with this method is using a coarse-grid operator that provides a sufficiently accurate approximation to the the so-called ideal coarse-grid operator. For certain classes of semi-Lagrangian discretizations, we present a novel semi-Lagrangian-based coarse-grid operator that leads to fast and scalable multilevel time integration of linear advection PDEs. The coarse-grid operator is composed of a semi-Lagrangian discretization followed by a correction term, with the correction designed so that the leading-order truncation error of the composite operator is approximately equal to that of the ideal coarse-grid operator. Parallel results show substantial speed-ups over sequential time integration for variable-wave-speed advection problems in one and two spatial dimensions, and using high-order discretizations up to order five. The proposed approach establishes the first practical method that provides small and scalable MGRIT iteration counts for advection problems.
△ Less
Submitted 22 April, 2022; v1 submitted 24 March, 2022;
originally announced March 2022.
-
Toward Parallel in Time for Chaotic Dynamical Systems
Authors:
David A. Vargas,
Robert D. Falgout,
Stefanie Günther,
Jacob B. Schroder
Abstract:
As CPU clock speeds have stagnated, and high performance computers continue to have ever higher core counts, increased parallelism is needed to take advantage of these new architectures. Traditional serial time-marching schemes are a significant bottleneck, as many types of simulations require large numbers of time-steps which must be computed sequentially. Parallel in Time schemes, such as the Mu…
▽ More
As CPU clock speeds have stagnated, and high performance computers continue to have ever higher core counts, increased parallelism is needed to take advantage of these new architectures. Traditional serial time-marching schemes are a significant bottleneck, as many types of simulations require large numbers of time-steps which must be computed sequentially. Parallel in Time schemes, such as the Multigrid Reduction in Time (MGRIT) method, remedy this by parallelizing across time-steps, and have shown promising results for parabolic problems. However, chaotic problems have proved more difficult, since chaotic initial value problems are inherently ill-conditioned. MGRIT relies on a hierarchy of successively coarser time-grids to iteratively correct the solution on the finest time-grid, but due to the nature of chaotic systems, subtle inaccuracies on the coarser levels can lead to poor coarse-grid corrections. Here we propose a modification to nonlinear FAS multigrid, as well as a novel time-coarsening scheme, which together better capture long term behavior on coarse grids and greatly improve convergence of MGRIT for chaotic initial value problems. We provide supporting numerical results for the Lorenz system model problem.
△ Less
Submitted 25 January, 2022;
originally announced January 2022.
-
Optimizing multigrid reduction-in-time (MGRIT) and Parareal coarse-grid operators for linear advection
Authors:
Hans De Sterck,
Robert D. Falgout,
Stephanie Friedhoff,
Oliver A. Krzysik,
Scott P. MacLachlan
Abstract:
Parallel-in-time methods, such as multigrid reduction-in-time (MGRIT) and Parareal, provide an attractive option for increasing concurrency when simulating time-dependent PDEs in modern high-performance computing environments. While these techniques have been very successful for parabolic equations, it has often been observed that their performance suffers dramatically when applied to advection-do…
▽ More
Parallel-in-time methods, such as multigrid reduction-in-time (MGRIT) and Parareal, provide an attractive option for increasing concurrency when simulating time-dependent PDEs in modern high-performance computing environments. While these techniques have been very successful for parabolic equations, it has often been observed that their performance suffers dramatically when applied to advection-dominated problems or purely hyperbolic PDEs using standard rediscretization approaches on coarse grids. In this paper, we apply MGRIT or Parareal to the constant-coefficient linear advection equation, appealing to existing convergence theory to provide insight into the typically non-scalable or even divergent behavior of these solvers for this problem. To overcome these failings, we replace rediscretization on coarse grids with improved coarse-grid operators that are computed by applying optimization techniques to approximately minimize error estimates from the convergence theory. One of our main findings is that, in order to obtain fast convergence as for parabolic problems, coarse-grid operators should take into account the behavior of the hyperbolic problem by tracking the characteristic curves. Our approach is tested for schemes of various orders using explicit or implicit Runge-Kutta methods combined with upwind-finite-difference spatial discretizations. In all cases, we obtain scalable convergence in just a handful of iterations, with parallel tests also showing significant speed-ups over sequential time-stepping. Our insight of tracking characteristics on coarse grids provides a key idea for solving the long-standing problem of efficient parallel-in-time integration for hyperbolic PDEs.
△ Less
Submitted 2 March, 2021; v1 submitted 8 October, 2019;
originally announced October 2019.
-
Parallel Performance of Algebraic Multigrid Domain Decomposition (AMG-DD)
Authors:
Wayne B. Mitchell,
Robert Strzodka,
Robert D. Falgout
Abstract:
Algebraic multigrid (AMG) is a widely used scalable solver and preconditioner for large-scale linear systems resulting from the discretization of a wide class of elliptic PDEs. While AMG has optimal computational complexity, the cost of communication has become a significant bottleneck that limits its scalability as processor counts continue to grow on modern machines. This paper examines the desi…
▽ More
Algebraic multigrid (AMG) is a widely used scalable solver and preconditioner for large-scale linear systems resulting from the discretization of a wide class of elliptic PDEs. While AMG has optimal computational complexity, the cost of communication has become a significant bottleneck that limits its scalability as processor counts continue to grow on modern machines. This paper examines the design, implementation, and parallel performance of a novel algorithm, Algebraic Multigrid Domain Decomposition (AMG-DD), designed specifically to limit communication. The goal of AMG-DD is to provide a low-communication alternative to standard AMG V-cycles by trading some additional computational overhead for a significant reduction in communication cost. Numerical results show that AMG-DD achieves superior accuracy per communication cost compared to AMG, and speedup over AMG is demonstrated on a large GPU cluster.
△ Less
Submitted 21 January, 2020; v1 submitted 25 June, 2019;
originally announced June 2019.
-
Multilevel convergence analysis of multigrid-reduction-in-time
Authors:
Andreas Hessenthaler,
Ben S. Southworth,
David Nordsletten,
Oliver Röhrle,
Robert D. Falgout,
Jacob B. Schroder
Abstract:
This paper presents a multilevel convergence framework for multigrid-reduction-in-time (MGRIT) as a generalization of previous two-grid estimates. The framework provides a priori upper bounds on the convergence of MGRIT V- and F-cycles, with different relaxation schemes, by deriving the respective residual and error propagation operators. The residual and error operators are functions of the time…
▽ More
This paper presents a multilevel convergence framework for multigrid-reduction-in-time (MGRIT) as a generalization of previous two-grid estimates. The framework provides a priori upper bounds on the convergence of MGRIT V- and F-cycles, with different relaxation schemes, by deriving the respective residual and error propagation operators. The residual and error operators are functions of the time stepping operator, analyzed directly and bounded in norm, both numerically and analytically. We present various upper bounds of different computational cost and varying sharpness. These upper bounds are complemented by proposing analytic formulae for the approximate convergence factor of V-cycle algorithms that take the number of fine grid time points, the temporal coarsening factors, and the eigenvalues of the time stepping operator as parameters.
The paper concludes with supporting numerical investigations of parabolic (anisotropic diffusion) and hyperbolic (wave equation) model problems. We assess the sharpness of the bounds and the quality of the approximate convergence factors. Observations from these numerical investigations demonstrate the value of the proposed multilevel convergence framework for estimating MGRIT convergence a priori and for the design of a convergent algorithm. We further highlight that observations in the literature are captured by the theory, including that two-level Parareal and multilevel MGRIT with F-relaxation do not yield scalable algorithms and the benefit of a stronger relaxation scheme. An important observation is that with increasing numbers of levels MGRIT convergence deteriorates for the hyperbolic model problem, while constant convergence factors can be achieved for the diffusion equation. The theory also indicates that L-stable Runge-Kutta schemes are more amendable to multilevel parallel-in-time integration with MGRIT than A-stable Runge-Kutta schemes.
△ Less
Submitted 4 June, 2019; v1 submitted 30 December, 2018;
originally announced December 2018.
-
A Discretization-Accurate Stopping Criterion for Iterative Solvers for Finite Element Approximation
Authors:
Zhiqiang Cai,
Shuhao Cao,
Robert D. Falgout
Abstract:
This paper introduces a discretization-accurate stopping criterion of symmetric iterative methods for solving systems of algebraic equations resulting from the finite element approximation. The stopping criterion consists of the evaluations of the discretization and the algebraic error estimators, that are based on the respective duality error estimator and the difference of two consecutive iterat…
▽ More
This paper introduces a discretization-accurate stopping criterion of symmetric iterative methods for solving systems of algebraic equations resulting from the finite element approximation. The stopping criterion consists of the evaluations of the discretization and the algebraic error estimators, that are based on the respective duality error estimator and the difference of two consecutive iterates. Iterations are terminated when the algebraic estimator is of the same magnitude as the discretization estimator. Numerical results for multigrid $V(1,1)$-cycle and symmetric Gauss-Seidel iterative methods are presented for the linear finite element approximation to the Poisson equations. A large reduction in computational cost is observed compared to the standard residual-based stopping criterion.
△ Less
Submitted 18 September, 2019; v1 submitted 27 November, 2016;
originally announced November 2016.