-
Dynamical Measure Transport and Neural PDE Solvers for Sampling
Authors:
Jingtong Sun,
Julius Berner,
Lorenz Richter,
Marius Zeinhofer,
Johannes Müller,
Kamyar Azizzadenesheli,
Anima Anandkumar
Abstract:
The task of sampling from a probability density can be approached as transporting a tractable density function to the target, known as dynamical measure transport. In this work, we tackle it through a principled unified framework using deterministic or stochastic evolutions described by partial differential equations (PDEs). This framework incorporates prior trajectory-based sampling methods, such…
▽ More
The task of sampling from a probability density can be approached as transporting a tractable density function to the target, known as dynamical measure transport. In this work, we tackle it through a principled unified framework using deterministic or stochastic evolutions described by partial differential equations (PDEs). This framework incorporates prior trajectory-based sampling methods, such as diffusion models or Schrödinger bridges, without relying on the concept of time-reversals. Moreover, it allows us to propose novel numerical methods for solving the transport task and thus sampling from complicated targets without the need for the normalization constant or data samples. We employ physics-informed neural networks (PINNs) to approximate the respective PDE solutions, implying both conceptional and computational advantages. In particular, PINNs allow for simulation- and discretization-free optimization and can be trained very efficiently, leading to significantly better mode coverage in the sampling task compared to alternative methods. Moreover, they can readily be fine-tuned with Gauss-Newton methods to achieve high accuracy in sampling.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Bergman Kernels of the Cheng-Yau metrics on quasi-projective manifolds
Authors:
Jingzhoun Sun
Abstract:
We show the asymptotics of the Bergman kernel function near the smooth divisor at infinity of the Cheng-Yau metric on quasi-projective manifolds. In particular, we show that there is a quantum phenomenon for the points very close to the divisor at infinity.
We show the asymptotics of the Bergman kernel function near the smooth divisor at infinity of the Cheng-Yau metric on quasi-projective manifolds. In particular, we show that there is a quantum phenomenon for the points very close to the divisor at infinity.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Kolmogorov Arnold Informed neural network: A physics-informed deep learning framework for solving PDEs based on Kolmogorov Arnold Networks
Authors:
Yizheng Wang,
Jia Sun,
Jinshuai Bai,
Cosmin Anitescu,
Mohammad Sadegh Eshaghi,
Xiaoying Zhuang,
Timon Rabczuk,
Yinghua Liu
Abstract:
AI for partial differential equations (PDEs) has garnered significant attention, particularly with the emergence of Physics-informed neural networks (PINNs). The recent advent of Kolmogorov-Arnold Network (KAN) indicates that there is potential to revisit and enhance the previously MLP-based PINNs. Compared to MLPs, KANs offer interpretability and require fewer parameters. PDEs can be described in…
▽ More
AI for partial differential equations (PDEs) has garnered significant attention, particularly with the emergence of Physics-informed neural networks (PINNs). The recent advent of Kolmogorov-Arnold Network (KAN) indicates that there is potential to revisit and enhance the previously MLP-based PINNs. Compared to MLPs, KANs offer interpretability and require fewer parameters. PDEs can be described in various forms, such as strong form, energy form, and inverse form. While mathematically equivalent, these forms are not computationally equivalent, making the exploration of different PDE formulations significant in computational physics. Thus, we propose different PDE forms based on KAN instead of MLP, termed Kolmogorov-Arnold-Informed Neural Network (KINN). We systematically compare MLP and KAN in various numerical examples of PDEs, including multi-scale, singularity, stress concentration, nonlinear hyperelasticity, heterogeneous, and complex geometry problems. Our results demonstrate that KINN significantly outperforms MLP in terms of accuracy and convergence speed for numerous PDEs in computational solid mechanics, except for the complex geometry problem. This highlights KINN's potential for more efficient and accurate PDE solutions in AI for PDEs.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Global-in-time energy stability: a powerful analysis tool for the gradient flow problem without maximum principle or Lipschitz assumption
Authors:
J. Sun,
H. Wang,
H. Zhang,
X. Qian,
S. Song
Abstract:
Before proving (unconditional) energy stability for gradient flows, most existing studies either require a strong Lipschitz condition regarding the non-linearity or certain $L^{\infty}$ bounds on the numerical solutions (the maximum principle). However, proving energy stability without such premises is a very challenging task. In this paper, we aim to develop a novel analytical tool, namely global…
▽ More
Before proving (unconditional) energy stability for gradient flows, most existing studies either require a strong Lipschitz condition regarding the non-linearity or certain $L^{\infty}$ bounds on the numerical solutions (the maximum principle). However, proving energy stability without such premises is a very challenging task. In this paper, we aim to develop a novel analytical tool, namely global-in-time energy stability, to demonstrate energy dissipation without assuming any strong Lipschitz condition or $L^{\infty}$ boundedness. The fourth-order-in-space Swift-Hohenberg equation is used to elucidate the theoretical results in detail. We also propose a temporal second-order accurate scheme for efficiently solving such a strongly stiff equation. Furthermore, we present the corresponding optimal $L^2$ error estimate and provide several numerical simulations to demonstrate the dynamics.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Long-Time Behavior of Zero-Sum Linear-Quadratic Stochastic Differential Games
Authors:
Jingrui Sun,
Jiongmin Yong
Abstract:
The paper investigates the long-time behavior of zero-sum linear-quadratic stochastic differential games, aiming to demonstrate that, under appropriate conditions, both the saddle strategy and the optimal state process exhibit the exponential turnpike property. Namely, for the majority of the time horizon, the distributions of the saddle strategy and the optimal state process closely stay near cer…
▽ More
The paper investigates the long-time behavior of zero-sum linear-quadratic stochastic differential games, aiming to demonstrate that, under appropriate conditions, both the saddle strategy and the optimal state process exhibit the exponential turnpike property. Namely, for the majority of the time horizon, the distributions of the saddle strategy and the optimal state process closely stay near certain (time-invariant) distributions $ν_1^*$, $ν_2^*$ and $μ^*$, respectively. Additionally, as a byproduct, we solve the infinite horizon version of the differential game and derive closed-loop representations for its open-loop saddle strategy, which has not been proved in the literature.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
A Ramsey-type theorem on deficiency
Authors:
Jin Sun,
Xinmin Hou
Abstract:
Ramsey's Theorem states that a graph $G$ has bounded order if and only if $G$ contains no complete graph $K_n$ or empty graph $E_n$ as its induced subgraph. The Gyárfás-Sumner conjecture says that a graph $G$ has bounded chromatic number if and only if it contains no induced subgraph isomorphic to $K_n$ or a tree $T$. The deficiency of a graph is the number of vertices that cannot be covered by a…
▽ More
Ramsey's Theorem states that a graph $G$ has bounded order if and only if $G$ contains no complete graph $K_n$ or empty graph $E_n$ as its induced subgraph. The Gyárfás-Sumner conjecture says that a graph $G$ has bounded chromatic number if and only if it contains no induced subgraph isomorphic to $K_n$ or a tree $T$. The deficiency of a graph is the number of vertices that cannot be covered by a maximum matching. In this paper, we prove a Ramsey type theorem for deficiency, i.e., we characterize all the forbidden induced subgraphs for graphs $G$ with bounded deficiency. As an application, we answer a question proposed by Fujita, Kawarabayashi, Lucchesi, Ota, Plummer and Saito (JCTB, 2006).
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Online Control in Population Dynamics
Authors:
Noah Golowich,
Elad Hazan,
Zhou Lu,
Dhruv Rohatgi,
Y. Jennifer Sun
Abstract:
The study of population dynamics originated with early sociological works but has since extended into many fields, including biology, epidemiology, evolutionary game theory, and economics. Most studies on population dynamics focus on the problem of prediction rather than control. Existing mathematical models for control in population dynamics are often restricted to specific, noise-free dynamics,…
▽ More
The study of population dynamics originated with early sociological works but has since extended into many fields, including biology, epidemiology, evolutionary game theory, and economics. Most studies on population dynamics focus on the problem of prediction rather than control. Existing mathematical models for control in population dynamics are often restricted to specific, noise-free dynamics, while real-world population changes can be complex and adversarial.
To address this gap, we propose a new framework based on the paradigm of online control. We first characterize a set of linear dynamical systems that can naturally model evolving populations. We then give an efficient gradient-based controller for these systems, with near-optimal regret bounds with respect to a broad class of linear policies. Our empirical evaluations demonstrate the effectiveness of the proposed algorithm for control in population dynamics even for non-linear models such as SIR and replicator dynamics.
△ Less
Submitted 6 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Convergence of the Deep Galerkin Method for Mean Field Control Problems
Authors:
William Hofgard,
Jingruo Sun,
Asaf Cohen
Abstract:
We establish the convergence of the deep Galerkin method (DGM), a deep learning-based scheme for solving high-dimensional nonlinear PDEs, for Hamilton-Jacobi-Bellman (HJB) equations that arise from the study of mean field control problems (MFCPs). Based on a recent characterization of the value function of the MFCP as the unique viscosity solution of an HJB equation on the simplex, we establish bo…
▽ More
We establish the convergence of the deep Galerkin method (DGM), a deep learning-based scheme for solving high-dimensional nonlinear PDEs, for Hamilton-Jacobi-Bellman (HJB) equations that arise from the study of mean field control problems (MFCPs). Based on a recent characterization of the value function of the MFCP as the unique viscosity solution of an HJB equation on the simplex, we establish both an existence and convergence result for the DGM. First, we show that the loss functional of the DGM can be made arbitrarily small given that the value function of the MFCP possesses sufficient regularity. Then, we show that if the loss functional of the DGM converges to zero, the corresponding neural network approximators must converge uniformly to the true value function on the simplex. We also provide numerical experiments demonstrating the DGM's ability to generalize to high-dimensional HJB equations.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Sweedler duality for Hom-algebras and Hom-modules
Authors:
Jiacheng Sun,
Shuanhong Wang,
Chi Zhang,
Haoran Zhu
Abstract:
The construction of Sweedler duality is an important tool in the theory of Hopf algebras over a field, which is a right adjoint to the dual algebra functor. In this paper, we study the Sweedler duality of Hom-algebras and their Hom-modules. We delve into the structure of Hom-coalgebras and derive the linear morphisms associated with them. Additionally, as an application, we present the (right) Hom…
▽ More
The construction of Sweedler duality is an important tool in the theory of Hopf algebras over a field, which is a right adjoint to the dual algebra functor. In this paper, we study the Sweedler duality of Hom-algebras and their Hom-modules. We delve into the structure of Hom-coalgebras and derive the linear morphisms associated with them. Additionally, as an application, we present the (right) Hom-(co)module morphisms under the Sweedler duality.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
An infinite dimensional balanced embedding problem III: Asymptotics near infinity
Authors:
Jingzhou Sun
Abstract:
We continue our study on the logarithmic balanced model metric initiated in our previous work. By a non-trivial refinement of the set of tools developed in our previous work, we are able to confirm partially a conjecture we made in our previous work on the asymptotic behavior of the balanced metric near infinity.
We continue our study on the logarithmic balanced model metric initiated in our previous work. By a non-trivial refinement of the set of tools developed in our previous work, we are able to confirm partially a conjecture we made in our previous work on the asymptotic behavior of the balanced metric near infinity.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Existence and dynamical behaviour of vectorial standing waves with prescribed mass for Hartree-Fock type systems
Authors:
Shuai Yao,
Juntao Sun,
Tsung-fang Wu
Abstract:
In this paper, we investigate vectorial standing waves with prescribed mass for the Hartree-Fock type system (HF system) with the double coupled feature. Such system is viewed as an approximation of the Coulomb system with two particles appeared in quantum mechanics. By exploring the interaction of the double coupled terms, we prove the exis?tence/nonexistence and symmetry of vectorial energy grou…
▽ More
In this paper, we investigate vectorial standing waves with prescribed mass for the Hartree-Fock type system (HF system) with the double coupled feature. Such system is viewed as an approximation of the Coulomb system with two particles appeared in quantum mechanics. By exploring the interaction of the double coupled terms, we prove the exis?tence/nonexistence and symmetry of vectorial energy ground states for the corresponding stationary problem. Furthermore, we obtain the relation between vectorial energy ground states and vectorial action ground states in some cases. Finally, we establish conditions for global well-posedness and finite time blow-up to HF system with the initial data, and prove orbital stability/strong instability of standing waves.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Exploring the Robustness of In-Context Learning with Noisy Labels
Authors:
Chen Cheng,
Xinzhi Yu,
Haodong Wen,
Jingsong Sun,
Guanzhang Yue,
Yihao Zhang,
Zeming Wei
Abstract:
Recently, the mysterious In-Context Learning (ICL) ability exhibited by Transformer architectures, especially in large language models (LLMs), has sparked significant research interest. However, the resilience of Transformers' in-context learning capabilities in the presence of noisy samples, prevalent in both training corpora and prompt demonstrations, remains underexplored. In this paper, inspir…
▽ More
Recently, the mysterious In-Context Learning (ICL) ability exhibited by Transformer architectures, especially in large language models (LLMs), has sparked significant research interest. However, the resilience of Transformers' in-context learning capabilities in the presence of noisy samples, prevalent in both training corpora and prompt demonstrations, remains underexplored. In this paper, inspired by prior research that studies ICL ability using simple function classes, we take a closer look at this problem by investigating the robustness of Transformers against noisy labels. Specifically, we first conduct a thorough evaluation and analysis of the robustness of Transformers against noisy labels during in-context learning and show that they exhibit notable resilience against diverse types of noise in demonstration labels. Furthermore, we delve deeper into this problem by exploring whether introducing noise into the training set, akin to a form of data augmentation, enhances such robustness during inference, and find that such noise can indeed improve the robustness of ICL. Overall, our fruitful analysis and findings provide a comprehensive understanding of the resilience of Transformer models against label noises during ICL and provide valuable insights into the research on Transformers in natural language processing. Our code is available at https://github.com/InezYu0928/in-context-learning.
△ Less
Submitted 1 May, 2024; v1 submitted 28 April, 2024;
originally announced April 2024.
-
Towards General Conceptual Model Editing via Adversarial Representation Engineering
Authors:
Yihao Zhang,
Zeming Wei,
Jun Sun,
Meng Sun
Abstract:
Since the development of Large Language Models (LLMs) has achieved remarkable success, understanding and controlling their internal complex mechanisms has become an urgent problem. Recent research has attempted to interpret their behaviors through the lens of inner representation. However, developing practical and efficient methods for applying these representations for general and flexible model…
▽ More
Since the development of Large Language Models (LLMs) has achieved remarkable success, understanding and controlling their internal complex mechanisms has become an urgent problem. Recent research has attempted to interpret their behaviors through the lens of inner representation. However, developing practical and efficient methods for applying these representations for general and flexible model editing remains challenging. In this work, we explore how to use representation engineering methods to guide the editing of LLMs by deploying a representation sensor as an oracle. We first identify the importance of a robust and reliable sensor during editing, then propose an Adversarial Representation Engineering (ARE) framework to provide a unified and interpretable approach for conceptual model editing without compromising baseline performance. Experiments on multiple model editing paradigms demonstrate the effectiveness of ARE in various settings. Code and data are available at https://github.com/Zhang-Yihao/Adversarial-Representation-Engineering.
△ Less
Submitted 23 May, 2024; v1 submitted 21 April, 2024;
originally announced April 2024.
-
Discrete non-commutative hungry Toda lattice and its application in matrix computation
Authors:
Zheng Wang,
Shi-Hao Li,
Kang-Ya Lu,
Jian-Qing Sun
Abstract:
In this paper, we plan to show an eigenvalue algorithm for block Hessenberg matrices by using the idea of non-commutative integrable systems and matrix-valued orthogonal polynomials. We introduce adjacent families of matrix-valued $θ$-deformed bi-orthogonal polynomials, and derive corresponding discrete non-commutative hungry Toda lattice from discrete spectral transformations for polynomials. It…
▽ More
In this paper, we plan to show an eigenvalue algorithm for block Hessenberg matrices by using the idea of non-commutative integrable systems and matrix-valued orthogonal polynomials. We introduce adjacent families of matrix-valued $θ$-deformed bi-orthogonal polynomials, and derive corresponding discrete non-commutative hungry Toda lattice from discrete spectral transformations for polynomials. It is shown that this discrete system can be used as a pre-precessing algorithm for block Hessenberg matrices. Besides, some convergence analysis and numerical examples of this algorithm are presented.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Towards a Foundation Model for Partial Differential Equations: Multi-Operator Learning and Extrapolation
Authors:
Jingmin Sun,
Yuxuan Liu,
Zecheng Zhang,
Hayden Schaeffer
Abstract:
Foundation models, such as large language models, have demonstrated success in addressing various language and image processing tasks. In this work, we introduce a multi-modal foundation model for scientific problems, named PROSE-PDE. Our model, designed for bi-modality to bi-modality learning, is a multi-operator learning approach which can predict future states of spatiotemporal systems while co…
▽ More
Foundation models, such as large language models, have demonstrated success in addressing various language and image processing tasks. In this work, we introduce a multi-modal foundation model for scientific problems, named PROSE-PDE. Our model, designed for bi-modality to bi-modality learning, is a multi-operator learning approach which can predict future states of spatiotemporal systems while concurrently learning the underlying governing equations of the physical system. Specifically, we focus on multi-operator learning by training distinct one-dimensional time-dependent nonlinear constant coefficient partial differential equations, with potential applications to many physical applications including physics, geology, and biology. More importantly, we provide three extrapolation studies to demonstrate that PROSE-PDE can generalize physical features through the robust training of multiple operators and that the proposed model can extrapolate to predict PDE solutions whose models or data were unseen during the training. Furthermore, we show through systematic numerical experiments that the utilization of the symbolic modality in our model effectively resolves the well-posedness problems with training multiple operators and thus enhances our model's predictive capabilities.
△ Less
Submitted 19 April, 2024; v1 submitted 18 April, 2024;
originally announced April 2024.
-
Analysis of a finite element DtN method for scattering resonances of sound hard obstacles
Authors:
Yingxia Xi,
Bo Gong,
Jiguang Sun
Abstract:
Scattering resonances have important applications in many areas of science and engineering. They are the replacement of discrete spectral data for problems on non-compact domains. In this paper, we consider the computation of scattering resonances defined on the exterior to a compact sound hard obstacle. The resonances are the eigenvalues of a holomorphic Fredholm operator function. We truncate th…
▽ More
Scattering resonances have important applications in many areas of science and engineering. They are the replacement of discrete spectral data for problems on non-compact domains. In this paper, we consider the computation of scattering resonances defined on the exterior to a compact sound hard obstacle. The resonances are the eigenvalues of a holomorphic Fredholm operator function. We truncate the unbounded domain and impose the Dirichlet-to-Neumann (DtN) mapping. The problem is then discretized using the linear Lagrange element. Convergence of the resonances is proved using the abstract approximation theory for holomorphic Fredholm operator functions. The discretization leads to nonlinear algebraic eigenvalue problems, which are solved by the recently developed parallel spectral indicator methods. Numerical examples are presented for validation.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
A finite element contour integral method for computing the resonances of metallic grating structures with subwavelength holes
Authors:
Yingxia Xi,
Junshan Lin,
Jiguang Sun
Abstract:
We consider the numerical computation of resonances for metallic grating structures with dispersive media and small slit holes. The underlying eigenvalue problem is nonlinear and the mathematical model is multiscale due to the existence of several length scales in problem geometry and material contrast. We discretize the partial differential equation model over the truncated domain using the finit…
▽ More
We consider the numerical computation of resonances for metallic grating structures with dispersive media and small slit holes. The underlying eigenvalue problem is nonlinear and the mathematical model is multiscale due to the existence of several length scales in problem geometry and material contrast. We discretize the partial differential equation model over the truncated domain using the finite element method and develop a multi-step contour integral eigensolver to compute the resonances. The eigensolver first locates eigenvalues using a spectral indicator and then computes eigenvalues by a subspace projection scheme. The proposed numerical method is robust and scalable, and does not require initial guess as the iteration methods. Numerical examples are presented to demonstrate its effectiveness.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
On virtual resolutions of points in a product of projective spaces
Authors:
Isidora Bailly-Hall,
Christine Berkesch,
Karina Dovgodko,
Sean Guan,
Saisudharshan Sivakumar,
Jishi Sun
Abstract:
For finite sets of points in $\mathbb{P}^n \times \mathbb{P}^m$, we produce short virtual resolutions, as introduced by Berkesch--Erman--Smith. We first intersect with a sufficiently high power of one set of variables for points in $\mathbb{P}^n \times \mathbb{P}^m$ to produce a virtual resolution of length $n+m$. Then, we describe an explicit virtual resolution of length 3 for a set of points in…
▽ More
For finite sets of points in $\mathbb{P}^n \times \mathbb{P}^m$, we produce short virtual resolutions, as introduced by Berkesch--Erman--Smith. We first intersect with a sufficiently high power of one set of variables for points in $\mathbb{P}^n \times \mathbb{P}^m$ to produce a virtual resolution of length $n+m$. Then, we describe an explicit virtual resolution of length 3 for a set of points in sufficiently general position in $\mathbb{P}^1 \times \mathbb{P}^2$, via a subcomplex of a free resolution. This first result generalizes to $\mathbb{P}^n \times \mathbb{P}^m$ work of Harada--Nowroozi--Van Tuyl, and the second partially generalizes work of Harada--Nowroozi--Van Tuyl and Booms-Peot, which were both for $\mathbb{P}^1 \times \mathbb{P}^1$. Along the way, we also note an explicit relationship between Betti numbers and higher difference matrices of bigraded Hilbert functions for $\mathbb{P}^n \times \mathbb{P}^m$.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
On an optimal problem of bilinear forms
Authors:
Naihuan Jing,
Yibo Liu,
Jiacheng Sun,
Chengrui Zhao,
Haoran Zhu
Abstract:
We study an optimization problem originated from the Grothendieck constant. A generalized normal equation is proposed and analyzed. We establish a correspondence between solutions of the general normal equation and its dual equation. Explicit solutions are described for the two-dimensional case.
We study an optimization problem originated from the Grothendieck constant. A generalized normal equation is proposed and analyzed. We establish a correspondence between solutions of the general normal equation and its dual equation. Explicit solutions are described for the two-dimensional case.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
A high-order multi-time-step scheme for bond-based peridynamics
Authors:
Chenguang Liu,
Jie Sun,
Hao Tian,
WaiSun Don,
Lili Ju
Abstract:
A high-order multi-time-step (MTS) scheme for the bond-based peridynamic (PD) model, an extension of classical continuous mechanics widely used for analyzing discontinuous problems like cracks, is proposed. The MTS scheme discretizes the spatial domain with a meshfree method and advances in time with a high-order Runge-Kutta method. To effectively handle discontinuities (cracks) that appear in a l…
▽ More
A high-order multi-time-step (MTS) scheme for the bond-based peridynamic (PD) model, an extension of classical continuous mechanics widely used for analyzing discontinuous problems like cracks, is proposed. The MTS scheme discretizes the spatial domain with a meshfree method and advances in time with a high-order Runge-Kutta method. To effectively handle discontinuities (cracks) that appear in a local subdomain in the solution, the scheme employs the Taylor expansion and Lagrange interpolation polynomials with a finer time step size, that is, coarse and fine time step sizes for smooth and discontinuous subdomains, respectively, to achieve accurate and efficient simulations. By eliminating unnecessary fine-scale resolution imposed on the entire domain, the MTS scheme outperforms the standard PD scheme by significantly reducing computational costs, particularly for problems with discontinuous solutions, as demonstrated by comprehensive theoretical analysis and numerical experiments.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
A New Parallel Cooperative Landscape Smoothing Algorithm and Its Applications on TSP and UBQP
Authors:
Wei Wang,
Jialong Shi,
Jianyong Sun,
Arnaud Liefooghe,
Qingfu Zhang
Abstract:
Combinatorial optimization problem (COP) is difficult to solve because of the massive number of local optimal solutions in his solution space. Various methods have been put forward to smooth the solution space of COPs, including homotopic convex (HC) transformation for the traveling salesman problem (TSP). This paper extends the HC transformation approach to unconstrained binary quadratic programm…
▽ More
Combinatorial optimization problem (COP) is difficult to solve because of the massive number of local optimal solutions in his solution space. Various methods have been put forward to smooth the solution space of COPs, including homotopic convex (HC) transformation for the traveling salesman problem (TSP). This paper extends the HC transformation approach to unconstrained binary quadratic programming (UBQP) by proposing a method to construct a unimodal toy UBQP of any size. We theoretically prove the unimodality of the constructed toy UBQP. After that, we apply this unimodal toy UBQP to smooth the original UBQP by using the HC transformation framework and empirically verify the smoothing effects. Subsequently, we introduce an iterative algorithmic framework incorporating HC transformation, referred as landscape smoothing iterated local search (LSILS). Our experimental analyses, conducted on various UBQP instances show the effectiveness of LSILS. Furthermore, this paper proposes a parallel cooperative variant of LSILS, denoted as PC-LSILS and apply it to both the UBQP and the TSP. Our experimental findings highlight that PC-LSILS improves the smoothing performance of the HC transformation, and further improves the overall performance of the algorithm.
△ Less
Submitted 1 July, 2024; v1 submitted 6 January, 2024;
originally announced January 2024.
-
Quadratic and cubic Lagrange finite elements for mixed Laplace eigenvalue problems on criss-cross meshes
Authors:
Kaibo Hu,
Jiguang Sun,
Qian Zhang
Abstract:
In [6], it was shown that the linear Lagrange element space on criss-cross meshes and its divergence exhibit spurious eigenvalues when applied in the mixed formulation of the Laplace eigenvalue problem, despite satisfying both the inf-sup condition and ellipticity on the discrete kernel. The lack of a Fortin interpolation is responsible for the spurious eigenvalues produced by the linear Lagrange…
▽ More
In [6], it was shown that the linear Lagrange element space on criss-cross meshes and its divergence exhibit spurious eigenvalues when applied in the mixed formulation of the Laplace eigenvalue problem, despite satisfying both the inf-sup condition and ellipticity on the discrete kernel. The lack of a Fortin interpolation is responsible for the spurious eigenvalues produced by the linear Lagrange space. In contrast, results in [8] confirm that quartic and higher-order Lagrange elements do not yield spurious eigenvalues on general meshes without nearly singular vertices, including criss-cross meshes as a special case. In this paper, we investigate quadratic and cubic Lagrange elements on criss-cross meshes. We prove the convergence of discrete eigenvalues by fitting the Lagrange elements on criss-cross meshes into a complex and constructing a Fortin interpolation. As a by-product, we construct bounded commuting projections for the finite element Stokes complex, which induces isomorphisms between cohomologies of the continuous and discrete complexes. We provide numerical examples to validate the theoretical results.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Parallel Multi-Step Contour Integral Methods for Nonlinear Eigenvalue Problems
Authors:
Yingxia Xi,
Jiguang Sun
Abstract:
We consider nonlinear eigenvalue problems to compute all eigenvalues in a bounded region on the complex plane. Based on domain decomposition and contour integrals, two robust and scalable parallel multi-step methods are proposed. The first method 1) uses the spectral indicator method to find eigenvalues and 2) calls a linear eigensolver to compute the associated eigenvectors. The second method 1)…
▽ More
We consider nonlinear eigenvalue problems to compute all eigenvalues in a bounded region on the complex plane. Based on domain decomposition and contour integrals, two robust and scalable parallel multi-step methods are proposed. The first method 1) uses the spectral indicator method to find eigenvalues and 2) calls a linear eigensolver to compute the associated eigenvectors. The second method 1) divides the region into subregions and uses the spectral indicator method to decide candidate regions that contain eigenvalues, 2) computes eigenvalues in each candidate subregion using Beyn's method; and 3) verifies each eigenvalue by substituting it back to the system and computes the smallest eigenvalue. Each step of the two methods is carried out in parallel. Both methods are robust, accurate, and does not require prior knowledge of the number and distribution of the eigenvalues in the region. Examples are presented to show the performance of the two methods.
△ Less
Submitted 17 January, 2024; v1 submitted 20 December, 2023;
originally announced December 2023.
-
Quasi-invariant theorem on the Gaussian path space
Authors:
Qinpin Chen,
Jian Sun,
Bo Wu
Abstract:
In this article, we will first introduce a class of Gaussian processes, and prove the quasi-invariant theorem with respect to the Gaussian Wiener measure, which is the law of the associated Gaussian process. In particular, it includes the case of the fractional Brownian motion.
As applications, we will establish the integration by parts formula and Bismut-Elworthy-Li formula on the Gaussian path…
▽ More
In this article, we will first introduce a class of Gaussian processes, and prove the quasi-invariant theorem with respect to the Gaussian Wiener measure, which is the law of the associated Gaussian process. In particular, it includes the case of the fractional Brownian motion.
As applications, we will establish the integration by parts formula and Bismut-Elworthy-Li formula on the Gaussian path space, and by which some logarithmic Sobolev inequalities will be presented. Moreover, we will also provides some applications in the field of financial mathematics.
△ Less
Submitted 1 January, 2024; v1 submitted 19 November, 2023;
originally announced November 2023.
-
An infinite dimensional balanced embedding problem II: uniqueness
Authors:
Jingzhou Sun
Abstract:
This is the sequel to our first paper concerning the balanced embedding of a non-compact complex manifold into an infinite-dimensional projective space. We prove the uniqueness of such an embedding. The proof relies on fine estimates of the asymptotics of a balanced embedding.
This is the sequel to our first paper concerning the balanced embedding of a non-compact complex manifold into an infinite-dimensional projective space. We prove the uniqueness of such an embedding. The proof relies on fine estimates of the asymptotics of a balanced embedding.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
The anisotropic interior transmission eigenvalue problem with a conductive boundary
Authors:
Victor Hughes,
Isaac Harris,
Jiguang Sun
Abstract:
In this paper, we study the transmission eigenvalue problem for an anisotropic material with a conductive boundary. We prove that the transmission eigenvalues for this problem exist and are at most a discrete set. We also study the dependence of the transmission eigenvalues on the physical parameters and prove that the first transmission eigenvalue is monotonic. We then consider the limiting behav…
▽ More
In this paper, we study the transmission eigenvalue problem for an anisotropic material with a conductive boundary. We prove that the transmission eigenvalues for this problem exist and are at most a discrete set. We also study the dependence of the transmission eigenvalues on the physical parameters and prove that the first transmission eigenvalue is monotonic. We then consider the limiting behavior of the transmission eigenvalues as the conductive boundary parameter $η$ vanishes or goes to infinity in magnitude. Finally, we provide some numerical examples on three different domains to demonstrate our theoretical results.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
Robust Model Predictive Control for Enhanced Fast Charging on Electric Vehicles through Integrated Power and Thermal Management
Authors:
Qiuhao Hu,
Mohammad Reza Amini,
Ashley Wiese,
Ilya Kolmanovsky,
Jing Sun
Abstract:
This paper explores the synergies between integrated power and thermal management (iPTM) and battery charging in an electric vehicle (EV). A multi-objective model predictive control (MPC) framework is developed to optimize the fast charging performance while enforcing the constraints in the power and thermal loops. The approach takes into account the coupling of the battery and cabin thermal manag…
▽ More
This paper explores the synergies between integrated power and thermal management (iPTM) and battery charging in an electric vehicle (EV). A multi-objective model predictive control (MPC) framework is developed to optimize the fast charging performance while enforcing the constraints in the power and thermal loops. The approach takes into account the coupling of the battery and cabin thermal management. The case study of a commercial EV demonstrates that the proposed method can effectively meet the requirements of fast charging and thermal management when accurate preview information is available. However, failure to predict the charging event can result in performance degradation with longer charging time. A time-varying weighting strategy is proposed to enhance charging performance in the presence of uncertainty. This strategy leverages the battery state-of-charge (SOC) and adjusts the priority of the multi-objective MPC at different phases during charging. Simulated results using a commercial EV use case show improved robustness in charging time using the proposed strategy.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Federated Learning with Convex Global and Local Constraints
Authors:
Chuan He,
Le Peng,
Ju Sun
Abstract:
In practice, many machine learning (ML) problems come with constraints, and their applied domains involve distributed sensitive data that cannot be shared with others, e.g., in healthcare. Collaborative learning in such practical scenarios entails federated learning (FL) for ML problems with constraints, or FL with constraints for short. Despite the extensive developments of FL techniques in recen…
▽ More
In practice, many machine learning (ML) problems come with constraints, and their applied domains involve distributed sensitive data that cannot be shared with others, e.g., in healthcare. Collaborative learning in such practical scenarios entails federated learning (FL) for ML problems with constraints, or FL with constraints for short. Despite the extensive developments of FL techniques in recent years, these techniques only deal with unconstrained FL problems or FL problems with simple constraints that are amenable to easy projections. There is little work dealing with FL problems with general constraints. To fill this gap, we take the first step toward building an algorithmic framework for solving FL problems with general constraints. In particular, we propose a new FL algorithm for constrained ML problems based on the proximal augmented Lagrangian (AL) method. Assuming convex objective and convex constraints plus other mild conditions, we establish the worst-case complexity of the proposed algorithm. Our numerical experiments show the effectiveness of our algorithm in performing Neyman-Pearson classification and fairness-aware learning with nonconvex constraints, in an FL setting.
△ Less
Submitted 1 May, 2024; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Vectorial ground state solutions for a class of Hartree-Fock type systems with the double coupled feature
Authors:
Juntao Sun,
Tsung-fang Wu
Abstract:
In this paper we study the Hartree-Fock type system as follows: \begin{equation*} \left\{ \begin{array}{ll} -Δu+u+λφ_{u,v}u=\left\vert u\right\vert ^{p-2}u+β\left\vert v\right\vert ^{\frac{p}{2}}\left\vert u\right\vert ^{\frac{p}{2}% -2}u & \text{ in }\mathbb{R}^{3}, \\ -Δv+v+λφ_{u,v}v=\left\vert v\right\vert ^{p-2}v+β\left\vert u\right\vert ^{\frac{p}{2}}\left\vert v\right\vert ^{\frac{p}{2}% -2}…
▽ More
In this paper we study the Hartree-Fock type system as follows: \begin{equation*} \left\{ \begin{array}{ll} -Δu+u+λφ_{u,v}u=\left\vert u\right\vert ^{p-2}u+β\left\vert v\right\vert ^{\frac{p}{2}}\left\vert u\right\vert ^{\frac{p}{2}% -2}u & \text{ in }\mathbb{R}^{3}, \\ -Δv+v+λφ_{u,v}v=\left\vert v\right\vert ^{p-2}v+β\left\vert u\right\vert ^{\frac{p}{2}}\left\vert v\right\vert ^{\frac{p}{2}% -2}v & \text{ in }\mathbb{R}^{3},% \end{array}% \right. \end{equation*}% where $φ_{u,v}(x)=\int_{\mathbb{R}^{3}}\frac{u^{2}(y)+v^{2}\left( y\right) }{|x-y|}dy,$ the parameters $λ,β>0$ and $2<p<4$. Such system is viewed as an approximation of the Coulomb system with two particles appeared in quantum mechanics, taking into account the Pauli principle. Its characteristic feature lies on the presence of the double coupled terms. When $2<p<3,$ we establish the existence and multiplicity of nontrivial radial solutions, including vectorial ones, in the radial space $% H_{r}$ by describing the internal relationship between the coupling constants $λ$ and $β.$ When $2<p<4,$ we study the existence of vectorial solutions in the non-radial space $H$ by developing a novel constraint method, together with some new analysis techniques. In particular, when $3\leq p<4,$ a vectorial ground state solution is found in $% H$, which is innovative as it was not discussed at all in any previous results. Our study can be regarded as an entire supplement in d'Avenia et al. [J. Differential Equations 335 (2022) 580--614].
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data
Authors:
Hao Sun,
Li Shen,
Shixiang Chen,
Jingwei Sun,
Jing Li,
Guangzhong Sun,
Dacheng Tao
Abstract:
Federated learning is an emerging distributed machine learning method, enables a large number of clients to train a model without exchanging their local data. The time cost of communication is an essential bottleneck in federated learning, especially for training large-scale deep neural networks. Some communication-efficient federated learning methods, such as FedAvg and FedAdam, share the same le…
▽ More
Federated learning is an emerging distributed machine learning method, enables a large number of clients to train a model without exchanging their local data. The time cost of communication is an essential bottleneck in federated learning, especially for training large-scale deep neural networks. Some communication-efficient federated learning methods, such as FedAvg and FedAdam, share the same learning rate across different clients. But they are not efficient when data is heterogeneous. To maximize the performance of optimization methods, the main challenge is how to adjust the learning rate without hurting the convergence. In this paper, we propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate based on local historical gradient squares and synchronized learning rates. Theoretical analysis shows that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients, which enables promising scalability in federated optimization. We also empirically compare our method with several communication-efficient federated optimization methods. Extensive experimental results on Computer Vision (CV) tasks and Natural Language Processing (NLP) task show the efficacy of our proposed FedLALR method and also coincides with our theoretical findings.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
On the Reduction of the Spherical Point-in-Polygon Problem for Antipode-Excluding Spherical Polygons
Authors:
Ziqiang Li,
Jindi Sun
Abstract:
Spherical polygons used in practice are nice, but the spherical point-in-polygon problem (SPiP) has long eluded solutions based on the winding number (wn). That a punctured sphere is simply connected is to blame. As a workaround, we prove that requiring the boundary of a spherical polygon to never intersect its antipode is sufficient to reduce its SPiP problem to the planar, point-in-polygon (PiP)…
▽ More
Spherical polygons used in practice are nice, but the spherical point-in-polygon problem (SPiP) has long eluded solutions based on the winding number (wn). That a punctured sphere is simply connected is to blame. As a workaround, we prove that requiring the boundary of a spherical polygon to never intersect its antipode is sufficient to reduce its SPiP problem to the planar, point-in-polygon (PiP) problem, whose state-of-the-art solution uses wn and does not utilize known interior points (KIP). We refer to such spherical polygons as boundary antipode-excluding (BAE) and show that all spherical polygons fully contained within an open hemisphere is BAE. We document two successful reduction methods, one based on rotation and the other on shearing, and address a common concern. Both reduction algorithms, when combined with a wn-PiP algorithm, solve SPiP correctly and efficiently for BAE spherical polygons. The MATLAB code provided demonstrates scenarios that are problematic for previous work.
△ Less
Submitted 7 September, 2023;
originally announced September 2023.
-
An infinite dimensional balanced embedding problem I:existence
Authors:
Jingzhou Sun,
Song Sun
Abstract:
We investigate the problem of balanced embedding of a non-compact complex manifold into an infinite-dimensional projective space. In this paper we prove the existence of such an embedding in a model case. The strategy is by using a gradient flow in a Hilbert space; both the long-time existence and convergence at infinite time are non-trivial. The long time existence is established by choosing a pe…
▽ More
We investigate the problem of balanced embedding of a non-compact complex manifold into an infinite-dimensional projective space. In this paper we prove the existence of such an embedding in a model case. The strategy is by using a gradient flow in a Hilbert space; both the long-time existence and convergence at infinite time are non-trivial. The long time existence is established by choosing a perturbation of the ODE; the convergence depends on a priori bounds that uses techniques in the proof of the Tauber-Hardy-Littlewood theorem.
△ Less
Submitted 3 September, 2023;
originally announced September 2023.
-
Semiparametric Modeling and Analysis for Longitudinal Network Data
Authors:
Yinqiu He,
Jiajin Sun,
Yuang Tian,
Zhiliang Ying,
Yang Feng
Abstract:
We introduce a semiparametric latent space model for analyzing longitudinal network data. The model consists of a static latent space component and a time-varying node-specific baseline component. We develop a semiparametric efficient score equation for the latent space parameter by adjusting for the baseline nuisance component. Estimation is accomplished through a one-step update estimator and an…
▽ More
We introduce a semiparametric latent space model for analyzing longitudinal network data. The model consists of a static latent space component and a time-varying node-specific baseline component. We develop a semiparametric efficient score equation for the latent space parameter by adjusting for the baseline nuisance component. Estimation is accomplished through a one-step update estimator and an appropriately penalized maximum likelihood estimator. We derive oracle error bounds for the two estimators and address identifiability concerns from a quotient manifold perspective. Our approach is demonstrated using the New York Citi Bike Dataset.
△ Less
Submitted 9 July, 2024; v1 submitted 23 August, 2023;
originally announced August 2023.
-
Ramsey-type results on parameters related to domination
Authors:
Jin Sun,
Xinmin Hou
Abstract:
The following inequality chain $$ ir(G)\le γ(G)\le i(G)\le α(G) \le Γ(G) \le I\!R(G)$$ is known as a domination chain, where $ir(G), γ(G), i(G), α(G), Γ(G)$, and $I\!R(G)$ are the lower irredundance number, the domination number, the independence domination number, the independence number, the upper domination number and the upper irredundance number of $G$, respectively. The Ramsey-type problem s…
▽ More
The following inequality chain $$ ir(G)\le γ(G)\le i(G)\le α(G) \le Γ(G) \le I\!R(G)$$ is known as a domination chain, where $ir(G), γ(G), i(G), α(G), Γ(G)$, and $I\!R(G)$ are the lower irredundance number, the domination number, the independence domination number, the independence number, the upper domination number and the upper irredundance number of $G$, respectively. The Ramsey-type problem seeks to characterize the family $\mathcal H$ of graphs such that every $\mathcal H$-free graph $G$ has a bounded parameter $μ$. The classical Ramsey's theorem states that every $\{K_n, E_n\}$-free graph has a bounded number of vertices. Furuya (Discrete Math.Theor 2018) characterized $\mathcal H$ such that every connected $\mathcal H$-free graph $G$ has a bounded domination number. The characterization of the graph family $\mathcal H$ for which every connected $\mathcal H$-free graph $G$ has a bounded independence number was due to Choi, Furuya, Kim, Park~(Discrete math. 2020) and Chiba, Furuya (Electron. J. Combin., 2022). In this paper, we further characterize $\mathcal H$ such that every connected $\mathcal H$-free graph $G$ has bounded $μ(G)$ for $μ$ belonging to the set $\{ir(G), i(G), Γ(G), \text{IR}(G)\}$. This completes the characterization of $\mathcal H$ for which every connected $\mathcal H$-free graph $G$ has bounded $μ(G)$ for $μ(G)$ along the domination chain. Additionally, we characterize $\mathcal H$ such that every connected $\mathcal H$-free graph $G$ has bounded $μ(G)$ for $μ$ related to the domination number. Specifically, we consider the parameters $O\!I\!R(G)$, $I\!S(G)$, or $I\!R\!S(G)\}$, where $O\!I\!R(G)$, $I\!S(G)$, and $I\!R\!S(G)$ are the open irredundance number, the independence saturation number, and the irredundance saturation number of graph $G$, respectively.
△ Less
Submitted 6 June, 2024; v1 submitted 15 August, 2023;
originally announced August 2023.
-
A Unified Distributed Method for Constrained Networked Optimization via Saddle-Point Dynamics
Authors:
Yi Huang,
Ziyang Meng,
Jian Sun,
Wei Ren
Abstract:
This paper develops a unified distributed method for solving two classes of constrained networked optimization problems, i.e., optimal consensus problem and resource allocation problem with non-identical set constraints. We first transform these two constrained networked optimization problems into a unified saddle-point problem framework with set constraints. Subsequently, two projection-based pri…
▽ More
This paper develops a unified distributed method for solving two classes of constrained networked optimization problems, i.e., optimal consensus problem and resource allocation problem with non-identical set constraints. We first transform these two constrained networked optimization problems into a unified saddle-point problem framework with set constraints. Subsequently, two projection-based primal-dual algorithms via Optimistic Gradient Descent Ascent (OGDA) method and Extra-gradient (EG) method are developed for solving constrained saddle-point problems. It is shown that the developed algorithms achieve exact convergence to a saddle point with an ergodic convergence rate $O(1/k)$ for general convex-concave functions. Based on the proposed primal-dual algorithms via saddle-point dynamics, we develop unified distributed algorithm design and convergence analysis for these two networked optimization problems. Finally, two numerical examples are presented to demonstrate the theoretical results.
△ Less
Submitted 14 July, 2023;
originally announced July 2023.
-
A Macro-Micro Approach to Reconstructing Vehicle Trajectories on Multi-Lane Freeways with Lane Changing
Authors:
Xuejian Chen,
Guoyang Qin,
Toru Seo,
Ye Tian,
Jian Sun
Abstract:
Vehicle trajectories can offer the most precise and detailed depiction of traffic flow and serve as a critical component in traffic management and control applications. Various technologies have been applied to reconstruct vehicle trajectories from sparse fixed and mobile detection data. However, existing methods predominantly concentrate on single-lane scenarios and neglect lane-changing (LC) beh…
▽ More
Vehicle trajectories can offer the most precise and detailed depiction of traffic flow and serve as a critical component in traffic management and control applications. Various technologies have been applied to reconstruct vehicle trajectories from sparse fixed and mobile detection data. However, existing methods predominantly concentrate on single-lane scenarios and neglect lane-changing (LC) behaviors that occur across multiple lanes, which limit their applicability in practical traffic systems. To address this research gap, we propose a macro-micro approach for reconstructing complete vehicle trajectories on multi-lane freeways, wherein the macro traffic state information and micro driving models are integrated to overcome the restrictions imposed by lane boundary. Particularly, the macroscopic velocity contour maps are established for each lane to regulate the movement of vehicle platoons, meanwhile the velocity difference between adjacent lanes provide valuable criteria for guiding LC behaviors. Simultaneously, the car-following models are extended from micro perspective to supply lane-based candidate trajectories and define the plausible range for LC positions. Later, a two-stage trajectory fusion algorithm is proposed to jointly infer both the car-following and LC behaviors, in which the optimal LC positions is identified and candidate trajectories are adjusted according to their weights. The proposed framework was evaluated using NGSIM dataset, and the results indicated a remarkable enhancement in both the accuracy and smoothness of reconstructed trajectories, with performance indicators reduced by over 30% compared to two representative reconstruction methods. Furthermore, the reconstruction process effectively reproduced LC behaviors across contiguous lanes, adding to the framework's comprehensiveness and realism.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Local Randomized Neural Networks with Discontinuous Galerkin Methods for Diffusive-Viscous Wave Equation
Authors:
Jingbo Sun,
Fei Wang
Abstract:
The diffusive-viscous wave equation is an advancement in wave equation theory, as it accounts for both diffusion and viscosity effects. This has a wide range of applications in geophysics, such as the attenuation of seismic waves in fluid-saturated solids and frequency-dependent phenomena in porous media. Therefore, the development of an efficient numerical method for the equation is of both theor…
▽ More
The diffusive-viscous wave equation is an advancement in wave equation theory, as it accounts for both diffusion and viscosity effects. This has a wide range of applications in geophysics, such as the attenuation of seismic waves in fluid-saturated solids and frequency-dependent phenomena in porous media. Therefore, the development of an efficient numerical method for the equation is of both theoretical and practical importance. Recently, local randomized neural networks with discontinuous Galerkin (LRNN-DG) methods have been introduced in \cite{Sun2022lrnndg} to solve elliptic and parabolic equations. Numerical examples suggest that LRNN-DG can achieve high accuracy, and can handle time-dependent problems naturally and efficiently by using a space-time framework. In this paper, we develop LRNN-DG methods for solving the diffusive-viscous wave equation and present numerical experiments with several cases. The numerical results show that the proposed methods can solve the diffusive-viscous wave equation more accurately with less computing costs than traditional methods.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
On receding-horizon approximation in time-varying optimal control
Authors:
Jintao Sun,
Michael Cantoni
Abstract:
The closed-loop stability and infinite-horizon performance of receding-horizon approximations are studied for non-stationary linear-quadratic regulator (LQR) problems. The approach is based on a lifted reformulation of the optimal control problem, under assumed uniform controllability and observability, leading to a strict contraction property of the corresponding Riccati operator. Leveraging this…
▽ More
The closed-loop stability and infinite-horizon performance of receding-horizon approximations are studied for non-stationary linear-quadratic regulator (LQR) problems. The approach is based on a lifted reformulation of the optimal control problem, under assumed uniform controllability and observability, leading to a strict contraction property of the corresponding Riccati operator. Leveraging this contraction property, a stabilizing linear time-varying state-feedback approximation of the infinite-horizon optimal control policy is constructed to meet a performance-loss specification. Its synthesis involves only finite preview of the time-varying problem data at each time step, over a sufficiently long prediction horizon.
△ Less
Submitted 4 September, 2023; v1 submitted 10 May, 2023;
originally announced May 2023.
-
On Riccati contraction in time-varying linear-quadratic control
Authors:
Jintao Sun,
Michael Cantoni
Abstract:
Contraction properties of the Riccati operator are studied within the context of non-stationary linear-quadratic optimal control. A lifting approach is used to obtain a bound on the rate of strict contraction, with respect to the Riemannian metric, across a sufficient number of iterations. This number of iterations is related to an assumed uniform controllability and observability property of the…
▽ More
Contraction properties of the Riccati operator are studied within the context of non-stationary linear-quadratic optimal control. A lifting approach is used to obtain a bound on the rate of strict contraction, with respect to the Riemannian metric, across a sufficient number of iterations. This number of iterations is related to an assumed uniform controllability and observability property of the dynamics and stage-cost in the original formulation of the problem.
△ Less
Submitted 4 September, 2023; v1 submitted 10 May, 2023;
originally announced May 2023.
-
Some Ramsey-type results
Authors:
Jin Sun
Abstract:
The Ramsey's theorem says that a graph with sufficiently many vertices contains a clique or stable set with many vertices. Now we attach some parameter to every vertex, such as degree. Consider the case a graph with sufficiently many vertices of large degree, we can get the realted Ramsey-type result. The Ramsey's theorem of connected version says that every connected graph with sufficiently many…
▽ More
The Ramsey's theorem says that a graph with sufficiently many vertices contains a clique or stable set with many vertices. Now we attach some parameter to every vertex, such as degree. Consider the case a graph with sufficiently many vertices of large degree, we can get the realted Ramsey-type result. The Ramsey's theorem of connected version says that every connected graph with sufficiently many vertices contains an induced path, clique or star with many vertices. Now we require the vertex is non-trivial, i.e. the parameter of this vertex is non-trivial, such as $\operatorname{deg}(v)\ge 2$. A connected graph with sufficiently many non-trivial vertices must contain some special induced subgraph. We also get the non-connected version of this Ramsey-type result as a corollary.
△ Less
Submitted 15 July, 2023; v1 submitted 3 May, 2023;
originally announced May 2023.
-
The planar Schrodinger--Poisson system with exponential critical growth: The local well-posedness and standing waves with prescribed mass
Authors:
Juntao Sun,
Shuai Yao,
Jian Zhang
Abstract:
In this paper, we investigate a class of planar Schrödinger-Poisson systems with critical exponential growth. We establish conditions for the local well-posedness of the Cauchy problem in the energy space, which seems innovative as it was not discussed at all in any previous results. By introducing some new ideas and relaxing some of the classical growth assumptions on the nonlinearity, we show th…
▽ More
In this paper, we investigate a class of planar Schrödinger-Poisson systems with critical exponential growth. We establish conditions for the local well-posedness of the Cauchy problem in the energy space, which seems innovative as it was not discussed at all in any previous results. By introducing some new ideas and relaxing some of the classical growth assumptions on the nonlinearity, we show that such system has at least two standing waves with prescribed mass, where one is a ground state standing waves with positive energy, and the other one is a high-energy standing waves with positive energy. In addition, with the help of the local well-posedness, we show that the set of ground state standing waves is orbitally stable.
△ Less
Submitted 28 April, 2023;
originally announced May 2023.
-
Curvature Estimate of Nodal Sets of Harmonic Functions in the Plane
Authors:
Jin Sun
Abstract:
In this paper, we study curvature estimates for nodal sets of harmonic functions in the plane. We prove that at any point $p$, the curvature of any nodal curve of a harmonic function $u$ is upper bounded by
$$
\left\vert κ(u)(p)\right\vert\leq \frac{4(n+1)}{nr},
$$
where $u$ has only $n$ nodal curves in $B_r(p)$ intersecting at $p$. This result is sharp for odd $n$, and the extreme case ca…
▽ More
In this paper, we study curvature estimates for nodal sets of harmonic functions in the plane. We prove that at any point $p$, the curvature of any nodal curve of a harmonic function $u$ is upper bounded by
$$
\left\vert κ(u)(p)\right\vert\leq \frac{4(n+1)}{nr},
$$
where $u$ has only $n$ nodal curves in $B_r(p)$ intersecting at $p$. This result is sharp for odd $n$, and the extreme case can be given by Dirac measure and its derivatives. For even $n$, we prove there exists a smaller sharp upper bound. Moreover, we can prove the curvature of any nodal curve is upper bounded at every point in the nodal set of $u$ in a small neighborhood $B_{cr}(p)$, where $c<1$ depends only on $n$. Furthermore, with the tool of the frequency, we can prove the positive part and the negative part of $u$ have uniform lower bound, which depends only on the number of the nodal domains in $B_r(p)$.
△ Less
Submitted 23 June, 2024; v1 submitted 18 April, 2023;
originally announced April 2023.
-
Liouville Theorem for Harmonic Maps from Riemannian Manifold with Compact Boundary
Authors:
Jun Sun,
Xiaobao Zhu
Abstract:
In this note we will provide a gradient estimate for harmonic maps from a complete noncompact Riemannian manifold with compact boundary (which we call "Kasue manifold") into a simply connected complete Riemannian manifold with non-positive sectional curvature. As a consequence, we can obtain a Liouville theorem. We will also show the nonexistence of positive solutions to some linear elliptic equat…
▽ More
In this note we will provide a gradient estimate for harmonic maps from a complete noncompact Riemannian manifold with compact boundary (which we call "Kasue manifold") into a simply connected complete Riemannian manifold with non-positive sectional curvature. As a consequence, we can obtain a Liouville theorem. We will also show the nonexistence of positive solutions to some linear elliptic equation on Kasue manifold.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.
-
AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks
Authors:
Hao Sun,
Li Shen,
Qihuang Zhong,
Liang Ding,
Shixiang Chen,
Jingwei Sun,
Jing Li,
Guangzhong Sun,
Dacheng Tao
Abstract:
Sharpness aware minimization (SAM) optimizer has been extensively explored as it can generalize better for training deep neural networks via introducing extra perturbation steps to flatten the landscape of deep learning models. Integrating SAM with adaptive learning rate and momentum acceleration, dubbed AdaSAM, has already been explored empirically to train large-scale deep neural networks withou…
▽ More
Sharpness aware minimization (SAM) optimizer has been extensively explored as it can generalize better for training deep neural networks via introducing extra perturbation steps to flatten the landscape of deep learning models. Integrating SAM with adaptive learning rate and momentum acceleration, dubbed AdaSAM, has already been explored empirically to train large-scale deep neural networks without theoretical guarantee due to the triple difficulties in analyzing the coupled perturbation step, adaptive learning rate and momentum step. In this paper, we try to analyze the convergence rate of AdaSAM in the stochastic non-convex setting. We theoretically show that AdaSAM admits a $\mathcal{O}(1/\sqrt{bT})$ convergence rate, which achieves linear speedup property with respect to mini-batch size $b$. Specifically, to decouple the stochastic gradient steps with the adaptive learning rate and perturbed gradient, we introduce the delayed second-order momentum term to decompose them to make them independent while taking an expectation during the analysis. Then we bound them by showing the adaptive learning rate has a limited range, which makes our analysis feasible. To the best of our knowledge, we are the first to provide the non-trivial convergence rate of SAM with an adaptive learning rate and momentum acceleration. At last, we conduct several experiments on several NLP tasks, which show that AdaSAM could achieve superior performance compared with SGD, AMSGrad, and SAM optimizers.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Control Co-design of a Hydrokinetic Turbine: A Comparative Study of Open-loop Optimal Control and Feedback Control
Authors:
Mohammad Reza Amini,
Boxi Jiang,
Yingqian Liao,
Kartik Naik,
Joaquim R. R. A. Martins,
Jing Sun
Abstract:
Control co-design (CCD) explores physical and control design spaces simultaneously to optimize a system's performance. A commonly used CCD framework aims to achieve open-loop optimal control (OLOC) trajectory while optimizing the physical design variables subject to constraints on control and design parameters. In this study, in contrast with the conventional CCD methods based on OLOC schemes, we…
▽ More
Control co-design (CCD) explores physical and control design spaces simultaneously to optimize a system's performance. A commonly used CCD framework aims to achieve open-loop optimal control (OLOC) trajectory while optimizing the physical design variables subject to constraints on control and design parameters. In this study, in contrast with the conventional CCD methods based on OLOC schemes, we present a CCD formulation that explicitly considers a feedback controller. In the formulation, we consider two control laws based on proportional linear and quadratic state feedback, where the control gain is optimized. The simulation results show that the OLOC trajectory could be approximated by a feedback controller. While the total energy generated from the CCD with a feedback controller is slightly lower than that of the CCD with OLOC, it results in a much simpler control structure and more robust performance in the presence of uncertainties and disturbances, making it suitable for real-time control. The study in this paper investigates the performance of optimal hydrokinetic turbine design with a feedback controller in the presence of uncertainties and disturbances to demonstrate the benefits and highlight challenges associated with incorporating the feedback controller explicitly in the CCD stage.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Sharp error estimates for spatial-temporal finite difference approximations to fractional sub-diffusion equation without regularity assumption on the exact solution
Authors:
Daxin Nie,
Jing Sun,
Weihua Deng
Abstract:
Finite difference method as a popular numerical method has been widely used to solve fractional diffusion equations. In the general spatial error analyses, an assumption $u\in C^{4}(\barΩ)$ is needed to preserve $\mathcal{O}(h^{2})$ convergence when using central finite difference scheme to solve fractional sub-diffusion equation with Laplace operator, but this assumption is somewhat strong, where…
▽ More
Finite difference method as a popular numerical method has been widely used to solve fractional diffusion equations. In the general spatial error analyses, an assumption $u\in C^{4}(\barΩ)$ is needed to preserve $\mathcal{O}(h^{2})$ convergence when using central finite difference scheme to solve fractional sub-diffusion equation with Laplace operator, but this assumption is somewhat strong, where $u$ is the exact solution and $h$ is the mesh size. In this paper, a novel analysis technique is proposed to show that the spatial convergence rate can reach $\mathcal{O}(h^{\min(σ+\frac{1}{2}-ε,2)})$ in both $l^{2}$-norm and $l^{\infty}$-norm in one-dimensional domain when the initial value and source term are both in $\hat{H}^σ(Ω)$ but without any regularity assumption on the exact solution, where $σ\geq 0$ and $ε>0$ being arbitrarily small. After making slight modifications on the scheme, acting on the initial value and source term, the spatial convergence rate can be improved to $\mathcal{O}(h^{2})$ in $l^{2}$-norm and $\mathcal{O}(h^{\min(σ+\frac{3}{2}-ε,2)})$ in $l^{\infty}$-norm. It's worth mentioning that our spatial error analysis is applicable to high dimensional cube domain by using the properties of tensor product. Moreover, two kinds of averaged schemes are provided to approximate the Riemann--Liouville fractional derivative, and $\mathcal{O}(τ^{2})$ convergence is obtained for all $α\in(0,1)$. Finally, some numerical experiments verify the effectiveness of the built theory.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
BINN: A deep learning approach for computational mechanics problems based on boundary integral equations
Authors:
Jia Sun,
Yinghua Liu,
Yizheng Wang,
Zhenhan Yao,
Xiaoping Zheng
Abstract:
We proposed the boundary-integral type neural networks (BINN) for the boundary value problems in computational mechanics. The boundary integral equations are employed to transfer all the unknowns to the boundary, then the unknowns are approximated using neural networks and solved through a training process. The loss function is chosen as the residuals of the boundary integral equations. Regulariza…
▽ More
We proposed the boundary-integral type neural networks (BINN) for the boundary value problems in computational mechanics. The boundary integral equations are employed to transfer all the unknowns to the boundary, then the unknowns are approximated using neural networks and solved through a training process. The loss function is chosen as the residuals of the boundary integral equations. Regularization techniques are adopted to efficiently evaluate the weakly singular and Cauchy principle integrals in boundary integral equations. Potential problems and elastostatic problems are mainly concerned in this article as a demonstration. The proposed method has several outstanding advantages: First, the dimensions of the original problem are reduced by one, thus the freedoms are greatly reduced. Second, the proposed method does not require any extra treatment to introduce the boundary conditions, since they are naturally considered through the boundary integral equations. Therefore, the method is suitable for complex geometries. Third, BINN is suitable for problems on the infinite or semi-infinite domains. Moreover, BINN can easily handle heterogeneous problems with a single neural network without domain decomposition.
△ Less
Submitted 11 January, 2023;
originally announced January 2023.
-
DOSnet as a Non-Black-Box PDE Solver: When Deep Learning Meets Operator Splitting
Authors:
Yuan Lan,
Zhen Li,
Jie Sun,
Yang Xiang
Abstract:
Deep neural networks (DNNs) recently emerged as a promising tool for analyzing and solving complex differential equations arising in science and engineering applications. Alternative to traditional numerical schemes, learning-based solvers utilize the representation power of DNNs to approximate the input-output relations in an automated manner. However, the lack of physics-in-the-loop often makes…
▽ More
Deep neural networks (DNNs) recently emerged as a promising tool for analyzing and solving complex differential equations arising in science and engineering applications. Alternative to traditional numerical schemes, learning-based solvers utilize the representation power of DNNs to approximate the input-output relations in an automated manner. However, the lack of physics-in-the-loop often makes it difficult to construct a neural network solver that simultaneously achieves high accuracy, low computational burden, and interpretability. In this work, focusing on a class of evolutionary PDEs characterized by having decomposable operators, we show that the classical ``operator splitting'' numerical scheme of solving these equations can be exploited to design neural network architectures. This gives rise to a learning-based PDE solver, which we name Deep Operator-Splitting Network (DOSnet). Such non-black-box network design is constructed from the physical rules and operators governing the underlying dynamics contains learnable parameters, and is thus more flexible than the standard operator splitting scheme. Once trained, it enables the fast solution of the same type of PDEs. To validate the special structure inside DOSnet, we take the linear PDEs as the benchmark and give the mathematical explanation for the weight behavior. Furthermore, to demonstrate the advantages of our new AI-enhanced PDE solver, we train and validate it on several types of operator-decomposable differential equations. We also apply DOSnet to nonlinear Schrödinger equations (NLSE) which have important applications in the signal processing for modern optical fiber transmission systems, and experimental results show that our model has better accuracy and lower computational complexity than numerical schemes and the baseline DNNs.
△ Less
Submitted 11 December, 2022;
originally announced December 2022.
-
Deep Galerkin Method for Mean Field Control Problem
Authors:
Jingruo Sun
Abstract:
We consider an optimal control problem where the average welfare of weakly interacting agents is of interest. We examine the mean-field control problem as the fluid approximation of the N-agent control problem with the setup of finite-state space, continuous-time, and finite-horizon. The value function of the mean-field control problem is characterized as the unique viscosity solution of a Hamilto…
▽ More
We consider an optimal control problem where the average welfare of weakly interacting agents is of interest. We examine the mean-field control problem as the fluid approximation of the N-agent control problem with the setup of finite-state space, continuous-time, and finite-horizon. The value function of the mean-field control problem is characterized as the unique viscosity solution of a Hamilton-Jacobi-Bellman equation in the simplex. We apply the DGM to estimate the value function and the evolution of the distribution. We also prove the numerical solution approximated by a neural network converges to the analytical solution.
△ Less
Submitted 9 February, 2024; v1 submitted 3 December, 2022;
originally announced December 2022.
-
Causal identification for continuous-time stochastic processes
Authors:
Jinghao Sun,
Forrest W. Crawford
Abstract:
Many real-world processes are trajectories that may be regarded as continuous-time "functional data". Examples include patients' biomarker concentrations, environmental pollutant levels, and prices of stocks. Corresponding advances in data collection have yielded near continuous-time measurements, from e.g. physiological monitors, wearable digital devices, and environmental sensors. Statistical me…
▽ More
Many real-world processes are trajectories that may be regarded as continuous-time "functional data". Examples include patients' biomarker concentrations, environmental pollutant levels, and prices of stocks. Corresponding advances in data collection have yielded near continuous-time measurements, from e.g. physiological monitors, wearable digital devices, and environmental sensors. Statistical methodology for estimating the causal effect of a time-varying treatment, measured discretely in time, is well developed. But discrete-time methods like the g-formula, structural nested models, and marginal structural models do not generalize easily to continuous time, due to the entanglement of uncountably infinite variables. Moreover, researchers have shown that the choice of discretization time scale can seriously affect the quality of causal inferences about the effects of an intervention. In this paper, we establish causal identification results for continuous-time treatment-outcome relationships for general cadlag stochastic processes under continuous-time confounding, through orthogonalization and weighting. We use three concrete running examples to demonstrate the plausibility of our identification assumptions, as well as their connections to the discrete-time g methods literature.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.