-
Distributed Prescribed-Time Convex Optimization: Cascade Design and Time-Varying Gain Approach
Authors:
Gewei Zuo,
Lijun Zhu,
Yujuan Wang,
Zhiyong Chen
Abstract:
In this paper, we address the distributed prescribed-time convex optimization (DPTCO) problem for a class of nonlinear multi-agent systems (MASs) under undirected connected graph. A cascade design framework is proposed such that the DPTCO implementation is divided into two parts: distributed optimal trajectory generator design and local reference trajectory tracking controller design. The DPTCO pr…
▽ More
In this paper, we address the distributed prescribed-time convex optimization (DPTCO) problem for a class of nonlinear multi-agent systems (MASs) under undirected connected graph. A cascade design framework is proposed such that the DPTCO implementation is divided into two parts: distributed optimal trajectory generator design and local reference trajectory tracking controller design. The DPTCO problem is then transformed into the prescribed-time stabilization problem of a cascaded system. Changing Lyapunov function method and time-varying state transformation method together with the sufficient conditions are proposed to prove the prescribed-time stabilization of the cascaded system as well as the uniform boundedness of internal signals in the closed-loop systems. The proposed framework is then utilized to solve robust DPTCO problem for a class of chain-integrator MASs with external disturbances by constructing a novel variables and exploiting the property of time-varying gains. The proposed framework is further utilized to solve the adaptive DPTCO problem for a class of strict-feedback MASs with parameter uncertainty, in which backstepping method with prescribed-time dynamic filter is adopted. The descending power state transformation is introduced to compensate the growth of increasing rate induced by the derivative of time-varying gains in recursive steps and the high-order derivative of local reference trajectory is not required. Finally, theoretical results are verified by two numerical examples.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Product Geometries on Cholesky Manifolds with Applications to SPD Manifolds
Authors:
Ziheng Chen,
Yue Song,
Xiao-Jun Wu,
Nicu Sebe
Abstract:
This paper presents two new metrics on the Symmetric Positive Definite (SPD) manifold via the Cholesky manifold, i.e., the space of lower triangular matrices with positive diagonal elements. We first unveil that the existing popular Riemannian metric on the Cholesky manifold can be generally characterized as the product metric of a Euclidean metric and a Riemannian metric on the space of n-dimensi…
▽ More
This paper presents two new metrics on the Symmetric Positive Definite (SPD) manifold via the Cholesky manifold, i.e., the space of lower triangular matrices with positive diagonal elements. We first unveil that the existing popular Riemannian metric on the Cholesky manifold can be generally characterized as the product metric of a Euclidean metric and a Riemannian metric on the space of n-dimensional positive vectors. Based on this analysis, we propose two novel metrics on the Cholesky manifolds, i.e., Diagonal Power Euclidean Metric and Diagonal Generalized Bures-Wasserstein Metric, which are numerically stabler than the existing Cholesky metric. We also discuss the gyro structures and deformed metrics associated with our metrics. The gyro structures connect the linear and geometric properties, while the deformed metrics interpolate between our proposed metrics and the existing metric. Further, by Cholesky decomposition, the proposed deformed metrics and gyro structures are pulled back to SPD manifolds. Compared with existing Riemannian metrics on SPD manifolds, our metrics are easy to use, computationally efficient, and numerically stable.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback
Authors:
Zhirui Chen,
Vincent Y. F. Tan
Abstract:
We consider offline reinforcement learning (RL) with preference feedback in which the implicit reward is a linear function of an unknown parameter. Given an offline dataset, our objective consists in ascertaining the optimal action for each state, with the ultimate goal of minimizing the {\em simple regret}. We propose an algorithm, \underline{RL} with \underline{L}ocally \underline{O}ptimal \unde…
▽ More
We consider offline reinforcement learning (RL) with preference feedback in which the implicit reward is a linear function of an unknown parameter. Given an offline dataset, our objective consists in ascertaining the optimal action for each state, with the ultimate goal of minimizing the {\em simple regret}. We propose an algorithm, \underline{RL} with \underline{L}ocally \underline{O}ptimal \underline{W}eights or {\sc RL-LOW}, which yields a simple regret of $\exp ( - Ω(n/H) )$ where $n$ is the number of data samples and $H$ denotes an instance-dependent hardness quantity that depends explicitly on the suboptimality gap of each action. Furthermore, we derive a first-of-its-kind instance-dependent lower bound in offline RL with preference feedback. Interestingly, we observe that the lower and upper bounds on the simple regret match order-wise in the exponent, demonstrating order-wise optimality of {\sc RL-LOW}. In view of privacy considerations in practical applications, we also extend {\sc RL-LOW} to the setting of $(\varepsilon,δ)$-differential privacy and show, somewhat surprisingly, that the hardness parameter $H$ is unchanged in the asymptotic regime as $n$ tends to infinity; this underscores the inherent efficiency of {\sc RL-LOW} in terms of preserving the privacy of the observed rewards. Given our focus on establishing instance-dependent bounds, our work stands in stark contrast to previous works that focus on establishing worst-case regrets for offline RL with preference feedback.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Well-posedness and large deviations of fractional McKean-Vlasov stochastic reaction-diffusion equations on unbounded domains
Authors:
Zhang Chen,
Bixiang Wang
Abstract:
This paper is mainly concerned with the large deviation principle of the fractional McKean-Vlasov stochastic reaction-diffusion equation defined on R^n with polynomial drift of any degree. We first prove the well-posedness of the underlying equation under a dissipative condition, and then show the strong convergence of solutions of the corresponding controlled equation with respect to the weak top…
▽ More
This paper is mainly concerned with the large deviation principle of the fractional McKean-Vlasov stochastic reaction-diffusion equation defined on R^n with polynomial drift of any degree. We first prove the well-posedness of the underlying equation under a dissipative condition, and then show the strong convergence of solutions of the corresponding controlled equation with respect to the weak topology of controls, by employing the idea of uniform tail-ends estimates of solutions in order to circumvent the non-compactness of Sobolev embeddings on unbounded domains. We finally establish the large deviation principle of the fractional McKean-Vlasov equation by the weak convergence method without assuming the time Holder continuity of the non-autonomous diffusion coefficients.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Nodewise Loreg: Nodewise $L_0$-penalized Regression for High-dimensional Sparse Precision Matrix Estimation
Authors:
Hai Shu,
Ziqi Chen,
Yingjie Zhang,
Hongtu Zhu
Abstract:
We propose Nodewise Loreg, a nodewise $L_0$-penalized regression method for estimating high-dimensional sparse precision matrices. We establish its asymptotic properties, including convergence rates, support recovery, and asymptotic normality under high-dimensional sub-Gaussian settings. Notably, the Nodewise Loreg estimator is asymptotically unbiased and normally distributed, eliminating the need…
▽ More
We propose Nodewise Loreg, a nodewise $L_0$-penalized regression method for estimating high-dimensional sparse precision matrices. We establish its asymptotic properties, including convergence rates, support recovery, and asymptotic normality under high-dimensional sub-Gaussian settings. Notably, the Nodewise Loreg estimator is asymptotically unbiased and normally distributed, eliminating the need for debiasing required by Nodewise Lasso. We also develop a desparsified version of Nodewise Loreg, similar to the desparsified Nodewise Lasso estimator. The asymptotic variances of the undesparsified Nodewise Loreg estimator are upper bounded by those of both desparsified Nodewise Loreg and Lasso estimators for Gaussian data, potentially offering more powerful statistical inference. Extensive simulations show that the undesparsified Nodewise Loreg estimator generally outperforms the two desparsified estimators in asymptotic normal behavior. Moreover, Nodewise Loreg surpasses Nodewise Lasso, CLIME, and GLasso in most simulations in terms of matrix norm losses, support recovery, and timing performance. Application to a breast cancer gene expression dataset further demonstrates Nodewise Loreg's superiority over the three $L_1$-norm based methods.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Expressive Power of Graph Neural Networks for (Mixed-Integer) Quadratic Programs
Authors:
Ziang Chen,
Xiaohan Chen,
Jialin Liu,
Xinshang Wang,
Wotao Yin
Abstract:
Quadratic programming (QP) is the most widely applied category of problems in nonlinear programming. Many applications require real-time/fast solutions, though not necessarily with high precision. Existing methods either involve matrix decomposition or use the preconditioned conjugate gradient method. For relatively large instances, these methods cannot achieve the real-time requirement unless the…
▽ More
Quadratic programming (QP) is the most widely applied category of problems in nonlinear programming. Many applications require real-time/fast solutions, though not necessarily with high precision. Existing methods either involve matrix decomposition or use the preconditioned conjugate gradient method. For relatively large instances, these methods cannot achieve the real-time requirement unless there is an effective precondition.
Recently, graph neural networks (GNNs) opened new possibilities for QP. Some promising empirical studies of applying GNNs for QP tasks show that GNNs can capture key characteristics of an optimization instance and provide adaptive guidance accordingly to crucial configurations during the solving process, or directly provide an approximate solution. Despite notable empirical observations, theoretical foundations are still lacking. In this work, we investigate the expressive or representative power of GNNs, a crucial aspect of neural network theory, specifically in the context of QP tasks, with both continuous and mixed-integer settings. We prove the existence of message-passing GNNs that can reliably represent key properties of quadratic programs, including feasibility, optimal objective value, and optimal solution. Our theory is validated by numerical results.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Nonlinear Optimal Guidance with Constraints on Impact Time and Impact Angle
Authors:
Fanchen Wu,
Zheng Chen,
Xueming Shao,
Kun Wang
Abstract:
This paper aims to address the nonlinear optimal guidance problem with impact-time and impact-angle constraints, which is fundamentally important for multiple pursuers to collaboratively achieve a target. Addressing such a guidance problem is equivalent to solving a nonlinear minimum-effort control problem in real time. To this end, the Pontryagain's maximum principle is employed to convert extrem…
▽ More
This paper aims to address the nonlinear optimal guidance problem with impact-time and impact-angle constraints, which is fundamentally important for multiple pursuers to collaboratively achieve a target. Addressing such a guidance problem is equivalent to solving a nonlinear minimum-effort control problem in real time. To this end, the Pontryagain's maximum principle is employed to convert extremal trajectories as the solutions of a parameterized differential system. The geometric property for the solution of the parameterized system is analyzed, leading to an additional optimality condition. By incorporating this optimality condition and the usual disconjugacy condition into the parameterized system, the dataset for optimal trajectories can be generated by propagating the parameterized system without using any optimization methods. In addition, a scaling invariance property is found for the solutions of the parameterized system. As a consequence of this scaling invariance property, a simple feedforward neural network trained by the solution of the parameterized system, selected at any fixed time, can be used to generate the nonlinear optimal guidance within milliseconds. Finally, numerical examples are presented, showing that the nonlinear optimal guidance command generated by the trained network can not only ensure the expected impact angle and impact time are precisely met but also requires less control effort compared with existing guidance methods.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Stable degeneration of families of klt singularities with constant local volume
Authors:
Zhiyuan Chen
Abstract:
We prove that for a locally stable family of klt singularities with constant local volume, the ideal sequences of the minimizing valuations for the normalized volume function form a family of ideals with flat cosupport, which induces a degeneration to a locally stable family of K-semistable log Fano cone singularities. Our proof is a family version of the method of C. Xu and Z. Zhuang proving fini…
▽ More
We prove that for a locally stable family of klt singularities with constant local volume, the ideal sequences of the minimizing valuations for the normalized volume function form a family of ideals with flat cosupport, which induces a degeneration to a locally stable family of K-semistable log Fano cone singularities. Our proof is a family version of the method of C. Xu and Z. Zhuang proving finite generation by Kollár models and multiple degenerations.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Graph Random Walk for Time-of-Flight Charge Mobilities
Authors:
Zhongquan Chen,
Pim van der Hoorn,
Björn Baumeier
Abstract:
We present a graph random walk (GRW) method for the study of charge transport properties of complex molecular materials in the time-of-flight regime. The molecules forming the material are represented by the vertices of a directed weighted graph, and the charge carriers are random walkers. The edge weights are rates for elementary jumping processes for a charge carrier to move along the edge and a…
▽ More
We present a graph random walk (GRW) method for the study of charge transport properties of complex molecular materials in the time-of-flight regime. The molecules forming the material are represented by the vertices of a directed weighted graph, and the charge carriers are random walkers. The edge weights are rates for elementary jumping processes for a charge carrier to move along the edge and are determined from a combination of the energies of the involved vertices and an interaction strength. Exclusions are built into the random walk to account for the Pauli exclusion principle. In time-of-flight experiments, charge carriers are injected into the material and the time until they reach a collecting electrode is recorded. In this setting, our GRW approach allows direct evaluation of the expected hitting time of the collecting nodes in the graph in terms of a typically sparse, linear system, thereby avoiding numerically cumbersome and potentially fluctuations-prone methods based on explicit time evolution from solutions of a high-dimensional system of coupled ordinary differential equations (the Master Equation) or from kinetic Monte Carlo (KMC). We validate the GRW approach by conducting numerical studies of charge dynamics of single and multiple carriers in diffusive and drift-diffusive regimes using a surrogate lattice model. The surrogate model allows varying types and strengths of energetic disorder from the reference baseline. Comparison with results from the Master Equation confirms the theoretical equivalence of both approaches also in numerical implementations. We further show that KMC results show substantial deviations due to inadequate sampling. All in all, we find that the GRW method provides a powerful alternative to the more commonly used methods without sampling issues and with the benefit of making use of sparse matrix methods.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Newsvendor under Mean-Variance Ambiguity and Misspecification
Authors:
Feng Liu,
Zhi Chen,
Ruodu Wang,
Shuming Wang
Abstract:
Consider a newsvendor problem with an unknown demand distribution. When addressing the issue of distributional uncertainty, we distinguish ambiguity under which the newsvendor does not differentiate demand distributions of common distributional characteristics (e.g., mean and variance) and misspecification under which such characteristics might be misspecified (due to, e.g., estimation error and/o…
▽ More
Consider a newsvendor problem with an unknown demand distribution. When addressing the issue of distributional uncertainty, we distinguish ambiguity under which the newsvendor does not differentiate demand distributions of common distributional characteristics (e.g., mean and variance) and misspecification under which such characteristics might be misspecified (due to, e.g., estimation error and/or distribution shift). The newsvendor hedges against ambiguity and misspecification by maximizing the worst-case expected profit regularized by a distribution's distance to an ambiguity set of distributions with some specified characteristics. Focusing on the popular mean-variance ambiguity set and optimal-transport cost for the misspecification, we show that the decision criterion of misspecification aversion possesses insightful interpretations as distributional transforms and convex risk measures. We derive the closed-form optimal order quantity that generalizes the solution of the seminal Scarf model under only ambiguity aversion. This highlights the impact of misspecification aversion: the optimal order quantity under misspecification aversion can decrease as the price or variance increases, reversing the monotonicity of that under only ambiguity aversion. Hence, ambiguity and misspecification, as different layers of distributional uncertainty, can result in distinct operational consequences. We quantify the finite-sample performance guarantee, which consists of two parts: the in-sample optimal value and the out-of-sample effect of misspecification that can be decoupled into estimation error and distribution shift. This theoretically justifies the necessity of incorporating misspecification aversion in a non-stationary environment, which is also well demonstrated in our experiments with real-world retailing data.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
Quantum Steenrod operations and Fukaya categories
Authors:
Zihong Chen
Abstract:
This paper is concerned with quantum cohomology and Fukaya categories of a closed monotone symplectic manifold $X$, where we use coefficients in a field $\mathbf{k}$ of characteristic $p>0$. The first main result of this paper is that the quantum Steenrod operations $QΣ$ admit an interpretation in terms of the Fukaya category of $X$, via suitable versions of the open-closed maps. Using this, we sh…
▽ More
This paper is concerned with quantum cohomology and Fukaya categories of a closed monotone symplectic manifold $X$, where we use coefficients in a field $\mathbf{k}$ of characteristic $p>0$. The first main result of this paper is that the quantum Steenrod operations $QΣ$ admit an interpretation in terms of the Fukaya category of $X$, via suitable versions of the open-closed maps. Using this, we show that $QΣ$, whose definition is intrinsic to characteristic $p$, is compatible with certain structures inherited from the quantum connection in characteristic $0$. We then turn to applications of these results. The first application is an arithmetic proof of the unramified exponential type conjecture for $X$ that satisfies Abouzaid's generation criterion over $\overline{\mathbb{Q}}$, which uses a reduction mod $p$ argument. Next, we demonstrate how the categorical perspective provides new tools for computing $QΣ$ beyond the reach of known technology. We also explore potential connections of our work to arithmetic homological mirror symmetry.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Planning of Truck Platooning for Road-Network Capacitated Vehicle Routing Problem
Authors:
Yilang Hao,
Zhibin Chen,
Xiaotong Sun,
Lu Tong
Abstract:
Truck platooning, a linking technology of trucks on the highway, has gained enormous attention in recent years due to its benefits in energy and operation cost savings. However, most existing studies on truck platooning limit their focus on scenarios in which each truck can serve only one customer demand and is thus with a specified origin-destination pair, so only routing and time schedules are c…
▽ More
Truck platooning, a linking technology of trucks on the highway, has gained enormous attention in recent years due to its benefits in energy and operation cost savings. However, most existing studies on truck platooning limit their focus on scenarios in which each truck can serve only one customer demand and is thus with a specified origin-destination pair, so only routing and time schedules are considered. Nevertheless, in real-world logistics, each truck may need to serve multiple customers located at different places, and the operator has to determine not only the routing and time schedules of each truck but also the set of customers allocated to each truck and their sequence to visit. This is well known as a capacitated vehicle routing problem with time windows (CVRPTW), and considering the application of truck platooning in such a problem entails new modeling frameworks and tailored solution algorithms. In light of this, this study makes the first attempt to optimize the truck platooning plan for a road-network CVRPTW to minimize the total operation cost, including vehicles' fixed dispatch cost and energy cost, while fulfilling all delivery demands within their time window constraints. Specifically, the operation plan will dictate the number of trucks to be dispatched, the set of customers, and the routing and time schedules for each truck. In addition, the modeling framework is constructed based on a road network instead of a traditional customer node graph to better resemble and facilitate the platooning operation. A 3-stage algorithm embedded with a "route-then-schedule" scheme, dynamic programming, and modified insertion heuristic, is developed to solve the proposed model in a timely manner. Numerical experiments are conducted to validate the modeling framework, demonstrate the performance of the proposed solution algorithm, and quantify the benefit of truck platooning.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Matching the Statistical Query Lower Bound for k-sparse Parity Problems with Stochastic Gradient Descent
Authors:
Yiwen Kou,
Zixiang Chen,
Quanquan Gu,
Sham M. Kakade
Abstract:
The $k$-parity problem is a classical problem in computational complexity and algorithmic theory, serving as a key benchmark for understanding computational classes. In this paper, we solve the $k$-parity problem with stochastic gradient descent (SGD) on two-layer fully-connected neural networks. We demonstrate that SGD can efficiently solve the $k$-sparse parity problem on a $d$-dimensional hyper…
▽ More
The $k$-parity problem is a classical problem in computational complexity and algorithmic theory, serving as a key benchmark for understanding computational classes. In this paper, we solve the $k$-parity problem with stochastic gradient descent (SGD) on two-layer fully-connected neural networks. We demonstrate that SGD can efficiently solve the $k$-sparse parity problem on a $d$-dimensional hypercube ($k\le O(\sqrt{d})$) with a sample complexity of $\tilde{O}(d^{k-1})$ using $2^{Θ(k)}$ neurons, thus matching the established $Ω(d^{k})$ lower bounds of Statistical Query (SQ) models. Our theoretical analysis begins by constructing a good neural network capable of correctly solving the $k$-parity problem. We then demonstrate how a trained neural network with SGD can effectively approximate this good network, solving the $k$-parity problem with small statistical errors. Our theoretical results and findings are supported by empirical evidence, showcasing the efficiency and efficacy of our approach.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision
Authors:
Zhuo Chen,
Jacob McCarran,
Esteban Vizcaino,
Marin Soljačić,
Di Luo
Abstract:
Partial differential equations (PDEs) are instrumental for modeling dynamical systems in science and engineering. The advent of neural networks has initiated a significant shift in tackling these complexities though challenges in accuracy persist, especially for initial value problems. In this paper, we introduce the $\textit{Time-Evolving Natural Gradient (TENG)}$, generalizing time-dependent var…
▽ More
Partial differential equations (PDEs) are instrumental for modeling dynamical systems in science and engineering. The advent of neural networks has initiated a significant shift in tackling these complexities though challenges in accuracy persist, especially for initial value problems. In this paper, we introduce the $\textit{Time-Evolving Natural Gradient (TENG)}$, generalizing time-dependent variational principles and optimization-based time integration, leveraging natural gradient optimization to obtain high accuracy in neural-network-based PDE solutions. Our comprehensive development includes algorithms like TENG-Euler and its high-order variants, such as TENG-Heun, tailored for enhanced precision and efficiency. TENG's effectiveness is further validated through its performance, surpassing current leading methods and achieving $\textit{machine precision}$ in step-by-step optimizations across a spectrum of PDEs, including the heat equation, Allen-Cahn equation, and Burgers' equation.
△ Less
Submitted 3 June, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Nehari manifold optimization and its application for finding unstable solutions of semilinear elliptic PDEs
Authors:
Zhaoxing Chen,
Wei Liu,
Ziqing Xie,
Wenfan Yi
Abstract:
A Nehari manifold optimization method (NMOM) is introduced for finding 1-saddles, i.e., saddle points with the Morse index equal to one, of a generic nonlinear functional in Hilbert spaces. Actually, it is based on the variational characterization that 1-saddles of the generic functional are local minimizers of the same functional restricted on the associated Nehari manifold. The framework contain…
▽ More
A Nehari manifold optimization method (NMOM) is introduced for finding 1-saddles, i.e., saddle points with the Morse index equal to one, of a generic nonlinear functional in Hilbert spaces. Actually, it is based on the variational characterization that 1-saddles of the generic functional are local minimizers of the same functional restricted on the associated Nehari manifold. The framework contains two important ingredients: one is the retraction mapping to make the iteration points always lie on the Nehari manifold; the other is the tangential search direction to decrease the generic functional with suitable step-size search rules. Particularly, the global convergence is rigorously established by virtue of some crucial analysis techniques (including a weak convergence method) overcoming difficulties in the infinite-dimensional setting. In practice, combining with an easy-to-implement Nehari retraction and the negative Riemannian gradient direction, the NMOM is successfully applied to compute the unstable ground-state solutions of a class of typical semilinear elliptic PDEs such as Hénon equation and the stationary nonlinear Schrödinger equation. In particular, the symmetry-breaking phenomenon of the ground states of Hénon equation is explored numerically in 1D and 2D with interesting numerical findings on the critical value of symmetry-breaking reported.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
A decomposition theorem for the affine Springer fibers
Authors:
Zongbin Chen
Abstract:
According to Laumon, an affine Springer fiber is homeomorphic to the universal abelian covering of the compactified Jacobian of a spectral curve. We construct equivariant deformations $f_{n}:\overline{\mathcal{P}}_{n}\to \mathcal{B}_{n}$ of the finite abelian coverings of this compactified Jacobian, and decompose the complex $Rf_{n,*}\mathbf{Q}_{\ell}$ as direct sum of intersection complexes. Pass…
▽ More
According to Laumon, an affine Springer fiber is homeomorphic to the universal abelian covering of the compactified Jacobian of a spectral curve. We construct equivariant deformations $f_{n}:\overline{\mathcal{P}}_{n}\to \mathcal{B}_{n}$ of the finite abelian coverings of this compactified Jacobian, and decompose the complex $Rf_{n,*}\mathbf{Q}_{\ell}$ as direct sum of intersection complexes. Pass to the limit, we obtain a similar expression for the homology of the affine Springer fibers. A quite surprising consequence is that we can reduce the homology to its $Λ^{0}$-invariant subspace. As an application, we get a sheaf-theoretic reformulation of the purity hypothesis of Goresky, Kottwitz and MacPherson. In an attempt to solve it, we propose a conjecture about the punctural weight of the intermediate extension of a smooth $\ell$-adic sheaf of pure weight.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
On the dependence of the affine Springer fibers on their root valuation datum
Authors:
Zongbin Chen
Abstract:
For the group $\mathrm{GL}_{d}$, we confirm a conjecture of Goresky, Kottwitz and MacPherson, which states that the cohomology of the affine Springer fibers depend only on the root valuation datum of their defining elements. The proof relies on a microlocal analysis of the intermediate extensions appearing in Ngô's support theorem, and a Whitney regularity property of the union of the strict $δ$-s…
▽ More
For the group $\mathrm{GL}_{d}$, we confirm a conjecture of Goresky, Kottwitz and MacPherson, which states that the cohomology of the affine Springer fibers depend only on the root valuation datum of their defining elements. The proof relies on a microlocal analysis of the intermediate extensions appearing in Ngô's support theorem, and a Whitney regularity property of the union of the strict $δ$-strata over the equisingular strata for the miniversal deformation of plane curve singularities.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Optimal State Equation for the Control of a Diffusion with Two Distinct Dynamics
Authors:
Zengjing Chen,
Panyu Wu,
Xiaowen Zhou
Abstract:
We consider a class of stochastic control problems which has been widely used in optimal foraging theory. The state processes have two distinct dynamics, characterized by two pairs of drift and diffusion coefficients, depending on whether it takes values bigger or smaller than a threshold value. Adopting a perturbation type approach, we find an expression for potential measure of the optimal state…
▽ More
We consider a class of stochastic control problems which has been widely used in optimal foraging theory. The state processes have two distinct dynamics, characterized by two pairs of drift and diffusion coefficients, depending on whether it takes values bigger or smaller than a threshold value. Adopting a perturbation type approach, we find an expression for potential measure of the optimal state process. We then obtain an expression for the transition density of the optimal state process by inverting the associated Laplace transform. Properties including the stationary distribution of the optimal state process are discussed. Finally, the expression of the value function is given for this class stochastic control problems.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Fuel-optimal powered descent guidance for lunar pinpoint landing using neural networks
Authors:
Kun Wang,
Zheng Chen,
Jun Li
Abstract:
This paper presents a Neural Networks (NNs) based approach for designing the Fuel-Optimal Powered Descent Guidance (FOPDG) for lunar pinpoint landing. According to Pontryagin's Minimum Principle, the optimality conditions are first derived. To generate the dataset of optimal trajectories for training NNs, we formulate a parameterized system, which allows for generating each optimal trajectory by a…
▽ More
This paper presents a Neural Networks (NNs) based approach for designing the Fuel-Optimal Powered Descent Guidance (FOPDG) for lunar pinpoint landing. According to Pontryagin's Minimum Principle, the optimality conditions are first derived. To generate the dataset of optimal trajectories for training NNs, we formulate a parameterized system, which allows for generating each optimal trajectory by a simple propagation without using any optimization method. Then, a dataset containing the optimal state and optimal thrust vector pairs can be readily collected. Since it is challenging for NNs to approximate bang-bang (or discontinuous) type of optimal thrust magnitude, we introduce a regularisation function to the switching function so that the regularized switching function approximated by a simple NN can be used to represent the optimal thrust magnitude. Meanwhile, another two well-trained NNs are used to predict the thrust steering angle and time of flight given a flight state. Finally, numerical simulations show that the proposed method is capable of generating the FOPDG that steers the lunar lander to the desired landing site with acceptable landing errors.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
A Max-Min-Max Algorithm for Large-Scale Robust Optimization
Authors:
Kai Tu,
Zhi Chen,
Man-Chung Yue
Abstract:
Robust optimization (RO) is a powerful paradigm for decision making under uncertainty. Existing algorithms for solving RO, including the reformulation approach and the cutting-plane method, do not scale well, hindering the application of RO to large-scale decision problems. In this paper, we devise a first-order algorithm for solving RO based on a novel max-min-max perspective. Our algorithm opera…
▽ More
Robust optimization (RO) is a powerful paradigm for decision making under uncertainty. Existing algorithms for solving RO, including the reformulation approach and the cutting-plane method, do not scale well, hindering the application of RO to large-scale decision problems. In this paper, we devise a first-order algorithm for solving RO based on a novel max-min-max perspective. Our algorithm operates directly on the model functions and sets through the subgradient and projection oracles, which enables the exploitation of problem structures and is especially suitable for large-scale RO. Theoretically, we prove that the oracle complexity of our algorithm for attaining an $\varepsilon$-approximate optimal solution is $\mathcal{O}(\varepsilon^{-3})$ or $\mathcal{O}(\varepsilon^{-2})$, depending on the smoothness of the model functions. The algorithm and its theoretical results are then extended to RO with projection-unfriendly uncertainty sets. We also show via extensive numerical experiments that the proposed algorithm outperforms the reformulation approach, the cutting-plane method and two other recent first-order algorithms.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Fuel-Optimal Trajectory Planning for Lunar Vertical Landing
Authors:
Kun Wang,
Zheng Chen,
Jun Li
Abstract:
In this paper, we consider a trajectory planning problem arising from a lunar vertical landing with minimum fuel consumption. The vertical landing requirement is written as a final steering angle constraint, and a nonnegative regularization term is proposed to modify the cost functional. In this way, the final steering angle constraint will be inherently satisfied according to Pontryagin's Minimum…
▽ More
In this paper, we consider a trajectory planning problem arising from a lunar vertical landing with minimum fuel consumption. The vertical landing requirement is written as a final steering angle constraint, and a nonnegative regularization term is proposed to modify the cost functional. In this way, the final steering angle constraint will be inherently satisfied according to Pontryagin's Minimum Principle. As a result, the modified optimal steering angle has to be determined by solving a transcendental equation. To this end, a transforming procedure is employed, which allows for finding the desired optimal steering angle by a simple bisection method. Consequently, the modified optimal control problem can be solved by the indirect shooting method. Finally, some numerical examples are presented to demonstrate and verify the developments of the paper.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
A simple lower bound for the complexity of estimating partition functions on a quantum computer
Authors:
Zherui Chen,
Giacomo Nannicini
Abstract:
We study the complexity of estimating the partition function $\mathsf{Z}(β)=\sum_{x\inχ} e^{-βH(x)}$ for a Gibbs distribution characterized by the Hamiltonian $H(x)$. We provide a simple and natural lower bound for quantum algorithms that solve this task by relying on reflections through the coherent encoding of Gibbs states. Our primary contribution is a $\varOmega(1/ε)$ lower bound for the numbe…
▽ More
We study the complexity of estimating the partition function $\mathsf{Z}(β)=\sum_{x\inχ} e^{-βH(x)}$ for a Gibbs distribution characterized by the Hamiltonian $H(x)$. We provide a simple and natural lower bound for quantum algorithms that solve this task by relying on reflections through the coherent encoding of Gibbs states. Our primary contribution is a $\varOmega(1/ε)$ lower bound for the number of reflections needed to estimate the partition function with a quantum algorithm. The proof is based on a reduction from the problem of estimating the Hamming weight of an unknown binary string.
△ Less
Submitted 8 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Monodromy of generalized Lame equations with Darboux-Treibich-Verdier potentials: A universal law
Authors:
Zhijie Chen,
Chang-Shou Lin
Abstract:
The Darboux-Treibich-Verdier (DTV) potential $\sum_{k=0}^{3}n_{k}(n_{k}+1)\wp(z+\tfrac{ ω_{k}}{2};τ)$ is well-known as doubly-periodic solutions of the stationary KdV hierarchy (Treibich-Verdier, Duke Math. J. {\bf 68} (1992), 217-236). In this paper, we study the generalized Lamé equation with the DTV potential \begin{equation*} y^{\prime \prime }(z)=\bigg[ \sum_{k=0}^{3}n_{k}(n_{k}+1)\wp(z+\tfra…
▽ More
The Darboux-Treibich-Verdier (DTV) potential $\sum_{k=0}^{3}n_{k}(n_{k}+1)\wp(z+\tfrac{ ω_{k}}{2};τ)$ is well-known as doubly-periodic solutions of the stationary KdV hierarchy (Treibich-Verdier, Duke Math. J. {\bf 68} (1992), 217-236). In this paper, we study the generalized Lamé equation with the DTV potential \begin{equation*} y^{\prime \prime }(z)=\bigg[ \sum_{k=0}^{3}n_{k}(n_{k}+1)\wp(z+\tfrac{ ω_{k}}{2};τ)+B\bigg] y(z),\quad n_{k}\in \mathbb{N} \end{equation*} from the monodromy aspect. We prove that the map from $(τ, B)$ to the monodromy data $(r,s)$ satisfies a surprising universal law $dτ\wedge dB\equiv8π^2 dr\wedge ds.$ Our proof applies Panlevé VI equation and modular forms. We also give applications to the algebraic multiplicity of (anti)periodic eigenvalues for the associated Hill operator.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
On a stability of higher level Coxeter unipotent representations
Authors:
Zhe Chen
Abstract:
Let $\mathbb{G}$ be a connected reductive group over $\mathcal{O}$, a complete discrete valuation ring with finite residue field $\mathbb{F}_q$. Let $R_{T_r,U_r}^θ$ be a level $r$ Deligne--Lusztig representation of $\mathbb{G}(\mathcal{O})$. We show that, if $q$ is not small, the Coxeter unipotent $R_{T_r,U_r}^1$ degenerates to the $r=1$ case. For $\mathbb{G}=\mathrm{GL}_2$ (or $\mathrm{SL}_2$), a…
▽ More
Let $\mathbb{G}$ be a connected reductive group over $\mathcal{O}$, a complete discrete valuation ring with finite residue field $\mathbb{F}_q$. Let $R_{T_r,U_r}^θ$ be a level $r$ Deligne--Lusztig representation of $\mathbb{G}(\mathcal{O})$. We show that, if $q$ is not small, the Coxeter unipotent $R_{T_r,U_r}^1$ degenerates to the $r=1$ case. For $\mathbb{G}=\mathrm{GL}_2$ (or $\mathrm{SL}_2$), as an application we give the dimensions and decompositions of all Coxeter $R_{T_r,U_r}^θ$. For general $\mathbb{G}$ we state a conjectural sign formula for $R_{T_r,U_r}^θ$.
△ Less
Submitted 27 May, 2024; v1 submitted 29 March, 2024;
originally announced April 2024.
-
Non-Abelian observable-geometric phases and the Riemann zeros
Authors:
Zeqian Chen
Abstract:
The Hilbert-Pólya conjecture asserts that the imaginary parts of the nontrivial zeros of the Riemann zeta function (the Riemann zeros) are the eigenvalues of a self-adjoint operator (a quantum mechanical Hamiltonian, in the physical sense), as a promising approach to prove the Riemann hypothesis (cf.\cite{SH2011}). Instead of the eigenvalues, in this paper we consider observable-geometric phases a…
▽ More
The Hilbert-Pólya conjecture asserts that the imaginary parts of the nontrivial zeros of the Riemann zeta function (the Riemann zeros) are the eigenvalues of a self-adjoint operator (a quantum mechanical Hamiltonian, in the physical sense), as a promising approach to prove the Riemann hypothesis (cf.\cite{SH2011}). Instead of the eigenvalues, in this paper we consider observable-geometric phases as the realization of the Riemann zeros in a periodically driven quantum system, which were introduced in \cite{Chen2020} for the study of geometric quantum computation. To this end, we further introduce the notion of non-Abelian observable-geometric phases, involving which we give an approach to finding a physical system to study the Riemann zeros. Since the observable-geometric phases are connected with the geometry of the observable space according to the evolution of the Heisenberg equation, this sheds some light on the investigation of the Riemann hypothesis.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
On the flag structure and classification of the holomorphic curves on C*-algebras
Authors:
Zhimeng Chen,
Jing Xu
Abstract:
In this note, we will define the formulas of curvature and it's covariant derivatives for holomorphic curves on C*-algebras for the multivariable case. As applications, the unitarily and similarly classification theorems for holomorphic bundle and commuting operator tuples in Cowen-Douglas class are given.
In this note, we will define the formulas of curvature and it's covariant derivatives for holomorphic curves on C*-algebras for the multivariable case. As applications, the unitarily and similarly classification theorems for holomorphic bundle and commuting operator tuples in Cowen-Douglas class are given.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Local well-posedness for dispersion generalized Benjamin-Ono equations in Fourier-Lebesgue spaces
Authors:
Zijun Chen
Abstract:
We prove that the Cauchy problem for the dispersion generalized Benjamin-Ono equation where $0<α\leq 1$ \begin{eqnarray*} \left\{ \begin{array}{l} \partial_t u+|\partial_x|^{1+α}\partial_x u+uu_x=0,\\ u(x,0)=u_0(x), \end{array} \right. \end{eqnarray*} is locally well-posed in the Fourier-Lebesgue space $\widehat{H}^{s}_{r}(\mathbb{R})$. This is proved via Picard iteration arguments using…
▽ More
We prove that the Cauchy problem for the dispersion generalized Benjamin-Ono equation where $0<α\leq 1$ \begin{eqnarray*} \left\{ \begin{array}{l} \partial_t u+|\partial_x|^{1+α}\partial_x u+uu_x=0,\\ u(x,0)=u_0(x), \end{array} \right. \end{eqnarray*} is locally well-posed in the Fourier-Lebesgue space $\widehat{H}^{s}_{r}(\mathbb{R})$. This is proved via Picard iteration arguments using $X^{s,b}$-type space adapted to the Fourier-Lebesgue space, inspired by the work of Grünrock and Vega. Note that, previously, Molinet, Saut and Tzvetkov \cite{MST2001} proved that the solution map is not $C^2$ in $H^s$ for any $s$ if $0\leq α<1$. However, in the Fourier-Lebesgue space, we have a stronger smoothing effect to handle the $high\times low$ interactions.
△ Less
Submitted 16 April, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
Fully discretized Sobolev gradient flow for the Gross-Pitaevskii eigenvalue problem
Authors:
Ziang Chen,
Jianfeng Lu,
Yulong Lu,
Xiangxiong Zhang
Abstract:
For the ground state of the Gross-Pitaevskii (GP) eigenvalue problem, we consider a fully discretized Sobolev gradient flow, which can be regarded as the Riemannian gradient descent on the sphere under a metric induced by a modified $H^1$-norm. We prove its global convergence to a critical point of the discrete GP energy and its local exponential convergence to the ground state of the discrete GP…
▽ More
For the ground state of the Gross-Pitaevskii (GP) eigenvalue problem, we consider a fully discretized Sobolev gradient flow, which can be regarded as the Riemannian gradient descent on the sphere under a metric induced by a modified $H^1$-norm. We prove its global convergence to a critical point of the discrete GP energy and its local exponential convergence to the ground state of the discrete GP energy. The local exponential convergence rate depends on the eigengap of the discrete GP energy. When the discretization is the classical second-order finite difference in two dimensions, such an eigengap can be further proven to be mesh independent, i.e., it has a uniform positive lower bound, thus the local exponential convergence rate is mesh independent. Numerical experiments with discretization by high order $Q^k$ spectral element methods in two and three dimensions are provided to validate the efficiency of the proposed method.
△ Less
Submitted 9 March, 2024;
originally announced March 2024.
-
An arbitrarily high order unfitted finite element method for elliptic interface problems with automatic mesh generation, Part II. Piecewise-smooth interfaces
Authors:
Zhiming Chen,
Yong Liu
Abstract:
We consider the reliable implementation of an adaptive high-order unfitted finite element method on Cartesian meshes for solving elliptic interface problems with geometrically curved singularities. We extend our previous work on the reliable cell merging algorithm for smooth interfaces to automatically generate the induced mesh for piecewise smooth interfaces. An $hp$ a posteriori error estimate i…
▽ More
We consider the reliable implementation of an adaptive high-order unfitted finite element method on Cartesian meshes for solving elliptic interface problems with geometrically curved singularities. We extend our previous work on the reliable cell merging algorithm for smooth interfaces to automatically generate the induced mesh for piecewise smooth interfaces. An $hp$ a posteriori error estimate is derived for a new unfitted finite element method whose finite element functions are conforming in each subdomain. Numerical examples illustrate the competitive performance of the method.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
On the dynamics of three-layer neural networks: initial condensation
Authors:
Zheng-An Chen,
Tao Luo
Abstract:
Empirical and theoretical works show that the input weights of two-layer neural networks, when initialized with small values, converge towards isolated orientations. This phenomenon, referred to as condensation, indicates that the gradient descent methods tend to spontaneously reduce the complexity of neural networks during the training process. In this work, we elucidate the mechanisms behind the…
▽ More
Empirical and theoretical works show that the input weights of two-layer neural networks, when initialized with small values, converge towards isolated orientations. This phenomenon, referred to as condensation, indicates that the gradient descent methods tend to spontaneously reduce the complexity of neural networks during the training process. In this work, we elucidate the mechanisms behind the condensation phenomena occurring in the training of three-layer neural networks and distinguish it from the training of two-layer neural networks. Through rigorous theoretical analysis, we establish the blow-up property of effective dynamics and present a sufficient condition for the occurrence of condensation, findings that are substantiated by experimental results. Additionally, we explore the association between condensation and the low-rank bias observed in deep matrix factorization.
△ Less
Submitted 27 February, 2024; v1 submitted 24 February, 2024;
originally announced February 2024.
-
Neural-Network-Based Optimal Guidance for Lunar Vertical Landing
Authors:
Kun Wang,
Zheng Chen,
Fangmin Lu,
Jun Li
Abstract:
This paper addresses an optimal guidance problem concerning the vertical landing of a lunar lander with the objective of minimizing fuel consumption. The vertical landing imposes a final attitude constraint, which is treated as a final control constraint. To handle this constraint, we propose a nonnegative small regularization term to augment the original cost functional. This ensures the satisfac…
▽ More
This paper addresses an optimal guidance problem concerning the vertical landing of a lunar lander with the objective of minimizing fuel consumption. The vertical landing imposes a final attitude constraint, which is treated as a final control constraint. To handle this constraint, we propose a nonnegative small regularization term to augment the original cost functional. This ensures the satisfaction of the final control constraint in accordance with Pontryagin's Minimum Principle. By leveraging the necessary conditions for optimality, we establish a parameterized system that facilitates the generation of numerous optimal trajectories, which contain the nonlinear mapping from the flight state to the optimal guidance command. Subsequently, a neural network is trained to approximate such mapping. Finally, numerical examples are presented to validate the proposed method.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Diffeomorphism Neural Operator for various domains and parameters of partial differential equations
Authors:
Zhiwei Zhao,
Changqing Liu,
Yingguang Li,
Zhibin Chen,
Xu Liu
Abstract:
In scientific and engineering applications, solving partial differential equations (PDEs) across various parameters and domains normally relies on resource-intensive numerical methods. Neural operators based on deep learning offered a promising alternative to PDEs solving by directly learning physical laws from data. However, the current neural operator methods were limited to solve PDEs on fixed…
▽ More
In scientific and engineering applications, solving partial differential equations (PDEs) across various parameters and domains normally relies on resource-intensive numerical methods. Neural operators based on deep learning offered a promising alternative to PDEs solving by directly learning physical laws from data. However, the current neural operator methods were limited to solve PDEs on fixed domains. Expanding neural operators to solve PDEs on various domains hold significant promise in medical imaging, engineering design and manufacturing applications, where geometric and parameter changes are essential. This paper presents a novel neural operator learning framework for solving PDEs with various domains and parameters defined for physical systems, named diffeomorphism neural operator (DNO). The main idea is that a neural operator learns in a generic domain which is diffeomorphically mapped from various physics domains expressed by the same PDE. In this way, the challenge of operator learning on various domains is transformed into operator learning on the generic domain. The generalization performance of DNO on different domains can be assessed by a proposed method which evaluates the geometric similarity between a new domain and the domains of training dataset after diffeomorphism. Experiments on Darcy flow, pipe flow, airfoil flow and mechanics were carried out, where harmonic and volume parameterization were used as the diffeomorphism for 2D and 3D domains. The DNO framework demonstrated robust learning capabilities and strong generalization performance across various domains and parameters.
△ Less
Submitted 20 June, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
Convergence rate and exponential stability of backward Euler method for neutral stochastic delay differential equations under generalized monotonicity conditions
Authors:
Jingjing Cai,
Ziheng Chen,
Yuanling Niu
Abstract:
This work focuses on the numerical approximations of neutral stochastic delay differential equations with their drift and diffusion coefficients growing super-linearly with respect to both delay variables and state variables. Under generalized monotonicity conditions, we prove that the backward Euler method not only converges strongly in the mean square sense with order $1/2$, but also inherit the…
▽ More
This work focuses on the numerical approximations of neutral stochastic delay differential equations with their drift and diffusion coefficients growing super-linearly with respect to both delay variables and state variables. Under generalized monotonicity conditions, we prove that the backward Euler method not only converges strongly in the mean square sense with order $1/2$, but also inherit the mean square exponential stability of the original equations. As a byproduct, we obtain the same results on convergence rate and exponential stability of the backward Euler method for stochastic delay differential equations with generalized monotonicity conditions. These theoretical results are finally supported by several numerical experiments.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Authors:
Ziang Chen,
Rong Ge
Abstract:
In this work, we study the mean-field flow for learning subspace-sparse polynomials using stochastic gradient descent and two-layer neural networks, where the input distribution is standard Gaussian and the output only depends on the projection of the input onto a low-dimensional subspace. We propose a basis-free generalization of the merged-staircase property in Abbe et al. (2022) and establish a…
▽ More
In this work, we study the mean-field flow for learning subspace-sparse polynomials using stochastic gradient descent and two-layer neural networks, where the input distribution is standard Gaussian and the output only depends on the projection of the input onto a low-dimensional subspace. We propose a basis-free generalization of the merged-staircase property in Abbe et al. (2022) and establish a necessary condition for the SGD-learnability. In addition, we prove that the condition is almost sufficient, in the sense that a condition slightly stronger than the necessary condition can guarantee the exponential decay of the loss functional to zero.
△ Less
Submitted 8 June, 2024; v1 submitted 14 February, 2024;
originally announced February 2024.
-
Rethinking the Capacity of Graph Neural Networks for Branching Strategy
Authors:
Ziang Chen,
Jialin Liu,
Xiaohan Chen,
Xinshang Wang,
Wotao Yin
Abstract:
Graph neural networks (GNNs) have been widely used to predict properties and heuristics of mixed-integer linear programs (MILPs) and hence accelerate MILP solvers. This paper investigates the capacity of GNNs to represent strong branching (SB), the most effective yet computationally expensive heuristic employed in the branch-and-bound algorithm. In the literature, message-passing GNN (MP-GNN), as…
▽ More
Graph neural networks (GNNs) have been widely used to predict properties and heuristics of mixed-integer linear programs (MILPs) and hence accelerate MILP solvers. This paper investigates the capacity of GNNs to represent strong branching (SB), the most effective yet computationally expensive heuristic employed in the branch-and-bound algorithm. In the literature, message-passing GNN (MP-GNN), as the simplest GNN structure, is frequently used as a fast approximation of SB and we find that not all MILPs's SB can be represented with MP-GNN. We precisely define a class of "MP-tractable" MILPs for which MP-GNNs can accurately approximate SB scores. Particularly, we establish a universal approximation theorem: for any data distribution over the MP-tractable class, there always exists an MP-GNN that can approximate the SB score with arbitrarily high accuracy and arbitrarily high probability, which lays a theoretical foundation of the existing works on imitating SB with MP-GNN. For MILPs without the MP-tractability, unfortunately, a similar result is impossible, which can be illustrated by two MILP instances with different SB scores that cannot be distinguished by any MP-GNN, regardless of the number of parameters. Recognizing this, we explore another GNN structure called the second-order folklore GNN (2-FGNN) that overcomes this limitation, and the aforementioned universal approximation theorem can be extended to the entire MILP space using 2-FGNN, regardless of the MP-tractability. A small-scale numerical experiment is conducted to directly validate our theoretical findings.
△ Less
Submitted 8 June, 2024; v1 submitted 10 February, 2024;
originally announced February 2024.
-
Efficient Algorithms for Sum-of-Minimum Optimization
Authors:
Lisang Ding,
Ziang Chen,
Xinshang Wang,
Wotao Yin
Abstract:
In this work, we propose a novel optimization model termed "sum-of-minimum" optimization. This model seeks to minimize the sum or average of $N$ objective functions over $k$ parameters, where each objective takes the minimum value of a predefined sub-function with respect to the $k$ parameters. This universal framework encompasses numerous clustering applications in machine learning and related fi…
▽ More
In this work, we propose a novel optimization model termed "sum-of-minimum" optimization. This model seeks to minimize the sum or average of $N$ objective functions over $k$ parameters, where each objective takes the minimum value of a predefined sub-function with respect to the $k$ parameters. This universal framework encompasses numerous clustering applications in machine learning and related fields. We develop efficient algorithms for solving sum-of-minimum optimization problems, inspired by a randomized initialization algorithm for the classic $k$-means (Arthur & Vassilvitskii, 2007) and Lloyd's algorithm (Lloyd, 1982). We establish a new tight bound for the generalized initialization algorithm and prove a gradient-descent-like convergence rate for generalized Lloyd's algorithm. The efficiency of our algorithms is numerically examined on multiple tasks, including generalized principal component analysis, mixed linear regression, and small-scale neural network training. Our approach compares favorably to previous ones based on simpler-but-less-precise optimization reformulations.
△ Less
Submitted 9 June, 2024; v1 submitted 10 February, 2024;
originally announced February 2024.
-
On operadic open-closed maps in characteristic $p$
Authors:
Zihong Chen
Abstract:
Consider a closed monotone symplectic manifold $(M,ω)$. \cite{Gan2} constructed a cyclic open-closed map, which goes from the cyclic homology of the Fukaya category of $M$ to the $S^1$-equivariant quantum cohomology of $M$. In this paper, we show that with mod $p$ coefficients, Ganatra's cyclic open-closed map is compatible with a certain $\mathbb{Z}/p$-equivariant open-closed map under the natura…
▽ More
Consider a closed monotone symplectic manifold $(M,ω)$. \cite{Gan2} constructed a cyclic open-closed map, which goes from the cyclic homology of the Fukaya category of $M$ to the $S^1$-equivariant quantum cohomology of $M$. In this paper, we show that with mod $p$ coefficients, Ganatra's cyclic open-closed map is compatible with a certain $\mathbb{Z}/p$-equivariant open-closed map under the natural $\mathbb{Z}/p$-Gysin type comparison map for Hochschild homology. Along with the proof, this paper gives a new homotopy theoretic framework for studying open-closed maps in symplectic topology. These will be used in an upcoming work \cite{Che} to study mod $p$ equivariant enumerative invariants such as the Quantum Steenrod operations. The main insights of this paper are: 1) a $\mathbb{Z}/p$-Gysin comparison result for ($\mathcal{A}_{\infty}$-) cyclic objects, 2) a new construction of the open-closed map using operadic Floer theory of \cite{AGV}, which gives rise to a new interpretation of its `$S^1$-equivariant' property, and 3) comparison of the new construction with its classical counterpart.
△ Less
Submitted 14 May, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Whether $p$-conductive homogeneity holds depends on $p$
Authors:
Shiping Cao,
Zhen-Qing Chen
Abstract:
We introduce two fractals, in Euclidean spaces of dimension two and three respectively, such the $2$-conductive homogeneity holds but there is some $\eps \in (0, 1)$ so that the $p$-conductive homogeneity fails for every $p\in (1, 1+\eps)$. In addition, these two fractals have Ahlfors regular conformal dimension within the interval $(1, 2)$ and $(2, 3)$, respectively.
We introduce two fractals, in Euclidean spaces of dimension two and three respectively, such the $2$-conductive homogeneity holds but there is some $\eps \in (0, 1)$ so that the $p$-conductive homogeneity fails for every $p\in (1, 1+\eps)$. In addition, these two fractals have Ahlfors regular conformal dimension within the interval $(1, 2)$ and $(2, 3)$, respectively.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Convergence of resistances on generalized {S}ierpiński carpets
Authors:
Shiping Cao,
Zhen-Qing Chen
Abstract:
We positively answer the open question of Barlow and Bass about the convergence of renormalized effective resistance between opposite faces of Euclidean domains approximating a generalized {S}ierpiński carpet.
We positively answer the open question of Barlow and Bass about the convergence of renormalized effective resistance between opposite faces of Euclidean domains approximating a generalized {S}ierpiński carpet.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Nonconvex weighted variational metal artifacts removal via convergent primal-dual algorithms
Authors:
Lianfang Wang,
Zhangling Chen,
Zhifang Liu,
Yutong Li,
Yunsong Zhao,
Hongwei Li,
Huibin Chang
Abstract:
Direct reconstruction through filtered back projection engenders metal artifacts in polychromatic computed tomography images, attributed to highly attenuating implants, which further poses great challenges for subsequent image analysis. Inpainting the metal trace directly in the Radon domain for the extant variational method leads to strong edge diffusion and potential inherent artifacts. With nor…
▽ More
Direct reconstruction through filtered back projection engenders metal artifacts in polychromatic computed tomography images, attributed to highly attenuating implants, which further poses great challenges for subsequent image analysis. Inpainting the metal trace directly in the Radon domain for the extant variational method leads to strong edge diffusion and potential inherent artifacts. With normalization based on pre-segmentation, the inpainted outcome can be notably ameliorated. However, its reconstructive fidelity is heavily contingent on the precision of the presegmentation, and highly accurate segmentation of images with metal artifacts is non-trivial in actuality. In this paper, we propose a nonconvex weighted variational approach for metal artifact reduction. Specifically, in lieu of employing a binary function with zeros in the metal trace, an adaptive weight function is designed in the Radon domain, with zeros in the overlapping regions of multiple disjoint metals as well as areas of highly attenuated projections, and the inverse square root of the measured projection in other regions. A nonconvex L1-alpha L2 regularization term is incorporated to further enhance edge contrast, alongside a box-constraint in the image domain. Efficient first-order primal-dual algorithms, proven to be globally convergent and of low computational cost owing to the closed-form solution of all subproblems, are devised to resolve such a constrained nonconvex model. Both simulated and real experiments are conducted with comparisons to other variational algorithms, validating the superiority of the presented method. Especially in comparison to Reweighted JSR, our proposed algorithm can curtail the total computational cost to at most one-third, and for the case of inaccurate pre-segmentation, the recovery outcomes by the proposed algorithms are notably enhanced.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
A Physics-Informed Indirect Method for Trajectory Optimization
Authors:
Kun Wang,
Fangmin Lu,
Zheng Chen,
Jun Li
Abstract:
This work presents a Physics-Informed Indirect Method (PIIM) that propagates the dynamics of both states and co-states backward in time for trajectory optimization problems. In the case of a Time-Optimal Soft Landing Problem (TOSLP), based on the initial co-state vector normalization technique, we show that the initial guess of the mass co-state and the numerical factor can be eliminated from the…
▽ More
This work presents a Physics-Informed Indirect Method (PIIM) that propagates the dynamics of both states and co-states backward in time for trajectory optimization problems. In the case of a Time-Optimal Soft Landing Problem (TOSLP), based on the initial co-state vector normalization technique, we show that the initial guess of the mass co-state and the numerical factor can be eliminated from the shooting procedure. As a result, the initial guess of the unknown co-states can be constrained to lie on a unit 3-D hypersphere. Then, using the PIIM allows one to exploit the physical significance of the optimal control law, which further narrows down the solution space to a unit 3-D octant sphere. Meanwhile, the analytical estimations of the fuel consumption and final time are provided. Additionally, a usually overlooked issue that results in an infeasible solution with a negative final time, is fixed by a simple remedy strategy. Consequently, the reduced solution space becomes sufficiently small to ensure fast, robust, and guaranteed convergence for the TOSLP. Then, we extend the PIIM to solve the Fuel-Optimal Soft Landing Problem (FOSLP) with a homotopy approach. The numerical simulations show that compared with the conventional indirect method with a success rate of 89.35%, it takes a shorter time for the proposed method to find the feasible solution to the FOSLP with a success rate of 100%.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Stochastic theta methods for random periodic solution of stochastic differential equations under non-globally Lipschitz conditions
Authors:
Ziheng Chen,
Liangmin Cao,
Lin Chen
Abstract:
This work focuses on the numerical approximations of random periodic solutions of stochastic differential equations (SDEs). Under non-globally Lipschitz conditions, we prove the existence and uniqueness of random periodic solutions for the considered equations and its numerical approximations generated by the stochastic theta (ST) methods with theta within (1/2,1]. It is shown that the random peri…
▽ More
This work focuses on the numerical approximations of random periodic solutions of stochastic differential equations (SDEs). Under non-globally Lipschitz conditions, we prove the existence and uniqueness of random periodic solutions for the considered equations and its numerical approximations generated by the stochastic theta (ST) methods with theta within (1/2,1]. It is shown that the random periodic solution of each ST method converges strongly in the mean square sense to that of SDEs for all step size. More precisely, the mean square convergence order is 1/2 for SDEs with multiplicative noise and 1 for SDEs with additive noise. Numerical results are finally reported to confirm these theoretical findings.
△ Less
Submitted 20 June, 2024; v1 submitted 18 January, 2024;
originally announced January 2024.
-
Mutation invariants of cluster algebras of rank 2
Authors:
Zhichao Chen,
Zixu Li
Abstract:
We consider the mutation invariants of cluster algebras of rank 2. We characterize the mutation invariants of finite type. Two examples are provided for the affine type and we prove the non-existence of Laurent mutation invariants of non-affine type. As an application, a class of Diophantine equations encoded with cluster algebras are studied.
We consider the mutation invariants of cluster algebras of rank 2. We characterize the mutation invariants of finite type. Two examples are provided for the affine type and we prove the non-existence of Laurent mutation invariants of non-affine type. As an application, a class of Diophantine equations encoded with cluster algebras are studied.
△ Less
Submitted 4 May, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
Fixed-Budget Differentially Private Best Arm Identification
Authors:
Zhirui Chen,
P. N. Karthik,
Yeow Meng Chee,
Vincent Y. F. Tan
Abstract:
We study best arm identification (BAI) in linear bandits in the fixed-budget regime under differential privacy constraints, when the arm rewards are supported on the unit interval. Given a finite budget $T$ and a privacy parameter $\varepsilon>0$, the goal is to minimise the error probability in finding the arm with the largest mean after $T$ sampling rounds, subject to the constraint that the pol…
▽ More
We study best arm identification (BAI) in linear bandits in the fixed-budget regime under differential privacy constraints, when the arm rewards are supported on the unit interval. Given a finite budget $T$ and a privacy parameter $\varepsilon>0$, the goal is to minimise the error probability in finding the arm with the largest mean after $T$ sampling rounds, subject to the constraint that the policy of the decision maker satisfies a certain {\em $\varepsilon$-differential privacy} ($\varepsilon$-DP) constraint. We construct a policy satisfying the $\varepsilon$-DP constraint (called {\sc DP-BAI}) by proposing the principle of {\em maximum absolute determinants}, and derive an upper bound on its error probability. Furthermore, we derive a minimax lower bound on the error probability, and demonstrate that the lower and the upper bounds decay exponentially in $T$, with exponents in the two bounds matching order-wise in (a) the sub-optimality gaps of the arms, (b) $\varepsilon$, and (c) the problem complexity that is expressible as the sum of two terms, one characterising the complexity of standard fixed-budget BAI (without privacy constraints), and the other accounting for the $\varepsilon$-DP constraint. Additionally, we present some auxiliary results that contribute to the derivation of the lower bound on the error probability. These results, we posit, may be of independent interest and could prove instrumental in proving lower bounds on error probabilities in several other bandit problems. Whereas prior works provide results for BAI in the fixed-budget regime without privacy constraints or in the fixed-confidence regime with privacy constraints, our work fills the gap in the literature by providing the results for BAI in the fixed-budget regime under the $\varepsilon$-DP constraint.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Boosting Gradient Ascent for Continuous DR-submodular Maximization
Authors:
Qixin Zhang,
Zongqi Wan,
Zengde Deng,
Zaiyi Chen,
Xiaoming Sun,
Jialin Zhang,
Yu Yang
Abstract:
Projected Gradient Ascent (PGA) is the most commonly used optimization scheme in machine learning and operations research areas. Nevertheless, numerous studies and examples have shown that the PGA methods may fail to achieve the tight approximation ratio for continuous DR-submodular maximization problems. To address this challenge, we present a boosting technique in this paper, which can efficient…
▽ More
Projected Gradient Ascent (PGA) is the most commonly used optimization scheme in machine learning and operations research areas. Nevertheless, numerous studies and examples have shown that the PGA methods may fail to achieve the tight approximation ratio for continuous DR-submodular maximization problems. To address this challenge, we present a boosting technique in this paper, which can efficiently improve the approximation guarantee of the standard PGA to \emph{optimal} with only small modifications on the objective function. The fundamental idea of our boosting technique is to exploit non-oblivious search to derive a novel auxiliary function $F$, whose stationary points are excellent approximations to the global maximum of the original DR-submodular objective $f$. Specifically, when $f$ is monotone and $γ$-weakly DR-submodular, we propose an auxiliary function $F$ whose stationary points can provide a better $(1-e^{-γ})$-approximation than the $(γ^2/(1+γ^2))$-approximation guaranteed by the stationary points of $f$ itself. Similarly, for the non-monotone case, we devise another auxiliary function $F$ whose stationary points can achieve an optimal $\frac{1-\min_{\boldsymbol{x}\in\mathcal{C}}\|\boldsymbol{x}\|_{\infty}}{4}$-approximation guarantee where $\mathcal{C}$ is a convex constraint set. In contrast, the stationary points of the original non-monotone DR-submodular function can be arbitrarily bad~\citep{chen2023continuous}. Furthermore, we demonstrate the scalability of our boosting technique on four problems. In all of these four problems, our resulting variants of boosting PGA algorithm beat the previous standard PGA in several aspects such as approximation ratio and efficiency. Finally, we corroborate our theoretical findings with numerical experiments, which demonstrate the effectiveness of our boosting PGA methods.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
Nonlinear Optimal Guidance for Cooperatively Imposing Relative Intercept Angles
Authors:
Han Wang,
Zheng Chen
Abstract:
The optimal cooperative guidance in the nonlinear setting for intercepting a target by multiple pursuers is studied in the paper. As certain relative angles can improve observability, the guidance command is required to cooperatively control the pursuers to intercept the target with specific relative angles. By using the neural networks, an approach for real-time generation of the nonlinear cooper…
▽ More
The optimal cooperative guidance in the nonlinear setting for intercepting a target by multiple pursuers is studied in the paper. As certain relative angles can improve observability, the guidance command is required to cooperatively control the pursuers to intercept the target with specific relative angles. By using the neural networks, an approach for real-time generation of the nonlinear cooperative optimal guidance command is developed. Specifically, the optimal control problem with constraints on relative intercepting angles is formulated. Then, Pontryagin's maximum principle is used to derive the necessary conditions for optimality, which are further employed to parameterize the nonlinear optimal guidance law. As a result, the dataset for the mapping from state to nonlinear optimal guidance command can be generated by a simple propagation. A simple feedforward neural network is trained by the dataset to generate the nonlinear optimal guidance command. Finally, numerical examples are presented, showing that a nonlinear optimal guidance command with specific relative angles can be generated within a faction of a millisecond.
△ Less
Submitted 14 January, 2024;
originally announced January 2024.
-
A dynamical neural network approach for distributionally robust chance constrained Markov decision process
Authors:
Tian Xia,
Jia Liu,
Zhiping Chen
Abstract:
In this paper, we study the distributionally robust joint chance constrained Markov decision process. {Utilizing the logarithmic transformation technique,} we derive its deterministic reformulation with bi-convex terms under the moment-based uncertainty set. To cope with the non-convexity and improve the robustness of the solution, we propose a dynamical neural network approach to solve the reform…
▽ More
In this paper, we study the distributionally robust joint chance constrained Markov decision process. {Utilizing the logarithmic transformation technique,} we derive its deterministic reformulation with bi-convex terms under the moment-based uncertainty set. To cope with the non-convexity and improve the robustness of the solution, we propose a dynamical neural network approach to solve the reformulated optimization problem. Numerical results on a machine replacement problem demonstrate the efficiency of the proposed dynamical neural network approach when compared with the sequential convex approximation approach.
△ Less
Submitted 2 January, 2024; v1 submitted 23 December, 2023;
originally announced December 2023.
-
Time Lower Bounds for the Metropolis Process and Simulated Annealing
Authors:
Zongchen Chen,
Dan Mikulincer,
Daniel Reichman,
Alexander S. Wein
Abstract:
The Metropolis process (MP) and Simulated Annealing (SA) are stochastic local search heuristics that are often used in solving combinatorial optimization problems. Despite significant interest, there are very few theoretical results regarding the quality of approximation obtained by MP and SA (with polynomially many iterations) for NP-hard optimization problems.
We provide rigorous lower bounds…
▽ More
The Metropolis process (MP) and Simulated Annealing (SA) are stochastic local search heuristics that are often used in solving combinatorial optimization problems. Despite significant interest, there are very few theoretical results regarding the quality of approximation obtained by MP and SA (with polynomially many iterations) for NP-hard optimization problems.
We provide rigorous lower bounds for MP and SA with respect to the classical maximum independent set problem when the algorithms are initialized from the empty set. We establish the existence of a family of graphs for which both MP and SA fail to find approximate solutions in polynomial time. More specifically, we show that for any $\varepsilon \in (0,1)$ there are $n$-vertex graphs for which the probability SA (when limited to polynomially many iterations) will approximate the optimal solution within ratio $Ω\left(\frac{1}{n^{1-\varepsilon}}\right)$ is exponentially small. Our lower bounds extend to graphs of constant average degree $d$, illustrating the failure of MP to achieve an approximation ratio of $Ω\left(\frac{\log (d)}{d}\right)$ in polynomial time. In some cases, our impossibility results also go beyond Simulated Annealing and apply even when the temperature is chosen adaptively. Finally, we prove time lower bounds when the inputs to these algorithms are bipartite graphs, and even trees, which are known to admit polynomial-time algorithms for the independent set problem.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
On a conjecture of transposed Poisson $n$-Lie algebras
Authors:
Junyuan Huang,
Xueqing Chen,
Zhiqi Chen,
Ming Ding
Abstract:
In this paper, we obtain a rich family of identities for transposed Poisson $n$-Lie algebras, and then prove the conjecture of Bai, Bai, Guo and Wu in \cite{BBGW} under certain strong condition.
In this paper, we obtain a rich family of identities for transposed Poisson $n$-Lie algebras, and then prove the conjecture of Bai, Bai, Guo and Wu in \cite{BBGW} under certain strong condition.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
Boundary Harnack principle for non-local operators on metric measure spaces
Authors:
Zhen-Qing Chen,
Jie-Ming Wang
Abstract:
In this paper, a necessary and sufficient condition is obtained for the scale invariant boundary Harnack inequality (BHP in abbreviation) for a large class of Hunt processes on metric measure spaces that are in weak duality with another Hunt process. We next consider a discontinuous subordinate Brownian motion with Gaussian component $X_t=W_{S_t}$ in ${\bf R}^d$ for which the Lévy density of the s…
▽ More
In this paper, a necessary and sufficient condition is obtained for the scale invariant boundary Harnack inequality (BHP in abbreviation) for a large class of Hunt processes on metric measure spaces that are in weak duality with another Hunt process. We next consider a discontinuous subordinate Brownian motion with Gaussian component $X_t=W_{S_t}$ in ${\bf R}^d$ for which the Lévy density of the subordinator $S$ satisfies some mild comparability condition. We show that the scale invariant BHP holds for the subordinate Brownian motion $X$ in any Lipschitz domain satisfying the interior cone condition with common angle $θ\in (\cos^{-1}(1/\sqrt d), π)$, but fails in any truncated circular cone with angle $θ\leq \cos^{-1}(1/\sqrt d)$, a Lipschitz domain whose Lipschitz constant is larger than or equal to $1/\sqrt{d-1}.$
△ Less
Submitted 4 December, 2023;
originally announced December 2023.