subscribe to arXiv mailings

arXiv:2406.19550 [pdf, other]

Provably Efficient Posterior Sampling for Sparse Linear Regression via Measure Decomposition

Abstract: We consider the problem of sampling from the posterior distribution of a $d$-dimensional coefficient vector $\boldsymbolθ$, given linear observations $\boldsymbol{y} = \boldsymbol{X}\boldsymbolθ+\boldsymbol{\varepsilon}$. In general, such posteriors are multimodal, and therefore challenging to sample from. This observation has prompted the exploration of various heuristics that aim at approximatin… ▽ More We consider the problem of sampling from the posterior distribution of a $d$-dimensional coefficient vector $\boldsymbolθ$, given linear observations $\boldsymbol{y} = \boldsymbol{X}\boldsymbolθ+\boldsymbol{\varepsilon}$. In general, such posteriors are multimodal, and therefore challenging to sample from. This observation has prompted the exploration of various heuristics that aim at approximating the posterior distribution. In this paper, we study a different approach based on decomposing the posterior distribution into a log-concave mixture of simple product measures. This decomposition allows us to reduce sampling from a multimodal distribution of interest to sampling from a log-concave one, which is tractable and has been investigated in detail. We prove that, under mild conditions on the prior, for random designs, such measure decomposition is generally feasible when the number of samples per parameter $n/d$ exceeds a constant threshold. We thus obtain a provably efficient (polynomial time) sampling algorithm in a regime where this was previously not known. Numerical simulations confirm that the algorithm is practical, and reveal that it has attractive statistical properties compared to state-of-the-art methods. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 29 pages, 10 figures

arXiv:2406.16089 [pdf, ps, other]

A projected Euler Method for Random Periodic Solutions of Semi-linear SDEs with non-globally Lipschitz coefficients

Authors: Yujia Guo, Xiaojie Wang, Yue Wu

Abstract: The present work introduces and investigates an explicit time discretization scheme, called the projected Euler method, to numerically approximate random periodic solutions of semi-linear SDEs under non-globally Lipschitz conditions. The existence of the random periodic solution is demonstrated as the limit of the pull-back of the discretized SDE. Without relying on a priori high-order moment boun… ▽ More The present work introduces and investigates an explicit time discretization scheme, called the projected Euler method, to numerically approximate random periodic solutions of semi-linear SDEs under non-globally Lipschitz conditions. The existence of the random periodic solution is demonstrated as the limit of the pull-back of the discretized SDE. Without relying on a priori high-order moment bounds of the numerical approximations, the mean square convergence rate is proved to be order 0.5 for SDEs with multiplicative noise and order 1 for SDEs with additive noise. Numerical examples are also provided to validate our theoretical findings. △ Less

Submitted 27 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

Comments: 25 pages,5 figures

MSC Class: 37H99; 60H10; 60H35; 65C30

arXiv:2406.15822 [pdf, ps, other]

On the Weisfeiler-Leman dimension of circulant graphs

Authors: Yulai Wu, Ilia Ponomarenko

Abstract: A circulant graph is a Cayley graph of a finite cyclic group. The Weisfeiler-Leman-dimension of a circulant graph $X$ with respect to the class of all circulant graphs is the smallest positive integer~$m$ such that the $m$-dimensional Weisfeiler-Leman algorithm correctly tests the isomorphism between $X$ and any other circulant graph. It is proved that for a circulant graph of order $n$ this dimen… ▽ More A circulant graph is a Cayley graph of a finite cyclic group. The Weisfeiler-Leman-dimension of a circulant graph $X$ with respect to the class of all circulant graphs is the smallest positive integer~$m$ such that the $m$-dimensional Weisfeiler-Leman algorithm correctly tests the isomorphism between $X$ and any other circulant graph. It is proved that for a circulant graph of order $n$ this dimension is less than or equal to $Ω(n)+3$, where $Ω(n)$ is the number of prime divisors of~$n$. △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: 21 pages

MSC Class: 05E30 05C60 05E18 ACM Class: F.2.2

arXiv:2406.13684 [pdf, ps, other]

Some non-Archimedean pluripotential theory on polarized affine cones

Authors: Yueqiao Wu

Abstract: We undertake a preliminary step towards studying non-Archimedean pluripotential theory on polarized affine cones over a trivially valued field. We study plurisubharmonic functions and the Monge--Ampère operator defined on the finite energy class, partially generalizing a result of Boucksom--Jonsson on projective varieties. We undertake a preliminary step towards studying non-Archimedean pluripotential theory on polarized affine cones over a trivially valued field. We study plurisubharmonic functions and the Monge--Ampère operator defined on the finite energy class, partially generalizing a result of Boucksom--Jonsson on projective varieties. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 26 pages

arXiv:2406.13428 [pdf, ps, other]

Petty projection inequality on the sphere and on the hyperbolic space

Authors: Y. Lin, Y. Wu

Abstract: Using gnomonic projection and Poincaré model, we first define the spherical projection body and hyperbolic projection body in spherical space $\mathbb{S}^n$ and hyperbolic space $\mathbb{H}^n$, then define the spherical Steiner symmetrization and hyperbolic Steiner symmetrization, finally prove the spherical projection inequality and hyperbolic projection inequality. Using gnomonic projection and Poincaré model, we first define the spherical projection body and hyperbolic projection body in spherical space $\mathbb{S}^n$ and hyperbolic space $\mathbb{H}^n$, then define the spherical Steiner symmetrization and hyperbolic Steiner symmetrization, finally prove the spherical projection inequality and hyperbolic projection inequality. △ Less

Submitted 19 June, 2024; originally announced June 2024.

MSC Class: 52A55; 28A75; 52A20; 53A35

arXiv:2406.03238 [pdf, ps, other]

The parity of Lusztig's restriction functor and Green's formula for a quiver with automorphism

Authors: Jiepeng Fang, Yixin Lan, Yumeng Wu

Abstract: In [8], Fang-Lan-Xiao proved a formula about Lusztig's induction and restriction functors which can induce Green's formula for the path algebra of a quiver over a finite field via the trace map. In this paper, we generalize their formula to that for the mixed semisimple perverse sheaves for a quiver with an automorphism. By applying the trace map, we obtain Green's formula for any finite-dimension… ▽ More In [8], Fang-Lan-Xiao proved a formula about Lusztig's induction and restriction functors which can induce Green's formula for the path algebra of a quiver over a finite field via the trace map. In this paper, we generalize their formula to that for the mixed semisimple perverse sheaves for a quiver with an automorphism. By applying the trace map, we obtain Green's formula for any finite-dimensional hereditary algebra over a finite field. △ Less

Submitted 5 June, 2024; originally announced June 2024.

MSC Class: 16G20; 17B37

arXiv:2405.14207 [pdf, ps, other]

The Boolean polynomial polytope with multiple choice constraints

Authors: Sihong Shao, Yishan Wu

Abstract: We consider a class of $0$-$1$ polynomial programming termed multiple choice polynomial programming (MCPP) where the constraint requires exact one component per subset of the partition to be $1$ after all the entries are partitioned. Compared to the unconstrained counterpart, there are few polyhedral studies of MCPP in general form. This paper serves as the first attempt to propose a polytope asso… ▽ More We consider a class of $0$-$1$ polynomial programming termed multiple choice polynomial programming (MCPP) where the constraint requires exact one component per subset of the partition to be $1$ after all the entries are partitioned. Compared to the unconstrained counterpart, there are few polyhedral studies of MCPP in general form. This paper serves as the first attempt to propose a polytope associated with a hypergraph to study MCPP, which is the convex hull of $0$-$1$ vectors satisfying multiple choice constraints and production constraints. With the help of the decomposability property, we obtain an explicit half-space representation of the MCPP polytope when the underlying hypergraph is $α$-acyclic by induction on the number of hyperedges, which is an analogy of the acyclicity results on the multilinear polytope by Del Pia and Khajavirad (SIAM J Optim 28 (2018) 1049) when the hypergraph is $γ$-acyclic. We also present a necessary and sufficient condition for the inequalities lifted from the facet-inducing ones for the multilinear polytope to be still facet-inducing for the MCPP polytope. This result covers the particular cases by Bärmann, Martin and Schneider (SIAM J Optim 33 (2023) 2909). △ Less

Submitted 19 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: 41 pages, submitted for publication on June 19 2024

MSC Class: 90C09; 52B12; 90C57; 05C65; 90C26

arXiv:2405.03993 [pdf, other]

Capillary Surfaces in Manifolds with Nonnegative Scalar Curvature and Strictly Mean Convex Boundary

Authors: Yujie Wu

Abstract: In this paper we use stable capillary surfaces (analogous to the $μ$-bubble construction) to study manifolds with strictly mean convex boundary and nonnegative scalar curvature. We give an obstruction to filling 2-manifolds by such 3-manifolds based on the Urysohn width. We also obtain a bandwidth estimate and establish other geometric properties of such manifolds. In this paper we use stable capillary surfaces (analogous to the $μ$-bubble construction) to study manifolds with strictly mean convex boundary and nonnegative scalar curvature. We give an obstruction to filling 2-manifolds by such 3-manifolds based on the Urysohn width. We also obtain a bandwidth estimate and establish other geometric properties of such manifolds. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: 19 pages, 1 figure, comments welcome!

arXiv:2405.02034 [pdf, other]

Multi-Agent Coverage Control on Surfaces Using Conformal Mapping

Authors: Chao Zhai, Yuming Wu

Abstract: Real-time environmental monitoring using a multi-agent system (MAS) has long been a focal point of cooperative control. It is still a challenging task to provide cost-effective services for potential emergencies in surface environments. This paper explores the transformation of a general surface into a two-dimensional (2D) disk through the construction of a conformal mapping. Multiple agents are s… ▽ More Real-time environmental monitoring using a multi-agent system (MAS) has long been a focal point of cooperative control. It is still a challenging task to provide cost-effective services for potential emergencies in surface environments. This paper explores the transformation of a general surface into a two-dimensional (2D) disk through the construction of a conformal mapping. Multiple agents are strategically deployed within the mapped convex disk, followed by mapping back to the original surface environment. This approach circumvents the complexities associated with handling the difficulties and intricacies of path planning. Technical analysis encompasses the design of distributed control laws and the method to eliminate distortions introduced by the mapping. Moreover, the developed coverage algorithm is applied to a scenario of monitoring surface deformation. Finally, the effectiveness of the proposed algorithm is validated through numerical simulations. △ Less

Submitted 3 May, 2024; originally announced May 2024.

arXiv:2404.15791 [pdf, ps, other]

Error estimates of a regularized finite difference method for the Logarithmic Schrödinger equation with Dirac delta potential

Authors: Xuanxuan Zhou, Tingchun Wang, Yong Wu, Yongyong Cai

Abstract: In this paper, we introduce a conservative Crank-Nicolson-type finite difference schemes for the regularized logarithmic Schrödinger equation (RLSE) with Dirac delta potential in 1D. The regularized logarithmic Schrödinger equation with a small regularized parameter $0<\eps \ll 1$ is adopted to approximate the logarithmic Schrödinger equation (LSE) with linear convergence rate $O(\eps)$. The numer… ▽ More In this paper, we introduce a conservative Crank-Nicolson-type finite difference schemes for the regularized logarithmic Schrödinger equation (RLSE) with Dirac delta potential in 1D. The regularized logarithmic Schrödinger equation with a small regularized parameter $0<\eps \ll 1$ is adopted to approximate the logarithmic Schrödinger equation (LSE) with linear convergence rate $O(\eps)$. The numerical method can be used to avoid numerical blow-up and/or to suppress round-off error due to the logarithmic nonlinearity in LSE. Then, by using domain-decomposition technique, we can transform the original problem into an interface problem. Different treatments on the interface conditions lead to different discrete schemes and it turns out that a simple discrete approximation of the Dirac potential coincides with one of the conservative finite difference schemes. The optimal $H^1$ error estimates and the conservative properties of the finite difference schemes are investigated. The Crank-Nicolson finite difference methods enjoy the second-order convergence rate in time and space. Numerical examples are provided to support our analysis and show the accuracy and efficiency of the numerical method. △ Less

Submitted 24 April, 2024; originally announced April 2024.

Comments: 20 pages, 3 figures

MSC Class: 35Q41; 65M70; 65N35

arXiv:2404.15454 [pdf, ps, other]

Prediction from compression for models with infinite memory, with applications to hidden Markov and renewal processes

Authors: Yanjun Han, Tianze Jiang, Yihong Wu

Abstract: Consider the problem of predicting the next symbol given a sample path of length n, whose joint distribution belongs to a distribution class that may have long-term memory. The goal is to compete with the conditional predictor that knows the true model. For both hidden Markov models (HMMs) and renewal processes, we determine the optimal prediction risk in Kullback- Leibler divergence up to univers… ▽ More Consider the problem of predicting the next symbol given a sample path of length n, whose joint distribution belongs to a distribution class that may have long-term memory. The goal is to compete with the conditional predictor that knows the true model. For both hidden Markov models (HMMs) and renewal processes, we determine the optimal prediction risk in Kullback- Leibler divergence up to universal constant factors. Extending existing results in finite-order Markov models [HJW23] and drawing ideas from universal compression, the proposed estimator has a prediction risk bounded by redundancy of the distribution class and a memory term that accounts for the long-range dependency of the model. Notably, for HMMs with bounded state and observation spaces, a polynomial-time estimator based on dynamic programming is shown to achieve the optimal prediction risk Θ(log n/n); prior to this work, the only known result of this type is O(1/log n) obtained using Markov approximation [Sha+18]. Matching minimax lower bounds are obtained by making connections to redundancy and mutual information via a reduction argument. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 37 Pages

arXiv:2404.09266 [pdf, ps, other]

Multivariate confluent Vandermonde with G-Arnoldi and applications

Authors: Lei-Hong Zhang, Ya-Nan Zhang, Linyi Yang, Yifu Wu

Abstract: In the least-squares fitting framework, the Vandermonde with Arnoldi (V+A) method presented in [Brubeck, Nakatsukasa, and Trefethen, SIAM Review, 63 (2021), pp. 405-415] is an effective approach to compute a polynomial that approximates an underlying univariate function f. Extensions of V+A include its multivariate version and the univariate confluent V+A; the latter enables us to use the informat… ▽ More In the least-squares fitting framework, the Vandermonde with Arnoldi (V+A) method presented in [Brubeck, Nakatsukasa, and Trefethen, SIAM Review, 63 (2021), pp. 405-415] is an effective approach to compute a polynomial that approximates an underlying univariate function f. Extensions of V+A include its multivariate version and the univariate confluent V+A; the latter enables us to use the information of the derivative of f in obtaining the approximation polynomial. In this paper, we shall extend V+A further to the multivariate confluent V+A. Besides the technical generalization of the univariate confluent V+A, we also introduce a general and application-dependent G-orthogonalization in the Arnoldi process. We shall demonstrate with several applications that, by specifying an application-related G-inner product, the desired approximate multivariate polynomial as well as its certain partial derivatives can be computed accurately from a well-conditioned least-squares problem whose coefficient matrix is orthonormal. The desired multivariate polynomial is represented in a discrete G-orthogonal polynomials basis which admits an explicit recurrence, and therefore, facilitates evaluating function values and certain partial derivatives at new nodes efficiently. We demonstrate its flexibility by applying it to solve the multivariate Hermite least-squares problem and PDEs with various boundary conditions in irregular domains. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: 7 figures

MSC Class: 41A10; 41A63; 65F20; 65F25; 65M22

arXiv:2404.08913 [pdf, ps, other]

On the best approximation by finite Gaussian mixtures

Authors: Yun Ma, Yihong Wu, Pengkun Yang

Abstract: We consider the problem of approximating a general Gaussian location mixture by finite mixtures. The minimum order of finite mixtures that achieve a prescribed accuracy (measured by various $f$-divergences) is determined within constant factors for the family of mixing distributions with compactly support or appropriate assumptions on the tail probability including subgaussian and subexponential.… ▽ More We consider the problem of approximating a general Gaussian location mixture by finite mixtures. The minimum order of finite mixtures that achieve a prescribed accuracy (measured by various $f$-divergences) is determined within constant factors for the family of mixing distributions with compactly support or appropriate assumptions on the tail probability including subgaussian and subexponential. While the upper bound is achieved using the technique of local moment matching, the lower bound is established by relating the best approximation error to the low-rank approximation of certain trigonometric moment matrices, followed by a refined spectral analysis of their minimum eigenvalue. In the case of Gaussian mixing distributions, this result corrects a previous lower bound in [Allerton Conference 48 (2010) 620-628]. △ Less

Submitted 13 April, 2024; originally announced April 2024.

arXiv:2404.04357 [pdf, other]

Why does the two-timescale Q-learning converge to different mean field solutions? A unified convergence analysis

Authors: Jing An, Jianfeng Lu, Yue Wu, Yang Xiang

Abstract: We revisit the unified two-timescale Q-learning algorithm as initially introduced by Angiuli et al. \cite{angiuli2022unified}. This algorithm demonstrates efficacy in solving mean field game (MFG) and mean field control (MFC) problems, simply by tuning the ratio of two learning rates for mean field distribution and the Q-functions respectively. In this paper, we provide a comprehensive theoretical… ▽ More We revisit the unified two-timescale Q-learning algorithm as initially introduced by Angiuli et al. \cite{angiuli2022unified}. This algorithm demonstrates efficacy in solving mean field game (MFG) and mean field control (MFC) problems, simply by tuning the ratio of two learning rates for mean field distribution and the Q-functions respectively. In this paper, we provide a comprehensive theoretical explanation of the algorithm's bifurcated numerical outcomes under fixed learning rates. We achieve this by establishing a diagram that correlates continuous-time mean field problems to their discrete-time Q-function counterparts, forming the basis of the algorithm. Our key contribution lies in the construction of a Lyapunov function integrating both mean field distribution and Q-function iterates. This Lyapunov function facilitates a unified convergence of the algorithm across the entire spectrum of learning rates, thus providing a cohesive framework for analysis. △ Less

Submitted 28 May, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: 34 pages. Updated version for submission. We added more numerical results and fixed several minor mistakes

arXiv:2403.19823 [pdf, ps, other]

Maffei's action and symplectic Springer action for quiver varieties

Authors: Yaochen Wu

Abstract: We examine the relationship between the actions of two Weyl groups on the cohomology of a smooth quiver variety: the Maffei's action of the Weyl group associated to the quiver, and the symplectic Springer action of the Namikawa-Weyl group of the affine quiver variety. We show there is a natural map from the former group to the latter, which is an embedding in favorable situations, and this map int… ▽ More We examine the relationship between the actions of two Weyl groups on the cohomology of a smooth quiver variety: the Maffei's action of the Weyl group associated to the quiver, and the symplectic Springer action of the Namikawa-Weyl group of the affine quiver variety. We show there is a natural map from the former group to the latter, which is an embedding in favorable situations, and this map intertwines their actions on the cohomology. This answers a question raised in arXiv:1904.10497. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 24 pages. Comments are welcome!

MSC Class: 16G20

arXiv:2403.19284 [pdf, ps, other]

Existence of solutions for a class of Kirchhoff-type equations with indefinite potential

Authors: Linlian Xiao, Jiaqian Yuan, Jian Zhou, Yunshun Wu

Abstract: In this paper, we consider the existence of solutions of the following Kirchhoff-type problem \[ \left\{ \begin{array} [c]{ll} -\left(a+b\int_{\mathbb{R}^3}|\nabla u|^2dx\right)Δu+ V(x)u=f(x,u),~{\rm{in}}~ \mathbb{R}^{3},\\ u\in H^1(\mathbb{R}^3), \end{array} \right. \] where $a,b$ are postive constants, and the potential $V(x)$ is continuous and indefinite in sign. Under some suitable… ▽ More In this paper, we consider the existence of solutions of the following Kirchhoff-type problem \[ \left\{ \begin{array} [c]{ll} -\left(a+b\int_{\mathbb{R}^3}|\nabla u|^2dx\right)Δu+ V(x)u=f(x,u),~{\rm{in}}~ \mathbb{R}^{3},\\ u\in H^1(\mathbb{R}^3), \end{array} \right. \] where $a,b$ are postive constants, and the potential $V(x)$ is continuous and indefinite in sign. Under some suitable assumptions on $V(x)$ and $f$, we obtain the existence of solutions by the Symmetric Mountain Pass Theorem. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.16822 [pdf, ps, other]

Locally-primitive block designs

Authors: Jianfu Chen, Peice Hua, Cai Heng Li, Yanni Wu

Abstract: A locally-primitive design is a block design $(\mathcal{P},\mathcal{B})$ which admits an automorphism group $G$ with primitive local actions. It is proved that $G$ is primitive on the points $\mathcal{P}$, and either $G$ is an almost simple group, or $G$ acting on $\mathcal{P}$ is an affine group. A locally-primitive design is a block design $(\mathcal{P},\mathcal{B})$ which admits an automorphism group $G$ with primitive local actions. It is proved that $G$ is primitive on the points $\mathcal{P}$, and either $G$ is an almost simple group, or $G$ acting on $\mathcal{P}$ is an affine group. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.16772 [pdf, other]

The cubic nonlinear Schrödinger equation with rough potential

Authors: Norbert J. Mauser, Yifei Wu, Xiaofei Zhao

Abstract: We consider the cubic nonlinear Schrödinger equation with a spatially rough potential, a key equation in the mathematical setup for nonlinear Anderson localization. Our study comprises two main parts: new optimal results on the well-posedness analysis on the PDE level, and subsequently a new efficient numerical method, its convergence analysis and simulations that illustrate our analytical results… ▽ More We consider the cubic nonlinear Schrödinger equation with a spatially rough potential, a key equation in the mathematical setup for nonlinear Anderson localization. Our study comprises two main parts: new optimal results on the well-posedness analysis on the PDE level, and subsequently a new efficient numerical method, its convergence analysis and simulations that illustrate our analytical results. In the analysis part, our results focus on understanding how the regularity of the solution is influenced by the regularity of the potential, where we provide quantitative and explicit characterizations. Ill-posedness results are also established to demonstrate the sharpness of the obtained regularity characterizations and to indicate the minimum regularity required from the potential for the NLS to be solvable. Building upon the obtained regularity results, we design an appropriate numerical discretization for the model and establish its convergence with an optimal error bound. The numerical experiments in the end not only verify the theoretical regularity results, but also confirm the established convergence rate of the proposed scheme. Additionally, a comparison with other existing schemes is conducted to demonstrate the better accuracy of our new scheme in the case of a rough potential. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 54 pages, 8 figures

MSC Class: 65M12; 65M15; 35Q55

arXiv:2403.15663 [pdf, ps, other]

Nonlinear Stability for the Superposition of Viscous Contact Wave and Rarefaction Waves to Non-isentropic Compressible Navier-Stokes System with General Initial Perturbations

Authors: Yi Peng, Xiaoding Shi, Yuhang Wu

Abstract: In this paper, the large time behavior of the solutions for the Cauchy problem to the one-dimensional compressible Navier-Stokes system with the motion of a viscous heat-conducting perfect polytropic gas is investigated.Our result shows that the combination of a viscous contact wave with rarefaction waves is asymptotically stable, when the large initial disturbance of the density, velocity and tem… ▽ More In this paper, the large time behavior of the solutions for the Cauchy problem to the one-dimensional compressible Navier-Stokes system with the motion of a viscous heat-conducting perfect polytropic gas is investigated.Our result shows that the combination of a viscous contact wave with rarefaction waves is asymptotically stable, when the large initial disturbance of the density, velocity and temperature belong to $H^{1}(\mathbb{R})$, $L^{2}(\mathbb{R})\cap L^{4}(\mathbb{R})$ and $L^{2}(\mathbb{R})$, provided the strength of the combination waves is suitably small. In addition, the initial disturbance on the derivation of velocity and temperature belong to $L^{2}(\mathbb{R})$ can be arbitrarily large. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: 38 pages. arXiv admin note: text overlap with arXiv:1502.00211 by other authors

MSC Class: 35Q35; 35B65; 76N10; 35M10; 35B40; 35C20; 76T30 ACM Class: G.1.8

arXiv:2403.12379 [pdf, other]

Probabilistic reachable sets of stochastic nonlinear systems with contextual uncertainties

Authors: Xun Shen, Ye Wang, Kazumune Hashimoto, Yuhu Wu, Sebastien Gros

Abstract: Validating and controlling safety-critical systems in uncertain environments necessitates probabilistic reachable sets of future state evolutions. The existing methods of computing probabilistic reachable sets normally assume that the uncertainties are independent of the state. However, this assumption falls short in many real-world applications, where uncertainties are state-dependent, referred t… ▽ More Validating and controlling safety-critical systems in uncertain environments necessitates probabilistic reachable sets of future state evolutions. The existing methods of computing probabilistic reachable sets normally assume that the uncertainties are independent of the state. However, this assumption falls short in many real-world applications, where uncertainties are state-dependent, referred to as contextual uncertainties. This paper formulates the problem of computing probabilistic reachable sets of stochastic nonlinear states with contextual uncertainties by seeking minimum-volume polynomial sublevel sets with contextual chance constraints. The formulated problem cannot be solved by the existing sample-based approximation method since the existing methods do not consider the conditional probability densities. To address this, we propose a consistent sample approximation of the original problem by leveraging the conditional density estimation and resampling. The obtained approximate problem is a tractable optimization problem. Additionally, we prove the almost uniform convergence of the proposed sample-based approximation, showing that it gives the optimal solution almost consistently with the original ones. Through a numerical example, we evaluate the effectiveness of the proposed method against existing approaches, highlighting its capability to significantly reduce the bias inherent in sample-based approximation without considering a conditional probability density. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.11437 [pdf, other]

Formalization of Complexity Analysis of the First-order Optimization Algorithms

Authors: Chenyi Li, Ziyu Wang, Wanyi He, Yuxuan Wu, Shengyang Xu, Zaiwen Wen

Abstract: The convergence rate of various first-order optimization algorithms is a pivotal concern within the numerical optimization community, as it directly reflects the efficiency of these algorithms across different optimization problems. Our goal is making a significant step forward in the formal mathematical representation of optimization techniques using the Lean4 theorem prover. We first formalize t… ▽ More The convergence rate of various first-order optimization algorithms is a pivotal concern within the numerical optimization community, as it directly reflects the efficiency of these algorithms across different optimization problems. Our goal is making a significant step forward in the formal mathematical representation of optimization techniques using the Lean4 theorem prover. We first formalize the gradient for smooth functions and the subgradient for convex functions on a Hilbert space, laying the groundwork for the accurate formalization of algorithmic structures. Then, we extend our contribution by proving several properties of differentiable convex functions that have not yet been formalized in Mathlib. Finally, a comprehensive formalization of these algorithms is presented. These developments are not only noteworthy on their own but also serve as essential precursors to the formalization of a broader spectrum of numerical algorithms and their applications in machine learning as well as many other areas. △ Less

Submitted 19 March, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

ACM Class: G.1.6

arXiv:2403.04139 [pdf, other]

Snevily's Conjecture about $\mathcal{L}$-intersecting Families on Set Systems and its Analogue on Vector Spaces

Authors: Jiuqiang Liu, Guihai Yu, Lihua Feng, Yongjiang Wu

Abstract: The classical Erdős-Ko-Rado theorem on the size of an intersecting family of $k$-subsets of the set $[n] = \{1, 2, \dots, n\}$ is one of the fundamental intersection theorems for set systems. After the establishment of the EKR theorem, many intersection theorems on set systems have appeared in the literature, such as the well-known Frankl-Wilson theorem, Alon-Babai-Suzuki theorem, and Grolmusz-Sud… ▽ More The classical Erdős-Ko-Rado theorem on the size of an intersecting family of $k$-subsets of the set $[n] = \{1, 2, \dots, n\}$ is one of the fundamental intersection theorems for set systems. After the establishment of the EKR theorem, many intersection theorems on set systems have appeared in the literature, such as the well-known Frankl-Wilson theorem, Alon-Babai-Suzuki theorem, and Grolmusz-Sudakov theorem. In 1995, Snevily proposed the conjecture that the upper bound for the size of an $\mathcal{L}$-intersecting family of subsets of $[n]$ is ${{n} \choose {s}}$ under the condition $\max \{l_{i}\} < \min \{k_{j}\}$, where $\mathcal{L} = \{l_{1}, \dots, l_{s}\}$ with $0 \leq l_{1} < \cdots < l_{s}$ and $k_{j}$ are subset sizes in the family. In this paper, we prove that Snevily's conjecture holds for $n \geq {k^{2} \choose {l_{1}+1}}s + l_{1}$, where $k$ is the maximum subset size in the family. We then derive an analogous result for $\mathcal{L}$-intersecting families of subspaces of an $n$-dimensional vector space over a finite field $\mathbb{F}_{q}$. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: arXiv admin note: text overlap with arXiv:1701.00585 by other authors

arXiv:2402.16793 [pdf, other]

Failures and Successes of Cross-Validation for Early-Stopped Gradient Descent

Authors: Pratik Patil, Yuchen Wu, Ryan J. Tibshirani

Abstract: We analyze the statistical properties of generalized cross-validation (GCV) and leave-one-out cross-validation (LOOCV) applied to early-stopped gradient descent (GD) in high-dimensional least squares regression. We prove that GCV is generically inconsistent as an estimator of the prediction risk of early-stopped GD, even for a well-specified linear model with isotropic features. In contrast, we sh… ▽ More We analyze the statistical properties of generalized cross-validation (GCV) and leave-one-out cross-validation (LOOCV) applied to early-stopped gradient descent (GD) in high-dimensional least squares regression. We prove that GCV is generically inconsistent as an estimator of the prediction risk of early-stopped GD, even for a well-specified linear model with isotropic features. In contrast, we show that LOOCV converges uniformly along the GD trajectory to the prediction risk. Our theory requires only mild assumptions on the data distribution and does not require the underlying regression function to be linear. Furthermore, by leveraging the individual LOOCV errors, we construct consistent estimators for the entire prediction error distribution along the GD trajectory and consistent estimators for a wide class of error functionals. This in particular enables the construction of pathwise prediction intervals based on GD iterates that have asymptotically correct nominal coverage conditional on the training data. △ Less

Submitted 26 February, 2024; originally announced February 2024.

Comments: 76 pages, 27 figures

arXiv:2402.11970 [pdf, other]

Port-Hamiltonian modeling and control of a curling HASEL actuator

Authors: Nelson Cisneros, Yongxin Wu, Kanty Rabenorosoa, Yann Le Gorrec

Abstract: This paper is concerned with the modeling and control of a curling Hydraulically Amplified Self-healing Electrostatic (HASEL) actuator using the port-Hamiltonian (PH) approach. For that purpose, we use a modular approach and consider the HASEL actuator as an interconnection of elementary subsystems. Each subsystem is modeled by an electrical component consisting of a capacitor in parallel with an… ▽ More This paper is concerned with the modeling and control of a curling Hydraulically Amplified Self-healing Electrostatic (HASEL) actuator using the port-Hamiltonian (PH) approach. For that purpose, we use a modular approach and consider the HASEL actuator as an interconnection of elementary subsystems. Each subsystem is modeled by an electrical component consisting of a capacitor in parallel with an inductor connected through the conservation of volume of the moving liquid to a mechanical structure based on inertia, linear, and torsional springs. The parameters are then identified, and the model is validated on the experimental setup. Position control is achieved by using Interconnection and Damping Assignment-Passivity Based Control (IDA-PBC) with integral action (IA) for disturbance rejection. Simulation results show the efficiency of the proposed controller. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Journal ref: 8th IFAC Workshop on Lagrangian and Hamiltonian Methods for Non Linear Control, Jun 2024, Besan{\c c}on, France

arXiv:2402.09693 [pdf, ps, other]

Sharp Information-Theoretic Thresholds for Shuffled Linear Regression

Authors: Leon Lufkin, Yihong Wu, Jiaming Xu

Abstract: This paper studies the problem of shuffled linear regression, where the correspondence between predictors and responses in a linear model is obfuscated by a latent permutation. Specifically, we consider the model $y = Π_* X β_* + w$, where $X$ is an $n \times d$ standard Gaussian design matrix, $w$ is Gaussian noise with entrywise variance $σ^2$, $Π_*$ is an unknown $n \times n$ permutation matrix… ▽ More This paper studies the problem of shuffled linear regression, where the correspondence between predictors and responses in a linear model is obfuscated by a latent permutation. Specifically, we consider the model $y = Π_* X β_* + w$, where $X$ is an $n \times d$ standard Gaussian design matrix, $w$ is Gaussian noise with entrywise variance $σ^2$, $Π_*$ is an unknown $n \times n$ permutation matrix, and $β_*$ is the regression coefficient, also unknown. Previous work has shown that, in the large $n$-limit, the minimal signal-to-noise ratio ($\mathsf{SNR}$), $\lVert β_* \rVert^2/σ^2$, for recovering the unknown permutation exactly with high probability is between $n^2$ and $n^C$ for some absolute constant $C$ and the sharp threshold is unknown even for $d=1$. We show that this threshold is precisely $\mathsf{SNR} = n^4$ for exact recovery throughout the sublinear regime $d=o(n)$. As a by-product of our analysis, we also determine the sharp threshold of almost exact recovery to be $\mathsf{SNR} = n^2$, where all but a vanishing fraction of the permutation is reconstructed. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: 18 pages (9 main, 1 references, 8 appendix)

arXiv:2402.07747 [pdf, ps, other]

Optimal score estimation via empirical Bayes smoothing

Authors: Andre Wibisono, Yihong Wu, Kaylee Yingxi Yang

Abstract: We study the problem of estimating the score function of an unknown probability distribution $ρ^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming that $ρ^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, we establish the optimal rate of $\tilde Θ(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function… ▽ More We study the problem of estimating the score function of an unknown probability distribution $ρ^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming that $ρ^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, we establish the optimal rate of $\tilde Θ(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function $\|\hat s - s^*\|^2_{L^2(ρ^*)}$ that is commonly used in the score matching literature, highlighting the curse of dimensionality where sample complexity for accurate score estimation grows exponentially with the dimension $d$. Leveraging key insights in empirical Bayes theory as well as a new convergence rate of smoothed empirical distribution in Hellinger distance, we show that a regularized score estimator based on a Gaussian kernel attains this rate, shown optimal by a matching minimax lower bound. We also discuss extensions to estimating $β$-Hölder continuous scores with $β\leq 1$, as well as the implication of our theory on the sample complexity of score-based generative models. △ Less

Submitted 12 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

Comments: COLT 2024; added the new results on extending to beta-Holder scores with beta <= 1

arXiv:2402.03862 [pdf, other]

Nonstationary Discounted Stochastic Games under Prospect Theory with Applications to the Smart Grid

Authors: Yiting Wu, Junyu Zhang

Abstract: This paper considers the discounted criterion of nonzero-sum decentralized stochastic games with prospect players. The state and action spaces are finite. The state transition probability is nonstationary. Each player independently controls their own Markov chain. The subjective behavior of players is described by the prospect theory (PT). Compared to the average criterion of stochastic games unde… ▽ More This paper considers the discounted criterion of nonzero-sum decentralized stochastic games with prospect players. The state and action spaces are finite. The state transition probability is nonstationary. Each player independently controls their own Markov chain. The subjective behavior of players is described by the prospect theory (PT). Compared to the average criterion of stochastic games under PT studied firstly in 2018, we are concerned with the time value of utility, i.e., the utility should be discounted in the future. Since PT distorts the probability, the optimality equation that plays a significant role in proving the existence of equilibrium does not exist. On the other hand, the games change into Markov decision processes (MDPs) with nonstationary payoff function when fixing others' stationary Markov strategies, then the occupation measure and the linear programming of stationary MDPs are no longer suitable. Therefore, we explore a new technique by constructing the marginal distribution on the state-action pairs at any time, and establish the existence of Nash equilibrium. When the initial state is given, there exists a Markov Nash equilibrium. Furthermore, this novel technique can be extended to the finite horizon criterion. Then, we present an algorithm to find a Markov $\varepsilon$-equilibrium. Finally, the model is applied to a noncooperative stochastic game among prosumers who can produce and consume energy in the smart grid, and we give some simulation results. △ Less

Submitted 15 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

arXiv:2402.03210 [pdf, other]

Universal Gradient Methods for Stochastic Convex Optimization

Authors: Anton Rodomanov, Ali Kavis, Yongtao Wu, Kimon Antonakopoulos, Volkan Cevher

Abstract: We develop universal gradient methods for Stochastic Convex Optimization (SCO). Our algorithms automatically adapt not only to the oracle's noise but also to the Hölder smoothness of the objective function without a priori knowledge of the particular setting. The key ingredient is a novel strategy for adjusting step-size coefficients in the Stochastic Gradient Method (SGD). Unlike AdaGrad, which a… ▽ More We develop universal gradient methods for Stochastic Convex Optimization (SCO). Our algorithms automatically adapt not only to the oracle's noise but also to the Hölder smoothness of the objective function without a priori knowledge of the particular setting. The key ingredient is a novel strategy for adjusting step-size coefficients in the Stochastic Gradient Method (SGD). Unlike AdaGrad, which accumulates gradient norms, our Universal Gradient Method accumulates appropriate combinations of gradient- and iterate differences. The resulting algorithm has state-of-the-art worst-case convergence rate guarantees for the entire Hölder class including, in particular, both nonsmooth functions and those with Lipschitz continuous gradient. We also present the Universal Fast Gradient Method for SCO enjoying optimal efficiency estimates. △ Less

Submitted 11 July, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.02284 [pdf, ps, other]

Variable-order fractional Laplacian and its accurate and efficient computations with meshfree methods

Authors: Yixuan Wu, Yanzhi Zhang

Abstract: The variable-order fractional Laplacian plays an important role in the study of heterogeneous systems. In this paper, we propose the first numerical methods for the variable-order Laplacian $(-Δ)^{α({\bf x})/2}$ with $0 < α({\bf x}) \le 2$, which will also be referred as the variable-order fractional Laplacian if $α({\bf x})$ is strictly less than 2. We present a class of hypergeometric functions… ▽ More The variable-order fractional Laplacian plays an important role in the study of heterogeneous systems. In this paper, we propose the first numerical methods for the variable-order Laplacian $(-Δ)^{α({\bf x})/2}$ with $0 < α({\bf x}) \le 2$, which will also be referred as the variable-order fractional Laplacian if $α({\bf x})$ is strictly less than 2. We present a class of hypergeometric functions whose variable-order Laplacian can be analytically expressed. Building on these analytical results, we design the meshfree methods based on globally supported radial basis functions (RBFs), including Gaussian, generalized inverse multiquadric, and Bessel-type RBFs, to approximate the variable-order Laplacian $(-Δ)^{α({\bf x})/2}$. Our meshfree methods integrate the advantages of both pseudo-differential and hypersingular integral forms of the variable-order fractional Laplacian, and thus avoid numerically approximating the hypersingular integral. Moreover, our methods are simple and flexible of domain geometry, and their computer implementation remains the same for any dimension $d \ge 1$. Compared to finite difference methods, our methods can achieve a desired accuracy with much fewer points. This fact makes our method much attractive for problems involving variable-order fractional Laplacian where the number of points required is a critical cost. We then apply our method to study solution behaviors of variable-order fractional PDEs arising in different fields, including transition of waves between classical and fractional media, and coexistence of anomalous and normal diffusion in both diffusion equation and the Allen-Cahn equation. These results would provide insights for further understanding and applications of variable-order fractional derivatives. △ Less

Submitted 3 February, 2024; originally announced February 2024.

Comments: 27 pages, 8 figures

arXiv:2402.00531 [pdf, other]

Preconditioning for Physics-Informed Neural Networks

Authors: Songming Liu, Chang Su, Jiachen Yao, Zhongkai Hao, Hang Su, Youjia Wu, Jun Zhu

Abstract: Physics-informed neural networks (PINNs) have shown promise in solving various partial differential equations (PDEs). However, training pathologies have negatively affected the convergence and prediction accuracy of PINNs, which further limits their practical applications. In this paper, we propose to use condition number as a metric to diagnose and mitigate the pathologies in PINNs. Inspired by c… ▽ More Physics-informed neural networks (PINNs) have shown promise in solving various partial differential equations (PDEs). However, training pathologies have negatively affected the convergence and prediction accuracy of PINNs, which further limits their practical applications. In this paper, we propose to use condition number as a metric to diagnose and mitigate the pathologies in PINNs. Inspired by classical numerical analysis, where the condition number measures sensitivity and stability, we highlight its pivotal role in the training dynamics of PINNs. We prove theorems to reveal how condition number is related to both the error control and convergence of PINNs. Subsequently, we present an algorithm that leverages preconditioning to improve the condition number. Evaluations of 18 PDE problems showcase the superior performance of our method. Significantly, in 7 of these problems, our method reduces errors by an order of magnitude. These empirical findings verify the critical role of the condition number in PINNs' training. △ Less

Submitted 1 February, 2024; originally announced February 2024.

arXiv:2401.15389 [pdf]

An Enhanced Modelling Approach for Warehouse Sharing Platform System Designing Problem

Authors: Zeren Xing, Yuehui Wu, Shuangyuan Yu

Abstract: With the increasing importance of sustainability, warehouse sharing arises as a possible way to improve the efficiency of the existing logistics system. This paper studied the warehouse sharing platform systems (WSPS) and investigated its supply chain network, including factories, warehouses, and customers. We proposed an enhanced modelling approach for the warehouse sharing platform system design… ▽ More With the increasing importance of sustainability, warehouse sharing arises as a possible way to improve the efficiency of the existing logistics system. This paper studied the warehouse sharing platform systems (WSPS) and investigated its supply chain network, including factories, warehouses, and customers. We proposed an enhanced modelling approach for the warehouse sharing platform system design problem (WSPSDP) using the multi-allocation hub location routing problem framework. New elements such as inter-warehouse transportation and multiple-allocation scheme were added compared to the existing WSPS model. Then an adaptive large neighbourhood decomposition search heuristic was applied to solve our problem. Computational experiments were conducted on different-sized instances for comparison with the WSPS model without inter-warehouse transportation and the WSPS model with single-allocation scheme. The results suggested that our proposed WSPSDP model is more cost-efficient than the existing WSPS models, and it has the potential to promote the utilisation of existing cheap idle warehouses. △ Less

Submitted 27 January, 2024; originally announced January 2024.

arXiv:2401.11658 [pdf, other]

A Randomized Runge-Kutta Method for time-irregular delay differential equations

Authors: Fabio V. Difonzo, Paweł Przybyłowicz, Yue Wu, Xinheng Xie

Abstract: In this paper we investigate the existence, uniqueness and approximation of solutions of delay differential equations (DDEs) with the right-hand side functions $f=f(t,x,z)$ that are Lipschitz continuous with respect to $x$ but only Hölder continuous with respect to $(t,z)$. We give a construction of the randomized two-stage Runge-Kutta scheme for DDEs and investigate its upper error bound in the… ▽ More In this paper we investigate the existence, uniqueness and approximation of solutions of delay differential equations (DDEs) with the right-hand side functions $f=f(t,x,z)$ that are Lipschitz continuous with respect to $x$ but only Hölder continuous with respect to $(t,z)$. We give a construction of the randomized two-stage Runge-Kutta scheme for DDEs and investigate its upper error bound in the $L^p(Ω)$-norm for $p\in [2,+\infty)$. Finally, we report on results of numerical experiments. △ Less

Submitted 21 January, 2024; originally announced January 2024.

Comments: arXiv admin note: text overlap with arXiv:2204.02016

arXiv:2401.11380 [pdf, other]

MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning

Authors: Mao Hong, Zhiyue Zhang, Yue Wu, Yanxun Xu

Abstract: Model-based offline reinforcement learning methods (RL) have achieved state-of-the-art performance in many decision-making problems thanks to their sample efficiency and generalizability. Despite these advancements, existing model-based offline RL approaches either focus on theoretical studies without developing practical algorithms or rely on a restricted parametric policy space, thus not fully l… ▽ More Model-based offline reinforcement learning methods (RL) have achieved state-of-the-art performance in many decision-making problems thanks to their sample efficiency and generalizability. Despite these advancements, existing model-based offline RL approaches either focus on theoretical studies without developing practical algorithms or rely on a restricted parametric policy space, thus not fully leveraging the advantages of an unrestricted policy space inherent to model-based methods. To address this limitation, we develop MoMA, a model-based mirror ascent algorithm with general function approximations under partial coverage of offline data. MoMA distinguishes itself from existing literature by employing an unrestricted policy class. In each iteration, MoMA conservatively estimates the value function by a minimization procedure within a confidence set of transition models in the policy evaluation step, then updates the policy with general function approximations instead of commonly-used parametric policy classes in the policy improvement step. Under some mild assumptions, we establish theoretical guarantees of MoMA by proving an upper bound on the suboptimality of the returned policy. We also provide a practically implementable, approximate version of the algorithm. The effectiveness of MoMA is demonstrated via numerical studies. △ Less

Submitted 20 January, 2024; originally announced January 2024.

arXiv:2401.06956 [pdf, ps, other]

A note on Rational Maps with three branching points on the Riemann sphere

Authors: Zhiqiang Wei, Yingyi Wu, Bin Xu

Abstract: Studying the existence of rational functions with given branching datum is a classical problem in the field of complex analysis and algebraic geometry. This problem dates back to Hurwitz and remains open to this day. In this paper, we utilize complex analysis to establish a property of rational functions with 3 branching points on the Riemann sphere. Given two compact Riemann surfaces $M$ and $N$,… ▽ More Studying the existence of rational functions with given branching datum is a classical problem in the field of complex analysis and algebraic geometry. This problem dates back to Hurwitz and remains open to this day. In this paper, we utilize complex analysis to establish a property of rational functions with 3 branching points on the Riemann sphere. Given two compact Riemann surfaces $M$ and $N$, a pair $(d,\mathcal{D})$ of an integer $d\geq2$ and a collection $\mathcal{D}$ of nontrivial partitions of $d$ is called a candidate branching datum if it satisfies the Riemann-Hurwitz formula. And a candidate branching datum is exceptional if there does not exist a rational function realization it. As applications, we present some new types of exceptional branching datum. These results cover some previous results mentioned in \cite{EKS84,PP06,Zhu19}. We also deduce the realizability of a certain type of candidate branching datum on the Riemann sphere. △ Less

Submitted 28 May, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

Comments: 9 pages. We have revised certain sentences, corrected several errors, and added additional content to enhance the clarity and depth of our paper. Comments are welcome

MSC Class: 57M12

arXiv:2401.06266 [pdf, other]

Supervised Gromov-Wasserstein Optimal Transport

Authors: Zixuan Cang, Yaqi Wu, Yanxiang Zhao

Abstract: We introduce the supervised Gromov-Wasserstein (sGW) optimal transport, an extension of Gromov-Wasserstein by incorporating potential infinity patterns in the cost tensor. sGW enables the enforcement of application-induced constraints such as the preservation of pairwise distances by implementing the constraints as an infinity pattern. A numerical solver is proposed for the sGW problem and the eff… ▽ More We introduce the supervised Gromov-Wasserstein (sGW) optimal transport, an extension of Gromov-Wasserstein by incorporating potential infinity patterns in the cost tensor. sGW enables the enforcement of application-induced constraints such as the preservation of pairwise distances by implementing the constraints as an infinity pattern. A numerical solver is proposed for the sGW problem and the effectiveness is demonstrated in various numerical experiments. The high-order constraints in sGW are transferred to constraints on the coupling matrix by solving a minimal vertex cover problem. The transformed problem is solved by the Mirror-C descent iteration coupled with the supervised optimal transport solver. In the numerical experiments, we first validate the proposed framework by applying it to matching synthetic datasets and investigating the impact of the model parameters. Additionally, we successfully apply sGW to real single-cell RNA sequencing data. Through comparisons with other Gromov-Wasserstein variants on real data, we demonstrate that sGW offers the novel utility of controlling distance preservation, leading to the automatic estimation of overlapping portions of datasets, which brings improved stability and flexibility in data-driven applications. △ Less

Submitted 11 January, 2024; originally announced January 2024.

MSC Class: 49Q22

arXiv:2401.05320 [pdf, other]

Hausdorff dimensions of topologically transitive Markov hom tree-shifts

Authors: Jung-Chao Ban, Guan-Yu Lai, Yu-Liang Wu

Abstract: This paper features an analog of Sanov's theorem for finite-state Markov chains indexed by rooted d-trees, obtained via the method of types in the classical analysis of large deviations. Along with the theorem comes two applications: an almost-sure type convergence of sample means and a formula for the Hausdorff dimension of the symbolic space associated with the irreducible Markov chain. This paper features an analog of Sanov's theorem for finite-state Markov chains indexed by rooted d-trees, obtained via the method of types in the classical analysis of large deviations. Along with the theorem comes two applications: an almost-sure type convergence of sample means and a formula for the Hausdorff dimension of the symbolic space associated with the irreducible Markov chain. △ Less

Submitted 10 January, 2024; originally announced January 2024.

MSC Class: 28A80; 60J10 (Primary) 37B10 (Secondary)

arXiv:2401.03391 [pdf, ps, other]

More MDS codes of non-Reed-Solomon type

Authors: Yansheng Wu, Ziling Heng, Chengju Li, Cunsheng Ding

Abstract: MDS codes have diverse practical applications in communication systems, data storage, and quantum codes due to their algebraic properties and optimal error-correcting capability. In this paper, we focus on a class of linear codes and establish some sufficient and necessary conditions for them being MDS. Notably, these codes differ from Reed-Solomon codes up to monomial equivalence. Additionally, w… ▽ More MDS codes have diverse practical applications in communication systems, data storage, and quantum codes due to their algebraic properties and optimal error-correcting capability. In this paper, we focus on a class of linear codes and establish some sufficient and necessary conditions for them being MDS. Notably, these codes differ from Reed-Solomon codes up to monomial equivalence. Additionally, we also explore the cases in which these codes are almost MDS or near MDS. Applying our main results, we determine the covering radii and deep holes of the dual codes associated with specific Roth-Lempel codes and discover an infinite family of (almost) optimally extendable codes with dimension three. △ Less

Submitted 7 January, 2024; originally announced January 2024.

Comments: 22 pages

arXiv:2401.01428 [pdf, ps, other]

Connections between K-stability and Vojta's conjecture

Authors: Jackson S. Morrow, Yueqiao Wu

Abstract: In this note, we use recent advances concerning the K-stability of $\mathbb{Q}$-Fano varieties to provide settings for which Vojta's conjecture holds. In this note, we use recent advances concerning the K-stability of $\mathbb{Q}$-Fano varieties to provide settings for which Vojta's conjecture holds. △ Less

Submitted 2 January, 2024; originally announced January 2024.

Comments: 11 pages, comments welcome!

arXiv:2401.01120 [pdf, ps, other]

Van der Corput and metric theorems for geometric progressions for self-similar measures

Authors: Amir Algom, Yuanyang Chang, Meng Wu, Yu-Liang Wu

Abstract: We prove a van der Corput lemma for non-atomic self-similar measures $μ$. As an application, we show that the correlations of all finite orders of $( x^n \mod 1 )_{n\geq 1}$ converge to the Poissonian model for $μ$-a.e. $x$, assuming $x>1$. We also complete a recent result of Algom, Rodriguez Hertz, and Wang (obtained simultaneously by Baker and Banaji), showing that any self-conformal measure wit… ▽ More We prove a van der Corput lemma for non-atomic self-similar measures $μ$. As an application, we show that the correlations of all finite orders of $( x^n \mod 1 )_{n\geq 1}$ converge to the Poissonian model for $μ$-a.e. $x$, assuming $x>1$. We also complete a recent result of Algom, Rodriguez Hertz, and Wang (obtained simultaneously by Baker and Banaji), showing that any self-conformal measure with respect to a non-affine real analytic IFS has polynomial Fourier decay. △ Less

Submitted 2 January, 2024; originally announced January 2024.

arXiv:2401.01047 [pdf, other]

Sharp Analysis of Power Iteration for Tensor PCA

Authors: Yuchen Wu, Kangjie Zhou

Abstract: We investigate the power iteration algorithm for the tensor PCA model introduced in Richard and Montanari (2014). Previous work studying the properties of tensor power iteration is either limited to a constant number of iterations, or requires a non-trivial data-independent initialization. In this paper, we move beyond these limitations and analyze the dynamics of randomly initialized tensor power… ▽ More We investigate the power iteration algorithm for the tensor PCA model introduced in Richard and Montanari (2014). Previous work studying the properties of tensor power iteration is either limited to a constant number of iterations, or requires a non-trivial data-independent initialization. In this paper, we move beyond these limitations and analyze the dynamics of randomly initialized tensor power iteration up to polynomially many steps. Our contributions are threefold: First, we establish sharp bounds on the number of iterations required for power method to converge to the planted signal, for a broad range of the signal-to-noise ratios. Second, our analysis reveals that the actual algorithmic threshold for power iteration is smaller than the one conjectured in literature by a polylog(n) factor, where n is the ambient dimension. Finally, we propose a simple and effective stopping criterion for power iteration, which provably outputs a solution that is highly correlated with the true signal. Extensive numerical experiments verify our theoretical results. △ Less

Submitted 2 January, 2024; originally announced January 2024.

Comments: 40 pages, 8 figures

arXiv:2312.17461 [pdf, ps, other]

Gaussian radial basis functions collocation for fractional PDEs: methodology and error analysis

Authors: Xiaochuan Tian, Yixuan Wu, Yanzhi Zhang

Abstract: The paper introduces a new meshfree pseudospectral method based on Gaussian radial basis functions (RBFs) collocation to solve fractional Poisson equations. Hypergeometric functions are used to represent the fractional Laplacian of Gaussian RBFs, enabling an efficient computation of stiffness matrix entries. Unlike existing RBF-based methods, our approach ensures a Toeplitz structure in the stiffn… ▽ More The paper introduces a new meshfree pseudospectral method based on Gaussian radial basis functions (RBFs) collocation to solve fractional Poisson equations. Hypergeometric functions are used to represent the fractional Laplacian of Gaussian RBFs, enabling an efficient computation of stiffness matrix entries. Unlike existing RBF-based methods, our approach ensures a Toeplitz structure in the stiffness matrix with equally spaced RBF centers, enabling efficient matrix-vector multiplications using fast Fourier transforms. We conduct a comprehensive study on the shape parameter selection, addressing challenges related to ill-conditioning and numerical stability. The main contribution of our work includes rigorous stability analysis and error estimates of the Gaussian RBF collocation method, representing a first attempt at the rigorous analysis of RBF-based methods for fractional PDEs to the best of our knowledge. We conduct numerical experiments to validate our analysis and provide practical insights for implementation. △ Less

Submitted 28 December, 2023; originally announced December 2023.

arXiv:2312.17077 [pdf, ps, other]

Projected Langevin Monte Carlo algorithms in non-convex and super-linear setting

Authors: Chenxu Pang, Xiaojie Wang, Yue Wu

Abstract: It is of significant interest in many applications to sample from a high-dimensional target distribution $π$ with the density $π(\text{d} x) \propto e^{-U(x)} (\text{d} x) $, based on the temporal discretization of the Langevin stochastic differential equations (SDEs). In this paper, we propose an explicit projected Langevin Monte Carlo (PLMC) algorithm with non-convex potential $U$ and super-line… ▽ More It is of significant interest in many applications to sample from a high-dimensional target distribution $π$ with the density $π(\text{d} x) \propto e^{-U(x)} (\text{d} x) $, based on the temporal discretization of the Langevin stochastic differential equations (SDEs). In this paper, we propose an explicit projected Langevin Monte Carlo (PLMC) algorithm with non-convex potential $U$ and super-linear gradient of $U$ and investigate the non-asymptotic analysis of its sampling error in total variation distance. Equipped with time-independent regularity estimates for the corresponding Kolmogorov equation, we derive the non-asymptotic bounds on the total variation distance between the target distribution of the Langevin SDEs and the law induced by the PLMC scheme with order $\mathcal{O}(h |\ln h|)$. Moreover, for a given precision $ε$, the smallest number of iterations of the classical Langevin Monte Carlo (LMC) scheme with the non-convex potential $U$ and the globally Lipshitz gradient of $U$ can be guaranteed by order ${\mathcal{O}}\big(\tfrac{d^{3/2}}ε \cdot \ln (\tfrac{d}ε) \cdot \ln (\tfrac{1}ε) \big)$. Numerical experiments are provided to confirm the theoretical findings. △ Less

Submitted 1 January, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

Comments: 31 pages, 6 figures

MSC Class: 60H35; 65C05; 65C30

arXiv:2312.16116 [pdf, other]

On the best convergence rates of lightning plus polynomial approximations

Authors: Shuhuang Xiang, Shunfeng Yang, Yanghao Wu

Abstract: Building on introducing exponentially clustered poles, Trefethen and his collaborators introduced lightning algorithms for approximating functions of singularities. These schemes may achieve root-exponential convergence rates. In particular, based on a specific choice of the parameter of the tapered exponentially clustered poles, the lightning approximation with either a low-degree polynomial basi… ▽ More Building on introducing exponentially clustered poles, Trefethen and his collaborators introduced lightning algorithms for approximating functions of singularities. These schemes may achieve root-exponential convergence rates. In particular, based on a specific choice of the parameter of the tapered exponentially clustered poles, the lightning approximation with either a low-degree polynomial basis may achieve the optimal convergence rate simply as the best rational approximation for prototype $x^α$ on $[0,1]$, which was illustrated through delicate numerical experiments and conjectured in [SIAM J. Numer. Anal., 61:2580-2600, 2023]. By utilizing Poisson's summation formula and results akin to Paley-Wiener Theorem, we rigorously show that all these schemes with a low-degree polynomial basis achieve root-exponential convergence rates with exact orders in approximating $x^α$ for arbitrary clustered parameters theoretically, and provide the best choices of the parameter to achieve the fastest convergence rate for each type of clustered poles, from which the conjecture is confirmed as a special case. Ample numerical evidences demonstrate the optimality and sharpness of the estimates. △ Less

Submitted 16 June, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

Comments: 37pages, 12figures

MSC Class: 41A20; 65E05; 65D15; 30C10

arXiv:2312.15718 [pdf, ps, other]

An Inexact Projected Regularized Newton Method for Fused Zero-norms Regularization Problems

Authors: Yuqia Wu, Shaohua Pan, Xiaoqi Yang

Abstract: We are concerned with structured $\ell_0$-norms regularization problems, with a twice continuously differentiable loss function and a box constraint. This class of problems have a wide range of applications in statistics, machine learning and image processing. To the best of our knowledge, there is no effective algorithm in the literature for solving them. In this paper, we first obtain a polynomi… ▽ More We are concerned with structured $\ell_0$-norms regularization problems, with a twice continuously differentiable loss function and a box constraint. This class of problems have a wide range of applications in statistics, machine learning and image processing. To the best of our knowledge, there is no effective algorithm in the literature for solving them. In this paper, we first obtain a polynomial-time algorithm to find a point in the proximal mapping of the fused $\ell_0$-norms with a box constraint based on dynamic programming principle. We then propose a hybrid algorithm of proximal gradient method and inexact projected regularized Newton method to solve structured $\ell_0$-norms regularization problems. The whole sequence generated by the algorithm is shown to be convergent by virtue of a non-degeneracy condition, a curvature condition and a Kurdyka-Łojasiewicz property. A superlinear convergence rate of the iterates is established under a locally Hölderian error bound condition on a second-order stationary point set, without requiring the local optimality of the limit point. Finally, numerical experiments are conducted to highlight the features of our considered model, and the superiority of our proposed algorithm. △ Less

Submitted 25 December, 2023; originally announced December 2023.

arXiv:2312.13681 [pdf, ps, other]

Irreducible characters and bitrace for the $q$-rook monoid

Authors: Naihuan Jing, Yu Wu, Ning Liu

Abstract: This paper studies irreducible characters of the $q$-rook monoid algebra $R_n(q)$ using the vertex algebraic method. Based on the Frobenius formula for $R_n(q)$, a new iterative character formula is derived with the help of the vertex operator realization of the Schur symmetric function. The same idea also leads to a simple proof of the Murnaghan-Nakayama rule for $R_n(q)$. We also introduce the b… ▽ More This paper studies irreducible characters of the $q$-rook monoid algebra $R_n(q)$ using the vertex algebraic method. Based on the Frobenius formula for $R_n(q)$, a new iterative character formula is derived with the help of the vertex operator realization of the Schur symmetric function. The same idea also leads to a simple proof of the Murnaghan-Nakayama rule for $R_n(q)$. We also introduce the bitrace for the $q$-monoid and obtain a general combinatorial formula for the bitrace, which generalizes the counterpart for the Iwahori-Hecke algebra. The character table of $R_n(q)$ with $|μ|=5$ is listed in the appendix. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: 22 pages

MSC Class: Primary: 20C08; Secondary: 17B69; 05E10

arXiv:2312.12708 [pdf, other]

Gradient flows for empirical Bayes in high-dimensional linear models

Authors: Zhou Fan, Leying Guan, Yandi Shen, Yihong Wu

Abstract: Empirical Bayes provides a powerful approach to learning and adapting to latent structure in data. Theory and algorithms for empirical Bayes have a rich literature for sequence models, but are less understood in settings where latent variables and data interact through more complex designs. In this work, we study empirical Bayes estimation of an i.i.d. prior in Bayesian linear models, via the nonp… ▽ More Empirical Bayes provides a powerful approach to learning and adapting to latent structure in data. Theory and algorithms for empirical Bayes have a rich literature for sequence models, but are less understood in settings where latent variables and data interact through more complex designs. In this work, we study empirical Bayes estimation of an i.i.d. prior in Bayesian linear models, via the nonparametric maximum likelihood estimator (NPMLE). We introduce and study a system of gradient flow equations for optimizing the marginal log-likelihood, jointly over the prior and posterior measures in its Gibbs variational representation using a smoothed reparametrization of the regression coefficients. A diffusion-based implementation yields a Langevin dynamics MCEM algorithm, where the prior law evolves continuously over time to optimize a sequence-model log-likelihood defined by the coordinates of the current Langevin iterate. We show consistency of the NPMLE as $n, p \rightarrow \infty$ under mild conditions, including settings of random sub-Gaussian designs when $n \asymp p$. In high noise, we prove a uniform log-Sobolev inequality for the mixing of Langevin dynamics, for possibly misspecified priors and non-log-concave posteriors. We then establish polynomial-time convergence of the joint gradient flow to a near-NPMLE if the marginal negative log-likelihood is convex in a sub-level set of the initialization. △ Less

Submitted 19 December, 2023; originally announced December 2023.

arXiv:2312.08781 [pdf, ps, other]

The von Bahr-Esseen type inequality under sub-linear expectations and applications

Authors: Yi Wu, Xuejun Wang

Abstract: Moment inequalities play important roles in probability limit theory and mathematical statistics. In this work, the von Bahr-Esseen type inequality for extended negatively dependent random variables under sub-linear expectations is established successfully. By virtue of the inequality, we further obtain the Kolmogorov type weak law of large numbers for partial sums and the complete convergence for… ▽ More Moment inequalities play important roles in probability limit theory and mathematical statistics. In this work, the von Bahr-Esseen type inequality for extended negatively dependent random variables under sub-linear expectations is established successfully. By virtue of the inequality, we further obtain the Kolmogorov type weak law of large numbers for partial sums and the complete convergence for weighted sums, which extend and improve corresponding results in sub-linear expectation space. △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: 18 pages

arXiv:2312.06980 [pdf, other]

SPFNO: Spectral operator learning for PDEs with Dirichlet and Neumann boundary conditions

Authors: Ziyuan Liu, Yuhang Wu, Daniel Zhengyu Huang, Hong Zhang, Xu Qian, Songhe Song

Abstract: Neural operators have been validated as promising deep surrogate models for solving partial differential equations (PDEs). Despite the critical role of boundary conditions in PDEs, however, only a limited number of neural operators robustly enforce these conditions. In this paper we introduce semi-periodic Fourier neural operator (SPFNO), a novel spectral operator learning method, to learn the tar… ▽ More Neural operators have been validated as promising deep surrogate models for solving partial differential equations (PDEs). Despite the critical role of boundary conditions in PDEs, however, only a limited number of neural operators robustly enforce these conditions. In this paper we introduce semi-periodic Fourier neural operator (SPFNO), a novel spectral operator learning method, to learn the target operators of PDEs with non-periodic BCs. This method extends our previous work (arXiv:2206.12698), which showed significant improvements by employing enhanced neural operators that precisely satisfy the boundary conditions. However, the previous work is associated with Gaussian grids, restricting comprehensive comparisons across most public datasets. Additionally, we present numerical results for various PDEs such as the viscous Burgers' equation, Darcy flow, incompressible pipe flow, and coupled reactiondiffusion equations. These results demonstrate the computational efficiency, resolution invariant property, and BC-satisfaction behavior of proposed model. An accuracy improvement of approximately 1.7X-4.7X over the non-BC-satisfying baselines is also achieved. Furthermore, our studies on SOL underscore the significance of satisfying BCs as a criterion for deep surrogate models of PDEs. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2312.05534 [pdf, ps, other]

Extended codes and deep holes of MDS codes

Authors: Yansheng Wu, Cunsheng Ding, Tingfang Chen

Abstract: For a given linear code $\C$ of length $n$ over $\gf(q)$ and a nonzero vector $\bu$ in $\gf(q)^n$, Sun, Ding and Chen defined an extended linear code $\overline{\C}(\bu)$ of $\C$, which is a generalisation of the classical extended code $\overline{\C}(-\bone)$ of $\C$ and called the second kind of an extended code of $\C$ (see arXiv:2307.04076 and arXiv:2307.08053). They developed some general the… ▽ More For a given linear code $\C$ of length $n$ over $\gf(q)$ and a nonzero vector $\bu$ in $\gf(q)^n$, Sun, Ding and Chen defined an extended linear code $\overline{\C}(\bu)$ of $\C$, which is a generalisation of the classical extended code $\overline{\C}(-\bone)$ of $\C$ and called the second kind of an extended code of $\C$ (see arXiv:2307.04076 and arXiv:2307.08053). They developed some general theory of the extended codes $\overline{\C}(\bu)$ and studied the extended codes $\overline{\C}(\bu)$ of several families of linear codes, including cyclic codes, projective two-weight codes, nonbinary Hamming codes, and a family of reversible MDS cyclic codes. The objective of this paper is to investigate the extended codes $\overline{\C}(\bu)$ of MDS codes $\C$ over finite fields. The main result of this paper is that the extended code $\overline{\C}(\bu)$ of an MDS $[n,k]$ code $\C$ remains MDS if and only if the covering radius $ρ(\mathcal{C}^{\bot})=k$ and the vector $\bu$ is a deep hole of the dual code $\C^\perp$. As applications of this main result, the extended codes of the GRS codes and extended GRS codes are investigated and the covering radii of several families of MDS codes are determined. △ Less

Submitted 9 December, 2023; originally announced December 2023.

Comments: 22 pages, submitted for possible publication

arXiv:2311.09119 [pdf, ps, other]

A high-order local discontinuous Galerkin method for the $p$-Laplace equation

Authors: Yue Wu, Yan Xu

Abstract: We study the high-order local discontinuous Galerkin (LDG) method for the $p$-Laplace equation. We reformulate our spatial discretization as an equivalent convex minimization problem and use a preconditioned gradient descent method as the nonlinear solver. For the first time, a weighted preconditioner that provides $hk$-independent convergence is applied in the LDG setting. For polynomial order… ▽ More We study the high-order local discontinuous Galerkin (LDG) method for the $p$-Laplace equation. We reformulate our spatial discretization as an equivalent convex minimization problem and use a preconditioned gradient descent method as the nonlinear solver. For the first time, a weighted preconditioner that provides $hk$-independent convergence is applied in the LDG setting. For polynomial order $k \geqslant 1$, we rigorously establish the solvability of our scheme and provide a priori error estimates in a mesh-dependent energy norm. Our error estimates are under a different and non-equivalent distance from existing LDG results. For arbitrarily high-order polynomials under the assumption that the exact solution has enough regularity, the error estimates demonstrate the potential for high-order accuracy. Our numerical results exhibit the desired convergence speed facilitated by the preconditioner, and we observe best convergence rates in gradient variables in alignment with linear LDG, and optimal rates in the primal variable when $1 < p \leqslant 2$. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: 36 pages, 36 figures

MSC Class: 65N30; 65N15; 65N22; 65N12

Showing 1–50 of 583 results for author: Wu, Y