-
Provably Efficient Posterior Sampling for Sparse Linear Regression via Measure Decomposition
Authors:
Andrea Montanari,
Yuchen Wu
Abstract:
We consider the problem of sampling from the posterior distribution of a $d$-dimensional coefficient vector $\boldsymbolθ$, given linear observations $\boldsymbol{y} = \boldsymbol{X}\boldsymbolθ+\boldsymbol{\varepsilon}$. In general, such posteriors are multimodal, and therefore challenging to sample from. This observation has prompted the exploration of various heuristics that aim at approximatin…
▽ More
We consider the problem of sampling from the posterior distribution of a $d$-dimensional coefficient vector $\boldsymbolθ$, given linear observations $\boldsymbol{y} = \boldsymbol{X}\boldsymbolθ+\boldsymbol{\varepsilon}$. In general, such posteriors are multimodal, and therefore challenging to sample from. This observation has prompted the exploration of various heuristics that aim at approximating the posterior distribution.
In this paper, we study a different approach based on decomposing the posterior distribution into a log-concave mixture of simple product measures. This decomposition allows us to reduce sampling from a multimodal distribution of interest to sampling from a log-concave one, which is tractable and has been investigated in detail. We prove that, under mild conditions on the prior, for random designs, such measure decomposition is generally feasible when the number of samples per parameter $n/d$ exceeds a constant threshold. We thus obtain a provably efficient (polynomial time) sampling algorithm in a regime where this was previously not known. Numerical simulations confirm that the algorithm is practical, and reveal that it has attractive statistical properties compared to state-of-the-art methods.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
A projected Euler Method for Random Periodic Solutions of Semi-linear SDEs with non-globally Lipschitz coefficients
Authors:
Yujia Guo,
Xiaojie Wang,
Yue Wu
Abstract:
The present work introduces and investigates an explicit time discretization scheme, called the projected Euler method, to numerically approximate random periodic solutions of semi-linear SDEs under non-globally Lipschitz conditions. The existence of the random periodic solution is demonstrated as the limit of the pull-back of the discretized SDE. Without relying on a priori high-order moment boun…
▽ More
The present work introduces and investigates an explicit time discretization scheme, called the projected Euler method, to numerically approximate random periodic solutions of semi-linear SDEs under non-globally Lipschitz conditions. The existence of the random periodic solution is demonstrated as the limit of the pull-back of the discretized SDE. Without relying on a priori high-order moment bounds of the numerical approximations, the mean square convergence rate is proved to be order 0.5 for SDEs with multiplicative noise and order 1 for SDEs with additive noise. Numerical examples are also provided to validate our theoretical findings.
△ Less
Submitted 27 June, 2024; v1 submitted 23 June, 2024;
originally announced June 2024.
-
On the Weisfeiler-Leman dimension of circulant graphs
Authors:
Yulai Wu,
Ilia Ponomarenko
Abstract:
A circulant graph is a Cayley graph of a finite cyclic group. The Weisfeiler-Leman-dimension of a circulant graph $X$ with respect to the class of all circulant graphs is the smallest positive integer~$m$ such that the $m$-dimensional Weisfeiler-Leman algorithm correctly tests the isomorphism between $X$ and any other circulant graph. It is proved that for a circulant graph of order $n$ this dimen…
▽ More
A circulant graph is a Cayley graph of a finite cyclic group. The Weisfeiler-Leman-dimension of a circulant graph $X$ with respect to the class of all circulant graphs is the smallest positive integer~$m$ such that the $m$-dimensional Weisfeiler-Leman algorithm correctly tests the isomorphism between $X$ and any other circulant graph. It is proved that for a circulant graph of order $n$ this dimension is less than or equal to $Ω(n)+3$, where $Ω(n)$ is the number of prime divisors of~$n$.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Some non-Archimedean pluripotential theory on polarized affine cones
Authors:
Yueqiao Wu
Abstract:
We undertake a preliminary step towards studying non-Archimedean pluripotential theory on polarized affine cones over a trivially valued field. We study plurisubharmonic functions and the Monge--Ampère operator defined on the finite energy class, partially generalizing a result of Boucksom--Jonsson on projective varieties.
We undertake a preliminary step towards studying non-Archimedean pluripotential theory on polarized affine cones over a trivially valued field. We study plurisubharmonic functions and the Monge--Ampère operator defined on the finite energy class, partially generalizing a result of Boucksom--Jonsson on projective varieties.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Petty projection inequality on the sphere and on the hyperbolic space
Authors:
Y. Lin,
Y. Wu
Abstract:
Using gnomonic projection and Poincaré model, we first define the spherical projection body and hyperbolic projection body in spherical space $\mathbb{S}^n$ and hyperbolic space $\mathbb{H}^n$, then define the spherical Steiner symmetrization and hyperbolic Steiner symmetrization, finally prove the spherical projection inequality and hyperbolic projection inequality.
Using gnomonic projection and Poincaré model, we first define the spherical projection body and hyperbolic projection body in spherical space $\mathbb{S}^n$ and hyperbolic space $\mathbb{H}^n$, then define the spherical Steiner symmetrization and hyperbolic Steiner symmetrization, finally prove the spherical projection inequality and hyperbolic projection inequality.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
The parity of Lusztig's restriction functor and Green's formula for a quiver with automorphism
Authors:
Jiepeng Fang,
Yixin Lan,
Yumeng Wu
Abstract:
In [8], Fang-Lan-Xiao proved a formula about Lusztig's induction and restriction functors which can induce Green's formula for the path algebra of a quiver over a finite field via the trace map. In this paper, we generalize their formula to that for the mixed semisimple perverse sheaves for a quiver with an automorphism. By applying the trace map, we obtain Green's formula for any finite-dimension…
▽ More
In [8], Fang-Lan-Xiao proved a formula about Lusztig's induction and restriction functors which can induce Green's formula for the path algebra of a quiver over a finite field via the trace map. In this paper, we generalize their formula to that for the mixed semisimple perverse sheaves for a quiver with an automorphism. By applying the trace map, we obtain Green's formula for any finite-dimensional hereditary algebra over a finite field.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
The Boolean polynomial polytope with multiple choice constraints
Authors:
Sihong Shao,
Yishan Wu
Abstract:
We consider a class of $0$-$1$ polynomial programming termed multiple choice polynomial programming (MCPP) where the constraint requires exact one component per subset of the partition to be $1$ after all the entries are partitioned. Compared to the unconstrained counterpart, there are few polyhedral studies of MCPP in general form. This paper serves as the first attempt to propose a polytope asso…
▽ More
We consider a class of $0$-$1$ polynomial programming termed multiple choice polynomial programming (MCPP) where the constraint requires exact one component per subset of the partition to be $1$ after all the entries are partitioned. Compared to the unconstrained counterpart, there are few polyhedral studies of MCPP in general form. This paper serves as the first attempt to propose a polytope associated with a hypergraph to study MCPP, which is the convex hull of $0$-$1$ vectors satisfying multiple choice constraints and production constraints. With the help of the decomposability property, we obtain an explicit half-space representation of the MCPP polytope when the underlying hypergraph is $α$-acyclic by induction on the number of hyperedges, which is an analogy of the acyclicity results on the multilinear polytope by Del Pia and Khajavirad (SIAM J Optim 28 (2018) 1049) when the hypergraph is $γ$-acyclic. We also present a necessary and sufficient condition for the inequalities lifted from the facet-inducing ones for the multilinear polytope to be still facet-inducing for the MCPP polytope. This result covers the particular cases by Bärmann, Martin and Schneider (SIAM J Optim 33 (2023) 2909).
△ Less
Submitted 19 June, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Capillary Surfaces in Manifolds with Nonnegative Scalar Curvature and Strictly Mean Convex Boundary
Authors:
Yujie Wu
Abstract:
In this paper we use stable capillary surfaces (analogous to the $μ$-bubble construction) to study manifolds with strictly mean convex boundary and nonnegative scalar curvature. We give an obstruction to filling 2-manifolds by such 3-manifolds based on the Urysohn width. We also obtain a bandwidth estimate and establish other geometric properties of such manifolds.
In this paper we use stable capillary surfaces (analogous to the $μ$-bubble construction) to study manifolds with strictly mean convex boundary and nonnegative scalar curvature. We give an obstruction to filling 2-manifolds by such 3-manifolds based on the Urysohn width. We also obtain a bandwidth estimate and establish other geometric properties of such manifolds.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Multi-Agent Coverage Control on Surfaces Using Conformal Mapping
Authors:
Chao Zhai,
Yuming Wu
Abstract:
Real-time environmental monitoring using a multi-agent system (MAS) has long been a focal point of cooperative control. It is still a challenging task to provide cost-effective services for potential emergencies in surface environments. This paper explores the transformation of a general surface into a two-dimensional (2D) disk through the construction of a conformal mapping. Multiple agents are s…
▽ More
Real-time environmental monitoring using a multi-agent system (MAS) has long been a focal point of cooperative control. It is still a challenging task to provide cost-effective services for potential emergencies in surface environments. This paper explores the transformation of a general surface into a two-dimensional (2D) disk through the construction of a conformal mapping. Multiple agents are strategically deployed within the mapped convex disk, followed by mapping back to the original surface environment. This approach circumvents the complexities associated with handling the difficulties and intricacies of path planning. Technical analysis encompasses the design of distributed control laws and the method to eliminate distortions introduced by the mapping. Moreover, the developed coverage algorithm is applied to a scenario of monitoring surface deformation. Finally, the effectiveness of the proposed algorithm is validated through numerical simulations.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Error estimates of a regularized finite difference method for the Logarithmic Schrödinger equation with Dirac delta potential
Authors:
Xuanxuan Zhou,
Tingchun Wang,
Yong Wu,
Yongyong Cai
Abstract:
In this paper, we introduce a conservative Crank-Nicolson-type finite difference schemes for the regularized logarithmic Schrödinger equation (RLSE) with Dirac delta potential in 1D. The regularized logarithmic Schrödinger equation with a small regularized parameter $0<\eps \ll 1$ is adopted to approximate the logarithmic Schrödinger equation (LSE) with linear convergence rate $O(\eps)$. The numer…
▽ More
In this paper, we introduce a conservative Crank-Nicolson-type finite difference schemes for the regularized logarithmic Schrödinger equation (RLSE) with Dirac delta potential in 1D. The regularized logarithmic Schrödinger equation with a small regularized parameter $0<\eps \ll 1$ is adopted to approximate the logarithmic Schrödinger equation (LSE) with linear convergence rate $O(\eps)$. The numerical method can be used to avoid numerical blow-up and/or to suppress round-off error due to the logarithmic nonlinearity in LSE. Then, by using domain-decomposition technique, we can transform the original problem into an interface problem. Different treatments on the interface conditions lead to different discrete schemes and it turns out that a simple discrete approximation of the Dirac potential coincides with one of the conservative finite difference schemes. The optimal $H^1$ error estimates and the conservative properties of the finite difference schemes are investigated. The Crank-Nicolson finite difference methods enjoy the second-order convergence rate in time and space. Numerical examples are provided to support our analysis and show the accuracy and efficiency of the numerical method.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Prediction from compression for models with infinite memory, with applications to hidden Markov and renewal processes
Authors:
Yanjun Han,
Tianze Jiang,
Yihong Wu
Abstract:
Consider the problem of predicting the next symbol given a sample path of length n, whose joint distribution belongs to a distribution class that may have long-term memory. The goal is to compete with the conditional predictor that knows the true model. For both hidden Markov models (HMMs) and renewal processes, we determine the optimal prediction risk in Kullback- Leibler divergence up to univers…
▽ More
Consider the problem of predicting the next symbol given a sample path of length n, whose joint distribution belongs to a distribution class that may have long-term memory. The goal is to compete with the conditional predictor that knows the true model. For both hidden Markov models (HMMs) and renewal processes, we determine the optimal prediction risk in Kullback- Leibler divergence up to universal constant factors. Extending existing results in finite-order Markov models [HJW23] and drawing ideas from universal compression, the proposed estimator has a prediction risk bounded by redundancy of the distribution class and a memory term that accounts for the long-range dependency of the model. Notably, for HMMs with bounded state and observation spaces, a polynomial-time estimator based on dynamic programming is shown to achieve the optimal prediction risk Θ(log n/n); prior to this work, the only known result of this type is O(1/log n) obtained using Markov approximation [Sha+18]. Matching minimax lower bounds are obtained by making connections to redundancy and mutual information via a reduction argument.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Multivariate confluent Vandermonde with G-Arnoldi and applications
Authors:
Lei-Hong Zhang,
Ya-Nan Zhang,
Linyi Yang,
Yifu Wu
Abstract:
In the least-squares fitting framework, the Vandermonde with Arnoldi (V+A) method presented in [Brubeck, Nakatsukasa, and Trefethen, SIAM Review, 63 (2021), pp. 405-415] is an effective approach to compute a polynomial that approximates an underlying univariate function f. Extensions of V+A include its multivariate version and the univariate confluent V+A; the latter enables us to use the informat…
▽ More
In the least-squares fitting framework, the Vandermonde with Arnoldi (V+A) method presented in [Brubeck, Nakatsukasa, and Trefethen, SIAM Review, 63 (2021), pp. 405-415] is an effective approach to compute a polynomial that approximates an underlying univariate function f. Extensions of V+A include its multivariate version and the univariate confluent V+A; the latter enables us to use the information of the derivative of f in obtaining the approximation polynomial. In this paper, we shall extend V+A further to the multivariate confluent V+A. Besides the technical generalization of the univariate confluent V+A, we also introduce a general and application-dependent G-orthogonalization in the Arnoldi process. We shall demonstrate with several applications that, by specifying an application-related G-inner product, the desired approximate multivariate polynomial as well as its certain partial derivatives can be computed accurately from a well-conditioned least-squares problem whose coefficient matrix is orthonormal. The desired multivariate polynomial is represented in a discrete G-orthogonal polynomials basis which admits an explicit recurrence, and therefore, facilitates evaluating function values and certain partial derivatives at new nodes efficiently. We demonstrate its flexibility by applying it to solve the multivariate Hermite least-squares problem and PDEs with various boundary conditions in irregular domains.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
On the best approximation by finite Gaussian mixtures
Authors:
Yun Ma,
Yihong Wu,
Pengkun Yang
Abstract:
We consider the problem of approximating a general Gaussian location mixture by finite mixtures. The minimum order of finite mixtures that achieve a prescribed accuracy (measured by various $f$-divergences) is determined within constant factors for the family of mixing distributions with compactly support or appropriate assumptions on the tail probability including subgaussian and subexponential.…
▽ More
We consider the problem of approximating a general Gaussian location mixture by finite mixtures. The minimum order of finite mixtures that achieve a prescribed accuracy (measured by various $f$-divergences) is determined within constant factors for the family of mixing distributions with compactly support or appropriate assumptions on the tail probability including subgaussian and subexponential. While the upper bound is achieved using the technique of local moment matching, the lower bound is established by relating the best approximation error to the low-rank approximation of certain trigonometric moment matrices, followed by a refined spectral analysis of their minimum eigenvalue. In the case of Gaussian mixing distributions, this result corrects a previous lower bound in [Allerton Conference 48 (2010) 620-628].
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
Why does the two-timescale Q-learning converge to different mean field solutions? A unified convergence analysis
Authors:
Jing An,
Jianfeng Lu,
Yue Wu,
Yang Xiang
Abstract:
We revisit the unified two-timescale Q-learning algorithm as initially introduced by Angiuli et al. \cite{angiuli2022unified}. This algorithm demonstrates efficacy in solving mean field game (MFG) and mean field control (MFC) problems, simply by tuning the ratio of two learning rates for mean field distribution and the Q-functions respectively. In this paper, we provide a comprehensive theoretical…
▽ More
We revisit the unified two-timescale Q-learning algorithm as initially introduced by Angiuli et al. \cite{angiuli2022unified}. This algorithm demonstrates efficacy in solving mean field game (MFG) and mean field control (MFC) problems, simply by tuning the ratio of two learning rates for mean field distribution and the Q-functions respectively. In this paper, we provide a comprehensive theoretical explanation of the algorithm's bifurcated numerical outcomes under fixed learning rates. We achieve this by establishing a diagram that correlates continuous-time mean field problems to their discrete-time Q-function counterparts, forming the basis of the algorithm. Our key contribution lies in the construction of a Lyapunov function integrating both mean field distribution and Q-function iterates. This Lyapunov function facilitates a unified convergence of the algorithm across the entire spectrum of learning rates, thus providing a cohesive framework for analysis.
△ Less
Submitted 28 May, 2024; v1 submitted 5 April, 2024;
originally announced April 2024.
-
Maffei's action and symplectic Springer action for quiver varieties
Authors:
Yaochen Wu
Abstract:
We examine the relationship between the actions of two Weyl groups on the cohomology of a smooth quiver variety: the Maffei's action of the Weyl group associated to the quiver, and the symplectic Springer action of the Namikawa-Weyl group of the affine quiver variety. We show there is a natural map from the former group to the latter, which is an embedding in favorable situations, and this map int…
▽ More
We examine the relationship between the actions of two Weyl groups on the cohomology of a smooth quiver variety: the Maffei's action of the Weyl group associated to the quiver, and the symplectic Springer action of the Namikawa-Weyl group of the affine quiver variety. We show there is a natural map from the former group to the latter, which is an embedding in favorable situations, and this map intertwines their actions on the cohomology. This answers a question raised in arXiv:1904.10497.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Existence of solutions for a class of Kirchhoff-type equations with indefinite potential
Authors:
Linlian Xiao,
Jiaqian Yuan,
Jian Zhou,
Yunshun Wu
Abstract:
In this paper, we consider the existence of solutions of the following Kirchhoff-type problem \[
\left\{
\begin{array}
[c]{ll}
-\left(a+b\int_{\mathbb{R}^3}|\nabla u|^2dx\right)Δu+ V(x)u=f(x,u),~{\rm{in}}~ \mathbb{R}^{3},\\
u\in H^1(\mathbb{R}^3),
\end{array} \right. \] where $a,b$ are postive constants, and the potential $V(x)$ is continuous and indefinite in sign. Under some suitable…
▽ More
In this paper, we consider the existence of solutions of the following Kirchhoff-type problem \[
\left\{
\begin{array}
[c]{ll}
-\left(a+b\int_{\mathbb{R}^3}|\nabla u|^2dx\right)Δu+ V(x)u=f(x,u),~{\rm{in}}~ \mathbb{R}^{3},\\
u\in H^1(\mathbb{R}^3),
\end{array} \right. \] where $a,b$ are postive constants, and the potential $V(x)$ is continuous and indefinite in sign. Under some suitable assumptions on $V(x)$ and $f$, we obtain the existence of solutions by the Symmetric Mountain Pass Theorem.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Locally-primitive block designs
Authors:
Jianfu Chen,
Peice Hua,
Cai Heng Li,
Yanni Wu
Abstract:
A locally-primitive design is a block design $(\mathcal{P},\mathcal{B})$ which admits an automorphism group $G$ with primitive local actions. It is proved that $G$ is primitive on the points $\mathcal{P}$, and either $G$ is an almost simple group, or $G$ acting on $\mathcal{P}$ is an affine group.
A locally-primitive design is a block design $(\mathcal{P},\mathcal{B})$ which admits an automorphism group $G$ with primitive local actions. It is proved that $G$ is primitive on the points $\mathcal{P}$, and either $G$ is an almost simple group, or $G$ acting on $\mathcal{P}$ is an affine group.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
The cubic nonlinear Schrödinger equation with rough potential
Authors:
Norbert J. Mauser,
Yifei Wu,
Xiaofei Zhao
Abstract:
We consider the cubic nonlinear Schrödinger equation with a spatially rough potential, a key equation in the mathematical setup for nonlinear Anderson localization. Our study comprises two main parts: new optimal results on the well-posedness analysis on the PDE level, and subsequently a new efficient numerical method, its convergence analysis and simulations that illustrate our analytical results…
▽ More
We consider the cubic nonlinear Schrödinger equation with a spatially rough potential, a key equation in the mathematical setup for nonlinear Anderson localization. Our study comprises two main parts: new optimal results on the well-posedness analysis on the PDE level, and subsequently a new efficient numerical method, its convergence analysis and simulations that illustrate our analytical results. In the analysis part, our results focus on understanding how the regularity of the solution is influenced by the regularity of the potential, where we provide quantitative and explicit characterizations. Ill-posedness results are also established to demonstrate the sharpness of the obtained regularity characterizations and to indicate the minimum regularity required from the potential for the NLS to be solvable. Building upon the obtained regularity results, we design an appropriate numerical discretization for the model and establish its convergence with an optimal error bound. The numerical experiments in the end not only verify the theoretical regularity results, but also confirm the established convergence rate of the proposed scheme. Additionally, a comparison with other existing schemes is conducted to demonstrate the better accuracy of our new scheme in the case of a rough potential.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Nonlinear Stability for the Superposition of Viscous Contact Wave and Rarefaction Waves to Non-isentropic Compressible Navier-Stokes System with General Initial Perturbations
Authors:
Yi Peng,
Xiaoding Shi,
Yuhang Wu
Abstract:
In this paper, the large time behavior of the solutions for the Cauchy problem to the one-dimensional compressible Navier-Stokes system with the motion of a viscous heat-conducting perfect polytropic gas is investigated.Our result shows that the combination of a viscous contact wave with rarefaction waves is asymptotically stable, when the large initial disturbance of the density, velocity and tem…
▽ More
In this paper, the large time behavior of the solutions for the Cauchy problem to the one-dimensional compressible Navier-Stokes system with the motion of a viscous heat-conducting perfect polytropic gas is investigated.Our result shows that the combination of a viscous contact wave with rarefaction waves is asymptotically stable, when the large initial disturbance of the density, velocity and temperature belong to $H^{1}(\mathbb{R})$, $L^{2}(\mathbb{R})\cap L^{4}(\mathbb{R})$ and $L^{2}(\mathbb{R})$, provided the strength of the combination waves is suitably small. In addition, the initial disturbance on the derivation of velocity and temperature belong to $L^{2}(\mathbb{R})$ can be arbitrarily large.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Probabilistic reachable sets of stochastic nonlinear systems with contextual uncertainties
Authors:
Xun Shen,
Ye Wang,
Kazumune Hashimoto,
Yuhu Wu,
Sebastien Gros
Abstract:
Validating and controlling safety-critical systems in uncertain environments necessitates probabilistic reachable sets of future state evolutions. The existing methods of computing probabilistic reachable sets normally assume that the uncertainties are independent of the state. However, this assumption falls short in many real-world applications, where uncertainties are state-dependent, referred t…
▽ More
Validating and controlling safety-critical systems in uncertain environments necessitates probabilistic reachable sets of future state evolutions. The existing methods of computing probabilistic reachable sets normally assume that the uncertainties are independent of the state. However, this assumption falls short in many real-world applications, where uncertainties are state-dependent, referred to as contextual uncertainties. This paper formulates the problem of computing probabilistic reachable sets of stochastic nonlinear states with contextual uncertainties by seeking minimum-volume polynomial sublevel sets with contextual chance constraints. The formulated problem cannot be solved by the existing sample-based approximation method since the existing methods do not consider the conditional probability densities. To address this, we propose a consistent sample approximation of the original problem by leveraging the conditional density estimation and resampling. The obtained approximate problem is a tractable optimization problem. Additionally, we prove the almost uniform convergence of the proposed sample-based approximation, showing that it gives the optimal solution almost consistently with the original ones. Through a numerical example, we evaluate the effectiveness of the proposed method against existing approaches, highlighting its capability to significantly reduce the bias inherent in sample-based approximation without considering a conditional probability density.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Formalization of Complexity Analysis of the First-order Optimization Algorithms
Authors:
Chenyi Li,
Ziyu Wang,
Wanyi He,
Yuxuan Wu,
Shengyang Xu,
Zaiwen Wen
Abstract:
The convergence rate of various first-order optimization algorithms is a pivotal concern within the numerical optimization community, as it directly reflects the efficiency of these algorithms across different optimization problems. Our goal is making a significant step forward in the formal mathematical representation of optimization techniques using the Lean4 theorem prover. We first formalize t…
▽ More
The convergence rate of various first-order optimization algorithms is a pivotal concern within the numerical optimization community, as it directly reflects the efficiency of these algorithms across different optimization problems. Our goal is making a significant step forward in the formal mathematical representation of optimization techniques using the Lean4 theorem prover. We first formalize the gradient for smooth functions and the subgradient for convex functions on a Hilbert space, laying the groundwork for the accurate formalization of algorithmic structures. Then, we extend our contribution by proving several properties of differentiable convex functions that have not yet been formalized in Mathlib. Finally, a comprehensive formalization of these algorithms is presented. These developments are not only noteworthy on their own but also serve as essential precursors to the formalization of a broader spectrum of numerical algorithms and their applications in machine learning as well as many other areas.
△ Less
Submitted 19 March, 2024; v1 submitted 17 March, 2024;
originally announced March 2024.
-
Snevily's Conjecture about $\mathcal{L}$-intersecting Families on Set Systems and its Analogue on Vector Spaces
Authors:
Jiuqiang Liu,
Guihai Yu,
Lihua Feng,
Yongjiang Wu
Abstract:
The classical Erdős-Ko-Rado theorem on the size of an intersecting family of $k$-subsets of the set $[n] = \{1, 2, \dots, n\}$ is one of the fundamental intersection theorems for set systems. After the establishment of the EKR theorem, many intersection theorems on set systems have appeared in the literature, such as the well-known Frankl-Wilson theorem, Alon-Babai-Suzuki theorem, and Grolmusz-Sud…
▽ More
The classical Erdős-Ko-Rado theorem on the size of an intersecting family of $k$-subsets of the set $[n] = \{1, 2, \dots, n\}$ is one of the fundamental intersection theorems for set systems. After the establishment of the EKR theorem, many intersection theorems on set systems have appeared in the literature, such as the well-known Frankl-Wilson theorem, Alon-Babai-Suzuki theorem, and Grolmusz-Sudakov theorem. In 1995, Snevily proposed the conjecture that the upper bound for the size of an $\mathcal{L}$-intersecting family of subsets of $[n]$ is ${{n} \choose {s}}$ under the condition $\max \{l_{i}\} < \min \{k_{j}\}$, where $\mathcal{L} = \{l_{1}, \dots, l_{s}\}$ with $0 \leq l_{1} < \cdots < l_{s}$ and $k_{j}$ are subset sizes in the family. In this paper, we prove that Snevily's conjecture holds for $n \geq {k^{2} \choose {l_{1}+1}}s + l_{1}$, where $k$ is the maximum subset size in the family. We then derive an analogous result for $\mathcal{L}$-intersecting families of subspaces of an $n$-dimensional vector space over a finite field $\mathbb{F}_{q}$.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Failures and Successes of Cross-Validation for Early-Stopped Gradient Descent
Authors:
Pratik Patil,
Yuchen Wu,
Ryan J. Tibshirani
Abstract:
We analyze the statistical properties of generalized cross-validation (GCV) and leave-one-out cross-validation (LOOCV) applied to early-stopped gradient descent (GD) in high-dimensional least squares regression. We prove that GCV is generically inconsistent as an estimator of the prediction risk of early-stopped GD, even for a well-specified linear model with isotropic features. In contrast, we sh…
▽ More
We analyze the statistical properties of generalized cross-validation (GCV) and leave-one-out cross-validation (LOOCV) applied to early-stopped gradient descent (GD) in high-dimensional least squares regression. We prove that GCV is generically inconsistent as an estimator of the prediction risk of early-stopped GD, even for a well-specified linear model with isotropic features. In contrast, we show that LOOCV converges uniformly along the GD trajectory to the prediction risk. Our theory requires only mild assumptions on the data distribution and does not require the underlying regression function to be linear. Furthermore, by leveraging the individual LOOCV errors, we construct consistent estimators for the entire prediction error distribution along the GD trajectory and consistent estimators for a wide class of error functionals. This in particular enables the construction of pathwise prediction intervals based on GD iterates that have asymptotically correct nominal coverage conditional on the training data.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Port-Hamiltonian modeling and control of a curling HASEL actuator
Authors:
Nelson Cisneros,
Yongxin Wu,
Kanty Rabenorosoa,
Yann Le Gorrec
Abstract:
This paper is concerned with the modeling and control of a curling Hydraulically Amplified Self-healing Electrostatic (HASEL) actuator using the port-Hamiltonian (PH) approach. For that purpose, we use a modular approach and consider the HASEL actuator as an interconnection of elementary subsystems. Each subsystem is modeled by an electrical component consisting of a capacitor in parallel with an…
▽ More
This paper is concerned with the modeling and control of a curling Hydraulically Amplified Self-healing Electrostatic (HASEL) actuator using the port-Hamiltonian (PH) approach. For that purpose, we use a modular approach and consider the HASEL actuator as an interconnection of elementary subsystems. Each subsystem is modeled by an electrical component consisting of a capacitor in parallel with an inductor connected through the conservation of volume of the moving liquid to a mechanical structure based on inertia, linear, and torsional springs. The parameters are then identified, and the model is validated on the experimental setup. Position control is achieved by using Interconnection and Damping Assignment-Passivity Based Control (IDA-PBC) with integral action (IA) for disturbance rejection. Simulation results show the efficiency of the proposed controller.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Sharp Information-Theoretic Thresholds for Shuffled Linear Regression
Authors:
Leon Lufkin,
Yihong Wu,
Jiaming Xu
Abstract:
This paper studies the problem of shuffled linear regression, where the correspondence between predictors and responses in a linear model is obfuscated by a latent permutation. Specifically, we consider the model $y = Π_* X β_* + w$, where $X$ is an $n \times d$ standard Gaussian design matrix, $w$ is Gaussian noise with entrywise variance $σ^2$, $Π_*$ is an unknown $n \times n$ permutation matrix…
▽ More
This paper studies the problem of shuffled linear regression, where the correspondence between predictors and responses in a linear model is obfuscated by a latent permutation. Specifically, we consider the model $y = Π_* X β_* + w$, where $X$ is an $n \times d$ standard Gaussian design matrix, $w$ is Gaussian noise with entrywise variance $σ^2$, $Π_*$ is an unknown $n \times n$ permutation matrix, and $β_*$ is the regression coefficient, also unknown. Previous work has shown that, in the large $n$-limit, the minimal signal-to-noise ratio ($\mathsf{SNR}$), $\lVert β_* \rVert^2/σ^2$, for recovering the unknown permutation exactly with high probability is between $n^2$ and $n^C$ for some absolute constant $C$ and the sharp threshold is unknown even for $d=1$. We show that this threshold is precisely $\mathsf{SNR} = n^4$ for exact recovery throughout the sublinear regime $d=o(n)$. As a by-product of our analysis, we also determine the sharp threshold of almost exact recovery to be $\mathsf{SNR} = n^2$, where all but a vanishing fraction of the permutation is reconstructed.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Optimal score estimation via empirical Bayes smoothing
Authors:
Andre Wibisono,
Yihong Wu,
Kaylee Yingxi Yang
Abstract:
We study the problem of estimating the score function of an unknown probability distribution $ρ^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming that $ρ^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, we establish the optimal rate of $\tilde Θ(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function…
▽ More
We study the problem of estimating the score function of an unknown probability distribution $ρ^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming that $ρ^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, we establish the optimal rate of $\tilde Θ(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function $\|\hat s - s^*\|^2_{L^2(ρ^*)}$ that is commonly used in the score matching literature, highlighting the curse of dimensionality where sample complexity for accurate score estimation grows exponentially with the dimension $d$. Leveraging key insights in empirical Bayes theory as well as a new convergence rate of smoothed empirical distribution in Hellinger distance, we show that a regularized score estimator based on a Gaussian kernel attains this rate, shown optimal by a matching minimax lower bound. We also discuss extensions to estimating $β$-Hölder continuous scores with $β\leq 1$, as well as the implication of our theory on the sample complexity of score-based generative models.
△ Less
Submitted 12 June, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
Nonstationary Discounted Stochastic Games under Prospect Theory with Applications to the Smart Grid
Authors:
Yiting Wu,
Junyu Zhang
Abstract:
This paper considers the discounted criterion of nonzero-sum decentralized stochastic games with prospect players. The state and action spaces are finite. The state transition probability is nonstationary. Each player independently controls their own Markov chain. The subjective behavior of players is described by the prospect theory (PT). Compared to the average criterion of stochastic games unde…
▽ More
This paper considers the discounted criterion of nonzero-sum decentralized stochastic games with prospect players. The state and action spaces are finite. The state transition probability is nonstationary. Each player independently controls their own Markov chain. The subjective behavior of players is described by the prospect theory (PT). Compared to the average criterion of stochastic games under PT studied firstly in 2018, we are concerned with the time value of utility, i.e., the utility should be discounted in the future. Since PT distorts the probability, the optimality equation that plays a significant role in proving the existence of equilibrium does not exist. On the other hand, the games change into Markov decision processes (MDPs) with nonstationary payoff function when fixing others' stationary Markov strategies, then the occupation measure and the linear programming of stationary MDPs are no longer suitable. Therefore, we explore a new technique by constructing the marginal distribution on the state-action pairs at any time, and establish the existence of Nash equilibrium. When the initial state is given, there exists a Markov Nash equilibrium. Furthermore, this novel technique can be extended to the finite horizon criterion. Then, we present an algorithm to find a Markov $\varepsilon$-equilibrium. Finally, the model is applied to a noncooperative stochastic game among prosumers who can produce and consume energy in the smart grid, and we give some simulation results.
△ Less
Submitted 15 May, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Universal Gradient Methods for Stochastic Convex Optimization
Authors:
Anton Rodomanov,
Ali Kavis,
Yongtao Wu,
Kimon Antonakopoulos,
Volkan Cevher
Abstract:
We develop universal gradient methods for Stochastic Convex Optimization (SCO). Our algorithms automatically adapt not only to the oracle's noise but also to the Hölder smoothness of the objective function without a priori knowledge of the particular setting. The key ingredient is a novel strategy for adjusting step-size coefficients in the Stochastic Gradient Method (SGD). Unlike AdaGrad, which a…
▽ More
We develop universal gradient methods for Stochastic Convex Optimization (SCO). Our algorithms automatically adapt not only to the oracle's noise but also to the Hölder smoothness of the objective function without a priori knowledge of the particular setting. The key ingredient is a novel strategy for adjusting step-size coefficients in the Stochastic Gradient Method (SGD). Unlike AdaGrad, which accumulates gradient norms, our Universal Gradient Method accumulates appropriate combinations of gradient- and iterate differences. The resulting algorithm has state-of-the-art worst-case convergence rate guarantees for the entire Hölder class including, in particular, both nonsmooth functions and those with Lipschitz continuous gradient. We also present the Universal Fast Gradient Method for SCO enjoying optimal efficiency estimates.
△ Less
Submitted 11 July, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Variable-order fractional Laplacian and its accurate and efficient computations with meshfree methods
Authors:
Yixuan Wu,
Yanzhi Zhang
Abstract:
The variable-order fractional Laplacian plays an important role in the study of heterogeneous systems. In this paper, we propose the first numerical methods for the variable-order Laplacian $(-Δ)^{α({\bf x})/2}$ with $0 < α({\bf x}) \le 2$, which will also be referred as the variable-order fractional Laplacian if $α({\bf x})$ is strictly less than 2. We present a class of hypergeometric functions…
▽ More
The variable-order fractional Laplacian plays an important role in the study of heterogeneous systems. In this paper, we propose the first numerical methods for the variable-order Laplacian $(-Δ)^{α({\bf x})/2}$ with $0 < α({\bf x}) \le 2$, which will also be referred as the variable-order fractional Laplacian if $α({\bf x})$ is strictly less than 2. We present a class of hypergeometric functions whose variable-order Laplacian can be analytically expressed. Building on these analytical results, we design the meshfree methods based on globally supported radial basis functions (RBFs), including Gaussian, generalized inverse multiquadric, and Bessel-type RBFs, to approximate the variable-order Laplacian $(-Δ)^{α({\bf x})/2}$. Our meshfree methods integrate the advantages of both pseudo-differential and hypersingular integral forms of the variable-order fractional Laplacian, and thus avoid numerically approximating the hypersingular integral. Moreover, our methods are simple and flexible of domain geometry, and their computer implementation remains the same for any dimension $d \ge 1$. Compared to finite difference methods, our methods can achieve a desired accuracy with much fewer points. This fact makes our method much attractive for problems involving variable-order fractional Laplacian where the number of points required is a critical cost. We then apply our method to study solution behaviors of variable-order fractional PDEs arising in different fields, including transition of waves between classical and fractional media, and coexistence of anomalous and normal diffusion in both diffusion equation and the Allen-Cahn equation. These results would provide insights for further understanding and applications of variable-order fractional derivatives.
△ Less
Submitted 3 February, 2024;
originally announced February 2024.
-
Preconditioning for Physics-Informed Neural Networks
Authors:
Songming Liu,
Chang Su,
Jiachen Yao,
Zhongkai Hao,
Hang Su,
Youjia Wu,
Jun Zhu
Abstract:
Physics-informed neural networks (PINNs) have shown promise in solving various partial differential equations (PDEs). However, training pathologies have negatively affected the convergence and prediction accuracy of PINNs, which further limits their practical applications. In this paper, we propose to use condition number as a metric to diagnose and mitigate the pathologies in PINNs. Inspired by c…
▽ More
Physics-informed neural networks (PINNs) have shown promise in solving various partial differential equations (PDEs). However, training pathologies have negatively affected the convergence and prediction accuracy of PINNs, which further limits their practical applications. In this paper, we propose to use condition number as a metric to diagnose and mitigate the pathologies in PINNs. Inspired by classical numerical analysis, where the condition number measures sensitivity and stability, we highlight its pivotal role in the training dynamics of PINNs. We prove theorems to reveal how condition number is related to both the error control and convergence of PINNs. Subsequently, we present an algorithm that leverages preconditioning to improve the condition number. Evaluations of 18 PDE problems showcase the superior performance of our method. Significantly, in 7 of these problems, our method reduces errors by an order of magnitude. These empirical findings verify the critical role of the condition number in PINNs' training.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
An Enhanced Modelling Approach for Warehouse Sharing Platform System Designing Problem
Authors:
Zeren Xing,
Yuehui Wu,
Shuangyuan Yu
Abstract:
With the increasing importance of sustainability, warehouse sharing arises as a possible way to improve the efficiency of the existing logistics system. This paper studied the warehouse sharing platform systems (WSPS) and investigated its supply chain network, including factories, warehouses, and customers. We proposed an enhanced modelling approach for the warehouse sharing platform system design…
▽ More
With the increasing importance of sustainability, warehouse sharing arises as a possible way to improve the efficiency of the existing logistics system. This paper studied the warehouse sharing platform systems (WSPS) and investigated its supply chain network, including factories, warehouses, and customers. We proposed an enhanced modelling approach for the warehouse sharing platform system design problem (WSPSDP) using the multi-allocation hub location routing problem framework. New elements such as inter-warehouse transportation and multiple-allocation scheme were added compared to the existing WSPS model. Then an adaptive large neighbourhood decomposition search heuristic was applied to solve our problem. Computational experiments were conducted on different-sized instances for comparison with the WSPS model without inter-warehouse transportation and the WSPS model with single-allocation scheme. The results suggested that our proposed WSPSDP model is more cost-efficient than the existing WSPS models, and it has the potential to promote the utilisation of existing cheap idle warehouses.
△ Less
Submitted 27 January, 2024;
originally announced January 2024.
-
A Randomized Runge-Kutta Method for time-irregular delay differential equations
Authors:
Fabio V. Difonzo,
Paweł Przybyłowicz,
Yue Wu,
Xinheng Xie
Abstract:
In this paper we investigate the existence, uniqueness and approximation of solutions of delay differential equations (DDEs) with the right-hand side functions $f=f(t,x,z)$ that are Lipschitz continuous with respect to $x$ but only Hölder continuous with respect to $(t,z)$. We give a construction of the randomized two-stage Runge-Kutta scheme for DDEs and investigate its upper error bound in the…
▽ More
In this paper we investigate the existence, uniqueness and approximation of solutions of delay differential equations (DDEs) with the right-hand side functions $f=f(t,x,z)$ that are Lipschitz continuous with respect to $x$ but only Hölder continuous with respect to $(t,z)$. We give a construction of the randomized two-stage Runge-Kutta scheme for DDEs and investigate its upper error bound in the $L^p(Ω)$-norm for $p\in [2,+\infty)$. Finally, we report on results of numerical experiments.
△ Less
Submitted 21 January, 2024;
originally announced January 2024.
-
MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning
Authors:
Mao Hong,
Zhiyue Zhang,
Yue Wu,
Yanxun Xu
Abstract:
Model-based offline reinforcement learning methods (RL) have achieved state-of-the-art performance in many decision-making problems thanks to their sample efficiency and generalizability. Despite these advancements, existing model-based offline RL approaches either focus on theoretical studies without developing practical algorithms or rely on a restricted parametric policy space, thus not fully l…
▽ More
Model-based offline reinforcement learning methods (RL) have achieved state-of-the-art performance in many decision-making problems thanks to their sample efficiency and generalizability. Despite these advancements, existing model-based offline RL approaches either focus on theoretical studies without developing practical algorithms or rely on a restricted parametric policy space, thus not fully leveraging the advantages of an unrestricted policy space inherent to model-based methods. To address this limitation, we develop MoMA, a model-based mirror ascent algorithm with general function approximations under partial coverage of offline data. MoMA distinguishes itself from existing literature by employing an unrestricted policy class. In each iteration, MoMA conservatively estimates the value function by a minimization procedure within a confidence set of transition models in the policy evaluation step, then updates the policy with general function approximations instead of commonly-used parametric policy classes in the policy improvement step. Under some mild assumptions, we establish theoretical guarantees of MoMA by proving an upper bound on the suboptimality of the returned policy. We also provide a practically implementable, approximate version of the algorithm. The effectiveness of MoMA is demonstrated via numerical studies.
△ Less
Submitted 20 January, 2024;
originally announced January 2024.
-
A note on Rational Maps with three branching points on the Riemann sphere
Authors:
Zhiqiang Wei,
Yingyi Wu,
Bin Xu
Abstract:
Studying the existence of rational functions with given branching datum is a classical problem in the field of complex analysis and algebraic geometry. This problem dates back to Hurwitz and remains open to this day. In this paper, we utilize complex analysis to establish a property of rational functions with 3 branching points on the Riemann sphere. Given two compact Riemann surfaces $M$ and $N$,…
▽ More
Studying the existence of rational functions with given branching datum is a classical problem in the field of complex analysis and algebraic geometry. This problem dates back to Hurwitz and remains open to this day. In this paper, we utilize complex analysis to establish a property of rational functions with 3 branching points on the Riemann sphere. Given two compact Riemann surfaces $M$ and $N$, a pair $(d,\mathcal{D})$ of an integer $d\geq2$ and a collection $\mathcal{D}$ of nontrivial partitions of $d$ is called a candidate branching datum if it satisfies the Riemann-Hurwitz formula. And a candidate branching datum is exceptional if there does not exist a rational function realization it. As applications, we present some new types of exceptional branching datum. These results cover some previous results mentioned in \cite{EKS84,PP06,Zhu19}. We also deduce the realizability of a certain type of candidate branching datum on the Riemann sphere.
△ Less
Submitted 28 May, 2024; v1 submitted 12 January, 2024;
originally announced January 2024.
-
Supervised Gromov-Wasserstein Optimal Transport
Authors:
Zixuan Cang,
Yaqi Wu,
Yanxiang Zhao
Abstract:
We introduce the supervised Gromov-Wasserstein (sGW) optimal transport, an extension of Gromov-Wasserstein by incorporating potential infinity patterns in the cost tensor. sGW enables the enforcement of application-induced constraints such as the preservation of pairwise distances by implementing the constraints as an infinity pattern. A numerical solver is proposed for the sGW problem and the eff…
▽ More
We introduce the supervised Gromov-Wasserstein (sGW) optimal transport, an extension of Gromov-Wasserstein by incorporating potential infinity patterns in the cost tensor. sGW enables the enforcement of application-induced constraints such as the preservation of pairwise distances by implementing the constraints as an infinity pattern. A numerical solver is proposed for the sGW problem and the effectiveness is demonstrated in various numerical experiments. The high-order constraints in sGW are transferred to constraints on the coupling matrix by solving a minimal vertex cover problem. The transformed problem is solved by the Mirror-C descent iteration coupled with the supervised optimal transport solver. In the numerical experiments, we first validate the proposed framework by applying it to matching synthetic datasets and investigating the impact of the model parameters. Additionally, we successfully apply sGW to real single-cell RNA sequencing data. Through comparisons with other Gromov-Wasserstein variants on real data, we demonstrate that sGW offers the novel utility of controlling distance preservation, leading to the automatic estimation of overlapping portions of datasets, which brings improved stability and flexibility in data-driven applications.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Hausdorff dimensions of topologically transitive Markov hom tree-shifts
Authors:
Jung-Chao Ban,
Guan-Yu Lai,
Yu-Liang Wu
Abstract:
This paper features an analog of Sanov's theorem for finite-state Markov chains indexed by rooted d-trees, obtained via the method of types in the classical analysis of large deviations. Along with the theorem comes two applications: an almost-sure type convergence of sample means and a formula for the Hausdorff dimension of the symbolic space associated with the irreducible Markov chain.
This paper features an analog of Sanov's theorem for finite-state Markov chains indexed by rooted d-trees, obtained via the method of types in the classical analysis of large deviations. Along with the theorem comes two applications: an almost-sure type convergence of sample means and a formula for the Hausdorff dimension of the symbolic space associated with the irreducible Markov chain.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
More MDS codes of non-Reed-Solomon type
Authors:
Yansheng Wu,
Ziling Heng,
Chengju Li,
Cunsheng Ding
Abstract:
MDS codes have diverse practical applications in communication systems, data storage, and quantum codes due to their algebraic properties and optimal error-correcting capability. In this paper, we focus on a class of linear codes and establish some sufficient and necessary conditions for them being MDS. Notably, these codes differ from Reed-Solomon codes up to monomial equivalence. Additionally, w…
▽ More
MDS codes have diverse practical applications in communication systems, data storage, and quantum codes due to their algebraic properties and optimal error-correcting capability. In this paper, we focus on a class of linear codes and establish some sufficient and necessary conditions for them being MDS. Notably, these codes differ from Reed-Solomon codes up to monomial equivalence. Additionally, we also explore the cases in which these codes are almost MDS or near MDS. Applying our main results, we determine the covering radii and deep holes of the dual codes associated with specific Roth-Lempel codes and discover an infinite family of (almost) optimally extendable codes with dimension three.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
Connections between K-stability and Vojta's conjecture
Authors:
Jackson S. Morrow,
Yueqiao Wu
Abstract:
In this note, we use recent advances concerning the K-stability of $\mathbb{Q}$-Fano varieties to provide settings for which Vojta's conjecture holds.
In this note, we use recent advances concerning the K-stability of $\mathbb{Q}$-Fano varieties to provide settings for which Vojta's conjecture holds.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
Van der Corput and metric theorems for geometric progressions for self-similar measures
Authors:
Amir Algom,
Yuanyang Chang,
Meng Wu,
Yu-Liang Wu
Abstract:
We prove a van der Corput lemma for non-atomic self-similar measures $μ$. As an application, we show that the correlations of all finite orders of $( x^n \mod 1 )_{n\geq 1}$ converge to the Poissonian model for $μ$-a.e. $x$, assuming $x>1$. We also complete a recent result of Algom, Rodriguez Hertz, and Wang (obtained simultaneously by Baker and Banaji), showing that any self-conformal measure wit…
▽ More
We prove a van der Corput lemma for non-atomic self-similar measures $μ$. As an application, we show that the correlations of all finite orders of $( x^n \mod 1 )_{n\geq 1}$ converge to the Poissonian model for $μ$-a.e. $x$, assuming $x>1$. We also complete a recent result of Algom, Rodriguez Hertz, and Wang (obtained simultaneously by Baker and Banaji), showing that any self-conformal measure with respect to a non-affine real analytic IFS has polynomial Fourier decay.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
Sharp Analysis of Power Iteration for Tensor PCA
Authors:
Yuchen Wu,
Kangjie Zhou
Abstract:
We investigate the power iteration algorithm for the tensor PCA model introduced in Richard and Montanari (2014). Previous work studying the properties of tensor power iteration is either limited to a constant number of iterations, or requires a non-trivial data-independent initialization. In this paper, we move beyond these limitations and analyze the dynamics of randomly initialized tensor power…
▽ More
We investigate the power iteration algorithm for the tensor PCA model introduced in Richard and Montanari (2014). Previous work studying the properties of tensor power iteration is either limited to a constant number of iterations, or requires a non-trivial data-independent initialization. In this paper, we move beyond these limitations and analyze the dynamics of randomly initialized tensor power iteration up to polynomially many steps. Our contributions are threefold: First, we establish sharp bounds on the number of iterations required for power method to converge to the planted signal, for a broad range of the signal-to-noise ratios. Second, our analysis reveals that the actual algorithmic threshold for power iteration is smaller than the one conjectured in literature by a polylog(n) factor, where n is the ambient dimension. Finally, we propose a simple and effective stopping criterion for power iteration, which provably outputs a solution that is highly correlated with the true signal. Extensive numerical experiments verify our theoretical results.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
Gaussian radial basis functions collocation for fractional PDEs: methodology and error analysis
Authors:
Xiaochuan Tian,
Yixuan Wu,
Yanzhi Zhang
Abstract:
The paper introduces a new meshfree pseudospectral method based on Gaussian radial basis functions (RBFs) collocation to solve fractional Poisson equations. Hypergeometric functions are used to represent the fractional Laplacian of Gaussian RBFs, enabling an efficient computation of stiffness matrix entries. Unlike existing RBF-based methods, our approach ensures a Toeplitz structure in the stiffn…
▽ More
The paper introduces a new meshfree pseudospectral method based on Gaussian radial basis functions (RBFs) collocation to solve fractional Poisson equations. Hypergeometric functions are used to represent the fractional Laplacian of Gaussian RBFs, enabling an efficient computation of stiffness matrix entries. Unlike existing RBF-based methods, our approach ensures a Toeplitz structure in the stiffness matrix with equally spaced RBF centers, enabling efficient matrix-vector multiplications using fast Fourier transforms. We conduct a comprehensive study on the shape parameter selection, addressing challenges related to ill-conditioning and numerical stability. The main contribution of our work includes rigorous stability analysis and error estimates of the Gaussian RBF collocation method, representing a first attempt at the rigorous analysis of RBF-based methods for fractional PDEs to the best of our knowledge. We conduct numerical experiments to validate our analysis and provide practical insights for implementation.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
Projected Langevin Monte Carlo algorithms in non-convex and super-linear setting
Authors:
Chenxu Pang,
Xiaojie Wang,
Yue Wu
Abstract:
It is of significant interest in many applications to sample from a high-dimensional target distribution $π$ with the density $π(\text{d} x) \propto e^{-U(x)} (\text{d} x) $, based on the temporal discretization of the Langevin stochastic differential equations (SDEs). In this paper, we propose an explicit projected Langevin Monte Carlo (PLMC) algorithm with non-convex potential $U$ and super-line…
▽ More
It is of significant interest in many applications to sample from a high-dimensional target distribution $π$ with the density $π(\text{d} x) \propto e^{-U(x)} (\text{d} x) $, based on the temporal discretization of the Langevin stochastic differential equations (SDEs). In this paper, we propose an explicit projected Langevin Monte Carlo (PLMC) algorithm with non-convex potential $U$ and super-linear gradient of $U$ and investigate the non-asymptotic analysis of its sampling error in total variation distance. Equipped with time-independent regularity estimates for the corresponding Kolmogorov equation, we derive the non-asymptotic bounds on the total variation distance between the target distribution of the Langevin SDEs and the law induced by the PLMC scheme with order $\mathcal{O}(h |\ln h|)$. Moreover, for a given precision $ε$, the smallest number of iterations of the classical Langevin Monte Carlo (LMC) scheme with the non-convex potential $U$ and the globally Lipshitz gradient of $U$ can be guaranteed by order ${\mathcal{O}}\big(\tfrac{d^{3/2}}ε \cdot \ln (\tfrac{d}ε) \cdot \ln (\tfrac{1}ε) \big)$. Numerical experiments are provided to confirm the theoretical findings.
△ Less
Submitted 1 January, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
On the best convergence rates of lightning plus polynomial approximations
Authors:
Shuhuang Xiang,
Shunfeng Yang,
Yanghao Wu
Abstract:
Building on introducing exponentially clustered poles, Trefethen and his collaborators introduced lightning algorithms for approximating functions of singularities. These schemes may achieve root-exponential convergence rates. In particular, based on a specific choice of the parameter of the tapered exponentially clustered poles, the lightning approximation with either a low-degree polynomial basi…
▽ More
Building on introducing exponentially clustered poles, Trefethen and his collaborators introduced lightning algorithms for approximating functions of singularities. These schemes may achieve root-exponential convergence rates. In particular, based on a specific choice of the parameter of the tapered exponentially clustered poles, the lightning approximation with either a low-degree polynomial basis may achieve the optimal convergence rate simply as the best rational approximation for prototype $x^α$ on $[0,1]$, which was illustrated through delicate numerical experiments and conjectured in [SIAM J. Numer. Anal., 61:2580-2600, 2023]. By utilizing Poisson's summation formula and results akin to Paley-Wiener Theorem, we rigorously show that all these schemes with a low-degree polynomial basis achieve root-exponential convergence rates with exact orders in approximating $x^α$ for arbitrary clustered parameters theoretically, and provide the best choices of the parameter to achieve the fastest convergence rate for each type of clustered poles, from which the conjecture is confirmed as a special case. Ample numerical evidences demonstrate the optimality and sharpness of the estimates.
△ Less
Submitted 16 June, 2024; v1 submitted 26 December, 2023;
originally announced December 2023.
-
An Inexact Projected Regularized Newton Method for Fused Zero-norms Regularization Problems
Authors:
Yuqia Wu,
Shaohua Pan,
Xiaoqi Yang
Abstract:
We are concerned with structured $\ell_0$-norms regularization problems, with a twice continuously differentiable loss function and a box constraint. This class of problems have a wide range of applications in statistics, machine learning and image processing. To the best of our knowledge, there is no effective algorithm in the literature for solving them. In this paper, we first obtain a polynomi…
▽ More
We are concerned with structured $\ell_0$-norms regularization problems, with a twice continuously differentiable loss function and a box constraint. This class of problems have a wide range of applications in statistics, machine learning and image processing. To the best of our knowledge, there is no effective algorithm in the literature for solving them. In this paper, we first obtain a polynomial-time algorithm to find a point in the proximal mapping of the fused $\ell_0$-norms with a box constraint based on dynamic programming principle. We then propose a hybrid algorithm of proximal gradient method and inexact projected regularized Newton method to solve structured $\ell_0$-norms regularization problems. The whole sequence generated by the algorithm is shown to be convergent by virtue of a non-degeneracy condition, a curvature condition and a Kurdyka-Łojasiewicz property. A superlinear convergence rate of the iterates is established under a locally Hölderian error bound condition on a second-order stationary point set, without requiring the local optimality of the limit point. Finally, numerical experiments are conducted to highlight the features of our considered model, and the superiority of our proposed algorithm.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Irreducible characters and bitrace for the $q$-rook monoid
Authors:
Naihuan Jing,
Yu Wu,
Ning Liu
Abstract:
This paper studies irreducible characters of the $q$-rook monoid algebra $R_n(q)$ using the vertex algebraic method. Based on the Frobenius formula for $R_n(q)$, a new iterative character formula is derived with the help of the vertex operator realization of the Schur symmetric function. The same idea also leads to a simple proof of the Murnaghan-Nakayama rule for $R_n(q)$. We also introduce the b…
▽ More
This paper studies irreducible characters of the $q$-rook monoid algebra $R_n(q)$ using the vertex algebraic method. Based on the Frobenius formula for $R_n(q)$, a new iterative character formula is derived with the help of the vertex operator realization of the Schur symmetric function. The same idea also leads to a simple proof of the Murnaghan-Nakayama rule for $R_n(q)$. We also introduce the bitrace for the $q$-monoid and obtain a general combinatorial formula for the bitrace, which generalizes the counterpart for the Iwahori-Hecke algebra. The character table of $R_n(q)$ with $|μ|=5$ is listed in the appendix.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Gradient flows for empirical Bayes in high-dimensional linear models
Authors:
Zhou Fan,
Leying Guan,
Yandi Shen,
Yihong Wu
Abstract:
Empirical Bayes provides a powerful approach to learning and adapting to latent structure in data. Theory and algorithms for empirical Bayes have a rich literature for sequence models, but are less understood in settings where latent variables and data interact through more complex designs. In this work, we study empirical Bayes estimation of an i.i.d. prior in Bayesian linear models, via the nonp…
▽ More
Empirical Bayes provides a powerful approach to learning and adapting to latent structure in data. Theory and algorithms for empirical Bayes have a rich literature for sequence models, but are less understood in settings where latent variables and data interact through more complex designs. In this work, we study empirical Bayes estimation of an i.i.d. prior in Bayesian linear models, via the nonparametric maximum likelihood estimator (NPMLE). We introduce and study a system of gradient flow equations for optimizing the marginal log-likelihood, jointly over the prior and posterior measures in its Gibbs variational representation using a smoothed reparametrization of the regression coefficients. A diffusion-based implementation yields a Langevin dynamics MCEM algorithm, where the prior law evolves continuously over time to optimize a sequence-model log-likelihood defined by the coordinates of the current Langevin iterate. We show consistency of the NPMLE as $n, p \rightarrow \infty$ under mild conditions, including settings of random sub-Gaussian designs when $n \asymp p$. In high noise, we prove a uniform log-Sobolev inequality for the mixing of Langevin dynamics, for possibly misspecified priors and non-log-concave posteriors. We then establish polynomial-time convergence of the joint gradient flow to a near-NPMLE if the marginal negative log-likelihood is convex in a sub-level set of the initialization.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
The von Bahr-Esseen type inequality under sub-linear expectations and applications
Authors:
Yi Wu,
Xuejun Wang
Abstract:
Moment inequalities play important roles in probability limit theory and mathematical statistics. In this work, the von Bahr-Esseen type inequality for extended negatively dependent random variables under sub-linear expectations is established successfully. By virtue of the inequality, we further obtain the Kolmogorov type weak law of large numbers for partial sums and the complete convergence for…
▽ More
Moment inequalities play important roles in probability limit theory and mathematical statistics. In this work, the von Bahr-Esseen type inequality for extended negatively dependent random variables under sub-linear expectations is established successfully. By virtue of the inequality, we further obtain the Kolmogorov type weak law of large numbers for partial sums and the complete convergence for weighted sums, which extend and improve corresponding results in sub-linear expectation space.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
SPFNO: Spectral operator learning for PDEs with Dirichlet and Neumann boundary conditions
Authors:
Ziyuan Liu,
Yuhang Wu,
Daniel Zhengyu Huang,
Hong Zhang,
Xu Qian,
Songhe Song
Abstract:
Neural operators have been validated as promising deep surrogate models for solving partial differential equations (PDEs). Despite the critical role of boundary conditions in PDEs, however, only a limited number of neural operators robustly enforce these conditions. In this paper we introduce semi-periodic Fourier neural operator (SPFNO), a novel spectral operator learning method, to learn the tar…
▽ More
Neural operators have been validated as promising deep surrogate models for solving partial differential equations (PDEs). Despite the critical role of boundary conditions in PDEs, however, only a limited number of neural operators robustly enforce these conditions. In this paper we introduce semi-periodic Fourier neural operator (SPFNO), a novel spectral operator learning method, to learn the target operators of PDEs with non-periodic BCs. This method extends our previous work (arXiv:2206.12698), which showed significant improvements by employing enhanced neural operators that precisely satisfy the boundary conditions. However, the previous work is associated with Gaussian grids, restricting comprehensive comparisons across most public datasets. Additionally, we present numerical results for various PDEs such as the viscous Burgers' equation, Darcy flow, incompressible pipe flow, and coupled reactiondiffusion equations. These results demonstrate the computational efficiency, resolution invariant property, and BC-satisfaction behavior of proposed model. An accuracy improvement of approximately 1.7X-4.7X over the non-BC-satisfying baselines is also achieved. Furthermore, our studies on SOL underscore the significance of satisfying BCs as a criterion for deep surrogate models of PDEs.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Extended codes and deep holes of MDS codes
Authors:
Yansheng Wu,
Cunsheng Ding,
Tingfang Chen
Abstract:
For a given linear code $\C$ of length $n$ over $\gf(q)$ and a nonzero vector $\bu$ in $\gf(q)^n$, Sun, Ding and Chen defined an extended linear code $\overline{\C}(\bu)$ of $\C$, which is a generalisation of the classical extended code $\overline{\C}(-\bone)$ of $\C$ and called the second kind of an extended code of $\C$ (see arXiv:2307.04076 and arXiv:2307.08053). They developed some general the…
▽ More
For a given linear code $\C$ of length $n$ over $\gf(q)$ and a nonzero vector $\bu$ in $\gf(q)^n$, Sun, Ding and Chen defined an extended linear code $\overline{\C}(\bu)$ of $\C$, which is a generalisation of the classical extended code $\overline{\C}(-\bone)$ of $\C$ and called the second kind of an extended code of $\C$ (see arXiv:2307.04076 and arXiv:2307.08053). They developed some general theory of the extended codes $\overline{\C}(\bu)$ and studied the extended codes $\overline{\C}(\bu)$ of several families of linear codes, including cyclic codes, projective two-weight codes, nonbinary Hamming codes, and a family of reversible MDS cyclic codes. The objective of this paper is to investigate the extended codes $\overline{\C}(\bu)$ of MDS codes $\C$ over finite fields. The main result of this paper is that the extended code $\overline{\C}(\bu)$ of an MDS $[n,k]$ code $\C$ remains MDS if and only if the covering radius $ρ(\mathcal{C}^{\bot})=k$ and the vector $\bu$ is a deep hole of the dual code $\C^\perp$. As applications of this main result, the extended codes of the GRS codes and extended GRS codes are investigated and the covering radii of several families of MDS codes are determined.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
A high-order local discontinuous Galerkin method for the $p$-Laplace equation
Authors:
Yue Wu,
Yan Xu
Abstract:
We study the high-order local discontinuous Galerkin (LDG) method for the $p$-Laplace equation. We reformulate our spatial discretization as an equivalent convex minimization problem and use a preconditioned gradient descent method as the nonlinear solver. For the first time, a weighted preconditioner that provides $hk$-independent convergence is applied in the LDG setting. For polynomial order…
▽ More
We study the high-order local discontinuous Galerkin (LDG) method for the $p$-Laplace equation. We reformulate our spatial discretization as an equivalent convex minimization problem and use a preconditioned gradient descent method as the nonlinear solver. For the first time, a weighted preconditioner that provides $hk$-independent convergence is applied in the LDG setting. For polynomial order $k \geqslant 1$, we rigorously establish the solvability of our scheme and provide a priori error estimates in a mesh-dependent energy norm. Our error estimates are under a different and non-equivalent distance from existing LDG results. For arbitrarily high-order polynomials under the assumption that the exact solution has enough regularity, the error estimates demonstrate the potential for high-order accuracy. Our numerical results exhibit the desired convergence speed facilitated by the preconditioner, and we observe best convergence rates in gradient variables in alignment with linear LDG, and optimal rates in the primal variable when $1 < p \leqslant 2$.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.