subscribe to arXiv mailings

Local Convergence of Gradient Descent-Ascent for Training Generative Adversarial Networks

Authors: Evan Becker, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher

Abstract: Generative Adversarial Networks (GANs) are a popular formulation to train generative models for complex high dimensional data. The standard method for training GANs involves a gradient descent-ascent (GDA) procedure on a minimax optimization problem. This procedure is hard to analyze in general due to the nonlinear nature of the dynamics. We study the local dynamics of GDA for training a GAN with… ▽ More Generative Adversarial Networks (GANs) are a popular formulation to train generative models for complex high dimensional data. The standard method for training GANs involves a gradient descent-ascent (GDA) procedure on a minimax optimization problem. This procedure is hard to analyze in general due to the nonlinear nature of the dynamics. We study the local dynamics of GDA for training a GAN with a kernel-based discriminator. This convergence analysis is based on a linearization of a non-linear dynamical system that describes the GDA iterations, under an \textit{isolated points model} assumption from [Becker et al. 2022]. Our analysis brings out the effect of the learning rates, regularization, and the bandwidth of the kernel discriminator, on the local convergence rate of GDA. Importantly, we show phase transitions that indicate when the system converges, oscillates, or diverges. We also provide numerical simulations that verify our claims. △ Less

Submitted 29 May, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

arXiv:2212.13631

Proceedings of AAAI 2022 Fall Symposium: The Role of AI in Responding to Climate Challenges

Authors: Feras A. Batarseh, Priya L. Donti, Ján Drgoňa, Kristen Fletcher, Pierre-Adrien Hanania, Melissa Hatton, Srinivasan Keshav, Bran Knowles, Raphaela Kotsch, Sean McGinnis, Peetak Mitra, Alex Philp, Jim Spohrer, Frank Stein, Meghna Tare, Svitlana Volkov, Gege Wen

Abstract: Climate change is one of the most pressing challenges of our time, requiring rapid action across society. As artificial intelligence tools (AI) are rapidly deployed, it is therefore crucial to understand how they will impact climate action. On the one hand, AI can support applications in climate change mitigation (reducing or preventing greenhouse gas emissions), adaptation (preparing for the effe… ▽ More Climate change is one of the most pressing challenges of our time, requiring rapid action across society. As artificial intelligence tools (AI) are rapidly deployed, it is therefore crucial to understand how they will impact climate action. On the one hand, AI can support applications in climate change mitigation (reducing or preventing greenhouse gas emissions), adaptation (preparing for the effects of a changing climate), and climate science. These applications have implications in areas ranging as widely as energy, agriculture, and finance. At the same time, AI is used in many ways that hinder climate action (e.g., by accelerating the use of greenhouse gas-emitting fossil fuels). In addition, AI technologies have a carbon and energy footprint themselves. This symposium brought together participants from across academia, industry, government, and civil society to explore these intersections of AI with climate change, as well as how each of these sectors can contribute to solutions. △ Less

Submitted 29 January, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

arXiv:2208.09938 [pdf, other]

Instability and Local Minima in GAN Training with Kernel Discriminators

Authors: Evan Becker, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher

Abstract: Generative Adversarial Networks (GANs) are a widely-used tool for generative modeling of complex data. Despite their empirical success, the training of GANs is not fully understood due to the min-max optimization of the generator and discriminator. This paper analyzes these joint dynamics when the true samples, as well as the generated samples, are discrete, finite sets, and the discriminator is k… ▽ More Generative Adversarial Networks (GANs) are a widely-used tool for generative modeling of complex data. Despite their empirical success, the training of GANs is not fully understood due to the min-max optimization of the generator and discriminator. This paper analyzes these joint dynamics when the true samples, as well as the generated samples, are discrete, finite sets, and the discriminator is kernel-based. A simple yet expressive framework for analyzing training called the $\textit{Isolated Points Model}$ is introduced. In the proposed model, the distance between true samples greatly exceeds the kernel width, so each generated point is influenced by at most one true point. Our model enables precise characterization of the conditions for convergence, both to good and bad minima. In particular, the analysis explains two common failure modes: (i) an approximate mode collapse and (ii) divergence. Numerical simulations are provided that predictably replicate these behaviors. △ Less

Submitted 21 August, 2022; originally announced August 2022.

arXiv:2201.08082 [pdf, other]

Kernel Methods and Multi-layer Perceptrons Learn Linear Models in High Dimensions

Authors: Mojtaba Sahraee-Ardakan, Melikasadat Emami, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher

Abstract: Empirical observation of high dimensional phenomena, such as the double descent behaviour, has attracted a lot of interest in understanding classical techniques such as kernel methods, and their implications to explain generalization properties of neural networks. Many recent works analyze such models in a certain high-dimensional regime where the covariates are independent and the number of sampl… ▽ More Empirical observation of high dimensional phenomena, such as the double descent behaviour, has attracted a lot of interest in understanding classical techniques such as kernel methods, and their implications to explain generalization properties of neural networks. Many recent works analyze such models in a certain high-dimensional regime where the covariates are independent and the number of samples and the number of covariates grow at a fixed ratio (i.e. proportional asymptotics). In this work we show that for a large class of kernels, including the neural tangent kernel of fully connected networks, kernel methods can only perform as well as linear models in this regime. More surprisingly, when the data is generated by a kernel model where the relationship between input and the response could be very nonlinear, we show that linear models are in fact optimal, i.e. linear models achieve the minimum risk among all models, linear or nonlinear. These results suggest that more complex models for the data other than independent features are needed for high-dimensional analysis. △ Less

Submitted 20 January, 2022; originally announced January 2022.

arXiv:2103.04557 [pdf, other]

Asymptotics of Ridge Regression in Convolutional Models

Authors: Mojtaba Sahraee-Ardakan, Tung Mai, Anup Rao, Ryan Rossi, Sundeep Rangan, Alyson K. Fletcher

Abstract: Understanding generalization and estimation error of estimators for simple models such as linear and generalized linear models has attracted a lot of attention recently. This is in part due to an interesting observation made in machine learning community that highly over-parameterized neural networks achieve zero training error, and yet they are able to generalize well over the test samples. This… ▽ More Understanding generalization and estimation error of estimators for simple models such as linear and generalized linear models has attracted a lot of attention recently. This is in part due to an interesting observation made in machine learning community that highly over-parameterized neural networks achieve zero training error, and yet they are able to generalize well over the test samples. This phenomenon is captured by the so called double descent curve, where the generalization error starts decreasing again after the interpolation threshold. A series of recent works tried to explain such phenomenon for simple models. In this work, we analyze the asymptotics of estimation error in ridge estimators for convolutional linear models. These convolutional inverse problems, also known as deconvolution, naturally arise in different fields such as seismology, imaging, and acoustics among others. Our results hold for a large class of input distributions that include i.i.d. features as a special case. We derive exact formulae for estimation error of ridge estimators that hold in a certain high-dimensional regime. We show the double descent phenomenon in our experiments for convolutional models and show that our theoretical results match the experiments. △ Less

Submitted 8 March, 2021; originally announced March 2021.

arXiv:2101.07833 [pdf, ps, other]

Implicit Bias of Linear RNNs

Authors: Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher

Abstract: Contemporary wisdom based on empirical studies suggests that standard recurrent neural networks (RNNs) do not perform well on tasks requiring long-term memory. However, precise reasoning for this behavior is still unknown. This paper provides a rigorous explanation of this property in the special case of linear RNNs. Although this work is limited to linear RNNs, even these systems have traditional… ▽ More Contemporary wisdom based on empirical studies suggests that standard recurrent neural networks (RNNs) do not perform well on tasks requiring long-term memory. However, precise reasoning for this behavior is still unknown. This paper provides a rigorous explanation of this property in the special case of linear RNNs. Although this work is limited to linear RNNs, even these systems have traditionally been difficult to analyze due to their non-linear parameterization. Using recently-developed kernel regime analysis, our main result shows that linear RNNs learned from random initializations are functionally equivalent to a certain weighted 1D-convolutional network. Importantly, the weightings in the equivalent model cause an implicit bias to elements with smaller time lags in the convolution and hence, shorter memory. The degree of this bias depends on the variance of the transition kernel matrix at initialization and is related to the classic exploding and vanishing gradients problem. The theory is validated in both synthetic and real data experiments. △ Less

Submitted 19 January, 2021; originally announced January 2021.

Comments: 30 pages, 4 figures

arXiv:2005.05053 [pdf, other]

Low-Rank Nonlinear Decoding of $μ$-ECoG from the Primary Auditory Cortex

Authors: Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit, Alyson K. Fletcher, Sundeep Rangan, Michael Trumpis, Brinnae Bent, Chia-Han Chiang, Jonathan Viventi

Abstract: This paper considers the problem of neural decoding from parallel neural measurements systems such as micro-electrocorticography ($μ$-ECoG). In systems with large numbers of array elements at very high sampling rates, the dimension of the raw measurement data may be large. Learning neural decoders for this high-dimensional data can be challenging, particularly when the number of training samples i… ▽ More This paper considers the problem of neural decoding from parallel neural measurements systems such as micro-electrocorticography ($μ$-ECoG). In systems with large numbers of array elements at very high sampling rates, the dimension of the raw measurement data may be large. Learning neural decoders for this high-dimensional data can be challenging, particularly when the number of training samples is limited. To address this challenge, this work presents a novel neural network decoder with a low-rank structure in the first hidden layer. The low-rank constraints dramatically reduce the number of parameters in the decoder while still enabling a rich class of nonlinear decoder maps. The low-rank decoder is illustrated on $μ$-ECoG data from the primary auditory cortex (A1) of awake rats. This decoding problem is particularly challenging due to the complexity of neural responses in the auditory cortex and the presence of confounding signals in awake animals. It is shown that the proposed low-rank decoder significantly outperforms models using standard dimensionality reduction techniques such as principal component analysis (PCA). △ Less

Submitted 6 May, 2020; originally announced May 2020.

Comments: 4 pages, 3 figures

arXiv:2005.00180 [pdf, other]

Generalization Error of Generalized Linear Models in High Dimensions

Authors: Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher

Abstract: At the heart of machine learning lies the question of generalizability of learned rules over previously unseen data. While over-parameterized models based on neural networks are now ubiquitous in machine learning applications, our understanding of their generalization capabilities is incomplete. This task is made harder by the non-convexity of the underlying learning problems. We provide a general… ▽ More At the heart of machine learning lies the question of generalizability of learned rules over previously unseen data. While over-parameterized models based on neural networks are now ubiquitous in machine learning applications, our understanding of their generalization capabilities is incomplete. This task is made harder by the non-convexity of the underlying learning problems. We provide a general framework to characterize the asymptotic generalization error for single-layer neural networks (i.e., generalized linear models) with arbitrary non-linearities, making it applicable to regression as well as classification problems. This framework enables analyzing the effect of (i) over-parameterization and non-linearity during modeling; and (ii) choices of loss function, initialization, and regularizer during learning. Our model also captures mismatch between training and test distributions. As examples, we analyze a few special cases, namely linear regression and logistic regression. We are also able to rigorously and analytically explain the \emph{double descent} phenomenon in generalized linear models. △ Less

Submitted 30 April, 2020; originally announced May 2020.

Comments: 20 pages, 4 figures

arXiv:2001.09396 [pdf, other]

Inference in Multi-Layer Networks with Matrix-Valued Unknowns

Authors: Parthe Pandit, Mojtaba Sahraee-Ardakan, Sundeep Rangan, Philip Schniter, Alyson K. Fletcher

Abstract: We consider the problem of inferring the input and hidden variables of a stochastic multi-layer neural network from an observation of the output. The hidden variables in each layer are represented as matrices. This problem applies to signal recovery via deep generative prior models, multi-task and mixed regression and learning certain classes of two-layer neural networks. A unified approximation a… ▽ More We consider the problem of inferring the input and hidden variables of a stochastic multi-layer neural network from an observation of the output. The hidden variables in each layer are represented as matrices. This problem applies to signal recovery via deep generative prior models, multi-task and mixed regression and learning certain classes of two-layer neural networks. A unified approximation algorithm for both MAP and MMSE inference is proposed by extending a recently-developed Multi-Layer Vector Approximate Message Passing (ML-VAMP) algorithm to handle matrix-valued unknowns. It is shown that the performance of the proposed Multi-Layer Matrix VAMP (ML-Mat-VAMP) algorithm can be exactly predicted in a certain random large-system limit, where the dimensions $N\times d$ of the unknown quantities grow as $N\rightarrow\infty$ with $d$ fixed. In the two-layer neural-network learning problem, this scaling corresponds to the case where the number of input features and training samples grow to infinity but the number of hidden nodes stays fixed. The analysis enables a precise prediction of the parameter and test error of the learning. △ Less

Submitted 25 January, 2020; originally announced January 2020.

Comments: 3 figures, 6 pages (two-column) + Appendix. arXiv admin note: text overlap with arXiv:1911.03409

arXiv:1911.03409 [pdf, other]

Inference with Deep Generative Priors in High Dimensions

Authors: Parthe Pandit, Mojtaba Sahraee-Ardakan, Sundeep Rangan, Philip Schniter, Alyson K. Fletcher

Abstract: Deep generative priors offer powerful models for complex-structured data, such as images, audio, and text. Using these priors in inverse problems typically requires estimating the input and/or hidden signals in a multi-layer deep neural network from observation of its output. While these approaches have been successful in practice, rigorous performance analysis is complicated by the non-convex nat… ▽ More Deep generative priors offer powerful models for complex-structured data, such as images, audio, and text. Using these priors in inverse problems typically requires estimating the input and/or hidden signals in a multi-layer deep neural network from observation of its output. While these approaches have been successful in practice, rigorous performance analysis is complicated by the non-convex nature of the underlying optimization problems. This paper presents a novel algorithm, Multi-Layer Vector Approximate Message Passing (ML-VAMP), for inference in multi-layer stochastic neural networks. ML-VAMP can be configured to compute maximum a priori (MAP) or approximate minimum mean-squared error (MMSE) estimates for these networks. We show that the performance of ML-VAMP can be exactly predicted in a certain high-dimensional random limit. Furthermore, under certain conditions, ML-VAMP yields estimates that achieve the minimum (i.e., Bayes-optimal) MSE as predicted by the replica method. In this way, ML-VAMP provides a computationally efficient method for multi-layer inference with an exact performance characterization and testable conditions for optimality in the large-system limit. △ Less

Submitted 8 November, 2019; originally announced November 2019.

Comments: 50 pages, double-spaced

arXiv:1910.13672 [pdf, other]

Input-Output Equivalence of Unitary and Contractive RNNs

Authors: M. Emami, M. Sahraee-Ardakan, S. Rangan, A. K. Fletcher

Abstract: Unitary recurrent neural networks (URNNs) have been proposed as a method to overcome the vanishing and exploding gradient problem in modeling data with long-term dependencies. A basic question is how restrictive is the unitary constraint on the possible input-output mappings of such a network? This work shows that for any contractive RNN with ReLU activations, there is a URNN with at most twice th… ▽ More Unitary recurrent neural networks (URNNs) have been proposed as a method to overcome the vanishing and exploding gradient problem in modeling data with long-term dependencies. A basic question is how restrictive is the unitary constraint on the possible input-output mappings of such a network? This work shows that for any contractive RNN with ReLU activations, there is a URNN with at most twice the number of hidden states and the identical input-output mapping. Hence, with ReLU activations, URNNs are as expressive as general RNNs. In contrast, for certain smooth activations, it is shown that the input-output mapping of an RNN cannot be matched with a URNN, even with an arbitrary number of states. The theoretical results are supported by experiments on modeling of slowly-varying dynamical systems. △ Less

Submitted 30 October, 2019; originally announced October 2019.

arXiv:1903.09631 [pdf, other]

High-Dimensional Bernoulli Autoregressive Process with Long-Range Dependence

Authors: Parthe Pandit, Mojtaba Sahraee-Ardakan, Arash A. Amini, Sundeep Rangan, Alyson K. Fletcher

Abstract: We consider the problem of estimating the parameters of a multivariate Bernoulli process with auto-regressive feedback in the high-dimensional setting where the number of samples available is much less than the number of parameters. This problem arises in learning interconnections of networks of dynamical systems with spiking or binary-valued data. We allow the process to depend on its past up to… ▽ More We consider the problem of estimating the parameters of a multivariate Bernoulli process with auto-regressive feedback in the high-dimensional setting where the number of samples available is much less than the number of parameters. This problem arises in learning interconnections of networks of dynamical systems with spiking or binary-valued data. We allow the process to depend on its past up to a lag $p$, for a general $p \ge 1$, allowing for more realistic modeling in many applications. We propose and analyze an $\ell_1$-regularized maximum likelihood estimator (MLE) under the assumption that the parameter tensor is approximately sparse. Rigorous analysis of such estimators is made challenging by the dependent and non-Gaussian nature of the process as well as the presence of the nonlinearities and multi-level feedback. We derive precise upper bounds on the mean-squared estimation error in terms of the number of samples, dimensions of the process, the lag $p$ and other key statistical properties of the model. The ideas presented can be used in the high-dimensional analysis of regularized $M$-estimators for other sparse nonlinear and non-Gaussian processes with long-range dependence. △ Less

Submitted 19 March, 2019; originally announced March 2019.

Comments: To appear at AISTATS 2019 titled "Sparse Multivariate Bernoulli Processes in High Dimensions"

Journal ref: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS) 2019, Naha, Okinawa, Japan. PMLR: Volume 89

arXiv:1903.01293 [pdf, other]

Asymptotics of MAP Inference in Deep Networks

Authors: Parthe Pandit, Mojtaba Sahraee, Sundeep Rangan, Alyson K. Fletcher

Abstract: Deep generative priors are a powerful tool for reconstruction problems with complex data such as images and text. Inverse problems using such models require solving an inference problem of estimating the input and hidden units of the multi-layer network from its output. Maximum a priori (MAP) estimation is a widely-used inference method as it is straightforward to implement, and has been successfu… ▽ More Deep generative priors are a powerful tool for reconstruction problems with complex data such as images and text. Inverse problems using such models require solving an inference problem of estimating the input and hidden units of the multi-layer network from its output. Maximum a priori (MAP) estimation is a widely-used inference method as it is straightforward to implement, and has been successful in practice. However, rigorous analysis of MAP inference in multi-layer networks is difficult. This work considers a recently-developed method, multi-layer vector approximate message passing (ML-VAMP), to study MAP inference in deep networks. It is shown that the mean squared error of the ML-VAMP estimate can be exactly and rigorously characterized in a certain high-dimensional random limit. The proposed method thus provides a tractable method for MAP inference with exact performance guarantees. △ Less

Submitted 1 March, 2019; originally announced March 2019.

Comments: 11 pages. arXiv admin note: text overlap with arXiv:1706.06549

arXiv:1809.00024 [pdf, ps, other]

doi 10.1109/TSP.2019.2916100

Bilinear Recovery using Adaptive Vector-AMP

Authors: Subrata Sarkar, Alyson K. Fletcher, Sundeep Rangan, Philip Schniter

Abstract: We consider the problem of jointly recovering the vector $\boldsymbol{b}$ and the matrix $\boldsymbol{C}$ from noisy measurements $\boldsymbol{Y} = \boldsymbol{A}(\boldsymbol{b})\boldsymbol{C} + \boldsymbol{W}$, where $\boldsymbol{A}(\cdot)$ is a known affine linear function of $\boldsymbol{b}$ (i.e., $\boldsymbol{A}(\boldsymbol{b})=\boldsymbol{A}_0+\sum_{i=1}^Q b_i \boldsymbol{A}_i$ with known ma… ▽ More We consider the problem of jointly recovering the vector $\boldsymbol{b}$ and the matrix $\boldsymbol{C}$ from noisy measurements $\boldsymbol{Y} = \boldsymbol{A}(\boldsymbol{b})\boldsymbol{C} + \boldsymbol{W}$, where $\boldsymbol{A}(\cdot)$ is a known affine linear function of $\boldsymbol{b}$ (i.e., $\boldsymbol{A}(\boldsymbol{b})=\boldsymbol{A}_0+\sum_{i=1}^Q b_i \boldsymbol{A}_i$ with known matrices $\boldsymbol{A}_i$). This problem has applications in matrix completion, robust PCA, dictionary learning, self-calibration, blind deconvolution, joint-channel/symbol estimation, compressive sensing with matrix uncertainty, and many other tasks. To solve this bilinear recovery problem, we propose the Bilinear Adaptive Vector Approximate Message Passing (BAd-VAMP) algorithm. We demonstrate numerically that the proposed approach is competitive with other state-of-the-art approaches to bilinear recovery, including lifted VAMP and Bilinear GAMP. △ Less

Submitted 8 March, 2019; v1 submitted 31 August, 2018; originally announced September 2018.

arXiv:1806.10466 [pdf, ps, other]

doi 10.1088/1742-5468/ab321a

Plug-in Estimation in High-Dimensional Linear Inverse Problems: A Rigorous Analysis

Authors: Alyson K. Fletcher, Sundeep Rangan, Subrata Sarkar, Philip Schniter

Abstract: Estimating a vector $\mathbf{x}$ from noisy linear measurements $\mathbf{Ax}+\mathbf{w}$ often requires use of prior knowledge or structural constraints on $\mathbf{x}$ for accurate reconstruction. Several recent works have considered combining linear least-squares estimation with a generic or "plug-in" denoiser function that can be designed in a modular manner based on the prior knowledge about… ▽ More Estimating a vector $\mathbf{x}$ from noisy linear measurements $\mathbf{Ax}+\mathbf{w}$ often requires use of prior knowledge or structural constraints on $\mathbf{x}$ for accurate reconstruction. Several recent works have considered combining linear least-squares estimation with a generic or "plug-in" denoiser function that can be designed in a modular manner based on the prior knowledge about $\mathbf{x}$. While these methods have shown excellent performance, it has been difficult to obtain rigorous performance guarantees. This work considers plug-in denoising combined with the recently-developed Vector Approximate Message Passing (VAMP) algorithm, which is itself derived via Expectation Propagation techniques. It shown that the mean squared error of this "plug-and-play" VAMP can be exactly predicted for high-dimensional right-rotationally invariant random $\mathbf{A}$ and Lipschitz denoisers. The method is demonstrated on applications in image recovery and parametric bilinear estimation. △ Less

Submitted 1 August, 2018; v1 submitted 27 June, 2018; originally announced June 2018.

arXiv:1706.06549 [pdf, other]

Inference in Deep Networks in High Dimensions

Authors: Alyson K. Fletcher, Sundeep Rangan

Abstract: Deep generative networks provide a powerful tool for modeling complex data in a wide range of applications. In inverse problems that use these networks as generative priors on data, one must often perform inference of the inputs of the networks from the outputs. Inference is also required for sampling during stochastic training on these generative models. This paper considers inference in a deep s… ▽ More Deep generative networks provide a powerful tool for modeling complex data in a wide range of applications. In inverse problems that use these networks as generative priors on data, one must often perform inference of the inputs of the networks from the outputs. Inference is also required for sampling during stochastic training on these generative models. This paper considers inference in a deep stochastic neural network where the parameters (e.g., weights, biases and activation functions) are known and the problem is to estimate the values of the input and hidden units from the output. While several approximate algorithms have been proposed for this task, there are few analytic tools that can provide rigorous guarantees in the reconstruction error. This work presents a novel and computationally tractable output-to-input inference method called Multi-Layer Vector Approximate Message Passing (ML-VAMP). The proposed algorithm, derived from expectation propagation, extends earlier AMP methods that are known to achieve the replica predictions for optimality in simple linear inverse problems. Our main contribution shows that the mean-squared error (MSE) of ML-VAMP can be exactly predicted in a certain large system limit (LSL) where the numbers of layers is fixed and weight matrices are random and orthogonally-invariant with dimensions that grow to infinity. ML-VAMP is thus a principled method for output-to-input inference in deep networks with a rigorous and precise performance achievability result in high dimensions. △ Less

Submitted 20 June, 2017; originally announced June 2017.

Comments: 27 pages

arXiv:1706.06054 [pdf, other]

Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems

Authors: Alyson K. Fletcher, Mojtaba Sahraee-Ardakan, Philip Schniter, Sundeep Rangan

Abstract: The problem of estimating a random vector x from noisy linear measurements y = A x + w with unknown parameters on the distributions of x and w, which must also be learned, arises in a wide range of statistical learning and linear inverse problems. We show that a computationally simple iterative message-passing algorithm can provably obtain asymptotically consistent estimates in a certain high-dime… ▽ More The problem of estimating a random vector x from noisy linear measurements y = A x + w with unknown parameters on the distributions of x and w, which must also be learned, arises in a wide range of statistical learning and linear inverse problems. We show that a computationally simple iterative message-passing algorithm can provably obtain asymptotically consistent estimates in a certain high-dimensional large-system limit (LSL) under very general parameterizations. Previous message passing techniques have required i.i.d. sub-Gaussian A matrices and often fail when the matrix is ill-conditioned. The proposed algorithm, called adaptive vector approximate message passing (Adaptive VAMP) with auto-tuning, applies to all right-rotationally random A. Importantly, this class includes matrices with arbitrarily poor conditioning. We show that the parameter estimates and mean squared error (MSE) of x in each iteration converge to deterministic limits that can be precisely predicted by a simple set of state evolution (SE) equations. In addition, a simple testable condition is provided in which the MSE matches the Bayes-optimal value predicted by the replica method. The paper thus provides a computationally simple method with provable guarantees of optimality and consistency over a large class of linear inverse problems. △ Less

Submitted 19 June, 2017; originally announced June 2017.

arXiv:1612.01186 [pdf, ps, other]

Vector Approximate Message Passing for the Generalized Linear Model

Authors: Philip Schniter, Sundeep Rangan, Alyson K. Fletcher

Abstract: The generalized linear model (GLM), where a random vector $\boldsymbol{x}$ is observed through a noisy, possibly nonlinear, function of a linear transform output $\boldsymbol{z}=\boldsymbol{Ax}$, arises in a range of applications such as robust regression, binary classification, quantized compressed sensing, phase retrieval, photon-limited imaging, and inference from neural spike trains. When… ▽ More The generalized linear model (GLM), where a random vector $\boldsymbol{x}$ is observed through a noisy, possibly nonlinear, function of a linear transform output $\boldsymbol{z}=\boldsymbol{Ax}$, arises in a range of applications such as robust regression, binary classification, quantized compressed sensing, phase retrieval, photon-limited imaging, and inference from neural spike trains. When $\boldsymbol{A}$ is large and i.i.d. Gaussian, the generalized approximate message passing (GAMP) algorithm is an efficient means of MAP or marginal inference, and its performance can be rigorously characterized by a scalar state evolution. For general $\boldsymbol{A}$, though, GAMP can misbehave. Damping and sequential-updating help to robustify GAMP, but their effects are limited. Recently, a "vector AMP" (VAMP) algorithm was proposed for additive white Gaussian noise channels. VAMP extends AMP's guarantees from i.i.d. Gaussian $\boldsymbol{A}$ to the larger class of rotationally invariant $\boldsymbol{A}$. In this paper, we show how VAMP can be extended to the GLM. Numerical experiments show that the proposed GLM-VAMP is much more robust to ill-conditioning in $\boldsymbol{A}$ than damped GAMP. △ Less

Submitted 4 December, 2016; originally announced December 2016.

arXiv:1610.03082 [pdf, ps, other]

Vector Approximate Message Passing

Authors: Sundeep Rangan, Philip Schniter, Alyson K. Fletcher

Abstract: The standard linear regression (SLR) problem is to recover a vector $\mathbf{x}^0$ from noisy linear observations $\mathbf{y}=\mathbf{Ax}^0+\mathbf{w}$. The approximate message passing (AMP) algorithm recently proposed by Donoho, Maleki, and Montanari is a computationally efficient iterative approach to SLR that has a remarkable property: for large i.i.d.\ sub-Gaussian matrices $\mathbf{A}$, its p… ▽ More The standard linear regression (SLR) problem is to recover a vector $\mathbf{x}^0$ from noisy linear observations $\mathbf{y}=\mathbf{Ax}^0+\mathbf{w}$. The approximate message passing (AMP) algorithm recently proposed by Donoho, Maleki, and Montanari is a computationally efficient iterative approach to SLR that has a remarkable property: for large i.i.d.\ sub-Gaussian matrices $\mathbf{A}$, its per-iteration behavior is rigorously characterized by a scalar state-evolution whose fixed points, when unique, are Bayes optimal. The AMP algorithm, however, is fragile in that even small deviations from the i.i.d.\ sub-Gaussian model can cause the algorithm to diverge. This paper considers a "vector AMP" (VAMP) algorithm and shows that VAMP has a rigorous scalar state-evolution that holds under a much broader class of large random matrices $\mathbf{A}$: those that are right-orthogonally invariant. After performing an initial singular value decomposition (SVD) of $\mathbf{A}$, the per-iteration complexity of VAMP can be made similar to that of AMP. In addition, the fixed points of VAMP's state evolution are consistent with the replica prediction of the minimum mean-squared error recently derived by Tulino, Caire, Verdú, and Shamai. Numerical experiments are used to confirm the effectiveness of VAMP and its consistency with state-evolution predictions. △ Less

Submitted 12 June, 2018; v1 submitted 10 October, 2016; originally announced October 2016.

arXiv:1602.08207 [pdf, ps, other]

Learning and Free Energies for Vector Approximate Message Passing

Authors: Alyson K. Fletcher, Philip Schniter

Abstract: Vector approximate message passing (VAMP) is a computationally simple approach to the recovery of a signal $\mathbf{x}$ from noisy linear measurements $\mathbf{y}=\mathbf{Ax}+\mathbf{w}$. Like the AMP proposed by Donoho, Maleki, and Montanari in 2009, VAMP is characterized by a rigorous state evolution (SE) that holds under certain large random matrices and that matches the replica prediction of o… ▽ More Vector approximate message passing (VAMP) is a computationally simple approach to the recovery of a signal $\mathbf{x}$ from noisy linear measurements $\mathbf{y}=\mathbf{Ax}+\mathbf{w}$. Like the AMP proposed by Donoho, Maleki, and Montanari in 2009, VAMP is characterized by a rigorous state evolution (SE) that holds under certain large random matrices and that matches the replica prediction of optimality. But while AMP's SE holds only for large i.i.d. sub-Gaussian $\mathbf{A}$, VAMP's SE holds under the much larger class: right-rotationally invariant $\mathbf{A}$. To run VAMP, however, one must specify the statistical parameters of the signal and noise. This work combines VAMP with Expectation-Maximization to yield an algorithm, EM-VAMP, that can jointly recover $\mathbf{x}$ while learning those statistical parameters. The fixed points of the proposed EM-VAMP algorithm are shown to be stationary points of a certain constrained free-energy, providing a variational interpretation of the algorithm. Numerical simulations show that EM-VAMP is robust to highly ill-conditioned $\mathbf{A}$ with performance nearly matching oracle-parameter VAMP. △ Less

Submitted 8 March, 2018; v1 submitted 26 February, 2016; originally announced February 2016.

arXiv:1602.07795 [pdf, ps, other]

Expectation Consistent Approximate Inference: Generalizations and Convergence

Authors: Alyson K. Fletcher, Mojtaba Sahraee-Ardakan, Sundeep Rangan, Philip Schniter

Abstract: Approximations of loopy belief propagation, including expectation propagation and approximate message passing, have attracted considerable attention for probabilistic inference problems. This paper proposes and analyzes a generalization of Opper and Winther's expectation consistent (EC) approximate inference method. The proposed method, called Generalized Expectation Consistency (GEC), can be appl… ▽ More Approximations of loopy belief propagation, including expectation propagation and approximate message passing, have attracted considerable attention for probabilistic inference problems. This paper proposes and analyzes a generalization of Opper and Winther's expectation consistent (EC) approximate inference method. The proposed method, called Generalized Expectation Consistency (GEC), can be applied to both maximum a posteriori (MAP) and minimum mean squared error (MMSE) estimation. Here we characterize its fixed points, convergence, and performance relative to the replica prediction of optimality. △ Less

Submitted 24 January, 2017; v1 submitted 25 February, 2016; originally announced February 2016.

Comments: 10 pages

arXiv:1501.01797 [pdf, other]

Inference for Generalized Linear Models via Alternating Directions and Bethe Free Energy Minimization

Authors: Sundeep Rangan, Alyson K. Fletcher, Philip Schniter, Ulugbek Kamilov

Abstract: Generalized Linear Models (GLMs), where a random vector $\mathbf{x}$ is observed through a noisy, possibly nonlinear, function of a linear transform $\mathbf{z}=\mathbf{Ax}$ arise in a range of applications in nonlinear filtering and regression. Approximate Message Passing (AMP) methods, based on loopy belief propagation, are a promising class of approaches for approximate inference in these model… ▽ More Generalized Linear Models (GLMs), where a random vector $\mathbf{x}$ is observed through a noisy, possibly nonlinear, function of a linear transform $\mathbf{z}=\mathbf{Ax}$ arise in a range of applications in nonlinear filtering and regression. Approximate Message Passing (AMP) methods, based on loopy belief propagation, are a promising class of approaches for approximate inference in these models. AMP methods are computationally simple, general, and admit precise analyses with testable conditions for optimality for large i.i.d. transforms $\mathbf{A}$. However, the algorithms can easily diverge for general $\mathbf{A}$. This paper presents a convergent approach to the generalized AMP (GAMP) algorithm based on direct minimization of a large-system limit approximation of the Bethe Free Energy (LSL-BFE). The proposed method uses a double-loop procedure, where the outer loop successively linearizes the LSL-BFE and the inner loop minimizes the linearized LSL-BFE using the Alternating Direction Method of Multipliers (ADMM). The proposed method, called ADMM-GAMP, is similar in structure to the original GAMP method, but with an additional least-squares minimization. It is shown that for strictly convex, smooth penalties, ADMM-GAMP is guaranteed to converge to a local minima of the LSL-BFE, thus providing a convergent alternative to GAMP that is stable under arbitrary transforms. Simulations are also presented that demonstrate the robustness of the method for non-convex penalties as well. △ Less

Submitted 2 May, 2016; v1 submitted 8 January, 2015; originally announced January 2015.

arXiv:1409.0289 [pdf, other]

Scalable Inference for Neuronal Connectivity from Calcium Imaging

Authors: Alyson K. Fletcher, Sundeep Rangan

Abstract: Fluorescent calcium imaging provides a potentially powerful tool for inferring connectivity in neural circuits with up to thousands of neurons. However, a key challenge in using calcium imaging for connectivity detection is that current systems often have a temporal response and frame rate that can be orders of magnitude slower than the underlying neural spiking process. Bayesian inference methods… ▽ More Fluorescent calcium imaging provides a potentially powerful tool for inferring connectivity in neural circuits with up to thousands of neurons. However, a key challenge in using calcium imaging for connectivity detection is that current systems often have a temporal response and frame rate that can be orders of magnitude slower than the underlying neural spiking process. Bayesian inference methods based on expectation-maximization (EM) have been proposed to overcome these limitations, but are often computationally demanding since the E-step in the EM procedure typically involves state estimation for a high-dimensional nonlinear dynamical system. In this work, we propose a computationally fast method for the state estimation based on a hybrid of loopy belief propagation and approximate message passing (AMP). The key insight is that a neural system as viewed through calcium imaging can be factorized into simple scalar dynamical systems for each neuron with linear interconnections between the neurons. Using the structure, the updates in the proposed hybrid AMP methodology can be computed by a set of one-dimensional state estimation procedures and linear transforms with the connectivity matrix. This yields a computationally scalable method for inferring connectivity of large neural circuits. Simulations of the method on realistic neural networks demonstrate good accuracy with computation times that are potentially significantly faster than current approaches based on Markov Chain Monte Carlo methods. △ Less

Submitted 3 September, 2014; v1 submitted 1 September, 2014; originally announced September 2014.

Comments: 14 pages, 3 figures

arXiv:1402.3210 [pdf, ps, other]

On the Convergence of Approximate Message Passing with Arbitrary Matrices

Authors: Sundeep Rangan, Philip Schniter, Alyson K. Fletcher, Subrata Sarkar

Abstract: Approximate message passing (AMP) methods and their variants have attracted considerable recent attention for the problem of estimating a random vector $\mathbf{x}$ observed through a linear transform $\mathbf{A}$. In the case of large i.i.d. zero-mean Gaussian $\mathbf{A}$, the methods exhibit fast convergence with precise analytic characterizations on the algorithm behavior. However, the converg… ▽ More Approximate message passing (AMP) methods and their variants have attracted considerable recent attention for the problem of estimating a random vector $\mathbf{x}$ observed through a linear transform $\mathbf{A}$. In the case of large i.i.d. zero-mean Gaussian $\mathbf{A}$, the methods exhibit fast convergence with precise analytic characterizations on the algorithm behavior. However, the convergence of AMP under general transforms $\mathbf{A}$ is not fully understood. In this paper, we provide sufficient conditions for the convergence of a damped version of the generalized AMP (GAMP) algorithm in the case of quadratic cost functions (i.e., Gaussian likelihood and prior). It is shown that, with sufficient damping, the algorithm is guaranteed to converge, although the amount of damping grows with peak-to-average ratio of the squared singular values of the transforms $\mathbf{A}$. This result explains the good performance of AMP on i.i.d. Gaussian transforms $\mathbf{A}$, but also their difficulties with ill-conditioned or non-zero-mean transforms $\mathbf{A}$. A related sufficient condition is then derived for the local stability of the damped GAMP method under general cost functions, assuming certain strict convexity conditions. △ Less

Submitted 2 March, 2018; v1 submitted 13 February, 2014; originally announced February 2014.

arXiv:1207.3859 [pdf, other]

Approximate Message Passing with Consistent Parameter Estimation and Applications to Sparse Learning

Authors: Ulugbek S. Kamilov, Sundeep Rangan, Alyson K. Fletcher, Michael Unser

Abstract: We consider the estimation of an i.i.d. (possibly non-Gaussian) vector $\xbf \in \R^n$ from measurements $\ybf \in \R^m$ obtained by a general cascade model consisting of a known linear transform followed by a probabilistic componentwise (possibly nonlinear) measurement channel. A novel method, called adaptive generalized approximate message passing (Adaptive GAMP), that enables joint learning of… ▽ More We consider the estimation of an i.i.d. (possibly non-Gaussian) vector $\xbf \in \R^n$ from measurements $\ybf \in \R^m$ obtained by a general cascade model consisting of a known linear transform followed by a probabilistic componentwise (possibly nonlinear) measurement channel. A novel method, called adaptive generalized approximate message passing (Adaptive GAMP), that enables joint learning of the statistics of the prior and measurement channel along with estimation of the unknown vector $\xbf$ is presented. The proposed algorithm is a generalization of a recently-developed EM-GAMP that uses expectation-maximization (EM) iterations where the posteriors in the E-steps are computed via approximate message passing. The methodology can be applied to a large class of learning problems including the learning of sparse priors in compressed sensing or identification of linear-nonlinear cascade models in dynamical systems and neural spiking processes. We prove that for large i.i.d. Gaussian transform matrices the asymptotic componentwise behavior of the adaptive GAMP algorithm is predicted by a simple set of scalar state evolution equations. In addition, we show that when a certain maximum-likelihood estimation can be performed in each step, the adaptive GAMP method can yield asymptotically consistent parameter estimates, which implies that the algorithm achieves a reconstruction quality equivalent to the oracle algorithm that knows the correct parameter values. Remarkably, this result applies to essentially arbitrary parametrizations of the unknown distributions, including ones that are nonlinear and non-Gaussian. The adaptive GAMP methodology thus provides a systematic, general and computationally efficient method applicable to a large range of complex linear-nonlinear models with provable guarantees. △ Less

Submitted 1 December, 2012; v1 submitted 16 July, 2012; originally announced July 2012.

Comments: 14 pages, 3 figures

arXiv:1202.2759 [pdf, ps, other]

Iterative Reconstruction of Rank-One Matrices in Noise

Authors: Alyson K. Fletcher, Sundeep Rangan

Abstract: We consider the problem of estimating a rank-one matrix in Gaussian noise under a probabilistic model for the left and right factors of the matrix. The probabilistic model can impose constraints on the factors including sparsity and positivity that arise commonly in learning problems. We propose a family of algorithms that reduce the problem to a sequence of scalar estimation computations. These a… ▽ More We consider the problem of estimating a rank-one matrix in Gaussian noise under a probabilistic model for the left and right factors of the matrix. The probabilistic model can impose constraints on the factors including sparsity and positivity that arise commonly in learning problems. We propose a family of algorithms that reduce the problem to a sequence of scalar estimation computations. These algorithms are similar to approximate message passing techniques based on Gaussian approximations of loopy belief propagation that have been used recently in compressed sensing. Leveraging analysis methods by Bayati and Montanari, we show that the asymptotic behavior of the algorithm is described by a simple scalar equivalent model, where the distribution of the estimates at each iteration is identical to certain scalar estimates of the variables in Gaussian noise. Moreover, the effective Gaussian noise level is described by a set of state evolution equations. The proposed approach to deriving algorithms thus provides a computationally simple and general method for rank-one estimation problems with a precise analysis in certain high-dimensional settings. △ Less

Submitted 15 September, 2015; v1 submitted 13 February, 2012; originally announced February 2012.

Comments: 28 pages, 2 figures

arXiv:1111.2581 [pdf, ps, other]

Hybrid Approximate Message Passing

Authors: Sundeep Rangan, Alyson K. Fletcher, Vivek K. Goyal, Evan Byrne, Philip Schniter

Abstract: Gaussian and quadratic approximations of message passing algorithms on graphs have attracted considerable recent attention due to their computational simplicity, analytic tractability, and wide applicability in optimization and statistical inference problems. This paper presents a systematic framework for incorporating such approximate message passing (AMP) methods in general graphical models. The… ▽ More Gaussian and quadratic approximations of message passing algorithms on graphs have attracted considerable recent attention due to their computational simplicity, analytic tractability, and wide applicability in optimization and statistical inference problems. This paper presents a systematic framework for incorporating such approximate message passing (AMP) methods in general graphical models. The key concept is a partition of dependencies of a general graphical model into strong and weak edges, with the weak edges representing interactions through aggregates of small, linearizable couplings of variables. AMP approximations based on the Central Limit Theorem can be readily applied to aggregates of many weak edges and integrated with standard message passing updates on the strong edges. The resulting algorithm, which we call hybrid generalized approximate message passing (HyGAMP), can yield significantly simpler implementations of sum-product and max-sum loopy belief propagation. By varying the partition of strong and weak edges, a performance--complexity trade-off can be achieved. Group sparsity and multinomial logistic regression problems are studied as examples of the proposed methodology. △ Less

Submitted 23 March, 2017; v1 submitted 10 November, 2011; originally announced November 2011.

arXiv:1110.6188 [pdf, ps, other]

doi 10.1109/TSP.2012.2208957

Ranked Sparse Signal Support Detection

Authors: Alyson K. Fletcher, Sundeep Rangan, Vivek K Goyal

Abstract: This paper considers the problem of detecting the support (sparsity pattern) of a sparse vector from random noisy measurements. Conditional power of a component of the sparse vector is defined as the energy conditioned on the component being nonzero. Analysis of a simplified version of orthogonal matching pursuit (OMP) called sequential OMP (SequOMP) demonstrates the importance of knowledge of the… ▽ More This paper considers the problem of detecting the support (sparsity pattern) of a sparse vector from random noisy measurements. Conditional power of a component of the sparse vector is defined as the energy conditioned on the component being nonzero. Analysis of a simplified version of orthogonal matching pursuit (OMP) called sequential OMP (SequOMP) demonstrates the importance of knowledge of the rankings of conditional powers. When the simple SequOMP algorithm is applied to components in nonincreasing order of conditional power, the detrimental effect of dynamic range on thresholding performance is eliminated. Furthermore, under the most favorable conditional powers, the performance of SequOMP approaches maximum likelihood performance at high signal-to-noise ratio. △ Less

Submitted 27 October, 2011; originally announced October 2011.

Comments: 13 pages

Journal ref: IEEE Trans. on Signal Processing, vol. 60, no. 11, pp. 5919-5931, November 2012

arXiv:1105.5853 [pdf, ps, other]

doi 10.1109/TSP.2011.2176936

Orthogonal Matching Pursuit: A Brownian Motion Analysis

Authors: Alyson K. Fletcher, Sundeep Rangan

Abstract: A well-known analysis of Tropp and Gilbert shows that orthogonal matching pursuit (OMP) can recover a k-sparse n-dimensional real vector from 4 k log(n) noise-free linear measurements obtained through a random Gaussian measurement matrix with a probability that approaches one as n approaches infinity. This work strengthens this result by showing that a lower number of measurements, 2 k log(n - k),… ▽ More A well-known analysis of Tropp and Gilbert shows that orthogonal matching pursuit (OMP) can recover a k-sparse n-dimensional real vector from 4 k log(n) noise-free linear measurements obtained through a random Gaussian measurement matrix with a probability that approaches one as n approaches infinity. This work strengthens this result by showing that a lower number of measurements, 2 k log(n - k), is in fact sufficient for asymptotic recovery. More generally, when the sparsity level satisfies kmin <= k <= kmax but is unknown, 2 kmax log(n - kmin) measurements is sufficient. Furthermore, this number of measurements is also sufficient for detection of the sparsity pattern (support) of the vector with measurement errors provided the signal-to-noise ratio (SNR) scales to infinity. The scaling 2 k log(n - k) exactly matches the number of measurements required by the more complex lasso method for signal recovery with a similar SNR scaling. △ Less

Submitted 29 May, 2011; originally announced May 2011.

Comments: 11 pages, 2 figures

arXiv:0906.3234 [pdf, ps, other]

doi 10.1109/TIT.2011.2177575

Asymptotic Analysis of MAP Estimation via the Replica Method and Applications to Compressed Sensing

Authors: Sundeep Rangan, Alyson K. Fletcher, Vivek K Goyal

Abstract: The replica method is a non-rigorous but well-known technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method, under the assumption of replica symmetry, to study estimators that are maximum a posteriori (MAP) under a postulated prior distribution. It is shown that with random linear measurements and Gaussian noise… ▽ More The replica method is a non-rigorous but well-known technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method, under the assumption of replica symmetry, to study estimators that are maximum a posteriori (MAP) under a postulated prior distribution. It is shown that with random linear measurements and Gaussian noise, the replica-symmetric prediction of the asymptotic behavior of the postulated MAP estimate of an n-dimensional vector "decouples" as n scalar postulated MAP estimators. The result is based on applying a hardening argument to the replica analysis of postulated posterior mean estimators of Tanaka and of Guo and Verdu. The replica-symmetric postulated MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero norm-regularized estimation it reduces to a hard-threshold. Among other benefits, the replica method provides a computationally-tractable method for precisely predicting various performance metrics including mean-squared error and sparsity pattern recovery probability. △ Less

Submitted 8 October, 2011; v1 submitted 17 June, 2009; originally announced June 2009.

Comments: 22 pages; added details on the replica symmetry assumption

Journal ref: IEEE Trans. on Information Theory, vol. 58, no. 3, pp. 1903-1923, March 2012

arXiv:0903.1022 [pdf, ps, other]

On-Off Random Access Channels: A Compressed Sensing Framework

Authors: Alyson K. Fletcher, Sundeep Rangan, Vivek K Goyal

Abstract: This paper considers a simple on-off random multiple access channel, where n users communicate simultaneously to a single receiver over m degrees of freedom. Each user transmits with probability lambda, where typically lambda n < m << n, and the receiver must detect which users transmitted. We show that when the codebook has i.i.d. Gaussian entries, detecting which users transmitted is mathemati… ▽ More This paper considers a simple on-off random multiple access channel, where n users communicate simultaneously to a single receiver over m degrees of freedom. Each user transmits with probability lambda, where typically lambda n < m << n, and the receiver must detect which users transmitted. We show that when the codebook has i.i.d. Gaussian entries, detecting which users transmitted is mathematically equivalent to a certain sparsity detection problem considered in compressed sensing. Using recent sparsity results, we derive upper and lower bounds on the capacities of these channels. We show that common sparsity detection algorithms, such as lasso and orthogonal matching pursuit (OMP), can be used as tractable multiuser detection schemes and have significantly better performance than single-user detection. These methods do achieve some near-far resistance but--at high signal-to-noise ratios (SNRs)--may achieve capacities far below optimal maximum likelihood detection. We then present a new algorithm, called sequential OMP, that illustrates that iterative detection combined with power ordering or power shaping can significantly improve the high SNR performance. Sequential OMP is analogous to successive interference cancellation in the classic multiple access channel. Our results thereby provide insight into the roles of power control and multiuser detection on random-access signalling. △ Less

Submitted 9 March, 2009; v1 submitted 5 March, 2009; originally announced March 2009.

Comments: 18 pages, 5 figures; addition of inadvertently omitted support information and acknowledgments

arXiv:0804.1839 [pdf, ps, other]

doi 10.1109/TIT.2009.2032726

Necessary and Sufficient Conditions on Sparsity Pattern Recovery

Authors: Alyson K. Fletcher, Sundeep Rangan, Vivek K. Goyal

Abstract: The problem of detecting the sparsity pattern of a k-sparse vector in R^n from m random noisy measurements is of interest in many areas such as system identification, denoising, pattern recognition, and compressed sensing. This paper addresses the scaling of the number of measurements m, with signal dimension n and sparsity-level nonzeros k, for asymptotically-reliable detection. We show a neces… ▽ More The problem of detecting the sparsity pattern of a k-sparse vector in R^n from m random noisy measurements is of interest in many areas such as system identification, denoising, pattern recognition, and compressed sensing. This paper addresses the scaling of the number of measurements m, with signal dimension n and sparsity-level nonzeros k, for asymptotically-reliable detection. We show a necessary condition for perfect recovery at any given SNR for all algorithms, regardless of complexity, is m = Omega(k log(n-k)) measurements. Conversely, it is shown that this scaling of Omega(k log(n-k)) measurements is sufficient for a remarkably simple ``maximum correlation'' estimator. Hence this scaling is optimal and does not require more sophisticated techniques such as lasso or matching pursuit. The constants for both the necessary and sufficient conditions are precisely defined in terms of the minimum-to-average ratio of the nonzero components and the SNR. The necessary condition improves upon previous results for maximum likelihood estimation. For lasso, it also provides a necessary condition at any SNR and for low SNR improves upon previous work. The sufficient condition provides the first asymptotically-reliable detection guarantee at finite SNR. △ Less

Submitted 11 April, 2008; originally announced April 2008.

Comments: Submitted to IEEE Transactions on Information Theory

Journal ref: IEEE Trans. on Information Theory, vol. 55, no. 12, pp. 5758-5772, December 2009

Showing 1–32 of 32 results for author: Fletcher, K