-
Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth Soft-Thresholding
Authors:
Shaik Basheeruddin Shah,
Pradyumna Pradhan,
Wei Pu,
Ramunaidu Randhi,
Miguel R. D. Rodrigues,
Yonina C. Eldar
Abstract:
Solving linear inverse problems plays a crucial role in numerous applications. Algorithm unfolding based, model-aware data-driven approaches have gained significant attention for effectively addressing these problems. Learned iterative soft-thresholding algorithm (LISTA) and alternating direction method of multipliers compressive sensing network (ADMM-CSNet) are two widely used such approaches, ba…
▽ More
Solving linear inverse problems plays a crucial role in numerous applications. Algorithm unfolding based, model-aware data-driven approaches have gained significant attention for effectively addressing these problems. Learned iterative soft-thresholding algorithm (LISTA) and alternating direction method of multipliers compressive sensing network (ADMM-CSNet) are two widely used such approaches, based on ISTA and ADMM algorithms, respectively. In this work, we study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs, for finite-layer unfolded networks such as LISTA and ADMM-CSNet with smooth soft-thresholding in an over-parameterized (OP) regime. We achieve this by leveraging a modified version of the Polyak-Lojasiewicz, denoted PL$^*$, condition. Satisfying the PL$^*$ condition within a specific region of the loss landscape ensures the existence of a global minimum and exponential convergence from initialization using gradient descent based methods. Hence, we provide conditions, in terms of the network width and the number of training samples, on these unfolded networks for the PL$^*$ condition to hold. We achieve this by deriving the Hessian spectral norm of these networks. Additionally, we show that the threshold on the number of training samples increases with the increase in the network width. Furthermore, we compare the threshold on training samples of unfolded networks with that of a standard fully-connected feed-forward network (FFNN) with smooth soft-thresholding non-linearity. We prove that unfolded networks have a higher threshold value than FFNN. Consequently, one can expect a better expected error for unfolded networks than FFNN.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Generalization and Estimation Error Bounds for Model-based Neural Networks
Authors:
Avner Shultzman,
Eyar Azar,
Miguel R. D. Rodrigues,
Yonina C. Eldar
Abstract:
Model-based neural networks provide unparalleled performance for various tasks, such as sparse coding and compressed sensing problems. Due to the strong connection with the sensing model, these networks are interpretable and inherit prior structure of the problem. In practice, model-based neural networks exhibit higher generalization capability compared to ReLU neural networks. However, this pheno…
▽ More
Model-based neural networks provide unparalleled performance for various tasks, such as sparse coding and compressed sensing problems. Due to the strong connection with the sensing model, these networks are interpretable and inherit prior structure of the problem. In practice, model-based neural networks exhibit higher generalization capability compared to ReLU neural networks. However, this phenomenon was not addressed theoretically. Here, we leverage complexity measures including the global and local Rademacher complexities, in order to provide upper bounds on the generalization and estimation errors of model-based networks. We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks, and derive practical design rules that allow to construct model-based networks with guaranteed high generalization. We demonstrate through a series of experiments that our theoretical insights shed light on a few behaviours experienced in practice, including the fact that ISTA and ADMM networks exhibit higher generalization abilities (especially for small number of training samples), compared to ReLU networks.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
Information-theoretic Characterizations of Generalization Error for the Gibbs Algorithm
Authors:
Gholamali Aminian,
Yuheng Bu,
Laura Toni,
Miguel R. D. Rodrigues,
Gregory W. Wornell
Abstract:
Various approaches have been developed to upper bound the generalization error of a supervised learning algorithm. However, existing bounds are often loose and even vacuous when evaluated in practice. As a result, they may fail to characterize the exact generalization ability of a learning algorithm. Our main contributions are exact characterizations of the expected generalization error of the wel…
▽ More
Various approaches have been developed to upper bound the generalization error of a supervised learning algorithm. However, existing bounds are often loose and even vacuous when evaluated in practice. As a result, they may fail to characterize the exact generalization ability of a learning algorithm. Our main contributions are exact characterizations of the expected generalization error of the well-known Gibbs algorithm (a.k.a. Gibbs posterior) using different information measures, in particular, the symmetrized KL information between the input training samples and the output hypothesis. Our result can be applied to tighten existing expected generalization error and PAC-Bayesian bounds. Our information-theoretic approach is versatile, as it also characterizes the generalization error of the Gibbs algorithm with a data-dependent regularizer and that of the Gibbs algorithm in the asymptotic regime, where it converges to the standard empirical risk minimization algorithm. Of particular relevance, our results highlight the role the symmetrized KL information plays in controlling the generalization error of the Gibbs algorithm.
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
Learning Algorithm Generalization Error Bounds via Auxiliary Distributions
Authors:
Gholamali Aminian,
Saeed Masiha,
Laura Toni,
Miguel R. D. Rodrigues
Abstract:
Generalization error bounds are essential for comprehending how well machine learning models work. In this work, we suggest a novel method, i.e., the Auxiliary Distribution Method, that leads to new upper bounds on expected generalization errors that are appropriate for supervised learning scenarios. We show that our general upper bounds can be specialized under some conditions to new bounds invol…
▽ More
Generalization error bounds are essential for comprehending how well machine learning models work. In this work, we suggest a novel method, i.e., the Auxiliary Distribution Method, that leads to new upper bounds on expected generalization errors that are appropriate for supervised learning scenarios. We show that our general upper bounds can be specialized under some conditions to new bounds involving the $α$-Jensen-Shannon, $α$-Rényi ($0< α< 1$) information between a random variable modeling the set of training samples and another random variable modeling the set of hypotheses. Our upper bounds based on $α$-Jensen-Shannon information are also finite. Additionally, we demonstrate how our auxiliary distribution method can be used to derive the upper bounds on excess risk of some learning algorithms in the supervised learning context {\blue and the generalization error under the distribution mismatch scenario in supervised learning algorithms, where the distribution mismatch is modeled as $α$-Jensen-Shannon or $α$-Rényi divergence between the distribution of test and training data samples distributions.} We also outline the conditions for which our proposed upper bounds might be tighter than other earlier upper bounds.
△ Less
Submitted 16 April, 2024; v1 submitted 2 October, 2022;
originally announced October 2022.
-
Semi-supervised Batch Learning From Logged Data
Authors:
Gholamali Aminian,
Armin Behnamnia,
Roberto Vega,
Laura Toni,
Chengchun Shi,
Hamid R. Rabiee,
Omar Rivasplata,
Miguel R. D. Rodrigues
Abstract:
Off-policy learning methods are intended to learn a policy from logged data, which includes context, action, and feedback (cost or reward) for each sample point. In this work, we build on the counterfactual risk minimization framework, which also assumes access to propensity scores. We propose learning methods for problems where feedback is missing for some samples, so there are samples with feedb…
▽ More
Off-policy learning methods are intended to learn a policy from logged data, which includes context, action, and feedback (cost or reward) for each sample point. In this work, we build on the counterfactual risk minimization framework, which also assumes access to propensity scores. We propose learning methods for problems where feedback is missing for some samples, so there are samples with feedback and samples missing-feedback in the logged data. We refer to this type of learning as semi-supervised batch learning from logged data, which arises in a wide range of application domains. We derive a novel upper bound for the true risk under the inverse propensity score estimator to address this kind of learning problem. Using this bound, we propose a regularized semi-supervised batch learning method with logged data where the regularization term is feedback-independent and, as a result, can be evaluated using the logged missing-feedback data. Consequently, even though feedback is only present for some samples, a learning policy can be learned by leveraging the missing-feedback samples. The results of experiments derived from benchmark datasets indicate that these algorithms achieve policies with better performance in comparison with logging policies.
△ Less
Submitted 18 February, 2024; v1 submitted 15 September, 2022;
originally announced September 2022.
-
Theoretical Perspectives on Deep Learning Methods in Inverse Problems
Authors:
Jonathan Scarlett,
Reinhard Heckel,
Miguel R. D. Rodrigues,
Paul Hand,
Yonina C. Eldar
Abstract:
In recent years, there have been significant advances in the use of deep learning methods in inverse problems such as denoising, compressive sensing, inpainting, and super-resolution. While this line of works has predominantly been driven by practical algorithms and experiments, it has also given rise to a variety of intriguing theoretical problems. In this paper, we survey some of the prominent t…
▽ More
In recent years, there have been significant advances in the use of deep learning methods in inverse problems such as denoising, compressive sensing, inpainting, and super-resolution. While this line of works has predominantly been driven by practical algorithms and experiments, it has also given rise to a variety of intriguing theoretical problems. In this paper, we survey some of the prominent theoretical developments in this line of works, focusing in particular on generative priors, untrained neural network priors, and unfolding algorithms. In addition to summarizing existing results in these topics, we highlight several ongoing challenges and open problems.
△ Less
Submitted 29 January, 2023; v1 submitted 28 June, 2022;
originally announced June 2022.
-
An Information-theoretical Approach to Semi-supervised Learning under Covariate-shift
Authors:
Gholamali Aminian,
Mahed Abroshan,
Mohammad Mahdi Khalili,
Laura Toni,
Miguel R. D. Rodrigues
Abstract:
A common assumption in semi-supervised learning is that the labeled, unlabeled, and test data are drawn from the same distribution. However, this assumption is not satisfied in many applications. In many scenarios, the data is collected sequentially (e.g., healthcare) and the distribution of the data may change over time often exhibiting so-called covariate shifts. In this paper, we propose an app…
▽ More
A common assumption in semi-supervised learning is that the labeled, unlabeled, and test data are drawn from the same distribution. However, this assumption is not satisfied in many applications. In many scenarios, the data is collected sequentially (e.g., healthcare) and the distribution of the data may change over time often exhibiting so-called covariate shifts. In this paper, we propose an approach for semi-supervised learning algorithms that is capable of addressing this issue. Our framework also recovers some popular methods, including entropy minimization and pseudo-labeling. We provide new information-theoretical based generalization error upper bounds inspired by our novel framework. Our bounds are applicable to both general semi-supervised learning and the covariate-shift scenario. Finally, we show numerically that our method outperforms previous approaches proposed for semi-supervised learning under the covariate shift.
△ Less
Submitted 24 February, 2022;
originally announced February 2022.
-
Robust lEarned Shrinkage-Thresholding (REST): Robust unrolling for sparse recover
Authors:
Wei Pu,
Chao Zhou,
Yonina C. Eldar,
Miguel R. D. Rodrigues
Abstract:
In this paper, we consider deep neural networks for solving inverse problems that are robust to forward model mis-specifications. Specifically, we treat sensing problems with model mismatch where one wishes to recover a sparse high-dimensional vector from low-dimensional observations subject to uncertainty in the measurement operator. We then design a new robust deep neural network architecture by…
▽ More
In this paper, we consider deep neural networks for solving inverse problems that are robust to forward model mis-specifications. Specifically, we treat sensing problems with model mismatch where one wishes to recover a sparse high-dimensional vector from low-dimensional observations subject to uncertainty in the measurement operator. We then design a new robust deep neural network architecture by applying algorithm unfolding techniques to a robust version of the underlying recovery problem. Our proposed network - named Robust lEarned Shrinkage-Thresholding (REST) - exhibits an additional normalization processing compared to Learned Iterative Shrinkage-Thresholding Algorithm (LISTA), leading to reliable recovery of the signal under sample-wise varying model mismatch. The proposed REST network is shown to outperform state-of-the-art model-based and data-driven algorithms in both compressive sensing and radar imaging problems wherein model mismatch is taken into consideration.
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Characterizing the Generalization Error of Gibbs Algorithm with Symmetrized KL information
Authors:
Gholamali Aminian,
Yuheng Bu,
Laura Toni,
Miguel R. D. Rodrigues,
Gregory Wornell
Abstract:
Bounding the generalization error of a supervised learning algorithm is one of the most important problems in learning theory, and various approaches have been developed. However, existing bounds are often loose and lack of guarantees. As a result, they may fail to characterize the exact generalization ability of a learning algorithm. Our main contribution is an exact characterization of the expec…
▽ More
Bounding the generalization error of a supervised learning algorithm is one of the most important problems in learning theory, and various approaches have been developed. However, existing bounds are often loose and lack of guarantees. As a result, they may fail to characterize the exact generalization ability of a learning algorithm. Our main contribution is an exact characterization of the expected generalization error of the well-known Gibbs algorithm in terms of symmetrized KL information between the input training samples and the output hypothesis. Such a result can be applied to tighten existing expected generalization error bound. Our analysis provides more insight on the fundamental role the symmetrized KL information plays in controlling the generalization error of the Gibbs algorithm.
△ Less
Submitted 28 July, 2021;
originally announced July 2021.
-
Information-Theoretic Bounds on the Moments of the Generalization Error of Learning Algorithms
Authors:
Gholamali Aminian,
Laura Toni,
Miguel R. D. Rodrigues
Abstract:
Generalization error bounds are critical to understanding the performance of machine learning models. In this work, building upon a new bound of the expected value of an arbitrary function of the population and empirical risk of a learning algorithm, we offer a more refined analysis of the generalization behaviour of a machine learning models based on a characterization of (bounds) to their genera…
▽ More
Generalization error bounds are critical to understanding the performance of machine learning models. In this work, building upon a new bound of the expected value of an arbitrary function of the population and empirical risk of a learning algorithm, we offer a more refined analysis of the generalization behaviour of a machine learning models based on a characterization of (bounds) to their generalization error moments. We discuss how the proposed bounds -- which also encompass new bounds to the expected generalization error -- relate to existing bounds in the literature. We also discuss how the proposed generalization error moment bounds can be used to construct new generalization error high-probability bounds.
△ Less
Submitted 5 May, 2021; v1 submitted 3 February, 2021;
originally announced February 2021.
-
Jensen-Shannon Information Based Characterization of the Generalization Error of Learning Algorithms
Authors:
Gholamali Aminian,
Laura Toni,
Miguel R. D. Rodrigues
Abstract:
Generalization error bounds are critical to understanding the performance of machine learning models. In this work, we propose a new information-theoretic based generalization error upper bound applicable to supervised learning scenarios. We show that our general bound can specialize in various previous bounds. We also show that our general bound can be specialized under some conditions to a new b…
▽ More
Generalization error bounds are critical to understanding the performance of machine learning models. In this work, we propose a new information-theoretic based generalization error upper bound applicable to supervised learning scenarios. We show that our general bound can specialize in various previous bounds. We also show that our general bound can be specialized under some conditions to a new bound involving the Jensen-Shannon information between a random variable modelling the set of training samples and another random variable modelling the hypothesis. We also prove that our bound can be tighter than mutual information-based bounds under some conditions.
△ Less
Submitted 8 January, 2021; v1 submitted 23 October, 2020;
originally announced October 2020.
-
Image Separation with Side Information: A Connected Auto-Encoders Based Approach
Authors:
Wei Pu,
Barak Sober,
Nathan Daly,
Zahra Sabetsarvestani,
Catherine Higgitt,
Ingrid Daubechies,
Miguel R. D. Rodrigues
Abstract:
X-radiography (X-ray imaging) is a widely used imaging technique in art investigation. It can provide information about the condition of a painting as well as insights into an artist's techniques and working methods, often revealing hidden information invisible to the naked eye. In this paper, we deal with the problem of separating mixed X-ray images originating from the radiography of double-side…
▽ More
X-radiography (X-ray imaging) is a widely used imaging technique in art investigation. It can provide information about the condition of a painting as well as insights into an artist's techniques and working methods, often revealing hidden information invisible to the naked eye. In this paper, we deal with the problem of separating mixed X-ray images originating from the radiography of double-sided paintings. Using the visible color images (RGB images) from each side of the painting, we propose a new Neural Network architecture, based upon 'connected' auto-encoders, designed to separate the mixed X-ray image into two simulated X-ray images corresponding to each side. In this proposed architecture, the convolutional auto encoders extract features from the RGB images. These features are then used to (1) reproduce both of the original RGB images, (2) reconstruct the hypothetical separated X-ray images, and (3) regenerate the mixed X-ray image. The algorithm operates in a totally self-supervised fashion without requiring a sample set that contains both the mixed X-ray images and the separated ones. The methodology was tested on images from the double-sided wing panels of the \textsl{Ghent Altarpiece}, painted in 1432 by the brothers Hubert and Jan van Eyck. These tests show that the proposed approach outperforms other state-of-the-art X-ray image separation methods for art investigation applications.
△ Less
Submitted 16 September, 2020;
originally announced September 2020.
-
Model-Aware Regularization For Learning Approaches To Inverse Problems
Authors:
Jaweria Amjad,
Zhaoyan Lyu,
Miguel R. D. Rodrigues
Abstract:
There are various inverse problems -- including reconstruction problems arising in medical imaging -- where one is often aware of the forward operator that maps variables of interest to the observations. It is therefore natural to ask whether such knowledge of the forward operator can be exploited in deep learning approaches increasingly used to solve inverse problems.
In this paper, we provide…
▽ More
There are various inverse problems -- including reconstruction problems arising in medical imaging -- where one is often aware of the forward operator that maps variables of interest to the observations. It is therefore natural to ask whether such knowledge of the forward operator can be exploited in deep learning approaches increasingly used to solve inverse problems.
In this paper, we provide one such way via an analysis of the generalisation error of deep learning methods applicable to inverse problems. In particular, by building on the algorithmic robustness framework, we offer a generalisation error bound that encapsulates key ingredients associated with the learning problem such as the complexity of the data space, the size of the training set, the Jacobian of the deep neural network and the Jacobian of the composition of the forward operator with the neural network. We then propose a 'plug-and-play' regulariser that leverages the knowledge of the forward map to improve the generalization of the network. We likewise also propose a new method allowing us to tightly upper bound the Lipschitz constants of the relevant functions that is much more computational efficient than existing ones. We demonstrate the efficacy of our model-aware regularised deep learning algorithms against other state-of-the-art approaches on inverse problems involving various sub-sampling operators such as those used in classical compressed sensing setup and accelerated Magnetic Resonance Imaging (MRI).
△ Less
Submitted 18 June, 2020;
originally announced June 2020.
-
Lautum Regularization for Semi-supervised Transfer Learning
Authors:
Daniel Jakubovitz,
Miguel R. D. Rodrigues,
Raja Giryes
Abstract:
Transfer learning is a very important tool in deep learning as it allows propagating information from one "source dataset" to another "target dataset", especially in the case of a small number of training examples in the latter. Yet, discrepancies between the underlying distributions of the source and target data are commonplace and are known to have a substantial impact on algorithm performance.…
▽ More
Transfer learning is a very important tool in deep learning as it allows propagating information from one "source dataset" to another "target dataset", especially in the case of a small number of training examples in the latter. Yet, discrepancies between the underlying distributions of the source and target data are commonplace and are known to have a substantial impact on algorithm performance. In this work we suggest a novel information theoretic approach for the analysis of the performance of deep neural networks in the context of transfer learning. We focus on the task of semi-supervised transfer learning, in which unlabeled samples from the target dataset are available during the network training on the source dataset. Our theory suggests that one may improve the transferability of a deep neural network by imposing a Lautum information based regularization that relates the network weights to the target data. We demonstrate the effectiveness of the proposed approach in various transfer learning experiments.
△ Less
Submitted 23 January, 2020; v1 submitted 2 April, 2019;
originally announced April 2019.
-
Deep Learning for Inverse Problems: Bounds and Regularizers
Authors:
Jaweria Amjad,
Zhaoyan Lyu,
Miguel R. D. Rodrigues
Abstract:
Inverse problems arise in a number of domains such as medical imaging, remote sensing, and many more, relying on the use of advanced signal and image processing approaches -- such as sparsity-driven techniques -- to determine their solution. This paper instead studies the use of deep learning approaches to approximate the solution of inverse problems. In particular, the paper provides a new genera…
▽ More
Inverse problems arise in a number of domains such as medical imaging, remote sensing, and many more, relying on the use of advanced signal and image processing approaches -- such as sparsity-driven techniques -- to determine their solution. This paper instead studies the use of deep learning approaches to approximate the solution of inverse problems. In particular, the paper provides a new generalization bound, depending on key quantity associated with a deep neural network -- its Jacobian matrix -- that also leads to a number of computationally efficient regularization strategies applicable to inverse problems. The paper also tests the proposed regularization strategies in a number of inverse problems including image super-resolution ones. Our numerical results conducted on various datasets show that both fully connected and convolutional neural networks regularized using the regularization or proxy regularization strategies originating from our theory exhibit much better performance than deep networks regularized with standard approaches such as weight-decay.
△ Less
Submitted 31 January, 2019;
originally announced January 2019.
-
Generalization Error in Deep Learning
Authors:
Daniel Jakubovitz,
Raja Giryes,
Miguel R. D. Rodrigues
Abstract:
Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still generally unclear what is the source of their generalization ability. Thus, an important question is what makes deep neural networks able to generalize well fro…
▽ More
Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still generally unclear what is the source of their generalization ability. Thus, an important question is what makes deep neural networks able to generalize well from the training set to new data. In this article, we provide an overview of the existing theory and bounds for the characterization of the generalization error of deep neural networks, combining both classical and more recent theoretical and empirical results.
△ Less
Submitted 6 April, 2019; v1 submitted 3 August, 2018;
originally announced August 2018.
-
Hardware-Limited Task-Based Quantization
Authors:
Nir Shlezinger,
Yonina C. Eldar,
Miguel R. D. Rodrigues
Abstract:
Quantization plays a critical role in digital signal processing systems. Quantizers are typically designed to obtain an accurate digital representation of the input signal, operating independently of the system task, and are commonly implemented using serial scalar analog-to-digital converters (ADCs). In this work, we study hardware-limited task-based quantization, where a system utilizing a seria…
▽ More
Quantization plays a critical role in digital signal processing systems. Quantizers are typically designed to obtain an accurate digital representation of the input signal, operating independently of the system task, and are commonly implemented using serial scalar analog-to-digital converters (ADCs). In this work, we study hardware-limited task-based quantization, where a system utilizing a serial scalar ADC is designed to provide a suitable representation in order to allow the recovery of a parameter vector underlying the input signal. We propose hardware-limited task-based quantization systems for a fixed and finite quantization resolution, and characterize their achievable distortion. We then apply the analysis to the practical setups of channel estimation and eigen-spectrum recovery from quantized measurements. Our results illustrate that properly designed hardware-limited systems can approach the optimal performance achievable with vector quantizers, and that by taking the underlying task into account, the quantization error can be made negligible with a relatively small number of bits.
△ Less
Submitted 1 August, 2019; v1 submitted 22 July, 2018;
originally announced July 2018.
-
Multimodal Image Denoising based on Coupled Dictionary Learning
Authors:
Pingfan Song,
Miguel R. D. Rodrigues
Abstract:
In this paper, we propose a new multimodal image denoising approach to attenuate white Gaussian additive noise in a given image modality under the aid of a guidance image modality. The proposed coupled image denoising approach consists of two stages: coupled sparse coding and reconstruction. The first stage performs joint sparse transform for multimodal images with respect to a group of learned co…
▽ More
In this paper, we propose a new multimodal image denoising approach to attenuate white Gaussian additive noise in a given image modality under the aid of a guidance image modality. The proposed coupled image denoising approach consists of two stages: coupled sparse coding and reconstruction. The first stage performs joint sparse transform for multimodal images with respect to a group of learned coupled dictionaries, followed by a shrinkage operation on the sparse representations. Then, in the second stage, the shrunken representations, together with coupled dictionaries, contribute to the reconstruction of the denoised image via an inverse transform. The proposed denoising scheme demonstrates the capability to capture both the common and distinct features of different data modalities. This capability makes our approach more robust to inconsistencies between the guidance and the target images, thereby overcoming drawbacks such as the texture copying artifacts. Experiments on real multimodal images demonstrate that the proposed approach is able to better employ guidance information to bring notable benefits in the image denoising task with respect to the state-of-the-art.
△ Less
Submitted 26 June, 2018;
originally announced June 2018.
-
Coupled Dictionary Learning for Multi-contrast MRI Reconstruction
Authors:
Pingfan Song,
Lior Weizman,
Joao F. C. Mota,
Yonina C. Eldar,
Miguel R. D. Rodrigues
Abstract:
Medical imaging tasks often involve multiple contrasts, such as T1- and T2-weighted magnetic resonance imaging (MRI) data. These contrasts capture information associated with the same underlying anatomy and thus exhibit similarities. In this paper, we propose a Coupled Dictionary Learning based multi-contrast MRI reconstruction (CDLMRI) approach to leverage an available guidance contrast to restor…
▽ More
Medical imaging tasks often involve multiple contrasts, such as T1- and T2-weighted magnetic resonance imaging (MRI) data. These contrasts capture information associated with the same underlying anatomy and thus exhibit similarities. In this paper, we propose a Coupled Dictionary Learning based multi-contrast MRI reconstruction (CDLMRI) approach to leverage an available guidance contrast to restore the target contrast. Our approach consists of three stages: coupled dictionary learning, coupled sparse denoising, and $k$-space consistency enforcing. The first stage learns a group of dictionaries that capture correlations among multiple contrasts. By capitalizing on the learned adaptive dictionaries, the second stage performs joint sparse coding to denoise the corrupted target image with the aid of a guidance contrast. The third stage enforces consistency between the denoised image and the measurements in the $k$-space domain. Numerical experiments on the retrospective under-sampling of clinical MR images demonstrate that incorporating additional guidance contrast via our design improves MRI reconstruction, compared to state-of-the-art approaches.
△ Less
Submitted 26 June, 2018;
originally announced June 2018.
-
Multi-modal Image Processing based on Coupled Dictionary Learning
Authors:
Pingfan Song,
Miguel R. D. Rodrigues
Abstract:
In real-world scenarios, many data processing problems often involve heterogeneous images associated with different imaging modalities. Since these multimodal images originate from the same phenomenon, it is realistic to assume that they share common attributes or characteristics. In this paper, we propose a multi-modal image processing framework based on coupled dictionary learning to capture sim…
▽ More
In real-world scenarios, many data processing problems often involve heterogeneous images associated with different imaging modalities. Since these multimodal images originate from the same phenomenon, it is realistic to assume that they share common attributes or characteristics. In this paper, we propose a multi-modal image processing framework based on coupled dictionary learning to capture similarities and disparities between different image modalities. In particular, our framework can capture favorable structure similarities across different image modalities such as edges, corners, and other elementary primitives in a learned sparse transform domain, instead of the original pixel domain, that can be used to improve a number of image processing tasks such as denoising, inpainting, or super-resolution. Practical experiments demonstrate that incorporating multimodal information using our framework brings notable benefits.
△ Less
Submitted 26 June, 2018;
originally announced June 2018.
-
Multimodal Image Super-resolution via Joint Sparse Representations induced by Coupled Dictionaries
Authors:
Pingfan Song,
Xin Deng,
João F. C. Mota,
Nikos Deligiannis,
Pier Luigi Dragotti,
Miguel R. D. Rodrigues
Abstract:
Real-world data processing problems often involve various image modalities associated with a certain scene, including RGB images, infrared images or multi-spectral images. The fact that different image modalities often share certain attributes, such as certain edges, textures and other structure primitives, represents an opportunity to enhance various image processing tasks. This paper proposes a…
▽ More
Real-world data processing problems often involve various image modalities associated with a certain scene, including RGB images, infrared images or multi-spectral images. The fact that different image modalities often share certain attributes, such as certain edges, textures and other structure primitives, represents an opportunity to enhance various image processing tasks. This paper proposes a new approach to construct a high-resolution (HR) version of a low-resolution (LR) image given another HR image modality as reference, based on joint sparse representations induced by coupled dictionaries. Our approach, which captures the similarities and disparities between different image modalities in a learned sparse feature domain in \emph{lieu} of the original image domain, consists of two phases. The coupled dictionary learning phase is used to learn a set of dictionaries that couple different image modalities in the sparse feature domain given a set of training data. In turn, the coupled super-resolution phase leverages such coupled dictionaries to construct a HR version of the LR target image given another related image modality. One of the merits of our sparsity-driven approach relates to the fact that it overcomes drawbacks such as the texture copying artifacts commonly resulting from inconsistency between the guidance and target images. Experiments on real multimodal images demonstrate that incorporating appropriate guidance information via joint sparse representation induced by coupled dictionary learning brings notable benefits in the super-resolution task with respect to the state-of-the-art. Of particular relevance, the proposed approach also demonstrates better robustness than competing deep-learning-based methods in the presence of noise.
△ Less
Submitted 8 March, 2018; v1 submitted 25 September, 2017;
originally announced September 2017.
-
Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems
Authors:
Jure Sokolic,
Qiang Qiu,
Miguel R. D. Rodrigues,
Guillermo Sapiro
Abstract:
Security, privacy, and fairness have become critical in the era of data science and machine learning. More and more we see that achieving universally secure, private, and fair systems is practically impossible. We have seen for example how generative adversarial networks can be used to learn about the expected private training data; how the exploitation of additional data can reveal private inform…
▽ More
Security, privacy, and fairness have become critical in the era of data science and machine learning. More and more we see that achieving universally secure, private, and fair systems is practically impossible. We have seen for example how generative adversarial networks can be used to learn about the expected private training data; how the exploitation of additional data can reveal private information in the original one; and how what looks like unrelated features can teach us about each other. Confronted with this challenge, in this paper we open a new line of research, where the security, privacy, and fairness is learned and used in a closed environment. The goal is to ensure that a given entity (e.g., the company or the government), trusted to infer certain information with our data, is blocked from inferring protected information from it. For example, a hospital might be allowed to produce diagnosis on the patient (the positive task), without being able to infer the gender of the subject (negative task). Similarly, a company can guarantee that internally it is not using the provided data for any undesired task, an important goal that is not contradicting the virtually impossible challenge of blocking everybody from the undesired task. We design a system that learns to succeed on the positive task while simultaneously fail at the negative one, and illustrate this with challenging cases where the positive task is actually harder than the negative one being blocked. Fairness, to the information in the negative task, is often automatically obtained as a result of this proposed approach. The particular framework and examples open the door to security, privacy, and fairness in very important closed scenarios, ranging from private data accumulation companies like social networks to law-enforcement and hospitals.
△ Less
Submitted 23 May, 2017;
originally announced May 2017.
-
Generalization Error of Invariant Classifiers
Authors:
Jure Sokolic,
Raja Giryes,
Guillermo Sapiro,
Miguel R. D. Rodrigues
Abstract:
This paper studies the generalization error of invariant classifiers. In particular, we consider the common scenario where the classification task is invariant to certain transformations of the input, and that the classifier is constructed (or learned) to be invariant to these transformations. Our approach relies on factoring the input space into a product of a base space and a set of transformati…
▽ More
This paper studies the generalization error of invariant classifiers. In particular, we consider the common scenario where the classification task is invariant to certain transformations of the input, and that the classifier is constructed (or learned) to be invariant to these transformations. Our approach relies on factoring the input space into a product of a base space and a set of transformations. We show that whereas the generalization error of a non-invariant classifier is proportional to the complexity of the input space, the generalization error of an invariant classifier is proportional to the complexity of the base space. We also derive a set of sufficient conditions on the geometry of the base space and the set of transformations that ensure that the complexity of the base space is much smaller than the complexity of the input space. Our analysis applies to general classifiers such as convolutional neural networks. We demonstrate the implications of the developed theory for such classifiers with experiments on the MNIST and CIFAR-10 datasets.
△ Less
Submitted 2 July, 2017; v1 submitted 14 October, 2016;
originally announced October 2016.
-
Multi-modal dictionary learning for image separation with application in art investigation
Authors:
Nikos Deligiannis,
Joao F. C. Mota,
Bruno Cornelis,
Miguel R. D. Rodrigues,
Ingrid Daubechies
Abstract:
In support of art investigation, we propose a new source separation method that unmixes a single X-ray scan acquired from double-sided paintings. In this problem, the X-ray signals to be separated have similar morphological characteristics, which brings previous source separation methods to their limits. Our solution is to use photographs taken from the front and back-side of the panel to drive th…
▽ More
In support of art investigation, we propose a new source separation method that unmixes a single X-ray scan acquired from double-sided paintings. In this problem, the X-ray signals to be separated have similar morphological characteristics, which brings previous source separation methods to their limits. Our solution is to use photographs taken from the front and back-side of the panel to drive the separation process. The crux of our approach relies on the coupling of the two imaging modalities (photographs and X-rays) using a novel coupled dictionary learning framework able to capture both common and disparate features across the modalities using parsimonious representations; the common component models features shared by the multi-modal images, whereas the innovation component captures modality-specific information. As such, our model enables the formulation of appropriately regularized convex optimization procedures that lead to the accurate separation of the X-rays. Our dictionary learning framework can be tailored both to a single- and a multi-scale framework, with the latter leading to a significant performance improvement. Moreover, to improve further on the visual quality of the separated images, we propose to train coupled dictionaries that ignore certain parts of the painting corresponding to craquelure. Experimentation on synthetic and real data - taken from digital acquisition of the Ghent Altarpiece (1432) - confirms the superiority of our method against the state-of-the-art morphological component analysis technique that uses either fixed or trained dictionaries to perform image separation.
△ Less
Submitted 14 July, 2016;
originally announced July 2016.
-
Bounds on the Number of Measurements for Reliable Compressive Classification
Authors:
Hugo Reboredo,
Francesco Renna,
Robert Calderbank,
Miguel R. D. Rodrigues
Abstract:
This paper studies the classification of high-dimensional Gaussian signals from low-dimensional noisy, linear measurements. In particular, it provides upper bounds (sufficient conditions) on the number of measurements required to drive the probability of misclassification to zero in the low-noise regime, both for random measurements and designed ones. Such bounds reveal two important operational r…
▽ More
This paper studies the classification of high-dimensional Gaussian signals from low-dimensional noisy, linear measurements. In particular, it provides upper bounds (sufficient conditions) on the number of measurements required to drive the probability of misclassification to zero in the low-noise regime, both for random measurements and designed ones. Such bounds reveal two important operational regimes that are a function of the characteristics of the source: i) when the number of classes is less than or equal to the dimension of the space spanned by signals in each class, reliable classification is possible in the low-noise regime by using a one-vs-all measurement design; ii) when the dimension of the spaces spanned by signals in each class is lower than the number of classes, reliable classification is guaranteed in the low-noise regime by using a simple random measurement design. Simulation results both with synthetic and real data show that our analysis is sharp, in the sense that it is able to gauge the number of measurements required to drive the misclassification probability to zero in the low-noise regime.
△ Less
Submitted 2 August, 2016; v1 submitted 10 July, 2016;
originally announced July 2016.
-
Robust Large Margin Deep Neural Networks
Authors:
Jure Sokolic,
Raja Giryes,
Guillermo Sapiro,
Miguel R. D. Rodrigues
Abstract:
The generalization error of deep neural networks via their classification margin is studied in this work. Our approach is based on the Jacobian matrix of a deep neural network and can be applied to networks with arbitrary non-linearities and pooling layers, and to networks with different architectures such as feed forward networks and residual networks. Our analysis leads to the conclusion that a…
▽ More
The generalization error of deep neural networks via their classification margin is studied in this work. Our approach is based on the Jacobian matrix of a deep neural network and can be applied to networks with arbitrary non-linearities and pooling layers, and to networks with different architectures such as feed forward networks and residual networks. Our analysis leads to the conclusion that a bounded spectral norm of the network's Jacobian matrix in the neighbourhood of the training samples is crucial for a deep neural network of arbitrary depth and width to generalize well. This is a significant improvement over the current bounds in the literature, which imply that the generalization error grows with either the width or the depth of the network. Moreover, it shows that the recently proposed batch normalization and weight normalization re-parametrizations enjoy good generalization properties, and leads to a novel network regularizer based on the network's Jacobian matrix. The analysis is supported with experimental results on the MNIST, CIFAR-10, LaRED and ImageNet datasets.
△ Less
Submitted 23 May, 2017; v1 submitted 26 May, 2016;
originally announced May 2016.
-
X-ray image separation via coupled dictionary learning
Authors:
Nikos Deligiannis,
João F. C. Mota,
Bruno Cornelis,
Miguel R. D. Rodrigues,
Ingrid Daubechies
Abstract:
In support of art investigation, we propose a new source sepa- ration method that unmixes a single X-ray scan acquired from double-sided paintings. Unlike prior source separation meth- ods, which are based on statistical or structural incoherence of the sources, we use visual images taken from the front- and back-side of the panel to drive the separation process. The coupling of the two imaging mo…
▽ More
In support of art investigation, we propose a new source sepa- ration method that unmixes a single X-ray scan acquired from double-sided paintings. Unlike prior source separation meth- ods, which are based on statistical or structural incoherence of the sources, we use visual images taken from the front- and back-side of the panel to drive the separation process. The coupling of the two imaging modalities is achieved via a new multi-scale dictionary learning method. Experimental results demonstrate that our method succeeds in the discrimination of the sources, while state-of-the-art methods fail to do so.
△ Less
Submitted 20 May, 2016;
originally announced May 2016.
-
Mismatch in the Classification of Linear Subspaces: Sufficient Conditions for Reliable Classification
Authors:
Jure Sokolic,
Francesco Renna,
Robert Calderbank,
Miguel R. D. Rodrigues
Abstract:
This paper considers the classification of linear subspaces with mismatched classifiers. In particular, we assume a model where one observes signals in the presence of isotropic Gaussian noise and the distribution of the signals conditioned on a given class is Gaussian with a zero mean and a low-rank covariance matrix. We also assume that the classifier knows only a mismatched version of the param…
▽ More
This paper considers the classification of linear subspaces with mismatched classifiers. In particular, we assume a model where one observes signals in the presence of isotropic Gaussian noise and the distribution of the signals conditioned on a given class is Gaussian with a zero mean and a low-rank covariance matrix. We also assume that the classifier knows only a mismatched version of the parameters of input distribution in lieu of the true parameters. By constructing an asymptotic low-noise expansion of an upper bound to the error probability of such a mismatched classifier, we provide sufficient conditions for reliable classification in the low-noise regime that are able to sharply predict the absence of a classification error floor. Such conditions are a function of the geometry of the true signal distribution, the geometry of the mismatched signal distributions as well as the interplay between such geometries, namely, the principal angles and the overlap between the true and the mismatched signal subspaces. Numerical results demonstrate that our conditions for reliable classification can sharply predict the behavior of a mismatched classifier both with synthetic data and in a motion segmentation and a hand-written digit classification applications.
△ Less
Submitted 18 February, 2016; v1 submitted 7 August, 2015;
originally announced August 2015.
-
Adaptive-Rate Sparse Signal Reconstruction With Application in Compressive Background Subtraction
Authors:
Joao F. C. Mota,
Nikos Deligiannis,
Aswin C. Sankaranarayanan,
Volkan Cevher,
Miguel R. D. Rodrigues
Abstract:
We propose and analyze an online algorithm for reconstructing a sequence of signals from a limited number of linear measurements. The signals are assumed sparse, with unknown support, and evolve over time according to a generic nonlinear dynamical model. Our algorithm, based on recent theoretical results for $\ell_1$-$\ell_1$ minimization, is recursive and computes the number of measurements to be…
▽ More
We propose and analyze an online algorithm for reconstructing a sequence of signals from a limited number of linear measurements. The signals are assumed sparse, with unknown support, and evolve over time according to a generic nonlinear dynamical model. Our algorithm, based on recent theoretical results for $\ell_1$-$\ell_1$ minimization, is recursive and computes the number of measurements to be taken at each time on-the-fly. As an example, we apply the algorithm to compressive video background subtraction, a problem that can be stated as follows: given a set of measurements of a sequence of images with a static background, simultaneously reconstruct each image while separating its foreground from the background. The performance of our method is illustrated on sequences of real images: we observe that it allows a dramatic reduction in the number of measurements with respect to state-of-the-art compressive background subtraction schemes.
△ Less
Submitted 11 March, 2015;
originally announced March 2015.
-
Classification and Reconstruction of High-Dimensional Signals from Low-Dimensional Features in the Presence of Side Information
Authors:
Francesco Renna,
Liming Wang,
Xin Yuan,
Jianbo Yang,
Galen Reeves,
Robert Calderbank,
Lawrence Carin,
Miguel R. D. Rodrigues
Abstract:
This paper offers a characterization of fundamental limits on the classification and reconstruction of high-dimensional signals from low-dimensional features, in the presence of side information. We consider a scenario where a decoder has access both to linear features of the signal of interest and to linear features of the side information signal; while the side information may be in a compressed…
▽ More
This paper offers a characterization of fundamental limits on the classification and reconstruction of high-dimensional signals from low-dimensional features, in the presence of side information. We consider a scenario where a decoder has access both to linear features of the signal of interest and to linear features of the side information signal; while the side information may be in a compressed form, the objective is recovery or classification of the primary signal, not the side information. The signal of interest and the side information are each assumed to have (distinct) latent discrete labels; conditioned on these two labels, the signal of interest and side information are drawn from a multivariate Gaussian distribution. With joint probabilities on the latent labels, the overall signal-(side information) representation is defined by a Gaussian mixture model. We then provide sharp sufficient and/or necessary conditions for these quantities to approach zero when the covariance matrices of the Gaussians are nearly low-rank. These conditions, which are reminiscent of the well-known Slepian-Wolf and Wyner-Ziv conditions, are a function of the number of linear features extracted from the signal of interest, the number of linear features extracted from the side information signal, and the geometry of these signals and their interplay. Moreover, on assuming that the signal of interest and the side information obey such an approximately low-rank model, we derive expansions of the reconstruction error as a function of the deviation from an exactly low-rank model; such expansions also allow identification of operational regimes where the impact of side information on signal reconstruction is most relevant. Our framework, which offers a principled mechanism to integrate side information in high-dimensional data problems, is also tested in the context of imaging applications.
△ Less
Submitted 17 March, 2016; v1 submitted 1 December, 2014;
originally announced December 2014.
-
Compressed Sensing With Side Information: Geometrical Interpretation and Performance Bounds
Authors:
João F. C. Mota,
Nikos Deligiannis,
Miguel R. D. Rodrigues
Abstract:
We address the problem of Compressed Sensing (CS) with side information. Namely, when reconstructing a target CS signal, we assume access to a similar signal. This additional knowledge, the side information, is integrated into CS via L1-L1 and L1-L2 minimization. We then provide lower bounds on the number of measurements that these problems require for successful reconstruction of the target signa…
▽ More
We address the problem of Compressed Sensing (CS) with side information. Namely, when reconstructing a target CS signal, we assume access to a similar signal. This additional knowledge, the side information, is integrated into CS via L1-L1 and L1-L2 minimization. We then provide lower bounds on the number of measurements that these problems require for successful reconstruction of the target signal. If the side information has good quality, the number of measurements is significantly reduced via L1-L1 minimization, but not so much via L1-L2 minimization. We provide geometrical interpretations and experimental results illustrating our findings.
△ Less
Submitted 10 October, 2014;
originally announced October 2014.
-
Compressed Sensing with Prior Information: Optimal Strategies, Geometry, and Bounds
Authors:
Joao F. C. Mota,
Nikos Deligiannis,
Miguel R. D. Rodrigues
Abstract:
We address the problem of compressed sensing (CS) with prior information: reconstruct a target CS signal with the aid of a similar signal that is known beforehand, our prior information. We integrate the additional knowledge of the similar signal into CS via L1-L1 and L1-L2 minimization. We then establish bounds on the number of measurements required by these problems to successfully reconstruct t…
▽ More
We address the problem of compressed sensing (CS) with prior information: reconstruct a target CS signal with the aid of a similar signal that is known beforehand, our prior information. We integrate the additional knowledge of the similar signal into CS via L1-L1 and L1-L2 minimization. We then establish bounds on the number of measurements required by these problems to successfully reconstruct the original signal. Our bounds and geometrical interpretations reveal that if the prior information has good enough quality, L1-L1 minimization improves the performance of CS dramatically. In contrast, L1-L2 minimization has a performance very similar to classical CS and brings no significant benefits. All our findings are illustrated with experimental results.
△ Less
Submitted 22 August, 2014;
originally announced August 2014.
-
Compressive Classification of a Mixture of Gaussians: Analysis, Designs and Geometrical Interpretation
Authors:
Hugo Reboredo,
Francesco Renna,
Robert Calderbank,
Miguel R. D. Rodrigues
Abstract:
This paper derives fundamental limits on the performance of compressive classification when the source is a mixture of Gaussians. It provides an asymptotic analysis of a Bhattacharya based upper bound on the misclassification probability for the optimal Maximum-A-Posteriori (MAP) classifier that depends on quantities that are dual to the concepts of diversity-order and coding gain in multi-antenna…
▽ More
This paper derives fundamental limits on the performance of compressive classification when the source is a mixture of Gaussians. It provides an asymptotic analysis of a Bhattacharya based upper bound on the misclassification probability for the optimal Maximum-A-Posteriori (MAP) classifier that depends on quantities that are dual to the concepts of diversity-order and coding gain in multi-antenna communications. The diversity-order of the measurement system determines the rate at which the probability of misclassification decays with signal-to-noise ratio (SNR) in the low-noise regime. The counterpart of coding gain is the measurement gain which determines the power offset of the probability of misclassification in the low-noise regime. These two quantities make it possible to quantify differences in misclassification probability between random measurement and (diversity-order) optimized measurement. Results are presented for two-class classification problems first with zero-mean Gaussians then with nonzero-mean Gaussians, and finally for multiple-class Gaussian classification problems. The behavior of misclassification probability is revealed to be intimately related to certain fundamental geometric quantities determined by the measurement system, the source and their interplay. Numerical results, representative of compressive classification of a mixture of Gaussians, demonstrate alignment of the actual misclassification probability with the Bhattacharya based upper bound. The connection between the misclassification performance and the alignment between source and measurement geometry may be used to guide the design of dictionaries for compressive classification.
△ Less
Submitted 27 January, 2014;
originally announced January 2014.
-
Towards Energy Neutrality in Energy Harvesting Wireless Sensor Networks: A Case for Distributed Compressive Sensing?
Authors:
Wei Chen,
Yiannis Andreopoulos,
Ian J. Wassell,
Miguel R. D. Rodrigues
Abstract:
This paper advocates the use of the emerging distributed compressive sensing (DCS) paradigm in order to deploy energy harvesting (EH) wireless sensor networks (WSN) with practical network lifetime and data gathering rates that are substantially higher than the state-of-the-art. In particular, we argue that there are two fundamental mechanisms in an EH WSN: i) the energy diversity associated with t…
▽ More
This paper advocates the use of the emerging distributed compressive sensing (DCS) paradigm in order to deploy energy harvesting (EH) wireless sensor networks (WSN) with practical network lifetime and data gathering rates that are substantially higher than the state-of-the-art. In particular, we argue that there are two fundamental mechanisms in an EH WSN: i) the energy diversity associated with the EH process that entails that the harvested energy can vary from sensor node to sensor node, and ii) the sensing diversity associated with the DCS process that entails that the energy consumption can also vary across the sensor nodes without compromising data recovery. We also argue that such mechanisms offer the means to match closely the energy demand to the energy supply in order to unlock the possibility for energy-neutral WSNs that leverage EH capability. A number of analytic and simulation results are presented in order to illustrate the potential of the approach.
△ Less
Submitted 17 October, 2013;
originally announced October 2013.
-
Reconstruction of Signals Drawn from a Gaussian Mixture from Noisy Compressive Measurements
Authors:
Francesco Renna,
Robert Calderbank,
Lawrence Carin,
Miguel R. D. Rodrigues
Abstract:
This paper determines to within a single measurement the minimum number of measurements required to successfully reconstruct a signal drawn from a Gaussian mixture model in the low-noise regime. The method is to develop upper and lower bounds that are a function of the maximum dimension of the linear subspaces spanned by the Gaussian mixture components. The method not only reveals the existence or…
▽ More
This paper determines to within a single measurement the minimum number of measurements required to successfully reconstruct a signal drawn from a Gaussian mixture model in the low-noise regime. The method is to develop upper and lower bounds that are a function of the maximum dimension of the linear subspaces spanned by the Gaussian mixture components. The method not only reveals the existence or absence of a minimum mean-squared error (MMSE) error floor (phase transition) but also provides insight into the MMSE decay via multivariate generalizations of the MMSE dimension and the MMSE power offset, which are a function of the interaction between the geometrical properties of the kernel and the Gaussian mixture. These results apply not only to standard linear random Gaussian measurements but also to linear kernels that minimize the MMSE. It is shown that optimal kernels do not change the number of measurements associated with the MMSE phase transition, rather they affect the sensed power required to achieve a target MMSE in the low-noise regime. Overall, our bounds are tighter and sharper than standard bounds on the minimum number of measurements needed to recover sparse signals associated with a union of subspaces model, as they are not asymptotic in the signal dimension or signal sparsity.
△ Less
Submitted 17 March, 2014; v1 submitted 2 July, 2013;
originally announced July 2013.
-
Filter Design with Secrecy Constraints: The MIMO Gaussian Wiretap Channel
Authors:
Hugo Reboredo,
João Xavier,
Miguel R. D. Rodrigues
Abstract:
This paper considers the problem of filter design with secrecy constraints, where two legitimate parties (Alice and Bob) communicate in the presence of an eavesdropper (Eve), over a Gaussian multiple-input-multiple-output (MIMO) wiretap channel. This problem involves designing, subject to a power constraint, the transmit and the receive filters which minimize the mean-squared error (MSE) between t…
▽ More
This paper considers the problem of filter design with secrecy constraints, where two legitimate parties (Alice and Bob) communicate in the presence of an eavesdropper (Eve), over a Gaussian multiple-input-multiple-output (MIMO) wiretap channel. This problem involves designing, subject to a power constraint, the transmit and the receive filters which minimize the mean-squared error (MSE) between the legitimate parties whilst assuring that the eavesdropper MSE remains above a certain threshold. We consider a general MIMO Gaussian wiretap scenario, where the legitimate receiver uses a linear Zero-Forcing (ZF) filter and the eavesdropper receiver uses either a ZF or an optimal linear Wiener filter. We provide a characterization of the optimal filter designs by demonstrating the convexity of the optimization problems. We also provide generalizations of the filter designs from the scenario where the channel state is known to all the parties to the scenario where there is uncertainty in the channel state. A set of numerical results illustrates the performance of the novel filter designs, including the robustness to channel modeling errors. In particular, we assess the efficacy of the designs in guaranteeing not only a certain MSE level at the eavesdropper, but also in limiting the error probability at the eavesdropper. We also assess the impact of the filter designs on the achievable secrecy rates. The penalty induced by the fact that the eavesdropper may use the optimal non-linear receive filter rather than the optimal linear one is also explored in the paper.
△ Less
Submitted 2 May, 2013;
originally announced May 2013.
-
Compressive Classification
Authors:
Hugo Reboredo,
Francesco Renna,
Robert Calderbank,
Miguel R. D. Rodrigues
Abstract:
This paper derives fundamental limits associated with compressive classification of Gaussian mixture source models. In particular, we offer an asymptotic characterization of the behavior of the (upper bound to the) misclassification probability associated with the optimal Maximum-A-Posteriori (MAP) classifier that depends on quantities that are dual to the concepts of diversity gain and coding gai…
▽ More
This paper derives fundamental limits associated with compressive classification of Gaussian mixture source models. In particular, we offer an asymptotic characterization of the behavior of the (upper bound to the) misclassification probability associated with the optimal Maximum-A-Posteriori (MAP) classifier that depends on quantities that are dual to the concepts of diversity gain and coding gain in multi-antenna communications. The diversity, which is shown to determine the rate at which the probability of misclassification decays in the low noise regime, is shown to depend on the geometry of the source, the geometry of the measurement system and their interplay. The measurement gain, which represents the counterpart of the coding gain, is also shown to depend on geometrical quantities. It is argued that the diversity order and the measurement gain also offer an optimization criterion to perform dictionary learning for compressive classification applications.
△ Less
Submitted 19 February, 2013;
originally announced February 2013.
-
Projection Design For Statistical Compressive Sensing: A Tight Frame Based Approach
Authors:
Wei Chen,
Miguel R. D. Rodrigues,
Ian Wassell
Abstract:
In this paper, we develop a framework to design sensing matrices for compressive sensing applications that lead to good mean squared error (MSE) performance subject to sensing cost constraints. By capitalizing on the MSE of the oracle estimator, whose performance has been shown to act as a benchmark to the performance of standard sparse recovery algorithms, we use the fact that a Parseval tight fr…
▽ More
In this paper, we develop a framework to design sensing matrices for compressive sensing applications that lead to good mean squared error (MSE) performance subject to sensing cost constraints. By capitalizing on the MSE of the oracle estimator, whose performance has been shown to act as a benchmark to the performance of standard sparse recovery algorithms, we use the fact that a Parseval tight frame is the closest design - in the Frobenius norm sense - to the solution of a convex relaxation of the optimization problem that relates to the minimization of the MSE of the oracle estimator with respect to the equivalent sensing matrix, subject to sensing energy constraints. Based on this result, we then propose two sensing matrix designs that exhibit two key properties: i) the designs are closed form rather than iterative; ii) the designs exhibit superior performance in relation to other designs in the literature, which is revealed by our numerical investigation in various scenarios with different sparse recovery algorithms including basis pursuit de-noise (BPDN), the Dantzig selector and orthogonal matching pursuit (OMP).
△ Less
Submitted 4 February, 2013;
originally announced February 2013.
-
Coherent Fading Channels Driven by Arbitrary Inputs: Asymptotic Characterization of the Constrained Capacity and Related Information- and Estimation-Theoretic Quantities
Authors:
Alberto Gil C. P. Ramos,
Miguel R. D. Rodrigues
Abstract:
We consider the characterization of the asymptotic behavior of the average minimum mean-squared error (MMSE) and the average mutual information in scalar and vector fading coherent channels, where the receiver knows the exact fading channel state but the transmitter knows only the fading channel distribution, driven by a range of inputs. We construct low-snr and -- at the heart of the novelty of t…
▽ More
We consider the characterization of the asymptotic behavior of the average minimum mean-squared error (MMSE) and the average mutual information in scalar and vector fading coherent channels, where the receiver knows the exact fading channel state but the transmitter knows only the fading channel distribution, driven by a range of inputs. We construct low-snr and -- at the heart of the novelty of the contribution -- high-snr asymptotic expansions for the average MMSE and the average mutual information for coherent channels subject to Rayleigh fading, Ricean fading or Nakagami fading and driven by discrete inputs (with finite support) or various continuous inputs. We reveal the role that the so-called canonical MMSE in a standard additive white Gaussian noise (AWGN) channel plays in the characterization of the asymptotic behavior of the average MMSE and the average mutual information in a fading coherent channel. We also reveal connections to and generalizations of the MMSE dimension. The most relevant element that enables the construction of these non-trivial expansions is the realization that the integral representation of the estimation- and information- theoretic quantities can be seen as an h-transform of a kernel with a monotonic argument: this enables the use of a novel asymptotic expansion of integrals technique -- the Mellin transform method -- that leads immediately to not only the high-snr but also the low-snr expansions of the average MMSE and -- via the I-MMSE relationship -- to expansions of the average mutual information. We conclude with applications of the results to the characterization and optimization of the constrained capacity of a bank of parallel independent coherent fading channels driven by arbitrary discrete inputs.
△ Less
Submitted 16 October, 2012;
originally announced October 2012.
-
Communications-Inspired Projection Design with Application to Compressive Sensing
Authors:
William R. Carson,
Minhua Chen,
Miguel R. D. Rodrigues,
Robert Calderbank,
Lawrence Carin
Abstract:
We consider the recovery of an underlying signal x \in C^m based on projection measurements of the form y=Mx+w, where y \in C^l and w is measurement noise; we are interested in the case l < m. It is assumed that the signal model p(x) is known, and w CN(w;0,S_w), for known S_W. The objective is to design a projection matrix M \in C^(l x m) to maximize key information-theoretic quantities with opera…
▽ More
We consider the recovery of an underlying signal x \in C^m based on projection measurements of the form y=Mx+w, where y \in C^l and w is measurement noise; we are interested in the case l < m. It is assumed that the signal model p(x) is known, and w CN(w;0,S_w), for known S_W. The objective is to design a projection matrix M \in C^(l x m) to maximize key information-theoretic quantities with operational significance, including the mutual information between the signal and the projections I(x;y) or the Renyi entropy of the projections h_a(y) (Shannon entropy is a special case). By capitalizing on explicit characterizations of the gradients of the information measures with respect to the projections matrix, where we also partially extend the well-known results of Palomar and Verdu from the mutual information to the Renyi entropy domain, we unveil the key operations carried out by the optimal projections designs: mode exposure and mode alignment. Experiments are considered for the case of compressive sensing (CS) applied to imagery. In this context, we provide a demonstration of the performance improvement possible through the application of the novel projection designs in relation to conventional ones, as well as justification for a fast online projections design method with which state-of-the-art adaptive CS signal recovery is achieved.
△ Less
Submitted 9 June, 2012;
originally announced June 2012.
-
Pseudo-random Puncturing: A Technique to Lower the Error Floor of Turbo Codes
Authors:
Ioannis Chatzigeorgiou,
Miguel R. D. Rodrigues,
Ian J. Wassell,
Rolando Carrasco
Abstract:
It has been observed that particular rate-1/2 partially systematic parallel concatenated convolutional codes (PCCCs) can achieve a lower error floor than that of their rate-1/3 parent codes. Nevertheless, good puncturing patterns can only be identified by means of an exhaustive search, whilst convergence towards low bit error probabilities can be problematic when the systematic output of a rate-…
▽ More
It has been observed that particular rate-1/2 partially systematic parallel concatenated convolutional codes (PCCCs) can achieve a lower error floor than that of their rate-1/3 parent codes. Nevertheless, good puncturing patterns can only be identified by means of an exhaustive search, whilst convergence towards low bit error probabilities can be problematic when the systematic output of a rate-1/2 partially systematic PCCC is heavily punctured. In this paper, we present and study a family of rate-1/2 partially systematic PCCCs, which we call pseudo-randomly punctured codes. We evaluate their bit error rate performance and we show that they always yield a lower error floor than that of their rate-1/3 parent codes. Furthermore, we compare analytic results to simulations and we demonstrate that their performance converges towards the error floor region, owning to the moderate puncturing of their systematic output. Consequently, we propose pseudo-random puncturing as a means of improving the bandwidth efficiency of a PCCC and simultaneously lowering its error floor.
△ Less
Submitted 3 April, 2007;
originally announced April 2007.
-
A Union Bound Approximation for Rapid Performance Evaluation of Punctured Turbo Codes
Authors:
Ioannis Chatzigeorgiou,
Miguel R. D. Rodrigues,
Ian J. Wassell,
Rolando Carrasco
Abstract:
In this paper, we present a simple technique to approximate the performance union bound of a punctured turbo code. The bound approximation exploits only those terms of the transfer function that have a major impact on the overall performance. We revisit the structure of the constituent convolutional encoder and we develop a rapid method to calculate the most significant terms of the transfer fun…
▽ More
In this paper, we present a simple technique to approximate the performance union bound of a punctured turbo code. The bound approximation exploits only those terms of the transfer function that have a major impact on the overall performance. We revisit the structure of the constituent convolutional encoder and we develop a rapid method to calculate the most significant terms of the transfer function of a turbo encoder. We demonstrate that, for a large interleaver size, this approximation is very accurate. Furthermore, we apply our proposed method to a family of punctured turbo codes, which we call pseudo-randomly punctured codes. We conclude by emphasizing the benefits of our approach compared to those employed previously. We also highlight the advantages of pseudo-random puncturing over other puncturing schemes.
△ Less
Submitted 19 February, 2007;
originally announced February 2007.
-
Can Punctured Rate-1/2 Turbo Codes Achieve a Lower Error Floor than their Rate-1/3 Parent Codes?
Authors:
I. Chatzigeorgiou,
M. R. D. Rodrigues,
I. J. Wassell,
R. Carrasco
Abstract:
In this paper we concentrate on rate-1/3 systematic parallel concatenated convolutional codes and their rate-1/2 punctured child codes. Assuming maximum-likelihood decoding over an additive white Gaussian channel, we demonstrate that a rate-1/2 non-systematic child code can exhibit a lower error floor than that of its rate-1/3 parent code, if a particular condition is met. However, assuming iter…
▽ More
In this paper we concentrate on rate-1/3 systematic parallel concatenated convolutional codes and their rate-1/2 punctured child codes. Assuming maximum-likelihood decoding over an additive white Gaussian channel, we demonstrate that a rate-1/2 non-systematic child code can exhibit a lower error floor than that of its rate-1/3 parent code, if a particular condition is met. However, assuming iterative decoding, convergence of the non-systematic code towards low bit-error rates is problematic. To alleviate this problem, we propose rate-1/2 partially-systematic codes that can still achieve a lower error floor than that of their rate-1/3 parent codes. Results obtained from extrinsic information transfer charts and simulations support our conclusion.
△ Less
Submitted 9 January, 2007;
originally announced January 2007.
-
Wireless Information-Theoretic Security - Part II: Practical Implementation
Authors:
Matthieu Bloch,
Joao Barros,
Miguel R. D. Rodrigues,
Steven W. McLaughlin
Abstract:
In Part I of this two-part paper on confidential communication over wireless channels, we studied the fundamental security limits of quasi-static fading channels from the point of view of outage secrecy capacity with perfect and imperfect channel state information. In Part II, we develop a practical secret key agreement protocol for Gaussian and quasi-static fading wiretap channels. The protocol…
▽ More
In Part I of this two-part paper on confidential communication over wireless channels, we studied the fundamental security limits of quasi-static fading channels from the point of view of outage secrecy capacity with perfect and imperfect channel state information. In Part II, we develop a practical secret key agreement protocol for Gaussian and quasi-static fading wiretap channels. The protocol uses a four-step procedure to secure communications: establish common randomness via an opportunistic transmission, perform message reconciliation, establish a common key via privacy amplification, and use of the key. We introduce a new reconciliation procedure that uses multilevel coding and optimized low density parity check codes which in some cases comes close to achieving the secrecy capacity limits established in Part I. Finally, we develop new metrics for assessing average secure key generation rates and show that our protocol is effective in secure key renewal.
△ Less
Submitted 23 November, 2006;
originally announced November 2006.
-
Wireless Information-Theoretic Security - Part I: Theoretical Aspects
Authors:
Matthieu Bloch,
Joao Barros,
Miguel R. D. Rodrigues,
Steven W. McLaughlin
Abstract:
In this two-part paper, we consider the transmission of confidential data over wireless wiretap channels. The first part presents an information-theoretic problem formulation in which two legitimate partners communicate over a quasi-static fading channel and an eavesdropper observes their transmissions through another independent quasi-static fading channel. We define the secrecy capacity in ter…
▽ More
In this two-part paper, we consider the transmission of confidential data over wireless wiretap channels. The first part presents an information-theoretic problem formulation in which two legitimate partners communicate over a quasi-static fading channel and an eavesdropper observes their transmissions through another independent quasi-static fading channel. We define the secrecy capacity in terms of outage probability and provide a complete characterization of the maximum transmission rate at which the eavesdropper is unable to decode any information. In sharp contrast with known results for Gaussian wiretap channels (without feedback), our contribution shows that in the presence of fading information-theoretic security is achievable even when the eavesdropper has a better average signal-to-noise ratio (SNR) than the legitimate receiver - fading thus turns out to be a friend and not a foe. The issue of imperfect channel state information is also addressed. Practical schemes for wireless information-theoretic security are presented in Part II, which in some cases comes close to the secrecy capacity limits given in this paper.
△ Less
Submitted 22 November, 2006;
originally announced November 2006.
-
On the Performance of Turbo Codes in Quasi-Static Fading Channels
Authors:
M. R. D. Rodrigues,
I. Chatzigeorgiou,
I. J. Wassell,
R. Carrasco
Abstract:
In this paper, we investigate in detail the performance of turbo codes in quasi-static fading channels both with and without antenna diversity. First, we develop a simple and accurate analytic technique to evaluate the performance of turbo codes in quasi-static fading channels. The proposed analytic technique relates the frame error rate of a turbo code to the iterative decoder convergence thres…
▽ More
In this paper, we investigate in detail the performance of turbo codes in quasi-static fading channels both with and without antenna diversity. First, we develop a simple and accurate analytic technique to evaluate the performance of turbo codes in quasi-static fading channels. The proposed analytic technique relates the frame error rate of a turbo code to the iterative decoder convergence threshold, rather than to the turbo code distance spectrum. Subsequently, we compare the performance of various turbo codes in quasi-static fading channels. We show that, in contrast to the situation in the AWGN channel, turbo codes with different interleaver sizes or turbo codes based on RSC codes with different constraint lengths and generator polynomials exhibit identical performance. Moreover, we also compare the performance of turbo codes and convolutional codes in quasi-static fading channels under the condition of identical decoding complexity. In particular, we show that turbo codes do not outperform convolutional codes in quasi-static fading channels with no antenna diversity; and that turbo codes only outperform convolutional codes in quasi-static fading channels with antenna diversity.
△ Less
Submitted 11 August, 2005;
originally announced August 2005.