subscribe to arXiv mailings

arXiv:2310.05146 [pdf, other]

Large Language Model (LLM) as a System of Multiple Expert Agents: An Approach to solve the Abstraction and Reasoning Corpus (ARC) Challenge

Authors: John Chong Min Tan, Mehul Motani

Abstract: We attempt to solve the Abstraction and Reasoning Corpus (ARC) Challenge using Large Language Models (LLMs) as a system of multiple expert agents. Using the flexibility of LLMs to be prompted to do various novel tasks using zero-shot, few-shot, context-grounded prompting, we explore the feasibility of using LLMs to solve the ARC Challenge. We firstly convert the input image into multiple suitable… ▽ More We attempt to solve the Abstraction and Reasoning Corpus (ARC) Challenge using Large Language Models (LLMs) as a system of multiple expert agents. Using the flexibility of LLMs to be prompted to do various novel tasks using zero-shot, few-shot, context-grounded prompting, we explore the feasibility of using LLMs to solve the ARC Challenge. We firstly convert the input image into multiple suitable text-based abstraction spaces. We then utilise the associative power of LLMs to derive the input-output relationship and map this to actions in the form of a working program, similar to Voyager / Ghost in the MineCraft. In addition, we use iterative environmental feedback in order to guide LLMs to solve the task. Our proposed approach achieves 50 solves out of 111 training set problems (45%) with just three abstraction spaces - grid, object and pixel - and we believe that with more abstraction spaces and learnable actions, we will be able to solve more. △ Less

Submitted 8 October, 2023; originally announced October 2023.

Comments: 6 main pages, 1 page references, 18 pages appendix

arXiv:2304.02223 [pdf, other]

Local Intrinsic Dimensional Entropy

Authors: Rohan Ghosh, Mehul Motani

Abstract: Most entropy measures depend on the spread of the probability distribution over the sample space $\mathcal{X}$, and the maximum entropy achievable scales proportionately with the sample space cardinality $|\mathcal{X}|$. For a finite $|\mathcal{X}|$, this yields robust entropy measures which satisfy many important properties, such as invariance to bijections, while the same is not true for continu… ▽ More Most entropy measures depend on the spread of the probability distribution over the sample space $\mathcal{X}$, and the maximum entropy achievable scales proportionately with the sample space cardinality $|\mathcal{X}|$. For a finite $|\mathcal{X}|$, this yields robust entropy measures which satisfy many important properties, such as invariance to bijections, while the same is not true for continuous spaces (where $|\mathcal{X}|=\infty$). Furthermore, since $\mathbb{R}$ and $\mathbb{R}^d$ ($d\in \mathbb{Z}^+$) have the same cardinality (from Cantor's correspondence argument), cardinality-dependent entropy measures cannot encode the data dimensionality. In this work, we question the role of cardinality and distribution spread in defining entropy measures for continuous spaces, which can undergo multiple rounds of transformations and distortions, e.g., in neural networks. We find that the average value of the local intrinsic dimension of a distribution, denoted as ID-Entropy, can serve as a robust entropy measure for continuous spaces, while capturing the data dimensionality. We find that ID-Entropy satisfies many desirable properties and can be extended to conditional entropy, joint entropy and mutual-information variants. ID-Entropy also yields new information bottleneck principles and also links to causality. In the context of deep learning, for feedforward architectures, we show, theoretically and empirically, that the ID-Entropy of a hidden layer directly controls the generalization gap for both classifiers and auto-encoders, when the target function is Lipschitz continuous. Our work primarily shows that, for continuous spaces, taking a structural rather than a statistical approach yields entropy measures which preserve intrinsic data dimensionality, while being relevant for studying various architectures. △ Less

Submitted 24 May, 2023; v1 submitted 5 April, 2023; originally announced April 2023.

Comments: Proceedings of the AAAI Conference on Artificial Intelligence 2023

arXiv:2301.13758 [pdf, other]

Learning, Fast and Slow: A Goal-Directed Memory-Based Approach for Dynamic Environments

Authors: John Chong Min Tan, Mehul Motani

Abstract: Model-based next state prediction and state value prediction are slow to converge. To address these challenges, we do the following: i) Instead of a neural network, we do model-based planning using a parallel memory retrieval system (which we term the slow mechanism); ii) Instead of learning state values, we guide the agent's actions using goal-directed exploration, by using a neural network to ch… ▽ More Model-based next state prediction and state value prediction are slow to converge. To address these challenges, we do the following: i) Instead of a neural network, we do model-based planning using a parallel memory retrieval system (which we term the slow mechanism); ii) Instead of learning state values, we guide the agent's actions using goal-directed exploration, by using a neural network to choose the next action given the current state and the goal state (which we term the fast mechanism). The goal-directed exploration is trained online using hippocampal replay of visited states and future imagined states every single time step, leading to fast and efficient training. Empirical studies show that our proposed method has a 92% solve rate across 100 episodes in a dynamically changing grid world, significantly outperforming state-of-the-art actor critic mechanisms such as PPO (54%), TRPO (50%) and A2C (24%). Ablation studies demonstrate that both mechanisms are crucial. We posit that the future of Reinforcement Learning (RL) will be to model goals and sub-goals for various tasks, and plan it out in a goal-directed memory-based approach. △ Less

Submitted 1 February, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

Comments: 22 pages

arXiv:2301.07871 [pdf, other]

doi 10.1561/0100000134

On Finite Blocklength Lossy Source Coding

Authors: Lin Zhou, Mehul Motani

Abstract: In this monograph, we review recent advances in second-order asymptotics for lossy source coding, which provides approximations to the finite blocklength performance of optimal codes. The monograph is divided into three parts. In part I, we motivate the monograph, present basic definitions, introduce mathematical tools and illustrate the motivation of non-asymptotic and second-order asymptotics vi… ▽ More In this monograph, we review recent advances in second-order asymptotics for lossy source coding, which provides approximations to the finite blocklength performance of optimal codes. The monograph is divided into three parts. In part I, we motivate the monograph, present basic definitions, introduce mathematical tools and illustrate the motivation of non-asymptotic and second-order asymptotics via the example of lossless source coding. In part II, we first present existing results for the rate-distortion problem with proof sketches. Subsequently, we present five generations of the rate-distortion problem to tackle various aspects of practical quantization tasks: noisy source, noisy channel, mismatched code, Gauss-Markov source and fixed-to-variable length compression. By presenting theoretical bounds for these settings, we illustrate the effect of noisy observation of the source, the influence of noisy transmission of the compressed information, the effect of using a fixed coding scheme for an arbitrary source and the roles of source memory and variable rate. In part III, we present four multiterminal generalizations of the rate-distortion problem to consider multiple encoders, decoders or source sequences: the Kaspi problem, the successive refinement problem, the Fu-Yeung problem and the Gray-Wyner problem. By presenting theoretical bounds for these multiterminal problems, we illustrate the role of side information, the optimality of stop and transmit, the effect of simultaneous lossless and lossy compression, and the tradeoff between encoders' rates in compressing correlated sources. Finally, we conclude the monograph, mention related results and discuss future directions. △ Less

Submitted 18 January, 2023; originally announced January 2023.

Comments: Submitted to FnT CIT. Comments are welcome

Journal ref: Foundations and Trends in Communications and Information Theory, 2023

arXiv:2212.06145 [pdf, other]

AP: Selective Activation for De-sparsifying Pruned Neural Networks

Authors: Shiyu Liu, Rohan Ghosh, Dylan Tan, Mehul Motani

Abstract: The rectified linear unit (ReLU) is a highly successful activation function in neural networks as it allows networks to easily obtain sparse representations, which reduces overfitting in overparameterized networks. However, in network pruning, we find that the sparsity introduced by ReLU, which we quantify by a term called dynamic dead neuron rate (DNR), is not beneficial for the pruned network. I… ▽ More The rectified linear unit (ReLU) is a highly successful activation function in neural networks as it allows networks to easily obtain sparse representations, which reduces overfitting in overparameterized networks. However, in network pruning, we find that the sparsity introduced by ReLU, which we quantify by a term called dynamic dead neuron rate (DNR), is not beneficial for the pruned network. Interestingly, the more the network is pruned, the smaller the dynamic DNR becomes during optimization. This motivates us to propose a method to explicitly reduce the dynamic DNR for the pruned network, i.e., de-sparsify the network. We refer to our method as Activating-while-Pruning (AP). We note that AP does not function as a stand-alone method, as it does not evaluate the importance of weights. Instead, it works in tandem with existing pruning methods and aims to improve their performance by selective activation of nodes to reduce the dynamic DNR. We conduct extensive experiments using popular networks (e.g., ResNet, VGG) via two classical and three state-of-the-art pruning methods. The experimental results on public datasets (e.g., CIFAR-10/100) suggest that AP works well with existing pruning methods and improves the performance by 3% - 4%. For larger scale datasets (e.g., ImageNet) and state-of-the-art networks (e.g., vision transformer), we observe an improvement of 2% - 3% with AP as opposed to without. Lastly, we conduct an ablation study to examine the effectiveness of the components comprising AP. △ Less

Submitted 9 December, 2022; originally announced December 2022.

Comments: 16 Pages

arXiv:2212.06144 [pdf, other]

Optimizing Learning Rate Schedules for Iterative Pruning of Deep Neural Networks

Authors: Shiyu Liu, Rohan Ghosh, John Tan Chong Min, Mehul Motani

Abstract: The importance of learning rate (LR) schedules on network pruning has been observed in a few recent works. As an example, Frankle and Carbin (2019) highlighted that winning tickets (i.e., accuracy preserving subnetworks) can not be found without applying a LR warmup schedule and Renda, Frankle and Carbin (2020) demonstrated that rewinding the LR to its initial state at the end of each pruning cycl… ▽ More The importance of learning rate (LR) schedules on network pruning has been observed in a few recent works. As an example, Frankle and Carbin (2019) highlighted that winning tickets (i.e., accuracy preserving subnetworks) can not be found without applying a LR warmup schedule and Renda, Frankle and Carbin (2020) demonstrated that rewinding the LR to its initial state at the end of each pruning cycle improves performance. In this paper, we go one step further by first providing a theoretical justification for the surprising effect of LR schedules. Next, we propose a LR schedule for network pruning called SILO, which stands for S-shaped Improved Learning rate Optimization. The advantages of SILO over existing state-of-the-art (SOTA) LR schedules are two-fold: (i) SILO has a strong theoretical motivation and dynamically adjusts the LR during pruning to improve generalization. Specifically, SILO increases the LR upper bound (max_lr) in an S-shape. This leads to an improvement of 2% - 4% in extensive experiments with various types of networks (e.g., Vision Transformers, ResNet) on popular datasets such as ImageNet, CIFAR-10/100. (ii) In addition to the strong theoretical motivation, SILO is empirically optimal in the sense of matching an Oracle, which exhaustively searches for the optimal value of max_lr via grid search. We find that SILO is able to precisely adjust the value of max_lr to be within the Oracle optimized interval, resulting in performance competitive with the Oracle with significantly lower complexity. △ Less

Submitted 30 December, 2022; v1 submitted 9 December, 2022; originally announced December 2022.

Comments: 23 Pages. arXiv admin note: text overlap with arXiv:2110.08764

arXiv:2212.06143 [pdf, other]

Improving Mutual Information based Feature Selection by Boosting Unique Relevance

Authors: Shiyu Liu, Mehul Motani

Abstract: Mutual Information (MI) based feature selection makes use of MI to evaluate each feature and eventually shortlists a relevant feature subset, in order to address issues associated with high-dimensional datasets. Despite the effectiveness of MI in feature selection, we notice that many state-of-the-art algorithms disregard the so-called unique relevance (UR) of features, and arrive at a suboptimal… ▽ More Mutual Information (MI) based feature selection makes use of MI to evaluate each feature and eventually shortlists a relevant feature subset, in order to address issues associated with high-dimensional datasets. Despite the effectiveness of MI in feature selection, we notice that many state-of-the-art algorithms disregard the so-called unique relevance (UR) of features, and arrive at a suboptimal selected feature subset which contains a non-negligible number of redundant features. We point out that the heart of the problem is that all these MIBFS algorithms follow the criterion of Maximize Relevance with Minimum Redundancy (MRwMR), which does not explicitly target UR. This motivates us to augment the existing criterion with the objective of boosting unique relevance (BUR), leading to a new criterion called MRwMR-BUR. Depending on the task being addressed, MRwMR-BUR has two variants, termed MRwMR-BUR-KSG and MRwMR-BUR-CLF, which estimate UR differently. MRwMR-BUR-KSG estimates UR via a nearest-neighbor based approach called the KSG estimator and is designed for three major tasks: (i) Classification Performance. (ii) Feature Interpretability. (iii) Classifier Generalization. MRwMR-BUR-CLF estimates UR via a classifier based approach. It adapts UR to different classifiers, further improving the competitiveness of MRwMR-BUR for classification performance oriented tasks. The performance of both MRwMR-BUR-KSG and MRwMR-BUR-CLF is validated via experiments using six public datasets and three popular classifiers. Specifically, as compared to MRwMR, the proposed MRwMR-BUR-KSG improves the test accuracy by 2% - 3% with 25% - 30% fewer features being selected, without increasing the algorithm complexity. MRwMR-BUR-CLF further improves the classification performance by 3.8%- 5.5% (relative to MRwMR), and it also outperforms three popular classifier dependent feature selection methods. △ Less

Submitted 17 December, 2022; v1 submitted 9 December, 2022; originally announced December 2022.

Comments: 13 pages

arXiv:2212.06142 [pdf, other]

Towards Better Long-range Time Series Forecasting using Generative Forecasting

Authors: Shiyu Liu, Rohan Ghosh, Mehul Motani

Abstract: Long-range time series forecasting is usually based on one of two existing forecasting strategies: Direct Forecasting and Iterative Forecasting, where the former provides low bias, high variance forecasts and the latter leads to low variance, high bias forecasts. In this paper, we propose a new forecasting strategy called Generative Forecasting (GenF), which generates synthetic data for the next f… ▽ More Long-range time series forecasting is usually based on one of two existing forecasting strategies: Direct Forecasting and Iterative Forecasting, where the former provides low bias, high variance forecasts and the latter leads to low variance, high bias forecasts. In this paper, we propose a new forecasting strategy called Generative Forecasting (GenF), which generates synthetic data for the next few time steps and then makes long-range forecasts based on generated and observed data. We theoretically prove that GenF is able to better balance the forecasting variance and bias, leading to a much smaller forecasting error. We implement GenF via three components: (i) a novel conditional Wasserstein Generative Adversarial Network (GAN) based generator for synthetic time series data generation, called CWGAN-TS. (ii) a transformer based predictor, which makes long-range predictions using both generated and observed data. (iii) an information theoretic clustering algorithm to improve the training of both the CWGAN-TS and the transformer based predictor. The experimental results on five public datasets demonstrate that GenF significantly outperforms a diverse range of state-of-the-art benchmarks and classical approaches. Specifically, we find a 5% - 11% improvement in predictive performance (mean absolute error) while having a 15% - 50% reduction in parameters compared to the benchmarks. Lastly, we conduct an ablation study to further explore and demonstrate the effectiveness of the components comprising GenF. △ Less

Submitted 9 December, 2022; originally announced December 2022.

Comments: 14 pages. arXiv admin note: substantial text overlap with arXiv:2110.08770

arXiv:2207.06646 [pdf, other]

doi 10.5555/3524938.3525805

DropNet: Reducing Neural Network Complexity via Iterative Pruning

Authors: John Tan Chong Min, Mehul Motani

Abstract: Modern deep neural networks require a significant amount of computing time and power to train and deploy, which limits their usage on edge devices. Inspired by the iterative weight pruning in the Lottery Ticket Hypothesis, we propose DropNet, an iterative pruning method which prunes nodes/filters to reduce network complexity. DropNet iteratively removes nodes/filters with the lowest average post-a… ▽ More Modern deep neural networks require a significant amount of computing time and power to train and deploy, which limits their usage on edge devices. Inspired by the iterative weight pruning in the Lottery Ticket Hypothesis, we propose DropNet, an iterative pruning method which prunes nodes/filters to reduce network complexity. DropNet iteratively removes nodes/filters with the lowest average post-activation value across all training samples. Empirically, we show that DropNet is robust across diverse scenarios, including MLPs and CNNs using the MNIST, CIFAR-10 and Tiny ImageNet datasets. We show that up to 90% of the nodes/filters can be removed without any significant loss of accuracy. The final pruned network performs well even with reinitialization of the weights and biases. DropNet also has similar accuracy to an oracle which greedily removes nodes/filters one at a time to minimise training loss, highlighting its effectiveness. △ Less

Submitted 13 July, 2022; originally announced July 2022.

Comments: Published at ICML 2020. Code can be found at https://github.com/tanchongmin/DropNet

Journal ref: Proceedings of the 37th International Conference on Machine Learning, PMLR 119:9356-9366, 2020 https://proceedings.mlr.press/v119/tan20a.html

arXiv:2207.05991 [pdf, other]

Brick Tic-Tac-Toe: Exploring the Generalizability of AlphaZero to Novel Test Environments

Authors: John Tan Chong Min, Mehul Motani

Abstract: Traditional reinforcement learning (RL) environments typically are the same for both the training and testing phases. Hence, current RL methods are largely not generalizable to a test environment which is conceptually similar but different from what the method has been trained on, which we term the novel test environment. As an effort to push RL research towards algorithms which can generalize to… ▽ More Traditional reinforcement learning (RL) environments typically are the same for both the training and testing phases. Hence, current RL methods are largely not generalizable to a test environment which is conceptually similar but different from what the method has been trained on, which we term the novel test environment. As an effort to push RL research towards algorithms which can generalize to novel test environments, we introduce the Brick Tic-Tac-Toe (BTTT) test bed, where the brick position in the test environment is different from that in the training environment. Using a round-robin tournament on the BTTT environment, we show that traditional RL state-search approaches such as Monte Carlo Tree Search (MCTS) and Minimax are more generalizable to novel test environments than AlphaZero is. This is surprising because AlphaZero has been shown to achieve superhuman performance in environments such as Go, Chess and Shogi, which may lead one to think that it performs well in novel test environments. Our results show that BTTT, though simple, is rich enough to explore the generalizability of AlphaZero. We find that merely increasing MCTS lookahead iterations was insufficient for AlphaZero to generalize to some novel test environments. Rather, increasing the variety of training environments helps to progressively improve generalizability across all possible starting brick configurations. △ Less

Submitted 13 July, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

Comments: AlphaZero, Generalization

arXiv:2112.06044 [pdf, other]

Achieving Low Complexity Neural Decoders via Iterative Pruning

Authors: Vikrant Malik, Rohan Ghosh, Mehul Motani

Abstract: The advancement of deep learning has led to the development of neural decoders for low latency communications. However, neural decoders can be very complex which can lead to increased computation and latency. We consider iterative pruning approaches (such as the lottery ticket hypothesis algorithm) to prune weights in neural decoders. Decoders with fewer number of weights can have lower latency an… ▽ More The advancement of deep learning has led to the development of neural decoders for low latency communications. However, neural decoders can be very complex which can lead to increased computation and latency. We consider iterative pruning approaches (such as the lottery ticket hypothesis algorithm) to prune weights in neural decoders. Decoders with fewer number of weights can have lower latency and lower complexity while retaining the accuracy of the original model. This will make neural decoders more suitable for mobile and other edge devices with limited computational power. We also propose semi-soft decision decoding for neural decoders which can be used to improve the bit error rate performance of the pruned network. △ Less

Submitted 16 November, 2022; v1 submitted 11 December, 2021; originally announced December 2021.

Comments: Machine Learning For Systems Workshop at NeurIPS 2021

arXiv:2110.08770 [pdf, ps, other]

Towards Better Long-range Time Series Forecasting using Generative Adversarial Networks

Authors: Shiyu Liu, Rohan Ghosh, Mehul Motani

Abstract: Long-range time series forecasting is usually based on one of two existing forecasting strategies: Direct Forecasting and Iterative Forecasting, where the former provides low bias, high variance forecasts and the later leads to low variance, high bias forecasts. In this paper, we propose a new forecasting strategy called Generative Forecasting (GenF), which generates synthetic data for the next fe… ▽ More Long-range time series forecasting is usually based on one of two existing forecasting strategies: Direct Forecasting and Iterative Forecasting, where the former provides low bias, high variance forecasts and the later leads to low variance, high bias forecasts. In this paper, we propose a new forecasting strategy called Generative Forecasting (GenF), which generates synthetic data for the next few time steps and then makes long-range forecasts based on generated and observed data. We theoretically prove that GenF is able to better balance the forecasting variance and bias, leading to a much smaller forecasting error. We implement GenF via three components: (i) a novel conditional Wasserstein Generative Adversarial Network (GAN) based generator for synthetic time series data generation, called CWGAN-TS. (ii) a transformer based predictor, which makes long-range predictions using both generated and observed data. (iii) an information theoretic clustering algorithm to improve the training of both the CWGAN-TS and the transformer based predictor. The experimental results on five public datasets demonstrate that GenF significantly outperforms a diverse range of state-of-the-art benchmarks and classical approaches. Specifically, we find a 5% - 11% improvement in predictive performance (mean absolute error) while having a 15% - 50% reduction in parameters compared to the benchmarks. Lastly, we conduct an ablation study to demonstrate the effectiveness of the components comprising GenF. △ Less

Submitted 4 August, 2022; v1 submitted 17 October, 2021; originally announced October 2021.

Comments: 20 pages paper including 9 pages appendix

arXiv:2110.08764 [pdf, other]

S-Cyc: A Learning Rate Schedule for Iterative Pruning of ReLU-based Networks

Authors: Shiyu Liu, Chong Min John Tan, Mehul Motani

Abstract: We explore a new perspective on adapting the learning rate (LR) schedule to improve the performance of the ReLU-based network as it is iteratively pruned. Our work and contribution consist of four parts: (i) We find that, as the ReLU-based network is iteratively pruned, the distribution of weight gradients tends to become narrower. This leads to the finding that as the network becomes more sparse,… ▽ More We explore a new perspective on adapting the learning rate (LR) schedule to improve the performance of the ReLU-based network as it is iteratively pruned. Our work and contribution consist of four parts: (i) We find that, as the ReLU-based network is iteratively pruned, the distribution of weight gradients tends to become narrower. This leads to the finding that as the network becomes more sparse, a larger value of LR should be used to train the pruned network. (ii) Motivated by this finding, we propose a novel LR schedule, called S-Cyclical (S-Cyc) which adapts the conventional cyclical LR schedule by gradually increasing the LR upper bound (max_lr) in an S-shape as the network is iteratively pruned.We highlight that S-Cyc is a method agnostic LR schedule that applies to many iterative pruning methods. (iii) We evaluate the performance of the proposed S-Cyc and compare it to four LR schedule benchmarks. Our experimental results on three state-of-the-art networks (e.g., VGG-19, ResNet-20, ResNet-50) and two popular datasets (e.g., CIFAR-10, ImageNet-200) demonstrate that S-Cyc consistently outperforms the best performing benchmark with an improvement of 2.1% - 3.4%, without substantial increase in complexity. (iv) We evaluate S-Cyc against an oracle and show that S-Cyc achieves comparable performance to the oracle, which carefully tunes max_lr via grid search. △ Less

Submitted 17 October, 2021; originally announced October 2021.

Comments: 7 pages main paper with 5 pages appendix

arXiv:1911.07499 [pdf, ps, other]

Throughput Maximization with an Average Age of Information Constraint in Fading Channels

Authors: Rajshekhar Vishweshwar Bhat, Rahul Vaze, Mehul Motani

Abstract: In the emerging fifth generation (5G) technology, communication nodes are expected to support two crucial classes of information traffic, namely, the enhanced mobile broadband (eMBB) traffic with high data rate requirements, and ultra-reliable low-latency communications (URLLC) traffic with strict requirements on latency and reliability. The URLLC traffic, which is usually analyzed by a metric cal… ▽ More In the emerging fifth generation (5G) technology, communication nodes are expected to support two crucial classes of information traffic, namely, the enhanced mobile broadband (eMBB) traffic with high data rate requirements, and ultra-reliable low-latency communications (URLLC) traffic with strict requirements on latency and reliability. The URLLC traffic, which is usually analyzed by a metric called the age of information (AoI), is assigned the first priority over the resources at a node. Motivated by this, we consider long-term average throughput maximization problems subject to average AoI and power constraints in a single user fading channel, when (i) perfect and (ii) no channel state information at the transmitter (CSIT) is available. We propose simple age-independent stationary randomized policies (AI-SRP), which allocate powers at the transmitter based only on the channel state and/or distribution information, without any knowledge of the AoI. We show that the optimal throughputs achieved by the AI-SRPs for scenarios (i) and (ii) are at least equal to the half of the respective optimal long-term average throughputs, independent of all the parameters of the problem, and that they are within additive gaps, expressed in terms of the optimal dual variable corresponding to their average AoI constraints, from the respective optimal long-term average throughputs. △ Less

Submitted 18 November, 2019; originally announced November 2019.

arXiv:1911.06621 [pdf, ps, other]

Long-range Prediction of Vital Signs Using Generative Boosting via LSTM Networks

Authors: Shiyu Liu, Mehul Motani

Abstract: Vital signs including heart rate, respiratory rate, body temperature and blood pressure, are critical in the clinical decision making process. Effective early prediction of vital signs help to alert medical practitioner ahead of time and may prevent adverse health outcomes. In this paper, we suggest a new approach called generative boosting, in order to effectively perform early prediction of vita… ▽ More Vital signs including heart rate, respiratory rate, body temperature and blood pressure, are critical in the clinical decision making process. Effective early prediction of vital signs help to alert medical practitioner ahead of time and may prevent adverse health outcomes. In this paper, we suggest a new approach called generative boosting, in order to effectively perform early prediction of vital signs. Generative boosting consists of a generative model, to generate synthetic data for next few time steps, and several predictive models, to directly make long-range predictions based on observed and generated data. We explore generative boosting via long short-term memory (LSTM) for both the predictive and generative models, leading to a scheme called generative LSTM (GLSTM). Our experiments indicate that GLSTM outperforms a diverse range of strong benchmark models, with and without generative boosting. Finally, we use a mutual information based clustering algorithm to select a more representative dataset to train the generative model of GLSTM. This significantly improves the long-range predictive performance of high variation vital signs such as heart rate and systolic blood pressure. △ Less

Submitted 14 November, 2019; originally announced November 2019.

Comments: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract

arXiv:1908.06416 [pdf, other]

Investigating Convolutional Neural Networks using Spatial Orderness

Authors: Rohan Ghosh, Anupam K. Gupta, Mehul Motani

Abstract: Convolutional Neural Networks (CNN) have been pivotal to the success of many state-of-the-art classification problems, in a wide variety of domains (for e.g. vision, speech, graphs and medical imaging). A commonality within those domains is the presence of hierarchical, spatially agglomerative local-to-global interactions within the data. For two-dimensional images, such interactions may induce an… ▽ More Convolutional Neural Networks (CNN) have been pivotal to the success of many state-of-the-art classification problems, in a wide variety of domains (for e.g. vision, speech, graphs and medical imaging). A commonality within those domains is the presence of hierarchical, spatially agglomerative local-to-global interactions within the data. For two-dimensional images, such interactions may induce an a priori relationship between the pixel data and the underlying spatial ordering of the pixels. For instance in natural images, neighboring pixels are more likely contain similar values than non-neighboring pixels which are further apart. To that end, we propose a statistical metric called spatial orderness, which quantifies the extent to which the input data (2D) obeys the underlying spatial ordering at various scales. In our experiments, we mainly find that adding convolutional layers to a CNN could be counterproductive for data bereft of spatial order at higher scales. We also observe, quite counter-intuitively, that the spatial orderness of CNN feature maps show a synchronized increase during the intial stages of training, and validation performance only improves after spatial orderness of feature maps start decreasing. Lastly, we present a theoretical analysis (and empirical validation) of the spatial orderness of network weights, where we find that using smaller kernel sizes leads to kernels of greater spatial orderness and vice-versa. △ Less

Submitted 29 November, 2019; v1 submitted 18 August, 2019; originally announced August 2019.

Comments: Presented at BMVC 2019: Workshop on Interpretable and Explainable Machine Vision, Cardiff, UK

arXiv:1901.00387 [pdf, ps, other]

Generalized Sphere-Packing Bound for Subblock-Constrained Codes

Authors: Han Mao Kiah, Anshoo Tandon, Mehul Motani

Abstract: We apply the generalized sphere-packing bound to two classes of subblock-constrained codes. A la Fazeli et al. (2015), we made use of automorphism to significantly reduce the number of variables in the associated linear programming problem. In particular, we study binary constant subblock-composition codes (CSCCs), characterized by the property that the number of ones in each subblock is constant,… ▽ More We apply the generalized sphere-packing bound to two classes of subblock-constrained codes. A la Fazeli et al. (2015), we made use of automorphism to significantly reduce the number of variables in the associated linear programming problem. In particular, we study binary constant subblock-composition codes (CSCCs), characterized by the property that the number of ones in each subblock is constant, and binary subblock energy-constrained codes (SECCs), characterized by the property that the number of ones in each subblock exceeds a certain threshold. For CSCCs, we show that the optimization problem is equivalent to finding the minimum of $N$ variables, where $N$ is independent of the number of subblocks. We then provide closed-form solutions for the generalized sphere-packing bounds for single- and double-error correcting CSCCs. For SECCs, we provide closed-form solutions for the generalized sphere-packing bounds for single errors in certain special cases. We also obtain improved bounds on the optimal asymptotic rate for CSCCs and SECCs, and provide numerical examples to highlight the improvement. △ Less

Submitted 2 January, 2019; originally announced January 2019.

arXiv:1812.00415 [pdf, other]

Feature Selection Based on Unique Relevant Information for Health Data

Authors: Shiyu Liu, Mehul Motani

Abstract: Feature selection, which searches for the most representative features in observed data, is critical for health data analysis. Unlike feature extraction, such as PCA and autoencoder based methods, feature selection preserves interpretability, meaning that the selected features provide direct information about certain health conditions (i.e., the label). Thus, feature selection allows domain expert… ▽ More Feature selection, which searches for the most representative features in observed data, is critical for health data analysis. Unlike feature extraction, such as PCA and autoencoder based methods, feature selection preserves interpretability, meaning that the selected features provide direct information about certain health conditions (i.e., the label). Thus, feature selection allows domain experts, such as clinicians, to understand the predictions made by machine learning based systems, as well as improve their own diagnostic skills. Mutual information is often used as a basis for feature selection since it measures dependencies between features and labels. In this paper, we introduce a novel mutual information based feature selection (MIBFS) method called SURI, which boosts features with high unique relevant information. We compare SURI to existing MIBFS methods using 3 different classifiers on 6 publicly available healthcare data sets. The results indicate that, in addition to preserving interpretability, SURI selects more relevant feature subsets which lead to higher classification performance. More importantly, we explore the dynamics of mutual information on a public low-dimensional health data set via exhaustive search. The results suggest the important role of unique relevant information in feature selection and verify the principles behind SURI. △ Less

Submitted 2 December, 2018; originally announced December 2018.

Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

Report number: ML4H/2018/172

arXiv:1807.09040 [pdf, ps, other]

Are RLL Codes Suitable for Simultaneous Energy and Information Transfer?

Authors: Anshoo Tandon, Mehul Motani, Lav R. Varshney

Abstract: Run-length limited (RLL) codes are a well-studied class of constrained codes having application in diverse areas such as optical and magnetic data recording systems, DNA-based storage, and visible light communication. RLL codes have also been proposed for the emerging area of simultaneous energy and information transfer, where the receiver uses the received signal for decoding information as well… ▽ More Run-length limited (RLL) codes are a well-studied class of constrained codes having application in diverse areas such as optical and magnetic data recording systems, DNA-based storage, and visible light communication. RLL codes have also been proposed for the emerging area of simultaneous energy and information transfer, where the receiver uses the received signal for decoding information as well as for harvesting energy to run its circuitry. In this paper, we show that RLL codes are not the best codes for simultaneous energy and information transfer, in terms of the maximum number of codewords which avoid energy outage, i.e., outage-constrained capacity. Specifically, we show that sliding window constrained (SWC) codes and subblock energy constrained (SEC) codes have significantly higher outage-constrained capacities than RLL codes. △ Less

Submitted 24 July, 2018; originally announced July 2018.

arXiv:1806.11271 [pdf, other]

Multicasting Energy and Information Simultaneously

Authors: Ting-Yi Wu, Anshoo Tandon, Lav R. Varshney, Mehul Motani

Abstract: Communication systems for multicasting information and energy simultaneously to more than one user are investigated. In the system under study, a transmitter sends the same message and signal to multiple receivers over distinct and independent channels. The fundamental communication limit under a received energy constraint, called the multicast capacity-energy function, is studied and a single-let… ▽ More Communication systems for multicasting information and energy simultaneously to more than one user are investigated. In the system under study, a transmitter sends the same message and signal to multiple receivers over distinct and independent channels. The fundamental communication limit under a received energy constraint, called the multicast capacity-energy function, is studied and a single-letter expression is derived. This is based on coding theorems for compound channels. The problem of receiver segmentation, where receivers are divided into related groups, is also considered. △ Less

Submitted 13 March, 2019; v1 submitted 29 June, 2018; originally announced June 2018.

Comments: 6 pages, 5 figures, parts of results have submitted to 2019 IEEE International Symposium on Information Theory

arXiv:1806.00739 [pdf, ps, other]

Second-Order Asymptotically Optimal Statistical Classification

Authors: Lin Zhou, Vincent Y. F. Tan, Mehul Motani

Abstract: Motivated by real-world machine learning applications, we analyze approximations to the non-asymptotic fundamental limits of statistical classification. In the binary version of this problem, given two training sequences generated according to two {\em unknown} distributions $P_1$ and $P_2$, one is tasked to classify a test sequence which is known to be generated according to either $P_1$ or… ▽ More Motivated by real-world machine learning applications, we analyze approximations to the non-asymptotic fundamental limits of statistical classification. In the binary version of this problem, given two training sequences generated according to two {\em unknown} distributions $P_1$ and $P_2$, one is tasked to classify a test sequence which is known to be generated according to either $P_1$ or $P_2$. This problem can be thought of as an analogue of the binary hypothesis testing problem but in the present setting, the generating distributions are unknown. Due to finite sample considerations, we consider the second-order asymptotics (or dispersion-type) tradeoff between type-I and type-II error probabilities for tests which ensure that (i) the type-I error probability for {\em all} pairs of distributions decays exponentially fast and (ii) the type-II error probability for a {\em particular} pair of distributions is non-vanishing. We generalize our results to classification of multiple hypotheses with the rejection option. △ Less

Submitted 6 December, 2018; v1 submitted 3 June, 2018; originally announced June 2018.

Comments: To appear in Information and Inference: A Journal of the IMA (https://academic.oup.com/imaiai)

arXiv:1801.03813 [pdf, ps, other]

Energy Harvesting Communications Using Dual Alternating Batteries

Authors: Rajshekhar Vishweshwar Bhat, Mehul Motani, Chandra R Murthy, Rahul Vaze

Abstract: Practical energy harvesting (EH) based communication systems typically use a battery to temporarily store the harvested energy prior to its use for communication. The batteries can be damaged when they are repeatedly charged (discharged) after being partially discharged (charged), overcharged or deeply discharged. This motivates the cycle constraint which says that a battery must be charged (disch… ▽ More Practical energy harvesting (EH) based communication systems typically use a battery to temporarily store the harvested energy prior to its use for communication. The batteries can be damaged when they are repeatedly charged (discharged) after being partially discharged (charged), overcharged or deeply discharged. This motivates the cycle constraint which says that a battery must be charged (discharged) only after it is sufficiently discharged (charged). We also assume Bernoulli energy arrivals, and a half-duplex constraint due to which the batteries are not charged and discharged simultaneously. In this context, we study EH communication systems with: (a) a single-battery with capacity 2B units and (b) dual-batteries, each having capacity of B units. The aim is to obtain the best possible long-term average throughputs and throughput regions in point-to-point (P2P) channels and multiple access channels (MAC), respectively. For the P2P channel, we obtain an analytical optimal solution in the single-battery case, and propose optimal and sub-optimal power allocation policies for the dual-battery case. We extend these policies to obtain achievable throughput regions in MACs by jointly allocating rates and powers. From numerical simulations, we find that the optimal throughput in the dual-battery case is significantly higher than that in the single-battery case, although the total storage capacity in both cases is 2B units. Further, in the proposed policies, the largest throughput region in the single-battery case is contained within that of the dual-battery case. △ Less

Submitted 18 December, 2018; v1 submitted 11 January, 2018; originally announced January 2018.

Comments: A single battery case is added and its performance is compared with that of the dual-battery case, with additional simulation results

arXiv:1801.03794 [pdf, ps, other]

Hybrid NOMA-TDMA for Multiple Access Channels with Non-Ideal Batteries and Circuit Cost

Authors: Rajshekhar Vishweshwar Bhat, Mehul Motani, Teng Joon Lim

Abstract: We consider a multiple-access channel where the users are powered from batteries having non-negligible internal resistance. When power is drawn from the battery, a variable fraction of the power, which is a function of the power drawn from the battery, is lost across the internal resistance. Hence, the power delivered to the load is less than the power drawn from the battery. The users consume a c… ▽ More We consider a multiple-access channel where the users are powered from batteries having non-negligible internal resistance. When power is drawn from the battery, a variable fraction of the power, which is a function of the power drawn from the battery, is lost across the internal resistance. Hence, the power delivered to the load is less than the power drawn from the battery. The users consume a constant power for the circuit operation during transmission but do not consume any power when not transmitting. In this setting, we obtain the maximum sum-rates and achievable rate regions under various cases. We show that, unlike in the ideal battery case, the TDMA (time-division multiple access) strategy, wherein the users transmit orthogonally in time, may not always achieve the maximum sum-rate when the internal resistance is non-zero. The users may need to adopt a hybrid NOMA-TDMA strategy which combines the features of NOMA (non-orthogonal multiple access) and TDMA, wherein a set of users are allocated fixed time windows for orthogonal single-user and non-orthogonal joint transmissions, respectively. We also numerically show that the maximum achievable rate regions in NOMA and TDMA strategies are contained within the maximum achievable rate region of the hybrid NOMA-TDMA strategy. △ Less

Submitted 15 January, 2018; v1 submitted 11 January, 2018; originally announced January 2018.

arXiv:1711.11206 [pdf, ps, other]

The Dispersion of Mismatched Joint Source-Channel Coding for Arbitrary Sources and Additive Channels

Authors: Lin Zhou, Vincent Y. F. Tan, Mehul Motani

Abstract: We consider a joint source channel coding (JSCC) problem in which we desire to transmit an arbitrary memoryless source over an arbitrary additive channel. We propose a mismatched coding architecture that consists of Gaussian codebooks for both the source reproduction sequences and channel codewords. The natural nearest neighbor encoder and decoder, however, need to be judiciously modified to obtai… ▽ More We consider a joint source channel coding (JSCC) problem in which we desire to transmit an arbitrary memoryless source over an arbitrary additive channel. We propose a mismatched coding architecture that consists of Gaussian codebooks for both the source reproduction sequences and channel codewords. The natural nearest neighbor encoder and decoder, however, need to be judiciously modified to obtain the highest communication rates at finite blocklength. In particular, we consider a unequal error protection (UEP) scheme in which all sources are partitioned into disjoint power type classes. We also regularize the nearest neighbor decoder so that an appropriate measure of the size of each power type class is taken into account in the decoding strategy. For such an architecture, we derive ensemble-tight second-order and moderate deviations results. Our first-order (optimal bandwidth expansion ratio) result generalizes the seminal results by Lapidoth (1996, 1997). The dispersion of our JSCC scheme is a linear combination of the mismatched dispersions for the channel coding saddle-point problem by Scarlett, Tan and Durisi (2017) and the rate-distortion saddle-point problem by the present authors, thus also generalizing these results. △ Less

Submitted 31 August, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

arXiv:1711.09494 [pdf, ps, other]

Skip-Sliding Window Codes

Authors: Ting-Yi Wu, Anshoo Tandon, Lav R. Varshney, Mehul Motani

Abstract: Constrained coding is used widely in digital communication and storage systems. In this paper, we study a generalized sliding window constraint called the skip-sliding window. A skip-sliding window (SSW) code is defined in terms of the length $L$ of a sliding window, skip length $J$, and cost constraint $E$ in each sliding window. Each valid codeword of length $L + kJ$ is determined by $k+1$ windo… ▽ More Constrained coding is used widely in digital communication and storage systems. In this paper, we study a generalized sliding window constraint called the skip-sliding window. A skip-sliding window (SSW) code is defined in terms of the length $L$ of a sliding window, skip length $J$, and cost constraint $E$ in each sliding window. Each valid codeword of length $L + kJ$ is determined by $k+1$ windows of length $L$ where window $i$ starts at $(iJ + 1)$th symbol for all non-negative integers $i$ such that $i \leq k$; and the cost constraint $E$ in each window must be satisfied. In this work, two methods are given to enumerate the size of SSW codes and further refinements are made to reduce the enumeration complexity. Using the proposed enumeration methods, the noiseless capacity of binary SSW codes is determined and observations such as greater capacity than other classes of codes are made. Moreover, some noisy capacity bounds are given. SSW coding constraints arise in various applications including simultaneous energy and information transfer. △ Less

Submitted 10 January, 2018; v1 submitted 26 November, 2017; originally announced November 2017.

Comments: 28 pages, 11 figures

arXiv:1708.05496 [pdf, ps, other]

Non-Asymptotic Converse Bounds and Refined Asymptotics for Two Lossy Source Coding Problems

Authors: Lin Zhou, Mehul Motani

Abstract: In this paper, we revisit two multi-terminal lossy source coding problems: the lossy source coding problem with side information available at the encoder and one of the two decoders, which we term as the Kaspi problem (Kaspi, 1994), and the multiple description coding problem with one semi-deterministic distortion measure, which we refer to as the Fu-Yeung problem (Fu and Yeung, 2002). For the Kas… ▽ More In this paper, we revisit two multi-terminal lossy source coding problems: the lossy source coding problem with side information available at the encoder and one of the two decoders, which we term as the Kaspi problem (Kaspi, 1994), and the multiple description coding problem with one semi-deterministic distortion measure, which we refer to as the Fu-Yeung problem (Fu and Yeung, 2002). For the Kaspi problem, we first present the properties of optimal test channels. Subsequently, we generalize the notion of the distortion-tilted information density for the lossy source coding problem to the Kaspi problem and prove a non-asymptotic converse bound using the properties of optimal test channels and the well-defined distortion-tilted information density. Finally, for discrete memoryless sources, we derive refined asymptotics which includes the second-order, large and moderate deviations asymptotics. In the converse proof of second-order asymptotics, we apply the Berry-Esseen theorem to the derived non-asymptotic converse bound. The achievability proof follows by first proving a type-covering lemma tailored to the Kaspi problem, then properly Taylor expanding the well-defined distortion-tilted information densities and finally applying the Berry-Esseen theorem. We then generalize the methods used in the Kaspi problem to the Fu-Yeung problem. As a result, we obtain the properties of optimal test channels for the minimum sum-rate function, a non-asymptotic converse bound and refined asymptotics for discrete memoryless sources. Since the successive refinement problem is a special case of the Fu-Yeung problem, as a by-product, we obtain a non-asymptotic converse bound for the successive refinement problem, which is a strict generalization of the non-asymptotic converse bound for successively refinable sources (Zhou, Tan and Motani, 2017). △ Less

Submitted 17 August, 2017; originally announced August 2017.

Comments: 34 pages, to be submitted to IEEE Transactions on Information Theory, extended version of two papers accepted by Globecom 2017

arXiv:1708.04778 [pdf, ps, other]

Refined Asymptotics for Rate-Distortion using Gaussian Codebooks for Arbitrary Sources

Authors: Lin Zhou, Vincent Y. F. Tan, Mehul Motani

Abstract: The rate-distortion saddle-point problem considered by Lapidoth (1997) consists in finding the minimum rate to compress an arbitrary ergodic source when one is constrained to use a random Gaussian codebook and minimum (Euclidean) distance encoding is employed. We extend Lapidoth's analysis in several directions in this paper. Firstly, we consider refined asymptotics. In particular, when the source… ▽ More The rate-distortion saddle-point problem considered by Lapidoth (1997) consists in finding the minimum rate to compress an arbitrary ergodic source when one is constrained to use a random Gaussian codebook and minimum (Euclidean) distance encoding is employed. We extend Lapidoth's analysis in several directions in this paper. Firstly, we consider refined asymptotics. In particular, when the source is stationary and memoryless, we establish the second-order, moderate, and large deviation asymptotics of the problem. Secondly, by "random Gaussian codebook", Lapidoth referred to a collection of random codewords, each of which is drawn independently and uniformly from the surface of an $n$-dimensional sphere. To be more precise, we term this as a spherical codebook. We also consider i.i.d.\ Gaussian codebooks in which each random codeword is drawn independently from a product Gaussian distribution. We derive the second-order, moderate, and large deviation asymptotics when i.i.d.\ Gaussian codebooks are employed. In contrast to the recent work on the channel coding counterpart by Scarlett, Tan and Durisi (2017), the dispersions for spherical and i.i.d.\ Gaussian codebooks are identical. The ensemble excess-distortion exponents for both spherical and i.i.d.\ Gaussian codebooks are established for all rates. Furthermore, we show that the i.i.d.\ Gaussian codebook has a strictly larger excess-distortion exponent than its spherical counterpart for any rate greater than the ensemble rate-distortion function derived by Lapidoth. △ Less

Submitted 31 August, 2018; v1 submitted 16 August, 2017; originally announced August 2017.

Comments: Under review with TIT

arXiv:1704.04584 [pdf, other]

Performance of Energy Harvesting Receivers with Power Optimization

Authors: Zhengwei Ni, Mehul Motani

Abstract: The difficulty of modeling energy consumption in communication systems leads to challenges in energy harvesting (EH) systems, in which nodes scavenge energy from their environment. An EH receiver must harvest enough energy for demodulating and decoding. The energy required depends upon factors, like code rate and signal-to-noise ratio, which can be adjusted dynamically. We consider a receiver whic… ▽ More The difficulty of modeling energy consumption in communication systems leads to challenges in energy harvesting (EH) systems, in which nodes scavenge energy from their environment. An EH receiver must harvest enough energy for demodulating and decoding. The energy required depends upon factors, like code rate and signal-to-noise ratio, which can be adjusted dynamically. We consider a receiver which harvests energy from ambient sources and the transmitter, meaning the received signal is used for both EH and information decoding. Assuming a generalized function for energy consumption, we maximize the total number of information bits decoded, under both average and peak power constraints at the transmitter, by carefully optimizing the power used for EH, power used for information transmission, fraction of time for EH, and code rate. For transmission over a single block, we find there exist problem parameters for which either maximizing power for information transmission or maximizing power for EH is optimal. In the general case, the optimal solution is a tradeoff of the two. For transmission over multiple blocks, we give an upper bound on performance and give sufficient and necessary conditions to achieve this bound. Finally, we give some numerical results to illustrate our results and analysis. △ Less

Submitted 15 April, 2017; originally announced April 2017.

Comments: 28 pages

arXiv:1702.06649 [pdf, ps, other]

Exponential Strong Converse for Content Identification with Lossy Recovery

Authors: Lin Zhou, Vincent Y. F. Tan, Lei Yu, Mehul Motani

Abstract: We revisit the high-dimensional content identification with lossy recovery problem (Tuncel and Gündüz, 2014) and establish an exponential strong converse theorem. As a corollary of the exponential strong converse theorem, we derive an upper bound on the joint identification-error and excess-distortion exponent for the problem. Our main results can be specialized to the biometrical identification p… ▽ More We revisit the high-dimensional content identification with lossy recovery problem (Tuncel and Gündüz, 2014) and establish an exponential strong converse theorem. As a corollary of the exponential strong converse theorem, we derive an upper bound on the joint identification-error and excess-distortion exponent for the problem. Our main results can be specialized to the biometrical identification problem~(Willems, 2003) and the content identification problem~(Tuncel, 2009) since these two problems are both special cases of the content identification with lossy recovery problem. We leverage the information spectrum method introduced by Oohama and adapt the strong converse techniques therein to be applicable to the problem at hand. △ Less

Submitted 25 December, 2017; v1 submitted 21 February, 2017; originally announced February 2017.

Comments: Changed parts compared to v1 are in blue. Under revision with TIT

arXiv:1702.04258 [pdf, other]

Layered Coding for Energy Harvesting Communication Without CSIT

Authors: Rajshekhar Vishweshwar Bhat, Mehul Motani, Teng Joon Lim

Abstract: Due to stringent constraints on resources, it may be infeasible to acquire the current channel state information at the transmitter in energy harvesting communication systems. In this paper, we optimize an energy harvesting transmitter, communicating over a slow fading channel, using layered coding. The transmitter has access to the channel statistics, but does not know the exact channel state. In… ▽ More Due to stringent constraints on resources, it may be infeasible to acquire the current channel state information at the transmitter in energy harvesting communication systems. In this paper, we optimize an energy harvesting transmitter, communicating over a slow fading channel, using layered coding. The transmitter has access to the channel statistics, but does not know the exact channel state. In layered coding, the codewords are first designed for each of the channel states at different rates, and then the codewords are either time-multiplexed or superimposed before the transmission, leading to two transmission strategies. The receiver then decodes the information adaptively based on the realized channel state. The transmitter is equipped with a finite-capacity battery having non-zero internal resistance. In each of the transmission strategies, we first formulate and study an average rate maximization problem with non-causal knowledge of the harvested power variations. Further, assuming statistical knowledge and causal information of the harvested power variations, we propose a sub-optimal algorithm, and compare with the stochastic dynamic programming based solution and a greedy policy. △ Less

Submitted 14 April, 2017; v1 submitted 14 February, 2017; originally announced February 2017.

Comments: Elaborated on the system model, added a result (Lemma 1), added 2 more references

arXiv:1701.04954 [pdf, ps, other]

Bounds on the Size and Asymptotic Rate of Subblock-Constrained Codes

Authors: Anshoo Tandon, Han Mao Kiah, Mehul Motani

Abstract: The study of subblock-constrained codes has recently gained attention due to their application in diverse fields. We present bounds on the size and asymptotic rate for two classes of subblock-constrained codes. The first class is binary constant subblock-composition codes (CSCCs), where each codeword is partitioned into equal sized subblocks, and every subblock has the same fixed weight. The secon… ▽ More The study of subblock-constrained codes has recently gained attention due to their application in diverse fields. We present bounds on the size and asymptotic rate for two classes of subblock-constrained codes. The first class is binary constant subblock-composition codes (CSCCs), where each codeword is partitioned into equal sized subblocks, and every subblock has the same fixed weight. The second class is binary subblock energy-constrained codes (SECCs), where the weight of every subblock exceeds a given threshold. We present novel upper and lower bounds on the code sizes and asymptotic rates for binary CSCCs and SECCs. For a fixed subblock length and small relative distance, we show that the asymptotic rate for CSCCs (resp. SECCs) is strictly lower than the corresponding rate for constant weight codes (CWCs) (resp. heavy weight codes (HWCs)). Further, for codes with high weight and low relative distance, we show that the asymptotic rates for CSCCs is strictly lower than that of SECCs, which contrasts that the asymptotic rate for CWCs is equal to that of HWCs. We also provide a correction to an earlier result by Chee et al. (2014) on the asymptotic CSCC rate. Additionally, we present several numerical examples comparing the rates for CSCCs and SECCs with those for constant weight codes and heavy weight codes. △ Less

Submitted 18 January, 2017; originally announced January 2017.

arXiv:1701.02444 [pdf, ps, other]

Energy Harvesting Communication Using Finite-Capacity Batteries with Internal Resistance

Authors: Rajshekhar Vishweshwar Bhat, Mehul Motani, Teng Joon Lim

Abstract: Modern systems will increasingly rely on energy harvested from their environment. Such systems utilize batteries to smoothen out the random fluctuations in harvested energy. These fluctuations induce highly variable battery charge and discharge rates, which affect the efficiencies of practical batteries that typically have non-zero internal resistances. In this paper, we study an energy harvesting… ▽ More Modern systems will increasingly rely on energy harvested from their environment. Such systems utilize batteries to smoothen out the random fluctuations in harvested energy. These fluctuations induce highly variable battery charge and discharge rates, which affect the efficiencies of practical batteries that typically have non-zero internal resistances. In this paper, we study an energy harvesting communication system using a finite battery with non-zero internal resistance. We adopt a dual-path architecture, in which harvested energy can be directly used, or stored and then used. In a frame, both time and power can be split between energy storage and data transmission. For a single frame, we derive an analytical expression for the rate optimal time and power splitting ratios between harvesting energy and transmitting data. We then optimize the time and power splitting ratios for a group of frames, assuming non-causal knowledge of harvested power and fading channel gains, by giving an approximate solution. When only the statistics of the energy arrivals and channel gains are known, we derive a dynamic programming based policy and, propose three sub-optimal policies, which are shown to perform competitively. In summary, our study suggests that battery internal resistance significantly impacts the design and performance of energy harvesting communication systems and must be taken into account. △ Less

Submitted 10 January, 2017; originally announced January 2017.

Comments: 30 single column pages

arXiv:1701.01938 [pdf, other]

Polar Coding for the Binary Erasure Channel with Deletions

Authors: Eldho K. Thomas, Vincent Y. F. Tan, Alexander Vardy, Mehul Motani

Abstract: We study the application of polar codes in deletion channels by analyzing the cascade of a binary erasure channel (BEC) and a deletion channel. We show how polar codes can be used effectively on a BEC with a single deletion, and propose a list decoding algorithm with a cyclic redundancy check for this case. The decoding complexity is $O(N^2\log N)$, where $N$ is the blocklength of the code. An imp… ▽ More We study the application of polar codes in deletion channels by analyzing the cascade of a binary erasure channel (BEC) and a deletion channel. We show how polar codes can be used effectively on a BEC with a single deletion, and propose a list decoding algorithm with a cyclic redundancy check for this case. The decoding complexity is $O(N^2\log N)$, where $N$ is the blocklength of the code. An important contribution is an optimization of the amount of redundancy added to minimize the overall error probability. Our theoretical results are corroborated by numerical simulations which show that the list size can be reduced to one and the original message can be recovered with high probability as the length of the code grows. △ Less

Submitted 8 January, 2017; originally announced January 2017.

Comments: 4 pages; 1 figure; To appear in the IEEE Communication Letters

arXiv:1604.07151 [pdf, ps, other]

Achievable Moderate Deviations Asymptotics for Streaming Compression of Correlated Sources

Authors: Lin Zhou, Vincent Y. F. Tan, Mehul Motani

Abstract: Motivated by streaming multi-view video coding and wireless sensor networks, we consider the problem of blockwise streaming compression of a pair of correlated sources, which we term streaming Slepian-Wolf coding. We study the moderate deviations regime in which the rate pairs of a sequence of codes converge, along a straight line, to various points on the boundary of the Slepian-Wolf region at a… ▽ More Motivated by streaming multi-view video coding and wireless sensor networks, we consider the problem of blockwise streaming compression of a pair of correlated sources, which we term streaming Slepian-Wolf coding. We study the moderate deviations regime in which the rate pairs of a sequence of codes converge, along a straight line, to various points on the boundary of the Slepian-Wolf region at a speed slower than the inverse square root of the blocklength $n$, while the error probability decays subexponentially fast in $n$. Our main result focuses on directions of approaches to corner points of the Slepian-Wolf region. It states that for each correlated source and all corner points, there exists a non-empty subset of directions of approaches such that the moderate deviations constant (the constant of proportionality for the subexponential decay of the error probability) is enhanced (over the non-streaming case) by at least a factor of $T$, the block delay of decoding source block pairs. We specialize our main result to the setting of streaming lossless source coding and generalize this result to the setting where we have different delay requirements for each of the two source blocks. The proof of our main result involves the use of various analytical tools and amalgamates several ideas from the recent information-theoretic streaming literature. We adapt the so-called truncated memory encoding idea from Draper and Khisti (2011) and Lee, Tan, and Khisti (2016) to ensure that the effect of error accumulation is nullified in the limit of large blocklengths. We also adapt the use of the so-called minimum weighted empirical suffix entropy decoder which was used by Draper, Chang, and Sahai (2014) to derive achievable error exponents for symbolwise streaming Slepian-Wolf coding. △ Less

Submitted 20 September, 2017; v1 submitted 25 April, 2016; originally announced April 2016.

Comments: 31 pages, 6 figures, under revision with IEEE Transactions on Information Theory; short version presented ISIT 2017

arXiv:1601.04500 [pdf, ps, other]

Second-Order and Moderate Deviation Asymptotics for Successive Refinement

Authors: Lin Zhou, Vincent Y. F. Tan, Mehul Motani

Abstract: We derive the optimal second-order coding region and moderate deviations constant for successive refinement source coding with a joint excess-distortion probability constraint. We consider two scenarios: (i) a discrete memoryless source (DMS) and arbitrary distortion measures at the decoders and (ii) a Gaussian memoryless source (GMS) and quadratic distortion measures at the decoders. For a DMS wi… ▽ More We derive the optimal second-order coding region and moderate deviations constant for successive refinement source coding with a joint excess-distortion probability constraint. We consider two scenarios: (i) a discrete memoryless source (DMS) and arbitrary distortion measures at the decoders and (ii) a Gaussian memoryless source (GMS) and quadratic distortion measures at the decoders. For a DMS with arbitrary distortion measures, we prove an achievable second-order coding region, using type covering lemmas by Kanlis and Narayan and by No, Ingber and Weissman. We prove the converse using the perturbation approach by Gu and Effros. When the DMS is successively refinable, the expressions for the second-order coding region and the moderate deviations constant are simplified and easily computable. For this case, we also obtain new insights on the second-order behavior compared to the scenario where separate excess-distortion proabilities are considered. For example, we describe a DMS, for which the optimal second-order region transitions from being characterizable by a bivariate Gaussian to a univariate Gaussian, as the distortion levels are varied. We then consider a GMS with quadratic distortion measures. To prove the direct part, we make use of the sphere covering theorem by Verger-Gaugry, together with appropriately-defined Gaussian type classes. To prove the converse, we generalize Kostina and Verdú's one-shot converse bound for point-to-point lossy source coding. We remark that this proof is applicable to general successively refinable sources. In the proofs of the moderate deviations results for both scenarios, we follow a strategy similar to that for the second-order asymptotics and use the moderate deviations principle. △ Less

Submitted 26 August, 2016; v1 submitted 18 January, 2016; originally announced January 2016.

Comments: Part of this paper has beed presented at ISIT 2016. Submitted to IEEE Transactions on Information Theory in Jan, 2016. Revised in Aug. 2016

arXiv:1512.01090 [pdf, ps, other]

doi 10.1109/TIT.2016.2644670

Discrete Lossy Gray-Wyner Revisited: Second-Order Asymptotics, Large and Moderate Deviations

Authors: Lin Zhou, Vincent Y. F. Tan, Mehul Motani

Abstract: In this paper, we revisit the discrete lossy Gray-Wyner problem. In particular, we derive its optimal second-order coding rate region, its error exponent (reliability function) and its moderate deviations constant under mild conditions on the source. To obtain the second-order asymptotics, we extend some ideas from Watanabe's work (2015). In particular, we leverage the properties of an appropriate… ▽ More In this paper, we revisit the discrete lossy Gray-Wyner problem. In particular, we derive its optimal second-order coding rate region, its error exponent (reliability function) and its moderate deviations constant under mild conditions on the source. To obtain the second-order asymptotics, we extend some ideas from Watanabe's work (2015). In particular, we leverage the properties of an appropriate generalization of the conditional distortion-tilted information density, which was first introduced by Kostina and Verdú (2012). The converse part uses a perturbation argument by Gu and Effros (2009) in their strong converse proof of the discrete Gray-Wyner problem. The achievability part uses two novel elements: (i) a generalization of various type covering lemmas; and (ii) the uniform continuity of the conditional rate-distortion function in both the source (joint) distribution and the distortion level. To obtain the error exponent, for the achievability part, we use the same generalized type covering lemma and for the converse, we use the strong converse together with a change-of-measure technique. Finally, to obtain the moderate deviations constant, we apply the moderate deviations theorem to probabilities defined in terms of information spectrum quantities. △ Less

Submitted 4 January, 2017; v1 submitted 3 December, 2015; originally announced December 2015.

Comments: 33 pages, accepted to IEEE Transactions on Information Theory; Short version v_1 presented at IEEE ISIT 2016, Barcelona, Spain

arXiv:1507.03190

On Error Exponents and Moderate Deviations for Lossless Streaming Compression of Correlated Sources

Authors: Lin Zhou, Vincent Yan Fu Tan, Mehul Motani

Abstract: We derive upper and lower bounds for the error exponents of lossless streaming compression of two correlated sources under the blockwise and symbolwise settings. We consider the linear scaling regime in which the delay is a scalar multiple of the number of symbol pairs of interest. We show that for rate pairs satisfying certain constraints, the upper and lower bounds for the error exponent of bloc… ▽ More We derive upper and lower bounds for the error exponents of lossless streaming compression of two correlated sources under the blockwise and symbolwise settings. We consider the linear scaling regime in which the delay is a scalar multiple of the number of symbol pairs of interest. We show that for rate pairs satisfying certain constraints, the upper and lower bounds for the error exponent of blockwise codes coincide. For symbolwise codes, the bounds coincide for rate pairs satisfying the aforementioned constraints and a certain condition on the symbol pairs we wish to decode---namely, that their indices are asymptotically comparable to the blocklength. We also derive moderate deviations constants for blockwise and symbolwise codes, leveraging the error exponent results, and using appropriate Taylor series expansions. In particular, for blockwise codes, we derive an information spectrum-type strong converse, giving the complete characterization of the moderate deviations constants. For symbolwise codes, under an additional requirement on the backoff from the first-order fundamental limit, we can show that the moderate deviations constants are the same as the blockwise setting. △ Less

Submitted 22 February, 2016; v1 submitted 12 July, 2015; originally announced July 2015.

Comments: The paper has been withdrawn due to an error in the achievability proof

arXiv:1506.00213 [pdf, ps, other]

Subblock-Constrained Codes for Real-Time Simultaneous Energy and Information Transfer

Authors: Anshoo Tandon, Mehul Motani, Lav R. Varshney

Abstract: Consider an energy-harvesting receiver that uses the same received signal both for decoding information and for harvesting energy, which is employed to power its circuitry. In the scenario where the receiver has limited battery size, a signal with bursty energy content may cause power outage at the receiver since the battery will drain during intervals with low signal energy. In this paper, we con… ▽ More Consider an energy-harvesting receiver that uses the same received signal both for decoding information and for harvesting energy, which is employed to power its circuitry. In the scenario where the receiver has limited battery size, a signal with bursty energy content may cause power outage at the receiver since the battery will drain during intervals with low signal energy. In this paper, we consider a discrete memoryless channel and characterize achievable information rates when the energy content in each codeword is regularized by ensuring that sufficient energy is carried within every subblock duration. In particular, we study constant subblock-composition codes (CSCCs) where all subblocks in every codeword have the same fixed composition, and this subblock-composition is chosen to maximize the rate of information transfer while meeting the energy requirement. Compared to constant composition codes (CCCs), we show that CSCCs incur a rate loss and that the error exponent for CSCCs is also related to the error exponent for CCCs by the same rate loss term. We show that CSCC capacity can be improved by allowing different subblocks to have different composition while still meeting the subblock energy constraint. We provide numerical examples highlighting the tradeoff between delivery of sufficient energy to the receiver and achieving high information transfer rates. It is observed that the ability to use energy in real-time imposes less of penalty than the ability to use information in real-time. △ Less

Submitted 31 May, 2015; originally announced June 2015.

arXiv:1411.6791 [pdf, other]

doi 10.1109/TMC.2009.138

A Metric for DISH Networks: Analysis, Implications, and Applications

Authors: Tie Luo, Vikram Srinivasan, Mehul Motani

Abstract: In wireless networks, node cooperation has been exploited as a data relaying mechanism for decades. However, the wireless channel allows for much richer interaction among nodes. In particular, Distributed Information SHaring (DISH) represents a new improvement to multi-channel MAC protocol design by using a cooperative element at the control plane. In this approach, nodes exchange control informat… ▽ More In wireless networks, node cooperation has been exploited as a data relaying mechanism for decades. However, the wireless channel allows for much richer interaction among nodes. In particular, Distributed Information SHaring (DISH) represents a new improvement to multi-channel MAC protocol design by using a cooperative element at the control plane. In this approach, nodes exchange control information to make up for other nodes' insufficient knowledge about the environment, and thereby aid in their decision making. To date, what is lacking is a theoretical understanding of DISH. In this paper, we view cooperation as a network resource and evaluate the availability of cooperation, $p_{co}$. We first analyze $p_{co}$ in the context of a multi-channel multi-hop wireless network, and then perform simulations which show that the analysis accurately characterizes $p_{co}$ as a function of underlying network parameters. Next, we investigate the correlation between $p_{co}$ and network metrics such as collision rate, packet delay, and throughput. We find a near-linear relationship between $p_{co}$ and the metrics, which suggests that $p_{co}$ can be used as an appropriate performance indicator itself. Finally, we apply our analysis to solving a channel bandwidth allocation problem, where we derive optimal schemes and provide general guidelines on bandwidth allocation for DISH networks. △ Less

Submitted 25 November, 2014; originally announced November 2014.

Comments: Cooperative protocol, availability of cooperation, multi-channel multi-hop wireless network, multi-channel MAC protocols, distributed information sharing (DISH), channel bandwidth allocation

Journal ref: IEEE Transactions on Mobile Computing, vol. 9, no. 3, pp. 376--389, March 2010

arXiv:1411.6749 [pdf, other]

Analyzing DISH for Multi-Channel MAC Protocols in Wireless Networks

Authors: Tie Luo, Mehul Motani, Vikram Srinivasan

Abstract: For long, node cooperation has been exploited as a data relaying mechanism. However, the wireless channel allows for much richer interaction between nodes. One such scenario is in a multi-channel environment, where transmitter-receiver pairs may make incorrect decisions (e.g., in selecting channels) but idle neighbors could help by sharing information to prevent undesirable consequences (e.g., dat… ▽ More For long, node cooperation has been exploited as a data relaying mechanism. However, the wireless channel allows for much richer interaction between nodes. One such scenario is in a multi-channel environment, where transmitter-receiver pairs may make incorrect decisions (e.g., in selecting channels) but idle neighbors could help by sharing information to prevent undesirable consequences (e.g., data collisions). This represents a Distributed Information SHaring (DISH) mechanism for cooperation and suggests new ways of designing cooperative protocols. However, what is lacking is a theoretical understanding of this new notion of cooperation. In this paper, we view cooperation as a network resource and evaluate the availability of cooperation via a metric, $p_{co}$, the probability of obtaining cooperation. First, we analytically evaluate $p_{co}$ in the context of multi-channel multi-hop wireless networks. Second, we verify our analysis via simulations and the results show that our analysis accurately characterizes the behavior of $p_{co}$ as a function of underlying network parameters. This step also yields important insights into DISH with respect to network dynamics. Third, we investigate the correlation between $p_{co}$ and network performance in terms of collision rate, packet delay, and throughput. The results indicate a near-linear relationship, which may significantly simplify performance analysis for cooperative networks and suggests that $p_{co}$ be used as an appropriate performance indicator itself. Throughout this work, we utilize, as appropriate, three different DISH contexts --- model-based DISH, ideal DISH, and real DISH --- to explore $p_{co}$. △ Less

Submitted 25 November, 2014; originally announced November 2014.

Comments: Multi-channel multi-hop networks, availability of cooperation, cooperative protocol, distributed information sharing, ACM MobiHoc, May 2008

arXiv:1411.6521 [pdf, other]

doi 10.1109/TMC.2011.60

Energy-Efficient Strategies for Cooperative Multi-Channel MAC Protocols

Authors: Tie Luo, Mehul Motani, Vikram Srinivasan

Abstract: Distributed Information SHaring (DISH) is a new cooperative approach to designing multi-channel MAC protocols. It aids nodes in their decision making processes by compensating for their missing information via information sharing through other neighboring nodes. This approach was recently shown to significantly boost the throughput of multi-channel MAC protocols. However, a critical issue for ad h… ▽ More Distributed Information SHaring (DISH) is a new cooperative approach to designing multi-channel MAC protocols. It aids nodes in their decision making processes by compensating for their missing information via information sharing through other neighboring nodes. This approach was recently shown to significantly boost the throughput of multi-channel MAC protocols. However, a critical issue for ad hoc communication devices, i.e., energy efficiency, has yet to be addressed. In this paper, we address this issue by developing simple solutions which (1) reduce the energy consumption (2) without compromising the throughput performance, and meanwhile (3) maximize cost efficiency. We propose two energy-efficient strategies: in-situ energy conscious DISH which uses existing nodes only, and altruistic DISH which needs additional nodes called altruists. We compare five protocols with respect to the strategies and identify altruistic DISH to be the right choice in general: it (1) conserves 40-80% of energy, (2) maintains the throughput advantage gained from the DISH approach, and (3) more than doubles the cost efficiency compared to protocols without applying the strategy. On the other hand, our study shows that in-situ energy conscious DISH is suitable only in certain limited scenarios. △ Less

Submitted 24 November, 2014; v1 submitted 24 November, 2014; originally announced November 2014.

Comments: Energy efficiency, cost efficiency, distributed information sharing, DISH, altruistic cooperation

Journal ref: IEEE Transactions on Mobile Computing (TMC), vol. 11, no. 4, pp. 553-566, April 2012

arXiv:1410.2687 [pdf, other]

Second-Order Coding Rates for Conditional Rate-Distortion

Authors: Sy-Quoc Le, Vincent Y. F. Tan, Mehul Motani

Abstract: This paper characterizes the second-order coding rates for lossy source coding with side information available at both the encoder and the decoder. We first provide non-asymptotic bounds for this problem and then specialize the non-asymptotic bounds for three different scenarios: discrete memoryless sources, Gaussian sources, and Markov sources. We obtain the second-order coding rates for these se… ▽ More This paper characterizes the second-order coding rates for lossy source coding with side information available at both the encoder and the decoder. We first provide non-asymptotic bounds for this problem and then specialize the non-asymptotic bounds for three different scenarios: discrete memoryless sources, Gaussian sources, and Markov sources. We obtain the second-order coding rates for these settings. It is interesting to observe that the second-order coding rate for Gaussian source coding with Gaussian side information available at both the encoder and the decoder is the same as that for Gaussian source coding without side information. Furthermore, regardless of the variance of the side information, the dispersion is $1/2$ nats squared per source symbol. △ Less

Submitted 10 October, 2014; originally announced October 2014.

Comments: 20 pages, 2 figures, second-order coding rates, finite blocklength, network information theory

arXiv:1406.1140 [pdf, ps, other]

Fading Two-Way Relay Channels: Physical-Layer Versus Digital Network Coding

Authors: Zhi Chen, Tengjoon Lim, Mehul Motani

Abstract: In this paper, we consider three transmit strategies for the fading three-node, two-way relay network (TWRN) -- physical-layer network coding (PNC), digital network coding (DNC) and codeword superposition (CW-Sup). The aim is to minimize the total average energy needed to deliver a given pair of required average rates. Full channel state information is assumed to be available at all transmitters a… ▽ More In this paper, we consider three transmit strategies for the fading three-node, two-way relay network (TWRN) -- physical-layer network coding (PNC), digital network coding (DNC) and codeword superposition (CW-Sup). The aim is to minimize the total average energy needed to deliver a given pair of required average rates. Full channel state information is assumed to be available at all transmitters and receivers. The optimization problems corresponding to the various strategies in fading channels are formulated, solved and compared. For the DNC-based strategies, a simple time sharing of transmission of the network-coded message and the remaining bits of the larger message (DNC-TS) is considered first. We extend this approach to include a superposition strategy (DNC-Sup), in which the network-coded message and the remainder of the longer source message are superimposed before transmission. It is demonstrated theoretically that DNC-Sup outperforms DNC-TS and CW-Sup in terms of total average energy usage. More importantly, it is shown in simulation that DNC-Sup performs better than PNC if the required rate is low and worse otherwise. Finally, an algorithm to select the optimal strategy in terms of energy usage subject to different rate pair requirements is presented. △ Less

Submitted 4 June, 2014; originally announced June 2014.

Comments: 25 pages, 5 figures

arXiv:1404.0255 [pdf, other]

A Case Where Interference Does Not Affect The Channel Dispersion

Authors: Sy-Quoc Le, Vincent Y. F. Tan, Mehul Motani

Abstract: In 1975, Carleial presented a special case of an interference channel in which the interference does not reduce the capacity of the constituent point-to-point Gaussian channels. In this work, we show that if the inequalities in the conditions that Carleial stated are strict, the dispersions are similarly unaffected. More precisely, in this work, we characterize the second-order coding rates of the… ▽ More In 1975, Carleial presented a special case of an interference channel in which the interference does not reduce the capacity of the constituent point-to-point Gaussian channels. In this work, we show that if the inequalities in the conditions that Carleial stated are strict, the dispersions are similarly unaffected. More precisely, in this work, we characterize the second-order coding rates of the Gaussian interference channel in the strictly very strong interference regime. In other words, we characterize the speed of convergence of rates of optimal block codes towards a boundary point of the (rectangular) capacity region. These second-order rates are expressed in terms of the average probability of error and variances of some modified information densities which coincide with the dispersion of the (single-user) Gaussian channel. We thus conclude that the dispersions are unaffected by interference in this channel model. △ Less

Submitted 1 April, 2014; originally announced April 2014.

Comments: Submitted to Transactions on Information Theory

arXiv:1210.3512 [pdf, ps, other]

Digital Network Coding Aided Two-way Relaying: Energy Minimization and Queue Analysis

Authors: Zhi Chen, Teng Joon Lim, Mehul Motani

Abstract: In this paper, we consider a three node, two-way relay system with digital network coding over static channels where all link gains are assumed to be constant during transmission. The aim is to minimize total energy consumption while ensuring queue stability at all nodes, for a given pair of random packet arrival rates. Specifically, we allow for a set of transmission modes and solve for the optim… ▽ More In this paper, we consider a three node, two-way relay system with digital network coding over static channels where all link gains are assumed to be constant during transmission. The aim is to minimize total energy consumption while ensuring queue stability at all nodes, for a given pair of random packet arrival rates. Specifically, we allow for a set of transmission modes and solve for the optimal fraction of resources allocated to each mode, including multiaccess uplink transmission mode and network coding broadcasting mode. In addition, for the downlink, we find the condition to determine whether superposition coding with excess data over the better link and network coded data for both users is energy efficient and the corresponding optimization is formulated and solved. To tackle the queue evolution in this network, we present a detailed analysis of the queues at each node using a random scheduling method that closely approximates the theoretical design, through a two-dimensional Markov chain model. △ Less

Submitted 25 October, 2012; v1 submitted 12 October, 2012; originally announced October 2012.

arXiv:1207.3850 [pdf, other]

doi 10.1109/TIT.2012.2204489

On Capacity and Optimal Scheduling for the Half-Duplex Multiple-Relay Channel

Authors: Lawrence Ong, Mehul Motani, Sarah J. Johnson

Abstract: We study the half-duplex multiple-relay channel (HD-MRC) where every node can either transmit or listen but cannot do both at the same time. We obtain a capacity upper bound based on a max-flow min-cut argument and achievable transmission rates based on the decode-forward (DF) coding strategy, for both the discrete memoryless HD-MRC and the phase-fading HD-MRC. We discover that both the upper boun… ▽ More We study the half-duplex multiple-relay channel (HD-MRC) where every node can either transmit or listen but cannot do both at the same time. We obtain a capacity upper bound based on a max-flow min-cut argument and achievable transmission rates based on the decode-forward (DF) coding strategy, for both the discrete memoryless HD-MRC and the phase-fading HD-MRC. We discover that both the upper bound and the achievable rates are functions of the transmit/listen state (a description of which nodes transmit and which receive). More precisely, they are functions of the time fraction of the different states, which we term a schedule. We formulate the optimal scheduling problem to find an optimal schedule that maximizes the DF rate. The optimal scheduling problem turns out to be a maximin optimization, for which we propose an algorithmic solution. We demonstrate our approach on a four-node multiple-relay channel, obtaining closed-form solutions in certain scenarios. Furthermore, we show that for the received signal-to-noise ratio degraded phase-fading HD-MRC, the optimal scheduling problem can be simplified to a max optimization. △ Less

Submitted 16 July, 2012; originally announced July 2012.

Comments: Author's final version (to appear in IEEE Transactions on Information Theory)

Journal ref: IEEE Transactions on Information Theory, Vol. 58, No. 9, pp. 5770-5784, 2012

arXiv:1103.2240 [pdf, ps, other]

Price-Based Resource Allocation for Spectrum-Sharing Femtocell Networks: A Stackelberg Game Approach

Authors: Xin Kang, Rui Zhang, Mehul Motani

Abstract: This paper investigates the price-based resource allocation strategies for the uplink transmission of a spectrum-sharing femtocell network, in which a central macrocell is underlaid with distributed femtocells, all operating over the same frequency band as the macrocell. Assuming that the macrocell base station (MBS) protects itself by pricing the interference from the femtocell users, a Stackelbe… ▽ More This paper investigates the price-based resource allocation strategies for the uplink transmission of a spectrum-sharing femtocell network, in which a central macrocell is underlaid with distributed femtocells, all operating over the same frequency band as the macrocell. Assuming that the macrocell base station (MBS) protects itself by pricing the interference from the femtocell users, a Stackelberg game is formulated to study the joint utility maximization of the macrocell and the femtocells subject to a maximum tolerable interference power constraint at the MBS. Especially, two practical femtocell channel models: sparsely deployed scenario for rural areas and densely deployed scenario for urban areas, are investigated. For each scenario, two pricing schemes: uniform pricing and non-uniform pricing, are proposed. Then, the Stackelberg equilibriums for these proposed games are studied, and an effective distributed interference price bargaining algorithm with guaranteed convergence is proposed for the uniform-pricing case. Finally, numerical examples are presented to verify the proposed studies. It is shown that the proposed algorithms are effective in resource allocation and macrocell protection requiring minimal network overhead for spectrum-sharing-based two-tier femtocell networks. △ Less

Submitted 11 March, 2011; originally announced March 2011.

Comments: 27 pages, 7 figures, Submitted to JSAC

arXiv:0807.2328 [pdf, ps, other]

Avatar Mobility in Networked Virtual Environments: Measurements, Analysis, and Implications

Authors: Huiguang Liang, Ian Tay, Ming Feng Neo, Wei Tsang Ooi, Mehul Motani

Abstract: We collected mobility traces of 84,208 avatars spanning 22 regions over two months in Second Life, a popular networked virtual environment. We analyzed the traces to characterize the dynamics of the avatars mobility and behavior, both temporally and spatially. We discuss the implications of the our findings to the design of peer-to-peer networked virtual environments, interest management, mobili… ▽ More We collected mobility traces of 84,208 avatars spanning 22 regions over two months in Second Life, a popular networked virtual environment. We analyzed the traces to characterize the dynamics of the avatars mobility and behavior, both temporally and spatially. We discuss the implications of the our findings to the design of peer-to-peer networked virtual environments, interest management, mobility modeling of avatars, server load balancing and zone partitioning, client-side caching, and prefetching. △ Less

Submitted 15 July, 2008; originally announced July 2008.

ACM Class: H.5.1; C.2.4

arXiv:0806.4874 [pdf, ps, other]

doi 10.1109/TIT.2008.924675

Myopic Coding in Multiterminal Networks

Authors: Lawrence Ong, Mehul Motani

Abstract: This paper investigates the interplay between cooperation and achievable rates in multi-terminal networks. Cooperation refers to the process of nodes working together to relay data toward the destination. There is an inherent tradeoff between achievable information transmission rates and the level of cooperation, which is determined by how many nodes are involved and how the nodes encode/decode… ▽ More This paper investigates the interplay between cooperation and achievable rates in multi-terminal networks. Cooperation refers to the process of nodes working together to relay data toward the destination. There is an inherent tradeoff between achievable information transmission rates and the level of cooperation, which is determined by how many nodes are involved and how the nodes encode/decode the data. We illustrate this trade-off by studying information-theoretic decode-forward based coding strategies for data transmission in multi-terminal networks. Decode-forward strategies are usually discussed in the context of omniscient coding, in which all nodes in the network fully cooperate with each other, both in encoding and decoding. In this paper, we investigate myopic coding, in which each node cooperates with only a few neighboring nodes. We show that achievable rates of myopic decode-forward can be as large as that of omniscient decode-forward in the low SNR regime. We also show that when each node has only a few cooperating neighbors, adding one node into the cooperation increases the transmission rate significantly. Furthermore, we show that myopic decode-forward can achieve non-zero rates as the network size grows without bound. △ Less

Submitted 30 June, 2008; originally announced June 2008.

Journal ref: IEEE Transactions on Information Theory, Vol. 54, No. 7, pp. 3295-3314, Jul. 2008

arXiv:0704.2926 [pdf, ps, other]

doi 10.1109/ISIT.2007.4557364

Optimal Routing for the Gaussian Multiple-Relay Channel with Decode-and-Forward

Authors: Lawrence Ong, Mehul Motani

Abstract: In this paper, we study a routing problem on the Gaussian multiple relay channel, in which nodes employ a decode-and-forward coding strategy. We are interested in routes for the information flow through the relays that achieve the highest DF rate. We first construct an algorithm that provably finds optimal DF routes. As the algorithm runs in factorial time in the worst case, we propose a polynom… ▽ More In this paper, we study a routing problem on the Gaussian multiple relay channel, in which nodes employ a decode-and-forward coding strategy. We are interested in routes for the information flow through the relays that achieve the highest DF rate. We first construct an algorithm that provably finds optimal DF routes. As the algorithm runs in factorial time in the worst case, we propose a polynomial time heuristic algorithm that finds an optimal route with high probability. We demonstrate that that the optimal (and near optimal) DF routes are good in practice by simulating a distributed DF coding scheme using low density parity check codes with puncturing and incremental redundancy. △ Less

Submitted 23 April, 2007; originally announced April 2007.

Comments: Accepted and to be presented at the 2007 IEEE International Symposium on Information Theory (ISIT 2007), Acropolis Congress and Exhibition Center, Nice, France, June 24-29 2007

Journal ref: Proceedings of the 2007 IEEE International Symposium on Information Theory (ISIT 2007), Acropolis Congress and Exhibition Center, Nice, France, pp. 1061-1065, Jun. 24-29 2007.

Showing 1–50 of 57 results for author: Motani, M