subscribe to arXiv mailings

The infrastructure powering IBM's Gen AI model development

Authors: Talia Gershon, Seetharami Seelam, Brian Belgodere, Milton Bonilla, Lan Hoang, Danny Barnett, I-Hsin Chung, Apoorve Mohan, Ming-Hung Chen, Lixiang Luo, Robert Walkup, Constantinos Evangelinos, Shweta Salaria, Marc Dombrowa, Yoonho Park, Apo Kayi, Liran Schour, Alim Alim, Ali Sydney, Pavlos Maniotis, Laurent Schares, Bernard Metzler, Bengi Karacali-Akyamac, Sophia Wen, Tatsuhiro Chiba , et al. (121 additional authors not shown)

Abstract: AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering effi… ▽ More AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering efficient and high-performing AI training requires an end-to-end solution that combines hardware, software and holistic telemetry to cater for multiple types of AI workloads. In this report, we describe IBM's hybrid cloud infrastructure that powers our generative AI model development. This infrastructure includes (1) Vela: an AI-optimized supercomputing capability directly integrated into the IBM Cloud, delivering scalable, dynamic, multi-tenant and geographically distributed infrastructure for large-scale model training and other AI workflow steps and (2) Blue Vela: a large-scale, purpose-built, on-premises hosting environment that is optimized to support our largest and most ambitious AI model training tasks. Vela provides IBM with the dual benefit of high performance for internal use along with the flexibility to adapt to an evolving commercial landscape. Blue Vela provides us with the benefits of rapid development of our largest and most ambitious models, as well as future-proofing against the evolving model landscape in the industry. Taken together, they provide IBM with the ability to rapidly innovate in the development of both AI models and commercial offerings. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: Corresponding Authors: Talia Gershon, Seetharami Seelam,Brian Belgodere, Milton Bonilla

arXiv:2406.12053 [pdf, other]

InternalInspector $I^2$: Robust Confidence Estimation in LLMs through Internal States

Authors: Mohammad Beigi, Ying Shen, Runing Yang, Zihao Lin, Qifan Wang, Ankith Mohan, Jianfeng He, Ming Jin, Chang-Tien Lu, Lifu Huang

Abstract: Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing high-confidence inaccuracies known as hallucinations. Addressing this challenge, our research introduces InternalInspector, a novel framework designed to enhance confidence estimation in LLMs by leveraging contrastive learning on internal states including attention st… ▽ More Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing high-confidence inaccuracies known as hallucinations. Addressing this challenge, our research introduces InternalInspector, a novel framework designed to enhance confidence estimation in LLMs by leveraging contrastive learning on internal states including attention states, feed-forward states, and activation states of all layers. Unlike existing methods that primarily focus on the final activation state, InternalInspector conducts a comprehensive analysis across all internal states of every layer to accurately identify both correct and incorrect prediction processes. By benchmarking InternalInspector against existing confidence estimation methods across various natural language understanding and generation tasks, including factual question answering, commonsense reasoning, and reading comprehension, InternalInspector achieves significantly higher accuracy in aligning the estimated confidence scores with the correctness of the LLM's predictions and lower calibration error. Furthermore, InternalInspector excels at HaluEval, a hallucination detection benchmark, outperforming other internal-based confidence estimation methods in this task. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 8 pages

arXiv:2404.19075 [pdf, other]

Distributed Stochastic Optimization of a Neural Representation Network for Time-Space Tomography Reconstruction

Authors: K. Aditya Mohan, Massimiliano Ferrucci, Chuck Divin, Garrett A. Stevenson, Hyojin Kim

Abstract: 4D time-space reconstruction of dynamic events or deforming objects using X-ray computed tomography (CT) is an extremely ill-posed inverse problem. Existing approaches assume that the object remains static for the duration of several tens or hundreds of X-ray projection measurement images (reconstruction of consecutive limited-angle CT scans). However, this is an unrealistic assumption for many in… ▽ More 4D time-space reconstruction of dynamic events or deforming objects using X-ray computed tomography (CT) is an extremely ill-posed inverse problem. Existing approaches assume that the object remains static for the duration of several tens or hundreds of X-ray projection measurement images (reconstruction of consecutive limited-angle CT scans). However, this is an unrealistic assumption for many in-situ experiments that causes spurious artifacts and inaccurate morphological reconstructions of the object. To solve this problem, we propose to perform a 4D time-space reconstruction using a distributed implicit neural representation (DINR) network that is trained using a novel distributed stochastic training algorithm. Our DINR network learns to reconstruct the object at its output by iterative optimization of its network parameters such that the measured projection images best match the output of the CT forward measurement model. We use a continuous time and space forward measurement model that is a function of the DINR outputs at a sparsely sampled set of continuous valued object coordinates. Unlike existing state-of-the-art neural representation architectures that forward and back propagate through dense voxel grids that sample the object's entire time-space coordinates, we only propagate through the DINR at a small subset of object coordinates in each iteration resulting in an order-of-magnitude reduction in memory and compute for training. DINR leverages distributed computation across several compute nodes and GPUs to produce high-fidelity 4D time-space reconstructions even for extremely large CT data sizes. We use both simulated parallel-beam and experimental cone-beam X-ray CT datasets to demonstrate the superior performance of our approach. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: submitted to Nature Machine Intelligence

arXiv:2404.16268 [pdf, other]

Lacunarity Pooling Layers for Plant Image Classification using Texture Analysis

Authors: Akshatha Mohan, Joshua Peeples

Abstract: Pooling layers (e.g., max and average) may overlook important information encoded in the spatial arrangement of pixel intensity and/or feature values. We propose a novel lacunarity pooling layer that aims to capture the spatial heterogeneity of the feature maps by evaluating the variability within local windows. The layer operates at multiple scales, allowing the network to adaptively learn hierar… ▽ More Pooling layers (e.g., max and average) may overlook important information encoded in the spatial arrangement of pixel intensity and/or feature values. We propose a novel lacunarity pooling layer that aims to capture the spatial heterogeneity of the feature maps by evaluating the variability within local windows. The layer operates at multiple scales, allowing the network to adaptively learn hierarchical features. The lacunarity pooling layer can be seamlessly integrated into any artificial neural network architecture. Experimental results demonstrate the layer's effectiveness in capturing intricate spatial patterns, leading to improved feature extraction capabilities. The proposed approach holds promise in various domains, especially in agricultural image analysis tasks. This work contributes to the evolving landscape of artificial neural network architectures by introducing a novel pooling layer that enriches the representation of spatial features. Our code is publicly available. △ Less

Submitted 6 July, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

Comments: 9 pages, 7 figures, accepted at 2024 IEEE/CVF Computer Vision and Pattern Recognition Vision for Agriculture Workshop

arXiv:2404.16053 [pdf, other]

Human Latency Conversational Turns for Spoken Avatar Systems

Authors: Derek Jacoby, Tianyi Zhang, Aanchan Mohan, Yvonne Coady

Abstract: A problem with many current Large Language Model (LLM) driven spoken dialogues is the response time. Some efforts such as Groq address this issue by lightning fast processing of the LLM, but we know from the cognitive psychology literature that in human-to-human dialogue often responses occur prior to the speaker completing their utterance. No amount of delay for LLM processing is acceptable if we… ▽ More A problem with many current Large Language Model (LLM) driven spoken dialogues is the response time. Some efforts such as Groq address this issue by lightning fast processing of the LLM, but we know from the cognitive psychology literature that in human-to-human dialogue often responses occur prior to the speaker completing their utterance. No amount of delay for LLM processing is acceptable if we wish to maintain human dialogue latencies. In this paper, we discuss methods for understanding an utterance in close to real time and generating a response so that the system can comply with human-level conversational turn delays. This means that the information content of the final part of the speaker's utterance is lost to the LLM. Using the Google NaturalQuestions (NQ) database, our results show GPT-4 can effectively fill in missing context from a dropped word at the end of a question over 60% of the time. We also provide some examples of utterances and the impacts of this information loss on the quality of LLM response in the context of an avatar that is currently under development. These results indicate that a simple classifier could be used to determine whether a question is semantically complete, or requires a filler phrase to allow a response to be generated within human dialogue time constraints. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2403.05530 [pdf, other]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2402.09474 [pdf, other]

Deciphering Heartbeat Signatures: A Vision Transformer Approach to Explainable Atrial Fibrillation Detection from ECG Signals

Authors: Aruna Mohan, Danne Elbers, Or Zilbershot, Fatemeh Afghah, David Vorchheimer

Abstract: Remote patient monitoring based on wearable single-lead electrocardiogram (ECG) devices has significant potential for enabling the early detection of heart disease, especially in combination with artificial intelligence (AI) approaches for automated heart disease detection. There have been prior studies applying AI approaches based on deep learning for heart disease detection. However, these model… ▽ More Remote patient monitoring based on wearable single-lead electrocardiogram (ECG) devices has significant potential for enabling the early detection of heart disease, especially in combination with artificial intelligence (AI) approaches for automated heart disease detection. There have been prior studies applying AI approaches based on deep learning for heart disease detection. However, these models are yet to be widely accepted as a reliable aid for clinical diagnostics, in part due to the current black-box perception surrounding many AI algorithms. In particular, there is a need to identify the key features of the ECG signal that contribute toward making an accurate diagnosis, thereby enhancing the interpretability of the model. In the present study, we develop a vision transformer approach to identify atrial fibrillation based on single-lead ECG data. A residual network (ResNet) approach is also developed for comparison with the vision transformer approach. These models are applied to the Chapman-Shaoxing dataset to classify atrial fibrillation, as well as another common arrhythmia, sinus bradycardia, and normal sinus rhythm heartbeats. The models enable the identification of the key regions of the heartbeat that determine the resulting classification, and highlight the importance of P-waves and T-waves, as well as heartbeat duration and signal amplitude, in distinguishing normal sinus rhythm from atrial fibrillation and sinus bradycardia. △ Less

Submitted 28 April, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

Comments: Accepted for publication at the 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, IEEE EMBC 2024

arXiv:2401.10298 [pdf, other]

Machine learning approach to detect dynamical states from recurrence measures

Authors: Dheeraja Thakur, Athul Mohan, G. Ambika, Chandrakala Meena

Abstract: We integrate machine learning approaches with nonlinear time series analysis, specifically utilizing recurrence measures to classify various dynamical states emerging from time series. We implement three machine learning algorithms Logistic Regression, Random Forest, and Support Vector Machine for this study. The input features are derived from the recurrence quantification of nonlinear time serie… ▽ More We integrate machine learning approaches with nonlinear time series analysis, specifically utilizing recurrence measures to classify various dynamical states emerging from time series. We implement three machine learning algorithms Logistic Regression, Random Forest, and Support Vector Machine for this study. The input features are derived from the recurrence quantification of nonlinear time series and characteristic measures of the corresponding recurrence networks. For training and testing we generate synthetic data from standard nonlinear dynamical systems and evaluate the efficiency and performance of the machine learning algorithms in classifying time series into periodic, chaotic, hyper-chaotic, or noisy categories. Additionally, we explore the significance of input features in the classification scheme and find that the features quantifying the density of recurrence points are the most relevant. Furthermore, we illustrate how the trained algorithms can successfully predict the dynamical states of two variable stars, SX Her and AC Her from the data of their light curves. △ Less

Submitted 20 March, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

arXiv:2312.16300 [pdf, other]

Unifying Static and Dynamic Intermediate Languages for Accelerator Generators

Authors: Caleb Kim, Pai Li, Anshuman Mohan, Andrew Butt, Adrian Sampson, Rachit Nigam

Abstract: Compilers for accelerator design languages (ADLs) translate high-level languages into application-specific hardware. ADL compilers rely on a hardware control interface to compose hardware units. There are two choices: static control, which relies on cycle-level timing; or dynamic control, which uses explicit signalling to avoid depending on timing details. Static control is efficient but brittle;… ▽ More Compilers for accelerator design languages (ADLs) translate high-level languages into application-specific hardware. ADL compilers rely on a hardware control interface to compose hardware units. There are two choices: static control, which relies on cycle-level timing; or dynamic control, which uses explicit signalling to avoid depending on timing details. Static control is efficient but brittle; dynamic control incurs hardware costs to support compositional reasoning. Piezo is an ADL compiler that unifies static and dynamic control in a single intermediate language (IL). Its key insight is that the IL's static fragment is a refinement of its dynamic fragment: static code admits a subset of the run-time behaviors of the dynamic equivalent. Piezo can optimize code by combining facts from static and dynamic submodules, and it opportunistically converts code from dynamic to static control styles. We implement Piezo as an extension to an existing dynamic ADL compiler, Calyx. We use Piezo to implement an MLIR frontend, a systolic array generator, and a packet-scheduling hardware generator to demonstrate its optimizations and the static-dynamic interactions it enables. △ Less

Submitted 26 December, 2023; originally announced December 2023.

Comments: 12 pages, 9 figures

arXiv:2312.01005 [pdf, other]

Generating Images of the M87* Black Hole Using GANs

Authors: Arya Mohan, Pavlos Protopapas, Keerthi Kunnumkai, Cecilia Garraffo, Lindy Blackburn, Koushik Chatterjee, Sheperd S. Doeleman, Razieh Emami, Christian M. Fromm, Yosuke Mizuno, Angelo Ricarte

Abstract: In this paper, we introduce a novel data augmentation methodology based on Conditional Progressive Generative Adversarial Networks (CPGAN) to generate diverse black hole (BH) images, accounting for variations in spin and electron temperature prescriptions. These generated images are valuable resources for training deep learning algorithms to accurately estimate black hole parameters from observati… ▽ More In this paper, we introduce a novel data augmentation methodology based on Conditional Progressive Generative Adversarial Networks (CPGAN) to generate diverse black hole (BH) images, accounting for variations in spin and electron temperature prescriptions. These generated images are valuable resources for training deep learning algorithms to accurately estimate black hole parameters from observational data. Our model can generate BH images for any spin value within the range of [-1, 1], given an electron temperature distribution. To validate the effectiveness of our approach, we employ a convolutional neural network to predict the BH spin using both the GRMHD images and the images generated by our proposed model. Our results demonstrate a significant performance improvement when training is conducted with the augmented dataset while testing is performed using GRMHD simulated data, as indicated by the high R2 score. Consequently, we propose that GANs can be employed as cost effective models for black hole image generation and reliably augment training datasets for other parameterization algorithms. △ Less

Submitted 1 December, 2023; originally announced December 2023.

Comments: 11 pages, 7 figures. Accepted by Monthly Notices of the Royal Astronomical Society Journal

arXiv:2311.17801 [pdf, other]

Towards Efficient Hyperdimensional Computing Using Photonics

Authors: Farbin Fayza, Cansu Demirkiran, Hanning Chen, Che-Kai Liu, Avi Mohan, Hamza Errahmouni, Sanggeon Yun, Mohsen Imani, David Zhang, Darius Bunandar, Ajay Joshi

Abstract: Over the past few years, silicon photonics-based computing has emerged as a promising alternative to CMOS-based computing for Deep Neural Networks (DNN). Unfortunately, the non-linear operations and the high-precision requirements of DNNs make it extremely challenging to design efficient silicon photonics-based systems for DNN inference and training. Hyperdimensional Computing (HDC) is an emerging… ▽ More Over the past few years, silicon photonics-based computing has emerged as a promising alternative to CMOS-based computing for Deep Neural Networks (DNN). Unfortunately, the non-linear operations and the high-precision requirements of DNNs make it extremely challenging to design efficient silicon photonics-based systems for DNN inference and training. Hyperdimensional Computing (HDC) is an emerging, brain-inspired machine learning technique that enjoys several advantages over existing DNNs, including being lightweight, requiring low-precision operands, and being robust to noise introduced by the nonidealities in the hardware. For HDC, computing in-memory (CiM) approaches have been widely used, as CiM reduces the data transfer cost if the operands can fit into the memory. However, inefficient multi-bit operations, high write latency, and low endurance make CiM ill-suited for HDC. On the other hand, the existing electro-photonic DNN accelerators are inefficient for HDC because they are specifically optimized for matrix multiplication in DNNs and consume a lot of power with high-precision data converters. In this paper, we argue that photonic computing and HDC complement each other better than photonic computing and DNNs, or CiM and HDC. We propose PhotoHDC, the first-ever electro-photonic accelerator for HDC training and inference, supporting the basic, record-based, and graph encoding schemes. Evaluating with popular datasets, we show that our accelerator can achieve two to five orders of magnitude lower EDP than the state-of-the-art electro-photonic DNN accelerators for implementing HDC training and inference. PhotoHDC also achieves four orders of magnitude lower energy-delay product than CiM-based accelerators for both HDC training and inference. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2306.16021 [pdf, other]

doi 10.1613/jair.1.15703

Structure in Deep Reinforcement Learning: A Survey and Open Problems

Authors: Aditya Mohan, Amy Zhang, Marius Lindauer

Abstract: Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural Networks (DNNs) for function approximation, has demonstrated considerable success in numerous applications. However, its practicality in addressing various real-world scenarios, characterized by diverse and unpredictable dynamics, noisy signals, and large state and action spaces, remains limited. This limitation s… ▽ More Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural Networks (DNNs) for function approximation, has demonstrated considerable success in numerous applications. However, its practicality in addressing various real-world scenarios, characterized by diverse and unpredictable dynamics, noisy signals, and large state and action spaces, remains limited. This limitation stems from poor data efficiency, limited generalization capabilities, a lack of safety guarantees, and the absence of interpretability, among other factors. To overcome these challenges and improve performance across these crucial metrics, one promising avenue is to incorporate additional structural information about the problem into the RL learning process. Various sub-fields of RL have proposed methods for incorporating such inductive biases. We amalgamate these diverse methodologies under a unified framework, shedding light on the role of structure in the learning problem, and classify these methods into distinct patterns of incorporating structure. By leveraging this comprehensive framework, we provide valuable insights into the challenges of structured RL and lay the groundwork for a design pattern perspective on RL research. This novel perspective paves the way for future advancements and aids in developing more effective and efficient RL algorithms that can potentially handle real-world scenarios better. △ Less

Submitted 25 April, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

Comments: Published at the Journal of Artificial Intelligence Research, Volume 79, Pages 1167-1236

arXiv:2306.08107 [pdf, other]

AutoML in the Age of Large Language Models: Current Challenges, Future Opportunities and Risks

Authors: Alexander Tornede, Difan Deng, Theresa Eimer, Joseph Giovanelli, Aditya Mohan, Tim Ruhkopf, Sarah Segel, Daphne Theodorakopoulos, Tanja Tornede, Henning Wachsmuth, Marius Lindauer

Abstract: The fields of both Natural Language Processing (NLP) and Automated Machine Learning (AutoML) have achieved remarkable results over the past years. In NLP, especially Large Language Models (LLMs) have experienced a rapid series of breakthroughs very recently. We envision that the two fields can radically push the boundaries of each other through tight integration. To showcase this vision, we explor… ▽ More The fields of both Natural Language Processing (NLP) and Automated Machine Learning (AutoML) have achieved remarkable results over the past years. In NLP, especially Large Language Models (LLMs) have experienced a rapid series of breakthroughs very recently. We envision that the two fields can radically push the boundaries of each other through tight integration. To showcase this vision, we explore the potential of a symbiotic relationship between AutoML and LLMs, shedding light on how they can benefit each other. In particular, we investigate both the opportunities to enhance AutoML approaches with LLMs from different perspectives and the challenges of leveraging AutoML to further improve LLMs. To this end, we survey existing work, and we critically assess risks. We strongly believe that the integration of the two fields has the potential to disrupt both fields, NLP and AutoML. By highlighting conceivable synergies, but also risks, we aim to foster further exploration at the intersection of AutoML and LLMs. △ Less

Submitted 21 February, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: Submitted and accepted at TMLR: https://openreview.net/forum?id=cAthubStyG

arXiv:2306.04037 [pdf, other]

doi 10.1109/IGARSS52108.2023.10281981

Quantitative Analysis of Primary Attribution Explainable Artificial Intelligence Methods for Remote Sensing Image Classification

Authors: Akshatha Mohan, Joshua Peeples

Abstract: We present a comprehensive analysis of quantitatively evaluating explainable artificial intelligence (XAI) techniques for remote sensing image classification. Our approach leverages state-of-the-art machine learning approaches to perform remote sensing image classification across multiple modalities. We investigate the results of the models qualitatively through XAI methods. Additionally, we compa… ▽ More We present a comprehensive analysis of quantitatively evaluating explainable artificial intelligence (XAI) techniques for remote sensing image classification. Our approach leverages state-of-the-art machine learning approaches to perform remote sensing image classification across multiple modalities. We investigate the results of the models qualitatively through XAI methods. Additionally, we compare the XAI methods quantitatively through various categories of desired properties. Through our analysis, we offer insights and recommendations for selecting the most appropriate XAI method(s) to gain a deeper understanding of the models' decision-making processes. The code for this work is publicly available. △ Less

Submitted 4 December, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

Comments: 4 pages, 3 figures, Accepted to 2023 IGARSS Community-Contributed Sessions - Opening the Black Box: Explainable AI/ML in Remote Sensing Analysis

arXiv:2305.10964 [pdf, other]

Learning Activation Functions for Sparse Neural Networks

Authors: Mohammad Loni, Aditya Mohan, Mehdi Asadi, Marius Lindauer

Abstract: Sparse Neural Networks (SNNs) can potentially demonstrate similar performance to their dense counterparts while saving significant energy and memory at inference. However, the accuracy drop incurred by SNNs, especially at high pruning ratios, can be an issue in critical deployment conditions. While recent works mitigate this issue through sophisticated pruning techniques, we shift our focus to an… ▽ More Sparse Neural Networks (SNNs) can potentially demonstrate similar performance to their dense counterparts while saving significant energy and memory at inference. However, the accuracy drop incurred by SNNs, especially at high pruning ratios, can be an issue in critical deployment conditions. While recent works mitigate this issue through sophisticated pruning techniques, we shift our focus to an overlooked factor: hyperparameters and activation functions. Our analyses have shown that the accuracy drop can additionally be attributed to (i) Using ReLU as the default choice for activation functions unanimously, and (ii) Fine-tuning SNNs with the same hyperparameters as dense counterparts. Thus, we focus on learning a novel way to tune activation functions for sparse networks and combining these with a separate hyperparameter optimization (HPO) regime for sparse networks. By conducting experiments on popular DNN models (LeNet-5, VGG-16, ResNet-18, and EfficientNet-B0) trained on MNIST, CIFAR-10, and ImageNet-16 datasets, we show that the novel combination of these two approaches, dubbed Sparse Activation Function Search, short: SAFS, results in up to 15.53%, 8.88%, and 6.33% absolute improvement in the accuracy for LeNet-5, VGG-16, and ResNet-18 over the default training protocols, especially at high pruning ratios. Our code can be found at https://github.com/automl/SAFS △ Less

Submitted 5 June, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

arXiv:2304.02396 [pdf, other]

AutoRL Hyperparameter Landscapes

Authors: Aditya Mohan, Carolin Benjamins, Konrad Wienecke, Alexander Dockhorn, Marius Lindauer

Abstract: Although Reinforcement Learning (RL) has shown to be capable of producing impressive results, its use is limited by the impact of its hyperparameters on performance. This often makes it difficult to achieve good results in practice. Automated RL (AutoRL) addresses this difficulty, yet little is known about the dynamics of the hyperparameter landscapes that hyperparameter optimization (HPO) methods… ▽ More Although Reinforcement Learning (RL) has shown to be capable of producing impressive results, its use is limited by the impact of its hyperparameters on performance. This often makes it difficult to achieve good results in practice. Automated RL (AutoRL) addresses this difficulty, yet little is known about the dynamics of the hyperparameter landscapes that hyperparameter optimization (HPO) methods traverse in search of optimal configurations. In view of existing AutoRL approaches dynamically adjusting hyperparameter configurations, we propose an approach to build and analyze these hyperparameter landscapes not just for one point in time but at multiple points in time throughout training. Addressing an important open question on the legitimacy of such dynamic AutoRL approaches, we provide thorough empirical evidence that the hyperparameter landscapes strongly vary over time across representative algorithms from RL literature (DQN, PPO, and SAC) in different kinds of environments (Cartpole, Bipedal Walker, and Hopper) This supports the theory that hyperparameters should be dynamically adjusted during training and shows the potential for more insights on AutoRL problems that can be gained through landscape analyses. Our code can be found at https://github.com/automl/AutoRL-Landscape △ Less

Submitted 5 June, 2023; v1 submitted 5 April, 2023; originally announced April 2023.

Comments: Version updated after acceptance

arXiv:2303.08232 [pdf, other]

Generating Humanoid Multi-Contact through Feasibility Visualization

Authors: Stephen McCrory, Sylvain Bertrand, Achintya Mohan, Duncan Calvert, Jerry Pratt, Robert Griffin

Abstract: We present a feasibility-driven teleoperation framework designed to generate humanoid multi-contact maneuvers for use in unstructured environments. Our framework is designed for motions with arbitrary contact modes and postures. The operator configures a pre-execution preview robot through contact points and kinematic tasks. A fast estimation of the preview robot's quasi-static feasibility is perf… ▽ More We present a feasibility-driven teleoperation framework designed to generate humanoid multi-contact maneuvers for use in unstructured environments. Our framework is designed for motions with arbitrary contact modes and postures. The operator configures a pre-execution preview robot through contact points and kinematic tasks. A fast estimation of the preview robot's quasi-static feasibility is performed by checking contact stability and collisions along an interpolated trajectory. A visualization of Center of Mass (CoM) stability margin, based on friction and actuation constraints, is displayed and can be previewed if the operator chooses to add or remove contacts. Contact points can be placed anywhere on a mesh approximation of the robot surface, enabling motions with knee or forearm contacts. We demonstrate our approach in simulation and hardware on a NASA Valkyrie humanoid, focusing on multi-contact trajectories which are challenging to generate autonomously or through alternative teleoperation approaches. △ Less

Submitted 10 November, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

arXiv:2301.01415 [pdf, other]

doi 10.1016/j.nima.2023.168409

Machine Learning technique for isotopic determination of radioisotopes using HPGe $\mathrmγ$-ray spectra

Authors: Ajeeta Khatiwada, Marc Klasky, Marcie Lombardi, Jason Matheny, Arvind Mohan

Abstract: $\mathrmγ… ▽ More $\mathrmγ$-ray spectroscopy is a quantitative, non-destructive technique that may be utilized for the identification and quantitative isotopic estimation of radionuclides. Traditional methods of isotopic determination have various challenges that contribute to statistical and systematic uncertainties in the estimated isotopics. Furthermore, these methods typically require numerous pre-processing steps, and have only been rigorously tested in laboratory settings with limited shielding. In this work, we examine the application of a number of machine learning based regression algorithms as alternatives to conventional approaches for analyzing $\mathrmγ$-ray spectroscopy data in the Emergency Response arena. This approach not only eliminates many steps in the analysis procedure, and therefore offers potential to reduce this source of systematic uncertainty, but is also shown to offer comparable performance to conventional approaches in the Emergency Response Application. △ Less

Submitted 3 January, 2023; originally announced January 2023.

arXiv:2212.00217 [pdf, other]

Physics-Constrained Generative Adversarial Networks for 3D Turbulence

Authors: Dima Tretiak, Arvind T. Mohan, Daniel Livescu

Abstract: Generative Adversarial Networks (GANs) have received wide acclaim among the machine learning (ML) community for their ability to generate realistic 2D images. ML is being applied more often to complex problems beyond those of computer vision. However, current frameworks often serve as black boxes and lack physics embeddings, leading to poor ability in enforcing constraints and unreliable models. I… ▽ More Generative Adversarial Networks (GANs) have received wide acclaim among the machine learning (ML) community for their ability to generate realistic 2D images. ML is being applied more often to complex problems beyond those of computer vision. However, current frameworks often serve as black boxes and lack physics embeddings, leading to poor ability in enforcing constraints and unreliable models. In this work, we develop physics embeddings that can be stringently imposed, referred to as hard constraints, in the neural network architecture. We demonstrate their capability for 3D turbulence by embedding them in GANs, particularly to enforce the mass conservation constraint in incompressible fluid turbulence. In doing so, we also explore and contrast the effects of other methods of imposing physics constraints within the GANs framework, especially penalty-based physics constraints popular in literature. By using physics-informed diagnostics and statistics, we evaluate the strengths and weaknesses of our approach and demonstrate its feasibility. △ Less

Submitted 30 November, 2022; originally announced December 2022.

Report number: LA-UR-22-32475

arXiv:2211.12340 [pdf, other]

DOLCE: A Model-Based Probabilistic Diffusion Framework for Limited-Angle CT Reconstruction

Authors: Jiaming Liu, Rushil Anirudh, Jayaraman J. Thiagarajan, Stewart He, K. Aditya Mohan, Ulugbek S. Kamilov, Hyojin Kim

Abstract: Limited-Angle Computed Tomography (LACT) is a non-destructive evaluation technique used in a variety of applications ranging from security to medicine. The limited angle coverage in LACT is often a dominant source of severe artifacts in the reconstructed images, making it a challenging inverse problem. We present DOLCE, a new deep model-based framework for LACT that uses a conditional diffusion mo… ▽ More Limited-Angle Computed Tomography (LACT) is a non-destructive evaluation technique used in a variety of applications ranging from security to medicine. The limited angle coverage in LACT is often a dominant source of severe artifacts in the reconstructed images, making it a challenging inverse problem. We present DOLCE, a new deep model-based framework for LACT that uses a conditional diffusion model as an image prior. Diffusion models are a recent class of deep generative models that are relatively easy to train due to their implementation as image denoisers. DOLCE can form high-quality images from severely under-sampled data by integrating data-consistency updates with the sampling updates of a diffusion model, which is conditioned on the transformed limited-angle data. We show through extensive experimentation on several challenging real LACT datasets that, the same pre-trained DOLCE model achieves the SOTA performance on drastically different types of images. Additionally, we show that, unlike standard LACT reconstruction methods, DOLCE naturally enables the quantification of the reconstruction uncertainty by generating multiple samples consistent with the measured data. △ Less

Submitted 22 November, 2022; originally announced November 2022.

Comments: 29 pages, 21 figures

arXiv:2211.11659 [pdf, other]

Formal Abstractions for Packet Scheduling

Authors: Anshuman Mohan, Yunhe Liu, Nate Foster, Tobias Kappé, Dexter Kozen

Abstract: Early programming models for software-defined networking (SDN) focused on basic features for controlling network-wide forwarding paths, but more recent work has considered richer features, such as packet scheduling and queueing, that affect performance. In particular, PIFO trees, proposed by Sivaraman et al., offer a flexible and efficient primitive for programmable packet scheduling. Prior work h… ▽ More Early programming models for software-defined networking (SDN) focused on basic features for controlling network-wide forwarding paths, but more recent work has considered richer features, such as packet scheduling and queueing, that affect performance. In particular, PIFO trees, proposed by Sivaraman et al., offer a flexible and efficient primitive for programmable packet scheduling. Prior work has shown that PIFO trees can express a wide range of practical algorithms including strict priority, weighted fair queueing, and hierarchical schemes. However, the semantic properties of PIFO trees are not well understood. This paper studies PIFO trees from a programming language perspective. We formalize the syntax and semantics of PIFO trees in an operational model that decouples the scheduling policy running on a tree from the topology of the tree. Building on this formalization, we develop compilation algorithms that allow the behavior of a PIFO tree written against one topology to be realized using a tree with a different topology. Such a compiler could be used to optimize an implementation of PIFO trees, or realize a logical PIFO tree on a target with a fixed topology baked into the hardware. To support experimentation, we develop a software simulator for PIFO trees, and we present case studies illustrating its behavior on standard and custom algorithms. △ Less

Submitted 19 October, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

ACM Class: C.2.1; E.1

arXiv:2210.06950 [pdf, other]

doi 10.48550/arXiv.2210.06950

Interference-Managed Local Service Insertion for 5G Broadcast

Authors: M. V. Abhay Mohan, K. Giridhar

Abstract: Broadcast of localized TV content enables tailored content delivery catering to the requirements of regional user base. 5G multicast-broadcast service (MBS) requires a spectrally efficient broadcast solution that enables the change of content from one local service area (LSA) to another. A frequency reuse factor of unity between two adjacent LSAs causes their boundary region to become saturated wi… ▽ More Broadcast of localized TV content enables tailored content delivery catering to the requirements of regional user base. 5G multicast-broadcast service (MBS) requires a spectrally efficient broadcast solution that enables the change of content from one local service area (LSA) to another. A frequency reuse factor of unity between two adjacent LSAs causes their boundary region to become saturated with co-channel interference (CCI). Increasing the reuse factor will reduce the CCI at the cost of degrading the spectral efficiency. This paper addresses the frequency and transmit power planning which manages the CCI at the LSA boundary to achieve a satisfactory trade-off between spectral efficiency and broadcast coverage. △ Less

Submitted 12 March, 2023; v1 submitted 1 September, 2022; originally announced October 2022.

Comments: Newer version of our unpublished work

arXiv:2207.09090 [pdf, other]

Actor-Critic based Improper Reinforcement Learning

Authors: Mohammadi Zaki, Avinash Mohan, Aditya Gopalan, Shie Mannor

Abstract: We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform each of the base ones. This can be useful in tuning across controllers, learnt possibly in mismatched or simulated environments, to obtain a good controller for a… ▽ More We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform each of the base ones. This can be useful in tuning across controllers, learnt possibly in mismatched or simulated environments, to obtain a good controller for a given target environment with relatively few trials. Towards this, we propose two algorithms: (1) a Policy Gradient-based approach; and (2) an algorithm that can switch between a simple Actor-Critic (AC) based scheme and a Natural Actor-Critic (NAC) scheme depending on the available information. Both algorithms operate over a class of improper mixtures of the given controllers. For the first case, we derive convergence rate guarantees assuming access to a gradient oracle. For the AC-based approach we provide convergence rate guarantees to a stationary point in the basic AC case and to a global optimum in the NAC case. Numerical results on (i) the standard control theoretic benchmark of stabilizing an cartpole; and (ii) a constrained queueing task show that our improper policy optimization algorithm can stabilize the system even when the base policies at its disposal are unstable. △ Less

Submitted 19 July, 2022; originally announced July 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2102.08201

arXiv:2206.03130 [pdf, other]

Towards Meta-learned Algorithm Selection using Implicit Fidelity Information

Authors: Aditya Mohan, Tim Ruhkopf, Marius Lindauer

Abstract: Automatically selecting the best performing algorithm for a given dataset or ranking multiple algorithms by their expected performance supports users in developing new machine learning applications. Most approaches for this problem rely on pre-computed dataset meta-features and landmarking performances to capture the salient topology of the datasets and those topologies that the algorithms attend… ▽ More Automatically selecting the best performing algorithm for a given dataset or ranking multiple algorithms by their expected performance supports users in developing new machine learning applications. Most approaches for this problem rely on pre-computed dataset meta-features and landmarking performances to capture the salient topology of the datasets and those topologies that the algorithms attend to. Landmarking usually exploits cheap algorithms not necessarily in the pool of candidate algorithms to get inexpensive approximations of the topology. While somewhat indicative, hand-crafted dataset meta-features and landmarks are likely insufficient descriptors, strongly depending on the alignment of the topologies that the landmarks and the candidate algorithms search for. We propose IMFAS, a method to exploit multi-fidelity landmarking information directly from the candidate algorithms in the form of non-parametrically non-myopic meta-learned learning curves via LSTMs in a few-shot setting during testing. Using this mechanism, IMFAS jointly learns the topology of the datasets and the inductive biases of the candidate algorithms, without the need to expensively train them to convergence. Our approach produces informative landmarks, easily enriched by arbitrary meta-features at a low computational cost, capable of producing the desired ranking using cheaper fidelities. We additionally show that IMFAS is able to beat Successive Halving with at most 50% of the fidelity sequence during test time. △ Less

Submitted 13 July, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

Comments: Camera-ready version

arXiv:2204.00096 [pdf, other]

Iterative Reconstruction of the Electron Density and Effective Atomic Number using a Non-Linear Forward Model

Authors: K. Aditya Mohan, Kyle M. Champley, Albert W. Reed, Steven M. Glenn, Harry E. Martz Jr

Abstract: For material identification, characterization, and quantification, it is useful to estimate system-independent material properties that do not depend on the detailed specifications of the X-ray computed tomography (CT) system such as spectral response. System independent rho-e and Z-e (SIRZ) refers to a suite of methods for estimating the system independent material properties of electron density,… ▽ More For material identification, characterization, and quantification, it is useful to estimate system-independent material properties that do not depend on the detailed specifications of the X-ray computed tomography (CT) system such as spectral response. System independent rho-e and Z-e (SIRZ) refers to a suite of methods for estimating the system independent material properties of electron density, rho-e, and effective atomic number, Z-e, of an object scanned using dual-energy X-ray CT (DECT). The current state-of-the-art approach, SIRZ-2, makes certain approximations that lead to inaccurate estimates for large atomic numbered (Z-e) materials. In this paper, we present an extension, SIRZ-3, which iteratively reconstructs the unknown rho-e and Z-e while avoiding the limiting approximations made by SIRZ-2. Unlike SIRZ-2, this allows SIRZ-3 to accurately reconstruct rho-e and Z-e even at large values of Z-e. SIRZ-3 relies on the use of a new non-linear differentiable forward measurement model that expresses the DECT measurement data as a direct analytical function of rho-e and Z-e. Leveraging this new forward model, we use an iterative optimization algorithm to reconstruct (or solve for) rho-e and Z-e directly from the DECT data. Compared to SIRZ-2, we show that the magnitude of performance improvement using SIRZ-3 increases with increasing values for Z-e. △ Less

Submitted 31 March, 2022; originally announced April 2022.

arXiv:2202.04500 [pdf, other]

Contextualize Me -- The Case for Context in Reinforcement Learning

Authors: Carolin Benjamins, Theresa Eimer, Frederik Schubert, Aditya Mohan, Sebastian Döhler, André Biedenkapp, Bodo Rosenhahn, Frank Hutter, Marius Lindauer

Abstract: While Reinforcement Learning ( RL) has made great strides towards solving increasingly complicated problems, many algorithms are still brittle to even slight environmental changes. Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner, thereby enabling flexible, precise and interpretable task specification and generation. Our goal is to show how… ▽ More While Reinforcement Learning ( RL) has made great strides towards solving increasingly complicated problems, many algorithms are still brittle to even slight environmental changes. Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner, thereby enabling flexible, precise and interpretable task specification and generation. Our goal is to show how the framework of cRL contributes to improving zero-shot generalization in RL through meaningful benchmarks and structured reasoning about generalization tasks. We confirm the insight that optimal behavior in cRL requires context information, as in other related areas of partial observability. To empirically validate this in the cRL framework, we provide various context-extended versions of common RL environments. They are part of the first benchmark library, CARL, designed for generalization based on cRL extensions of popular benchmarks, which we propose as a testbed to further study general agents. We show that in the contextual setting, even simple RL environments become challenging - and that naive solutions are not enough to generalize across complex context spaces. △ Less

Submitted 2 June, 2023; v1 submitted 9 February, 2022; originally announced February 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2110.02102

arXiv:2110.13745 [pdf, other]

PARIS: Personalized Activity Recommendation for Improving Sleep Quality

Authors: Meghna Singh, Saksham Goel, Abhiraj Mohan, Jaideep Srivastava

Abstract: The quality of sleep has a deep impact on people's physical and mental health. People with insufficient sleep are more likely to report physical and mental distress, activity limitation, anxiety, and pain. Moreover, in the past few years, there has been an explosion of applications and devices for activity monitoring and health tracking. Signals collected from these wearable devices can be used to… ▽ More The quality of sleep has a deep impact on people's physical and mental health. People with insufficient sleep are more likely to report physical and mental distress, activity limitation, anxiety, and pain. Moreover, in the past few years, there has been an explosion of applications and devices for activity monitoring and health tracking. Signals collected from these wearable devices can be used to study and improve sleep quality. In this paper, we utilize the relationship between physical activity and sleep quality to find ways of assisting people improve their sleep using machine learning techniques. People usually have several behavior modes that their bio-functions can be divided into. Performing time series clustering on activity data, we find cluster centers that would correlate to the most evident behavior modes for a specific subject. Activity recipes are then generated for good sleep quality for each behavior mode within each cluster. These activity recipes are supplied to an activity recommendation engine for suggesting a mix of relaxed to intense activities to subjects during their daily routines. The recommendations are further personalized based on the subjects' lifestyle constraints, i.e. their age, gender, body mass index (BMI), resting heart rate, etc, with the objective of the recommendation being the improvement of that night's quality of sleep. This would in turn serve a longer-term health objective, like lowering heart rate, improving the overall quality of sleep, etc. △ Less

Submitted 28 May, 2024; v1 submitted 26 October, 2021; originally announced October 2021.

Comments: 18 pages, 7 figures, Submitted to UMUAI: Special Issue on Recommender Systems for Health and Wellbeing, 2020

arXiv:2105.11213 [pdf, other]

A Low-Delay MAC for IoT Applications: Decentralized Optimal Scheduling of Queues without Explicit State Information Sharing

Authors: Avinash Mohan, Arpan Chattopadhyay, Shivam Vinayak Vatsa, Anurag Kumar

Abstract: We consider a system of several collocated nodes sharing a time slotted wireless channel, and seek a MAC (medium access control) that (i) provides low mean delay, (ii) has distributed control (i.e., there is no central scheduler), and (iii) does not require explicit exchange of state information or control signals. The design of such MAC protocols must keep in mind the need for contention access a… ▽ More We consider a system of several collocated nodes sharing a time slotted wireless channel, and seek a MAC (medium access control) that (i) provides low mean delay, (ii) has distributed control (i.e., there is no central scheduler), and (iii) does not require explicit exchange of state information or control signals. The design of such MAC protocols must keep in mind the need for contention access at light traffic, and scheduled access in heavy traffic, leading to the long-standing interest in hybrid, adaptive MACs. Working in the discrete time setting, for the distributed MAC design, we consider a practical information structure where each node has local information and some common information obtained from overhearing. In this setting, "ZMAC" is an existing protocol that is hybrid and adaptive. We approach the problem via two steps (1) We show that it is sufficient for the policy to be "greedy" and "exhaustive". Limiting the policy to this class reduces the problem to obtaining a queue switching policy at queue emptiness instants. (2) Formulating the delay optimal scheduling as a POMDP (partially observed Markov decision process), we show that the optimal switching rule is Stochastic Largest Queue (SLQ). Using this theory as the basis, we then develop a practical distributed scheduler, QZMAC, which is also tunable. We implement QZMAC on standard off-the-shelf TelosB motes and also use simulations to compare QZMAC with the full-knowledge centralized scheduler, and with ZMAC. We use our implementation to study the impact of false detection while overhearing the common information, and the efficiency of QZMAC. Our simulation results show that the mean delay with QZMAC is close that of the full-knowledge centralized scheduler. △ Less

Submitted 20 June, 2023; v1 submitted 24 May, 2021; originally announced May 2021.

Comments: 28 pages, 19 figures

arXiv:2105.09046 [pdf, other]

Music Generation using Three-layered LSTM

Authors: Vaishali Ingale, Anush Mohan, Divit Adlakha, Krishan Kumar, Mohit Gupta

Abstract: This paper explores the idea of utilising Long Short-Term Memory neural networks (LSTMNN) for the generation of musical sequences in ABC notation. The proposed approach takes ABC notations from the Nottingham dataset and encodes it to be fed as input for the neural networks. The primary objective is to input the neural networks with an arbitrary note, let the network process and augment a sequence… ▽ More This paper explores the idea of utilising Long Short-Term Memory neural networks (LSTMNN) for the generation of musical sequences in ABC notation. The proposed approach takes ABC notations from the Nottingham dataset and encodes it to be fed as input for the neural networks. The primary objective is to input the neural networks with an arbitrary note, let the network process and augment a sequence based on the note until a good piece of music is produced. Multiple calibrations have been done to amend the parameters of the network for optimal generation. The output is assessed on the basis of rhythm, harmony, and grammar accuracy. △ Less

Submitted 9 June, 2021; v1 submitted 19 May, 2021; originally announced May 2021.

arXiv:2105.00210 [pdf, other]

Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling

Authors: Mohammani Zaki, Avi Mohan, Aditya Gopalan, Shie Mannor

Abstract: We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. Modern communication systems are becoming increasingly complex, and are required to handle multiple types of traffic with widely varying characteristics such as arrival rates and service times. This, coupled with the need for rapid network deployment, render a bottom up approach of first… ▽ More We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. Modern communication systems are becoming increasingly complex, and are required to handle multiple types of traffic with widely varying characteristics such as arrival rates and service times. This, coupled with the need for rapid network deployment, render a bottom up approach of first characterizing the traffic and then devising an appropriate scheduling protocol infeasible. In contrast, we formulate a top down approach to scheduling where, given an unknown network and a set of scheduling policies, we use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies. We derive convergence results and analyze finite time performance of the algorithm. Simulation results show that the algorithm performs well even when the arrival rates are nonstationary and can stabilize the system even when the constituent policies are unstable. △ Less

Submitted 1 May, 2021; originally announced May 2021.

Comments: 4 pages, 5 figures, RLNQ workshop at the SIGMETRICS 2021

arXiv:2104.05405 [pdf, other]

Additive Tridiagonal Codes over $\mathbb{F}_{4}$

Authors: N. Annamalai, Anandhu Mohan, C. Durairajan

Abstract: In this paper, we introduce a additive Tridiagonal and Double-Tridiagonal codes over $\mathbb{F}_4$ and then we study the properties of the code. Also, we find the number of additive Tridiagonal codes over $\mathbb{F}_4.$ Finally, we study the applications of Double-Tridiagonal codes to secret sharing scheme based on matrix projection. In this paper, we introduce a additive Tridiagonal and Double-Tridiagonal codes over $\mathbb{F}_4$ and then we study the properties of the code. Also, we find the number of additive Tridiagonal codes over $\mathbb{F}_4.$ Finally, we study the applications of Double-Tridiagonal codes to secret sharing scheme based on matrix projection. △ Less

Submitted 25 March, 2021; originally announced April 2021.

MSC Class: 94B05; 94A62; 94A15

arXiv:2102.08201 [pdf, other]

Improper Reinforcement Learning with Gradient-based Policy Optimization

Authors: Mohammadi Zaki, Avinash Mohan, Aditya Gopalan, Shie Mannor

Abstract: We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform each of the base ones. This can be useful in tuning across controllers, learnt possibly in mismatched or simulated environments, to obtain a good controller for a… ▽ More We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform each of the base ones. This can be useful in tuning across controllers, learnt possibly in mismatched or simulated environments, to obtain a good controller for a given target environment with relatively few trials. \par We propose a gradient-based approach that operates over a class of improper mixtures of the controllers. We derive convergence rate guarantees for the approach assuming access to a gradient oracle. The value function of the mixture and its gradient may not be available in closed-form; however, we show that we can employ rollouts and simultaneous perturbation stochastic approximation (SPSA) for explicit gradient descent optimization. Numerical results on (i) the standard control theoretic benchmark of stabilizing an inverted pendulum and (ii) a constrained queueing task show that our improper policy optimization algorithm can stabilize the system even when the base policies at its disposal are unstable\footnote{Under review. Please do not distribute.}. △ Less

Submitted 3 July, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

arXiv:2012.12310 [pdf, other]

doi 10.1109/JBHI.2021.3099745

Mixture Model Framework for Traumatic Brain Injury Prognosis Using Heterogeneous Clinical and Outcome Data

Authors: Alan D. Kaplan, Qi Cheng, K. Aditya Mohan, Lindsay D. Nelson, Sonia Jain, Harvey Levin, Abel Torres-Espin, Austin Chou, J. Russell Huie, Adam R. Ferguson, Michael McCrea, Joseph Giacino, Shivshankar Sundaram, Amy J. Markowitz, Geoffrey T. Manley

Abstract: Prognoses of Traumatic Brain Injury (TBI) outcomes are neither easily nor accurately determined from clinical indicators. This is due in part to the heterogeneity of damage inflicted to the brain, ultimately resulting in diverse and complex outcomes. Using a data-driven approach on many distinct data elements may be necessary to describe this large set of outcomes and thereby robustly depict the n… ▽ More Prognoses of Traumatic Brain Injury (TBI) outcomes are neither easily nor accurately determined from clinical indicators. This is due in part to the heterogeneity of damage inflicted to the brain, ultimately resulting in diverse and complex outcomes. Using a data-driven approach on many distinct data elements may be necessary to describe this large set of outcomes and thereby robustly depict the nuanced differences among TBI patients' recovery. In this work, we develop a method for modeling large heterogeneous data types relevant to TBI. Our approach is geared toward the probabilistic representation of mixed continuous and discrete variables with missing values. The model is trained on a dataset encompassing a variety of data types, including demographics, blood-based biomarkers, and imaging findings. In addition, it includes a set of clinical outcome assessments at 3, 6, and 12 months post-injury. The model is used to stratify patients into distinct groups in an unsupervised learning setting. We use the model to infer outcomes using input data, and show that the collection of input data reduces uncertainty of outcomes over a baseline approach. In addition, we quantify the performance of a likelihood scoring technique that can be used to self-evaluate the extrapolation risk of prognosis on unseen patients. △ Less

Submitted 20 July, 2021; v1 submitted 22 December, 2020; originally announced December 2020.

Comments: 12 pages, 5 figures

arXiv:2012.02955 [pdf, other]

Implementing QZMAC (a Decentralized Delay Optimal MAC) over 6TiSCH under the Contiki OS in an IEEE 802.15.4 Network

Authors: Shivam Vinayak Vatsa, Avi Mohan, Anurag Kumar

Abstract: Motivated by the emerging delay-sensitive applications of the Internet of Things (IoT), there has been a resurgence of interest in developing medium access control (MAC) protocols in a time-slotted framework. The resource-constrained, ad-hoc nature of wireless networks typical of the IoT also forces the amount of control information exchanged across the network -- required to make scheduling decis… ▽ More Motivated by the emerging delay-sensitive applications of the Internet of Things (IoT), there has been a resurgence of interest in developing medium access control (MAC) protocols in a time-slotted framework. The resource-constrained, ad-hoc nature of wireless networks typical of the IoT also forces the amount of control information exchanged across the network -- required to make scheduling decisions -- to a minimum. In a previous article we proposed a protocol called QZMAC that (i) provides provably low mean delay, (ii) has distributed control (i.e., there is no central scheduler), and (iii) does not require explicit exchange of state information or control signals. In the present article, we implement and demonstrate the performance of QZMAC on a test bed consisting of CC2420 based Crossbow telosB motes, running the 6TiSCH communication stack on the Contiki operating system over the 2.4GHz ISM band. QZMAC achieves its near-optimal delay performance using a clever combination of polling and contention modes. We demonstrate the polling and the contention modes of QZMAC separately. We use an Adaptive Synchronization Technique in our implementation which we also demonstrate. Our network shows good delay performance even in the presence of heavy interference from ambient WiFi networks. △ Less

Submitted 5 December, 2020; originally announced December 2020.

Comments: 4 pages, 3 figures, Comsnets 2021 (submitted)

arXiv:2011.10549 [pdf, other]

Graph Signal Recovery Using Restricted Boltzmann Machines

Authors: Ankith Mohan, Aiichiro Nakano, Emilio Ferrara

Abstract: We propose a model-agnostic pipeline to recover graph signals from an expert system by exploiting the content addressable memory property of restricted Boltzmann machine and the representational ability of a neural network. The proposed pipeline requires the deep neural network that is trained on a downward machine learning task with clean data, data which is free from any form of corruption or in… ▽ More We propose a model-agnostic pipeline to recover graph signals from an expert system by exploiting the content addressable memory property of restricted Boltzmann machine and the representational ability of a neural network. The proposed pipeline requires the deep neural network that is trained on a downward machine learning task with clean data, data which is free from any form of corruption or incompletion. We show that denoising the representations learned by the deep neural networks is usually more effective than denoising the data itself. Although this pipeline can deal with noise in any dataset, it is particularly effective for graph-structured datasets. △ Less

Submitted 20 November, 2020; originally announced November 2020.

Comments: Paper: 27 pages, 9 figures. Appendix: 5 pages, 12 figures. Submitted to Expert Systems with Applications

arXiv:2010.15987 [pdf, other]

doi 10.1109/JBHI.2021.3124733

AutoAtlas: Neural Network for 3D Unsupervised Partitioning and Representation Learning

Authors: K. Aditya Mohan, Alan D. Kaplan

Abstract: We present a novel neural network architecture called AutoAtlas for fully unsupervised partitioning and representation learning of 3D brain Magnetic Resonance Imaging (MRI) volumes. AutoAtlas consists of two neural network components: one neural network to perform multi-label partitioning based on local texture in the volume, and a second neural network to compress the information contained within… ▽ More We present a novel neural network architecture called AutoAtlas for fully unsupervised partitioning and representation learning of 3D brain Magnetic Resonance Imaging (MRI) volumes. AutoAtlas consists of two neural network components: one neural network to perform multi-label partitioning based on local texture in the volume, and a second neural network to compress the information contained within each partition. We train both of these components simultaneously by optimizing a loss function that is designed to promote accurate reconstruction of each partition, while encouraging spatially smooth and contiguous partitioning, and discouraging relatively small partitions. We show that the partitions adapt to the subject specific structural variations of brain tissue while consistently appearing at similar spatial locations across subjects. AutoAtlas also produces very low dimensional features that represent local texture of each partition. We demonstrate prediction of metadata associated with each subject using the derived feature representations and compare the results to prediction using features derived from FreeSurfer anatomical parcellation. Since our features are intrinsically linked to distinct partitions, we can then map values of interest, such as partition-specific feature importance scores onto the brain for visualization. △ Less

Submitted 11 November, 2021; v1 submitted 29 October, 2020; originally announced October 2020.

Comments: IEEE Journal of Biomedical and Health Informatics

arXiv:2010.15668 [pdf]

PeopleXploit -- A hybrid tool to collect public data

Authors: Arjun Anand V, Buvanasri A K, Meenakshi R, Dr. Karthika S, Ashok Kumar Mohan

Abstract: This paper introduces the concept of Open Source Intelligence (OSINT) as an important application in intelligent profiling of individuals. With a variety of tools available, significant data shall be obtained on an individual as a consequence of analyzing his/her internet presence but all of this comes at the cost of low relevance. To increase the relevance score in profiling, PeopleXploit is bein… ▽ More This paper introduces the concept of Open Source Intelligence (OSINT) as an important application in intelligent profiling of individuals. With a variety of tools available, significant data shall be obtained on an individual as a consequence of analyzing his/her internet presence but all of this comes at the cost of low relevance. To increase the relevance score in profiling, PeopleXploit is being introduced. PeopleXploit is a hybrid tool which helps in collecting the publicly available information that is reliable and relevant to the given input. This tool is used to track and trace the given target with their digital footprints like Name, Email, Phone Number, User IDs etc. and the tool will scan & search other associated data from public available records from the internet and create a summary report against the target. PeopleXploit profiles a person using authorship analysis and finds the best matching guess. Also, the type of analysis performed (professional/matrimonial/criminal entity) varies with the requirement of the user. △ Less

Submitted 28 October, 2020; originally announced October 2020.

Comments: 8 pages, 3 images, ICCCSP 2020

arXiv:2010.01499 [pdf]

A New Mask R-CNN Based Method for Improved Landslide Detection

Authors: Silvia Liberata Ullo, Amrita Mohan, Alessandro Sebastianelli, Shaik Ejaz Ahamed, Basant Kumar, Ramji Dwivedi, G. R. Sinha

Abstract: This paper presents a novel method of landslide detection by exploiting the Mask R-CNN capability of identifying an object layout by using a pixel-based segmentation, along with transfer learning used to train the proposed model. A data set of 160 elements is created containing landslide and non-landslide images. The proposed method consists of three steps: (i) augmenting training image samples to… ▽ More This paper presents a novel method of landslide detection by exploiting the Mask R-CNN capability of identifying an object layout by using a pixel-based segmentation, along with transfer learning used to train the proposed model. A data set of 160 elements is created containing landslide and non-landslide images. The proposed method consists of three steps: (i) augmenting training image samples to increase the volume of the training data, (ii) fine tuning with limited image samples, and (iii) performance evaluation of the algorithm in terms of precision, recall and F1 measure, on the considered landslide images, by adopting ResNet-50 and 101 as backbone models. The experimental results are quite encouraging as the proposed method achieves Precision equals to 1.00, Recall 0.93 and F1 measure 0.97, when ResNet-101 is used as backbone model, and with a low number of landslide photographs used as training samples. The proposed algorithm can be potentially useful for land use planners and policy makers of hilly areas where intermittent slope deformations necessitate landslide detection as prerequisite before planning. △ Less

Submitted 4 October, 2020; originally announced October 2020.

Comments: 9 pages, 8 figures, 6 tables, submitted to JSTARS special issue on Cultural Heritage

arXiv:2009.10990 [pdf, other]

Accurate and Interpretable Machine Learning for Transparent Pricing of Health Insurance Plans

Authors: Rohun Kshirsagar, Li-Yen Hsu, Vatshank Chaturvedi, Charles H. Greenberg, Matthew McClelland, Anushadevi Mohan, Wideet Shende, Nicolas P. Tilmans, Renzo Frigato, Min Guo, Ankit Chheda, Meredith Trotter, Shonket Ray, Arnold Lee, Miguel Alvarado

Abstract: Health insurance companies cover half of the United States population through commercial employer-sponsored health plans and pay 1.2 trillion US dollars every year to cover medical expenses for their members. The actuary and underwriter roles at a health insurance company serve to assess which risks to take on and how to price those risks to ensure profitability of the organization. While Bayesian… ▽ More Health insurance companies cover half of the United States population through commercial employer-sponsored health plans and pay 1.2 trillion US dollars every year to cover medical expenses for their members. The actuary and underwriter roles at a health insurance company serve to assess which risks to take on and how to price those risks to ensure profitability of the organization. While Bayesian hierarchical models are the current standard in the industry to estimate risk, interest in machine learning as a way to improve upon these existing methods is increasing. Lumiata, a healthcare analytics company, ran a study with a large health insurance company in the United States. We evaluated the ability of machine learning models to predict the per member per month cost of employer groups in their next renewal period, especially those groups who will cost less than 95\% of what an actuarial model predicts (groups with "concession opportunities"). We developed a sequence of two models, an individual patient-level and an employer-group-level model, to predict the annual per member per month allowed amount for employer groups, based on a population of 14 million patients. Our models performed 20\% better than the insurance carrier's existing pricing model, and identified 84\% of the concession opportunities. This study demonstrates the application of a machine learning system to compute an accurate and fair price for health insurance products and analyzes how explainable machine learning models can exceed actuarial models' predictive accuracy while maintaining interpretability. △ Less

Submitted 27 February, 2021; v1 submitted 23 September, 2020; originally announced September 2020.

Comments: Accepted for publication in The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), in the Innovative Applications of Artificial Intelligence track. This is the extended version with some stylistic fixes from the first posting and complete author list

arXiv:2006.07562 [pdf, other]

Explicit Best Arm Identification in Linear Bandits Using No-Regret Learners

Authors: Mohammadi Zaki, Avi Mohan, Aditya Gopalan

Abstract: We study the problem of best arm identification in linearly parameterised multi-armed bandits. Given a set of feature vectors $\mathcal{X}\subset\mathbb{R}^d,$ a confidence parameter $δ$ and an unknown vector $θ^*,$ the goal is to identify $\arg\max_{x\in\mathcal{X}}x^Tθ^*$, with probability at least $1-δ,$ using noisy measurements of the form $x^Tθ^*.$ For this fixed confidence ($δ$-PAC) setting,… ▽ More We study the problem of best arm identification in linearly parameterised multi-armed bandits. Given a set of feature vectors $\mathcal{X}\subset\mathbb{R}^d,$ a confidence parameter $δ$ and an unknown vector $θ^*,$ the goal is to identify $\arg\max_{x\in\mathcal{X}}x^Tθ^*$, with probability at least $1-δ,$ using noisy measurements of the form $x^Tθ^*.$ For this fixed confidence ($δ$-PAC) setting, we propose an explicitly implementable and provably order-optimal sample-complexity algorithm to solve this problem. Previous approaches rely on access to minimax optimization oracles. The algorithm, which we call the \textit{Phased Elimination Linear Exploration Game} (PELEG), maintains a high-probability confidence ellipsoid containing $θ^*$ in each round and uses it to eliminate suboptimal arms in phases. PELEG achieves fast shrinkage of this confidence ellipsoid along the most confusing (i.e., close to, but not optimal) directions by interpreting the problem as a two player zero-sum game, and sequentially converging to its saddle point using low-regret learners to compute players' strategies in each round. We analyze the sample complexity of PELEG and show that it matches, up to order, an instance-dependent lower bound on sample complexity in the linear bandit setting. We also provide numerical results for the proposed algorithm consistent with its theoretical guarantees. △ Less

Submitted 13 June, 2020; originally announced June 2020.

arXiv:2005.04790 [pdf, other]

The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes

Authors: Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, Davide Testuggine

Abstract: This work proposes a new challenge set for multimodal classification, focusing on detecting hate speech in multimodal memes. It is constructed such that unimodal models struggle and only multimodal models can succeed: difficult examples ("benign confounders") are added to the dataset to make it hard to rely on unimodal signals. The task requires subtle reasoning, yet is straightforward to evaluate… ▽ More This work proposes a new challenge set for multimodal classification, focusing on detecting hate speech in multimodal memes. It is constructed such that unimodal models struggle and only multimodal models can succeed: difficult examples ("benign confounders") are added to the dataset to make it hard to rely on unimodal signals. The task requires subtle reasoning, yet is straightforward to evaluate as a binary classification problem. We provide baseline performance numbers for unimodal models, as well as for multimodal models with various degrees of sophistication. We find that state-of-the-art methods perform poorly compared to humans (64.73% vs. 84.7% accuracy), illustrating the difficulty of the task and highlighting the challenge that this important problem poses to the community. △ Less

Submitted 7 April, 2021; v1 submitted 10 May, 2020; originally announced May 2020.

Comments: NeurIPS 2020

arXiv:2004.01221 [pdf, other]

Towards Relevance and Sequence Modeling in Language Recognition

Authors: Bharat Padi, Anand Mohan, Sriram Ganapathy

Abstract: The task of automatic language identification (LID) involving multiple dialects of the same language family in the presence of noise is a challenging problem. In these scenarios, the identity of the language/dialect may be reliably present only in parts of the temporal sequence of the speech signal. The conventional approaches to LID (and for speaker recognition) ignore the sequence information by… ▽ More The task of automatic language identification (LID) involving multiple dialects of the same language family in the presence of noise is a challenging problem. In these scenarios, the identity of the language/dialect may be reliably present only in parts of the temporal sequence of the speech signal. The conventional approaches to LID (and for speaker recognition) ignore the sequence information by extracting long-term statistical summary of the recording assuming an independence of the feature frames. In this paper, we propose a neural network framework utilizing short-sequence information in language recognition. In particular, a new model is proposed for incorporating relevance in language recognition, where parts of speech data are weighted more based on their relevance for the language recognition task. This relevance weighting is achieved using the bidirectional long short-term memory (BLSTM) network with attention modeling. We explore two approaches, the first approach uses segment level i-vector/x-vector representations that are aggregated in the neural model and the second approach where the acoustic features are directly modeled in an end-to-end neural model. Experiments are performed using the language recognition task in NIST LRE 2017 Challenge using clean, noisy and multi-speaker speech data as well as in the RATS language recognition corpus. In these experiments on noisy LRE tasks as well as the RATS dataset, the proposed approach yields significant improvements over the conventional i-vector/x-vector based language recognition approaches as well as with other previous models incorporating sequence information. △ Less

Submitted 2 April, 2020; originally announced April 2020.

Comments: https://github.com/iiscleap/lre-relevance-weighting Accepted to IEEE Transactions on Audio, Speech and Language Processing

arXiv:2002.08141 [pdf, other]

doi 10.1109/TNET.2020.2976923

Throughput Optimal Decentralized Scheduling with Single-bit State Feedback for a Class of Queueing Systems

Authors: Avinash Mohan, Aditya Gopalan, Anurag Kumar

Abstract: Motivated by medium access control for resource-challenged wireless Internet of Things (IoT), we consider the problem of queue scheduling with reduced queue state information. In particular, we consider a time-slotted scheduling model with $N$ sensor nodes, with pair-wise dependence, such that Nodes $i$ and $i + 1,~0 < i < N$ cannot transmit together. We develop new throughput-optimal scheduling p… ▽ More Motivated by medium access control for resource-challenged wireless Internet of Things (IoT), we consider the problem of queue scheduling with reduced queue state information. In particular, we consider a time-slotted scheduling model with $N$ sensor nodes, with pair-wise dependence, such that Nodes $i$ and $i + 1,~0 < i < N$ cannot transmit together. We develop new throughput-optimal scheduling policies requiring only the empty-nonempty state of each queue that we term Queue Nonemptiness-Based (QNB) policies. We propose a Policy Splicing technique to combine scheduling policies for small networks in order to construct throughput-optimal policies for larger networks, some of which also aim for low delay. For $N = 3,$ there exists a sum-queue length optimal QNB scheduling policy. We show, however, that for $N > 4,$ there is no QNB policy that is sum-queue length optimal over all arrival rate vectors in the capacity region. We then extend our results to a more general class of interference constraints that we call cluster-of-cliques (CoC) conflict graphs. We consider two types of CoC networks, namely, Linear Arrays of Cliques (LAoC) and Star-of-Cliques (SoC) networks. We develop QNB policies for these classes of networks, study their stability and delay properties, and propose and analyze techniques to reduce the amount of state information to be disseminated across the network for scheduling. In the SoC setting, we propose a throughput-optimal policy that only uses information that nodes in the network can glean by sensing activity (or lack thereof) on the channel. Our throughput-optimality results rely on two new arguments: a Lyapunov drift lemma specially adapted to policies that are queue length-agnostic, and a priority queueing analysis for showing strong stability. △ Less

Submitted 19 February, 2020; originally announced February 2020.

Comments: 53 pages, 18 figures, IEEE/ACM Transactions on Networking

Journal ref: IEEE/ACM Transactions on Networking, April 2020

arXiv:1911.01695 [pdf, other]

Towards Optimal and Efficient Best Arm Identification in Linear Bandits

Authors: Mohammadi Zaki, Avinash Mohan, Aditya Gopalan

Abstract: We give a new algorithm for best arm identification in linearly parameterised bandits in the fixed confidence setting. The algorithm generalises the well-known LUCB algorithm of Kalyanakrishnan et al. (2012) by playing an arm which minimises a suitable notion of geometric overlap of the statistical confidence set for the unknown parameter, and is fully adaptive and computationally efficient as com… ▽ More We give a new algorithm for best arm identification in linearly parameterised bandits in the fixed confidence setting. The algorithm generalises the well-known LUCB algorithm of Kalyanakrishnan et al. (2012) by playing an arm which minimises a suitable notion of geometric overlap of the statistical confidence set for the unknown parameter, and is fully adaptive and computationally efficient as compared to several state-of-the methods. We theoretically analyse the sample complexity of the algorithm for problems with two and three arms, showing optimality in many cases. Numerical results indicate favourable performance over other algorithms with which we compare. △ Less

Submitted 7 November, 2019; v1 submitted 5 November, 2019; originally announced November 2019.

arXiv:1910.05375 [pdf, other]

Extreme Few-view CT Reconstruction using Deep Inference

Authors: Hyojin Kim, Rushil Anirudh, K. Aditya Mohan, Kyle Champley

Abstract: Reconstruction of few-view x-ray Computed Tomography (CT) data is a highly ill-posed problem. It is often used in applications that require low radiation dose in clinical CT, rapid industrial scanning, or fixed-gantry CT. Existing analytic or iterative algorithms generally produce poorly reconstructed images, severely deteriorated by artifacts and noise, especially when the number of x-ray project… ▽ More Reconstruction of few-view x-ray Computed Tomography (CT) data is a highly ill-posed problem. It is often used in applications that require low radiation dose in clinical CT, rapid industrial scanning, or fixed-gantry CT. Existing analytic or iterative algorithms generally produce poorly reconstructed images, severely deteriorated by artifacts and noise, especially when the number of x-ray projections is considerably low. This paper presents a deep network-driven approach to address extreme few-view CT by incorporating convolutional neural network-based inference into state-of-the-art iterative reconstruction. The proposed method interprets few-view sinogram data using attention-based deep networks to infer the reconstructed image. The predicted image is then used as prior knowledge in the iterative algorithm for final reconstruction. We demonstrate effectiveness of the proposed approach by performing reconstruction experiments on a chest CT dataset. △ Less

Submitted 11 October, 2019; originally announced October 2019.

Comments: Deep Inverse NeurIPS 2019 Workshop

arXiv:1910.01634 [pdf, other]

Improving Limited Angle CT Reconstruction with a Robust GAN Prior

Authors: Rushil Anirudh, Hyojin Kim, Jayaraman J. Thiagarajan, K. Aditya Mohan, Kyle M. Champley

Abstract: Limited angle CT reconstruction is an under-determined linear inverse problem that requires appropriate regularization techniques to be solved. In this work we study how pre-trained generative adversarial networks (GANs) can be used to clean noisy, highly artifact laden reconstructions from conventional techniques, by effectively projecting onto the inferred image manifold. In particular, we use a… ▽ More Limited angle CT reconstruction is an under-determined linear inverse problem that requires appropriate regularization techniques to be solved. In this work we study how pre-trained generative adversarial networks (GANs) can be used to clean noisy, highly artifact laden reconstructions from conventional techniques, by effectively projecting onto the inferred image manifold. In particular, we use a robust version of the popularly used GAN prior for inverse problems, based on a recent technique called corruption mimicking, that significantly improves the reconstruction quality. The proposed approach operates in the image space directly, as a result of which it does not need to be trained or require access to the measurement model, is scanner agnostic, and can work over a wide range of sensing scenarios. △ Less

Submitted 29 January, 2020; v1 submitted 3 October, 2019; originally announced October 2019.

Comments: NeurIPS 2019 Workshop on Deep Inverse Problems

arXiv:1908.08920 [pdf, other]

doi 10.1038/s41560-020-0644-3

Automation is no barrier to light vehicle electrification

Authors: Aniruddh Mohan, Shashank Sripad, Parth Vaishnav, Venkatasubramanian Viswanathan

Abstract: Weight, computational load, sensor load, and possibly higher drag may increase the energy use of automated electric vehicles (AEVs) relative to human-driven electric vehicles (EVs), although this increase may be offset by smoother driving. We use a vehicle dynamics model to show that automation is likely to impose a minor penalty on EV range and have negligible effect on battery longevity. As such… ▽ More Weight, computational load, sensor load, and possibly higher drag may increase the energy use of automated electric vehicles (AEVs) relative to human-driven electric vehicles (EVs), although this increase may be offset by smoother driving. We use a vehicle dynamics model to show that automation is likely to impose a minor penalty on EV range and have negligible effect on battery longevity. As such, while some commentators have suggested that the power and energy requirements of automation mean that the first automated vehicles (AVs) will be gas-electric hybrids, we conclude that this need not be the case. We also find that drivers need to place only a modest value on the time saved by automation for its benefits to exceed direct costs. △ Less

Submitted 6 February, 2020; v1 submitted 7 August, 2019; originally announced August 2019.

Comments: 25 pages, 4 figures, 14 pages of Supporting Information

Journal ref: Nature Energy, (2020). Direct access link: https://rdcu.be/b5iR6

arXiv:1907.07627 [pdf, other]

A Secure Cloud with Minimal Provider Trust

Authors: Amin Mosayyebzadeh, Gerardo Ravago, Apoorve Mohan, Ali Raza, Sahil Tikale, Nabil Schear, Trammell Hudson, Jason Hennessey, Naved Ansari, Kyle Hogan, Charles Munson, Larry Rudolph, Gene Cooperman, Peter Desnoyers, Orran Krieger

Abstract: Bolted is a new architecture for a bare metal cloud with the goal of providing security-sensitive customers of a cloud the same level of security and control that they can obtain in their own private data centers. It allows tenants to elastically allocate secure resources within a cloud while being protected from other previous, current, and future tenants of the cloud. The provisioning of a new s… ▽ More Bolted is a new architecture for a bare metal cloud with the goal of providing security-sensitive customers of a cloud the same level of security and control that they can obtain in their own private data centers. It allows tenants to elastically allocate secure resources within a cloud while being protected from other previous, current, and future tenants of the cloud. The provisioning of a new server to a tenant isolates a bare metal server, only allowing it to communicate with other tenant's servers once its critical firmware and software have been attested to the tenant. Tenants, rather than the provider, control the tradeoffs between security, price, and performance. A prototype demonstrates scalable end-to-end security with small overhead compared to a less secure alternative. △ Less

Submitted 13 July, 2019; originally announced July 2019.

Comments: 7 Pages, 10th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '18). arXiv admin note: text overlap with arXiv:1907.06110

arXiv:1907.06110 [pdf, other]

Supporting Security Sensitive Tenants in a Bare-Metal Cloud

Authors: Amin Mosayyebzadeh, Apoorve Mohan, Sahil Tikale, Mania Abdi, Nabil Schear, Charles Munson, Trammell Hudson, Larry Rudolph, Gene Cooperman, Peter Desnoyers, Orran Krieger

Abstract: Bolted is a new architecture for bare-metal clouds that enables tenants to control tradeoffs between security, price, and performance. Security-sensitive tenants can minimize their trust in the public cloud provider and achieve similar levels of security and control that they can obtain in their own private data centers. At the same time, Bolted neither imposes overhead on tenants that are securit… ▽ More Bolted is a new architecture for bare-metal clouds that enables tenants to control tradeoffs between security, price, and performance. Security-sensitive tenants can minimize their trust in the public cloud provider and achieve similar levels of security and control that they can obtain in their own private data centers. At the same time, Bolted neither imposes overhead on tenants that are security insensitive nor compromises the flexibility or operational efficiency of the provider. Our prototype exploits a novel provisioning system and specialized firmware to enable elasticity similar to virtualized clouds. Experimentally we quantify the cost of different levels of security for a variety of workloads and demonstrate the value of giving control to the tenant. △ Less

Submitted 13 July, 2019; originally announced July 2019.

Comments: 16 Pages, 2019 USENIX Annual Technical Conference (ATC'19)

arXiv:1901.06347 [pdf, other]

doi 10.1109/MMUL.2018.2890255

Cloud Resource Optimization for Processing Multiple Streams of Visual Data

Authors: Zohar Kapach, Andrew Ulmer, Daniel Merrick, Arshad Alikhan, Yung-Hsiang Lu, Anup Mohan, Ahmed S. Kaseb, George K. Thiruvathukal

Abstract: Hundreds of millions of network cameras have been installed throughout the world. Each is capable of providing a vast amount of real-time data. Analyzing the massive data generated by these cameras requires significant computational resources and the demands may vary over time. Cloud computing shows the most promise to provide the needed resources on demand. In this article, we investigate how to… ▽ More Hundreds of millions of network cameras have been installed throughout the world. Each is capable of providing a vast amount of real-time data. Analyzing the massive data generated by these cameras requires significant computational resources and the demands may vary over time. Cloud computing shows the most promise to provide the needed resources on demand. In this article, we investigate how to allocate cloud resources when analyzing real-time data streams from network cameras. A resource manager considers many factors that affect its decisions, including the types of analysis, the number of data streams, and the locations of the cameras. The manager then selects the most cost-efficient types of cloud instances (e.g. CPU vs. GPGPU) to meet the computational demands for analyzing streams. We evaluate the effectiveness of our approach using Amazon Web Services. Experiments demonstrate more than 50% cost reduction for real workloads. △ Less

Submitted 18 January, 2019; originally announced January 2019.

Comments: IEEE MultiMedia Magazine

Showing 1–50 of 57 results for author: Mohan, A