subscribe to arXiv mailings

ColBERT-XM: A Modular Multi-Vector Representation Model for Zero-Shot Multilingual Information Retrieval

Authors: Antoine Louis, Vageesh Saxena, Gijs van Dijck, Gerasimos Spanakis

Abstract: State-of-the-art neural retrievers predominantly focus on high-resource languages like English, which impedes their adoption in retrieval scenarios involving other languages. Current approaches circumvent the lack of high-quality labeled data in non-English languages by leveraging multilingual pretrained language models capable of cross-lingual transfer. However, these models require substantial t… ▽ More State-of-the-art neural retrievers predominantly focus on high-resource languages like English, which impedes their adoption in retrieval scenarios involving other languages. Current approaches circumvent the lack of high-quality labeled data in non-English languages by leveraging multilingual pretrained language models capable of cross-lingual transfer. However, these models require substantial task-specific fine-tuning across multiple languages, often perform poorly in languages with minimal representation in the pretraining corpus, and struggle to incorporate new languages after the pretraining phase. In this work, we present a novel modular dense retrieval model that learns from the rich data of a single high-resource language and effectively zero-shot transfers to a wide array of languages, thereby eliminating the need for language-specific labeled data. Our model, ColBERT-XM, demonstrates competitive performance against existing state-of-the-art multilingual retrievers trained on more extensive datasets in various languages. Further analysis reveals that our modular approach is highly data-efficient, effectively adapts to out-of-distribution data, and significantly reduces energy consumption and carbon emissions. By demonstrating its proficiency in zero-shot scenarios, ColBERT-XM marks a shift towards more sustainable and inclusive retrieval systems, enabling effective information accessibility in numerous languages. We publicly release our code and models for the community. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: Under review. Code is available at https://github.com/ant-louis/xm-retrievers

arXiv:2401.08987 [pdf, other]

The Quantum Cryptography Approach: Unleashing the Potential of Quantum Key Reconciliation Protocol for Secure Communication

Authors: Neha Sharma, Vikas Saxena

Abstract: Quantum cryptography is the study of delivering secret communications across a quantum channel. Recently, Quantum Key Distribution (QKD) has been recognized as the most important breakthrough in quantum cryptography. This process facilitates two distant parties to share secure communications based on physical laws. The BB84 protocol was developed in 1984 and remains the most widely used among BB92… ▽ More Quantum cryptography is the study of delivering secret communications across a quantum channel. Recently, Quantum Key Distribution (QKD) has been recognized as the most important breakthrough in quantum cryptography. This process facilitates two distant parties to share secure communications based on physical laws. The BB84 protocol was developed in 1984 and remains the most widely used among BB92, Ekert91, COW, and SARG04 protocols. However the practical security of QKD with imperfect devices have been widely discussed, and there are many ways to guarantee that generated key by QKD still provides unconditional security. This paper proposed a novel method that allows users to communicate while generating the secure keys as well as securing the transmission without any leakage of the data. In this approach sender will never reveal her basis, hence neither the receiver nor the intruder will get knowledge of the fundamental basis.Further to detect Eve, polynomial interpolation is also used as a key verification technique. In order to fully utilize the quantum computing capabilities provided by IBM quantum computers, the protocol is executed using the Qiskit backend for 45 qubits. This article discusses a plot of % error against alpha (strength of eavesdropping). As a result, different types of noise have been included, and the success probability of the desired key bits has been determined. Furthermore, the success probability under depolarizing noise is explained for different qubit counts.Last but not least, even when the applied noise is increased to maximum capacity, a 50% probability of successful key generation is still observed in an experiment. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2311.01419 [pdf, other]

Constrained-Context Conditional Diffusion Models for Imitation Learning

Authors: Vaibhav Saxena, Yotto Koga, Danfei Xu

Abstract: Offline Imitation Learning (IL) is a powerful paradigm to learn visuomotor skills, especially for high-precision manipulation tasks. However, IL methods are prone to spurious correlation - expressive models may focus on distractors that are irrelevant to action prediction - and are thus fragile in real-world deployment. Prior methods have addressed this challenge by exploring different model archi… ▽ More Offline Imitation Learning (IL) is a powerful paradigm to learn visuomotor skills, especially for high-precision manipulation tasks. However, IL methods are prone to spurious correlation - expressive models may focus on distractors that are irrelevant to action prediction - and are thus fragile in real-world deployment. Prior methods have addressed this challenge by exploring different model architectures and action representations. However, none were able to balance between sample efficiency, robustness against distractors, and solving high-precision manipulation tasks with complex action space. To this end, we present $\textbf{C}$onstrained-$\textbf{C}$ontext $\textbf{C}$onditional $\textbf{D}$iffusion $\textbf{M}$odel (C3DM), a diffusion model policy for solving 6-DoF robotic manipulation tasks with high precision and ability to ignore distractions. A key component of C3DM is a fixation step that helps the action denoiser to focus on task-relevant regions around the predicted action while ignoring distractors in the context. We empirically show that C3DM is able to consistently achieve high success rate on a wide array of tasks, ranging from table top manipulation to industrial kitting, that require varying levels of precision and robustness to distractors. For details, please visit this https://sites.google.com/view/c3dm-imitation-learning △ Less

Submitted 2 November, 2023; originally announced November 2023.

arXiv:2310.05484 [pdf, other]

IDTraffickers: An Authorship Attribution Dataset to link and connect Potential Human-Trafficking Operations on Text Escort Advertisements

Authors: Vageesh Saxena, Benjamin Bashpole, Gijs Van Dijck, Gerasimos Spanakis

Abstract: Human trafficking (HT) is a pervasive global issue affecting vulnerable individuals, violating their fundamental human rights. Investigations reveal that a significant number of HT cases are associated with online advertisements (ads), particularly in escort markets. Consequently, identifying and connecting HT vendors has become increasingly challenging for Law Enforcement Agencies (LEAs). To addr… ▽ More Human trafficking (HT) is a pervasive global issue affecting vulnerable individuals, violating their fundamental human rights. Investigations reveal that a significant number of HT cases are associated with online advertisements (ads), particularly in escort markets. Consequently, identifying and connecting HT vendors has become increasingly challenging for Law Enforcement Agencies (LEAs). To address this issue, we introduce IDTraffickers, an extensive dataset consisting of 87,595 text ads and 5,244 vendor labels to enable the verification and identification of potential HT vendors on online escort markets. To establish a benchmark for authorship identification, we train a DeCLUTR-small model, achieving a macro-F1 score of 0.8656 in a closed-set classification environment. Next, we leverage the style representations extracted from the trained classifier to conduct authorship verification, resulting in a mean r-precision score of 0.8852 in an open-set ranking environment. Finally, to encourage further research and ensure responsible data sharing, we plan to release IDTraffickers for the authorship attribution task to researchers under specific conditions, considering the sensitive nature of the data. We believe that the availability of our dataset and benchmarks will empower future researchers to utilize our findings, thereby facilitating the effective linkage of escort ads and the development of more robust approaches for identifying HT indicators. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2305.17252 [pdf, other]

Generalizable Pose Estimation Using Implicit Scene Representations

Authors: Vaibhav Saxena, Kamal Rahimi Malekshan, Linh Tran, Yotto Koga

Abstract: 6-DoF pose estimation is an essential component of robotic manipulation pipelines. However, it usually suffers from a lack of generalization to new instances and object types. Most widely used methods learn to infer the object pose in a discriminative setup where the model filters useful information to infer the exact pose of the object. While such methods offer accurate poses, the model does not… ▽ More 6-DoF pose estimation is an essential component of robotic manipulation pipelines. However, it usually suffers from a lack of generalization to new instances and object types. Most widely used methods learn to infer the object pose in a discriminative setup where the model filters useful information to infer the exact pose of the object. While such methods offer accurate poses, the model does not store enough information to generalize to new objects. In this work, we address the generalization capability of pose estimation using models that contain enough information about the object to render it in different poses. We follow the line of work that inverts neural renderers to infer the pose. We propose i-$σ$SRN to maximize the information flowing from the input pose to the rendered scene and invert them to infer the pose given an input image. Specifically, we extend Scene Representation Networks (SRNs) by incorporating a separate network for density estimation and introduce a new way of obtaining a weighted scene representation. We investigate several ways of initial pose estimates and losses for the neural renderer. Our final evaluation shows a significant improvement in inference performance and speed compared to existing approaches. △ Less

Submitted 26 May, 2023; originally announced May 2023.

arXiv:2305.02763 [pdf, other]

doi 10.18653/v1/2023.acl-long.481

VendorLink: An NLP approach for Identifying & Linking Vendor Migrants & Potential Aliases on Darknet Markets

Authors: Vageesh Saxena, Nils Rethmeier, Gijs Van Dijck, Gerasimos Spanakis

Abstract: The anonymity on the Darknet allows vendors to stay undetected by using multiple vendor aliases or frequently migrating between markets. Consequently, illegal markets and their connections are challenging to uncover on the Darknet. To identify relationships between illegal markets and their vendors, we propose VendorLink, an NLP-based approach that examines writing patterns to verify, identify, an… ▽ More The anonymity on the Darknet allows vendors to stay undetected by using multiple vendor aliases or frequently migrating between markets. Consequently, illegal markets and their connections are challenging to uncover on the Darknet. To identify relationships between illegal markets and their vendors, we propose VendorLink, an NLP-based approach that examines writing patterns to verify, identify, and link unique vendor accounts across text advertisements (ads) on seven public Darknet markets. In contrast to existing literature, VendorLink utilizes the strength of supervised pre-training to perform closed-set vendor verification, open-set vendor identification, and low-resource market adaption tasks. Through VendorLink, we uncover (i) 15 migrants and 71 potential aliases in the Alphabay-Dreams-Silk dataset, (ii) 17 migrants and 3 potential aliases in the Valhalla-Berlusconi dataset, and (iii) 75 migrants and 10 potential aliases in the Traderoute-Agora dataset. Altogether, our approach can help Law Enforcement Agencies (LEA) make more informed decisions by verifying and identifying migrating vendors and their potential aliases on existing and Low-Resource (LR) emerging Darknet markets. △ Less

Submitted 4 May, 2023; originally announced May 2023.

arXiv:2110.02737 [pdf, ps, other]

Analysis of Trade-offs in RF Photonic Links based on Multi-Bias Tuning of Silicon Photonic Ring-Assisted Mach Zehnder Modulators

Authors: Md Jubayer Shawon, Vishal Saxena

Abstract: Recent progress in silicon-based photonic integrated circuits (PICs) have opened new avenues for analog circuit designers to explore hybrid integration of photonics with CMOS ICs. Traditionally, optoelectronic systems are designed using discrete optics and electronics. Silicon photonic (SiP) platforms provide the opportunity to realize these systems in a compact chip-scale form factor and alleviat… ▽ More Recent progress in silicon-based photonic integrated circuits (PICs) have opened new avenues for analog circuit designers to explore hybrid integration of photonics with CMOS ICs. Traditionally, optoelectronic systems are designed using discrete optics and electronics. Silicon photonic (SiP) platforms provide the opportunity to realize these systems in a compact chip-scale form factor and alleviate long-standing challenges in optoelectronics. In this work, we analyze multi-bias tuning in Ring-Assisted Mach Zehnder Modulator (RAMZM) and resulting trade-offs in analog RF photonic links realized using RAMZMs. Multi-bias tuning in the rings and the Mach-Zehnder arms allow informed trade-offs between link noise figure and linearity. We derive performance metrics including gain, noise figure, and linearity metrics associated with tuning of multiple bias settings in RAMZM based links and present resulting design optimization. Compared to MZM, an improvement of 18 dB/Hz$^{\frac{2}{3}}$ in SFDR is noted when RAMZM is linearized. We also propose a biasing scheme for RAMZM that provides 6x improvement in slope efficiency, or equivalently, 15.56dB in power Gain over MZMs (single drive) while still providing similar SFDR performance ($\sim$ 109 dB/Hz$^{\frac{2}{3}}$) as MZMs. Moreover, a method to improve gain in photodiode saturation limited links is presented and studied. △ Less

Submitted 27 September, 2021; originally announced October 2021.

Comments: 11 pages, 21 figures, Updated version of this work with more experimental results will be published in other relevant journals

arXiv:2107.06570 [pdf, other]

doi 10.1109/PIMRC50174.2021.9569581

QoS-Aware Scheduling in New Radio Using Deep Reinforcement Learning

Authors: Jakob Stigenberg, Vidit Saxena, Soma Tayamon, Euhanna Ghadimi

Abstract: Fifth-generation (5G) New Radio (NR) cellular networks support a wide range of new services, many of which require an application-specific quality of service (QoS), e.g. in terms of a guaranteed minimum bit-rate or a maximum tolerable delay. Therefore, scheduling multiple parallel data flows, each serving a unique application instance, is bound to become an even more challenging task compared to t… ▽ More Fifth-generation (5G) New Radio (NR) cellular networks support a wide range of new services, many of which require an application-specific quality of service (QoS), e.g. in terms of a guaranteed minimum bit-rate or a maximum tolerable delay. Therefore, scheduling multiple parallel data flows, each serving a unique application instance, is bound to become an even more challenging task compared to the previous generations. Leveraging recent advances in deep reinforcement learning, in this paper, we propose a QoS-Aware Deep Reinforcement learning Agent (QADRA) scheduler for NR networks. In contrast to state-of-the-art scheduling heuristics, the QADRA scheduler explicitly optimizes for the QoS satisfaction rate while simultaneously maximizing the network performance. Moreover, we train our algorithm end-to-end on these objectives. We evaluate QADRA in a full scale, near-product, system level NR simulator and demonstrate a significant boost in network performance. In our particular evaluation scenario, the QADRA scheduler improves network throughput by 30% while simultaneously maintaining the QoS satisfaction rate of VoIP users served by the network, compared to state-of-the-art baselines. △ Less

Submitted 14 July, 2021; originally announced July 2021.

arXiv:2102.09532 [pdf, other]

Clockwork Variational Autoencoders

Authors: Vaibhav Saxena, Jimmy Ba, Danijar Hafner

Abstract: Deep learning has enabled algorithms to generate realistic images. However, accurately predicting long video sequences requires understanding long-term dependencies and remains an open challenge. While existing video prediction models succeed at generating sharp images, they tend to fail at accurately predicting far into the future. We introduce the Clockwork VAE (CW-VAE), a video prediction model… ▽ More Deep learning has enabled algorithms to generate realistic images. However, accurately predicting long video sequences requires understanding long-term dependencies and remains an open challenge. While existing video prediction models succeed at generating sharp images, they tend to fail at accurately predicting far into the future. We introduce the Clockwork VAE (CW-VAE), a video prediction model that leverages a hierarchy of latent sequences, where higher levels tick at slower intervals. We demonstrate the benefits of both hierarchical latents and temporal abstraction on 4 diverse video prediction datasets with sequences of up to 1000 frames, where CW-VAE outperforms top video prediction models. Additionally, we propose a Minecraft benchmark for long-term video prediction. We conduct several experiments to gain insights into CW-VAE and confirm that slower levels learn to represent objects that change more slowly in the video, and faster levels learn to represent faster objects. △ Less

Submitted 20 February, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

Comments: 17 pages, 12 figures, 4 tables

arXiv:2101.06665 [pdf, other]

Brightening the Optical Flow through Posit Arithmetic

Authors: Vinay Saxena, Ankitha Reddy, Jonathan Neudorfer, John Gustafson, Sangeeth Nambiar, Rainer Leupers, Farhad Merchant

Abstract: As new technologies are invented, their commercial viability needs to be carefully examined along with their technical merits and demerits. The posit data format, proposed as a drop-in replacement for IEEE 754 float format, is one such invention that requires extensive theoretical and experimental study to identify products that can benefit from the advantages of posits for specific market segment… ▽ More As new technologies are invented, their commercial viability needs to be carefully examined along with their technical merits and demerits. The posit data format, proposed as a drop-in replacement for IEEE 754 float format, is one such invention that requires extensive theoretical and experimental study to identify products that can benefit from the advantages of posits for specific market segments. In this paper, we present an extensive empirical study of posit-based arithmetic vis-à-vis IEEE 754 compliant arithmetic for the optical flow estimation method called Lucas-Kanade (LuKa). First, we use SoftPosit and SoftFloat format emulators to perform an empirical error analysis of the LuKa method. Our study shows that the average error in LuKa with SoftPosit is an order of magnitude lower than LuKa with SoftFloat. We then present the integration of the hardware implementation of a posit adder and multiplier in a RISC-V open-source platform. We make several recommendations, along with the analysis of LuKa in the RISC-V context, for future generation platforms incorporating posit arithmetic units. △ Less

Submitted 17 January, 2021; originally announced January 2021.

Comments: To appear in ISQED 2021

arXiv:2010.08651 [pdf, other]

Reinforcement Learning for Efficient and Tuning-Free Link Adaptation

Authors: Vidit Saxena, Hugo Tullberg, Joakim Jaldén

Abstract: Wireless links adapt the data transmission parameters to the dynamic channel state -- this is called link adaptation. Classical link adaptation relies on tuning parameters that are challenging to configure for optimal link performance. Recently, reinforcement learning has been proposed to automate link adaptation, where the transmission parameters are modeled as discrete arms of a multi-armed band… ▽ More Wireless links adapt the data transmission parameters to the dynamic channel state -- this is called link adaptation. Classical link adaptation relies on tuning parameters that are challenging to configure for optimal link performance. Recently, reinforcement learning has been proposed to automate link adaptation, where the transmission parameters are modeled as discrete arms of a multi-armed bandit. In this context, we propose a latent learning model for link adaptation that exploits the correlation between data transmission parameters. Further, motivated by the recent success of Thompson sampling for multi-armed bandit problems, we propose a latent Thompson sampling (LTS) algorithm that quickly learns the optimal parameters for a given channel state. We extend LTS to fading wireless channels through a tuning-free mechanism that automatically tracks the channel dynamics. In numerical evaluations with fading wireless channels, LTS improves the link throughout by up to 100% compared to the state-of-the-art link adaptation algorithms. △ Less

Submitted 4 May, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

Comments: 30 pages, 4 figures

arXiv:2006.13878 [pdf, other]

Effective Elastic Scaling of Deep Learning Workloads

Authors: Vaibhav Saxena, K. R. Jayaram, Saurav Basu, Yogish Sabharwal, Ashish Verma

Abstract: The increased use of deep learning (DL) in academia, government and industry has, in turn, led to the popularity of on-premise and cloud-hosted deep learning platforms, whose goals are to enable organizations utilize expensive resources effectively, and to share said resources among multiple teams in a fair and effective manner. In this paper, we examine the elastic scaling of Deep Learning (DL)… ▽ More The increased use of deep learning (DL) in academia, government and industry has, in turn, led to the popularity of on-premise and cloud-hosted deep learning platforms, whose goals are to enable organizations utilize expensive resources effectively, and to share said resources among multiple teams in a fair and effective manner. In this paper, we examine the elastic scaling of Deep Learning (DL) jobs over large-scale training platforms and propose a novel resource allocation strategy for DL training jobs, resulting in improved job run time performance as well as increased cluster utilization. We begin by analyzing DL workloads and exploit the fact that DL jobs can be run with a range of batch sizes without affecting their final accuracy. We formulate an optimization problem that explores a dynamic batch size allocation to individual DL jobs based on their scaling efficiency, when running on multiple nodes. We design a fast dynamic programming based optimizer to solve this problem in real-time to determine jobs that can be scaled up/down, and use this optimizer in an autoscaler to dynamically change the allocated resources and batch sizes of individual DL jobs. We demonstrate empirically that our elastic scaling algorithm can complete up to $\approx 2 \times$ as many jobs as compared to a strong baseline algorithm that also scales the number of GPUs but does not change the batch size. We also demonstrate that the average completion time with our algorithm is up to $\approx 10 \times$ faster than that of the baseline. △ Less

Submitted 24 June, 2020; originally announced June 2020.

arXiv:2005.04167 [pdf, ps, other]

doi 10.1145/3407197.3407213

Continuous Learning in a Single-Incremental-Task Scenario with Spike Features

Authors: Ruthvik Vaila, John Chiasson, Vishal Saxena

Abstract: Deep Neural Networks (DNNs) have two key deficiencies, their dependence on high precision computing and their inability to perform sequential learning, that is, when a DNN is trained on a first task and the same DNN is trained on the next task it forgets the first task. This phenomenon of forgetting previous tasks is also referred to as catastrophic forgetting. On the other hand a mammalian brain… ▽ More Deep Neural Networks (DNNs) have two key deficiencies, their dependence on high precision computing and their inability to perform sequential learning, that is, when a DNN is trained on a first task and the same DNN is trained on the next task it forgets the first task. This phenomenon of forgetting previous tasks is also referred to as catastrophic forgetting. On the other hand a mammalian brain outperforms DNNs in terms of energy efficiency and the ability to learn sequentially without catastrophically forgetting. Here, we use bio-inspired Spike Timing Dependent Plasticity (STDP)in the feature extraction layers of the network with instantaneous neurons to extract meaningful features. In the classification sections of the network we use a modified synaptic intelligence that we refer to as cost per synapse metric as a regularizer to immunize the network against catastrophic forgetting in a Single-Incremental-Task scenario (SIT). In this study, we use MNIST handwritten digits dataset that was divided into five sub-tasks. △ Less

Submitted 3 May, 2020; originally announced May 2020.

Comments: Submitted to ICONS 2020

Journal ref: nternational Conference on Neuromorphic Systems 2020

arXiv:2004.09258 [pdf, other]

Thompson Sampling for Linearly Constrained Bandits

Authors: Vidit Saxena, Joseph E. Gonzalez, Joakim Jaldén

Abstract: We address multi-armed bandits (MAB) where the objective is to maximize the cumulative reward under a probabilistic linear constraint. For a few real-world instances of this problem, constrained extensions of the well-known Thompson Sampling (TS) heuristic have recently been proposed. However, finite-time analysis of constrained TS is challenging; as a result, only O(\sqrt{T}) bounds on the cumula… ▽ More We address multi-armed bandits (MAB) where the objective is to maximize the cumulative reward under a probabilistic linear constraint. For a few real-world instances of this problem, constrained extensions of the well-known Thompson Sampling (TS) heuristic have recently been proposed. However, finite-time analysis of constrained TS is challenging; as a result, only O(\sqrt{T}) bounds on the cumulative reward loss (i.e., the regret) are available. In this paper, we describe LinConTS, a TS-based algorithm for bandits that place a linear constraint on the probability of earning a reward in every round. We show that for LinConTS, the regret as well as the cumulative constraint violations are upper bounded by O(\log T) for the suboptimal arms. We develop a proof technique that relies on careful analysis of the dual problem and combine it with recent theoretical work on unconstrained TS. Through numerical experiments on two real-world datasets, we demonstrate that LinConTS outperforms an asymptotically optimal upper confidence bound (UCB) scheme in terms of simultaneously minimizing the regret and the violation. △ Less

Submitted 12 May, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

Comments: 10 pages, 2 figures, updated version of paper accepted at AISTATS2020

arXiv:2002.11843 [pdf, ps, other]

A Deep Unsupervised Feature Learning Spiking Neural Network with Binarized Classification Layers for EMNIST Classification using SpykeFlow

Authors: Ruthvik Vaila, John Chiasson, Vishal Saxena

Abstract: End user AI is trained on large server farms with data collected from the users. With ever increasing demand for IOT devices, there is a need for deep learning approaches that can be implemented (at the edge) in an energy efficient manner. In this work we approach this using spiking neural networks. The unsupervised learning technique of spike timing dependent plasticity (STDP) using binary activa… ▽ More End user AI is trained on large server farms with data collected from the users. With ever increasing demand for IOT devices, there is a need for deep learning approaches that can be implemented (at the edge) in an energy efficient manner. In this work we approach this using spiking neural networks. The unsupervised learning technique of spike timing dependent plasticity (STDP) using binary activations are used to extract features from spiking input data. Gradient descent (backpropagation) is used only on the output layer to perform the training for classification. The accuracies obtained for the balanced EMNIST data set compare favorably with other approaches. The effect of stochastic gradient descent (SGD) approximations on learning capabilities of our network are also explored. △ Less

Submitted 28 October, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

Comments: A section of of this work is Submitted to IEEE TETCI 2020 Journal

arXiv:1912.00982 [pdf, other]

TX-Ray: Quantifying and Explaining Model-Knowledge Transfer in (Un-)Supervised NLP

Authors: Nils Rethmeier, Vageesh Kumar Saxena, Isabelle Augenstein

Abstract: While state-of-the-art NLP explainability (XAI) methods focus on explaining per-sample decisions in supervised end or probing tasks, this is insufficient to explain and quantify model knowledge transfer during (un-)supervised training. Thus, for TX-Ray, we modify the established computer vision explainability principle of 'visualizing preferred inputs of neurons' to make it usable transfer analysi… ▽ More While state-of-the-art NLP explainability (XAI) methods focus on explaining per-sample decisions in supervised end or probing tasks, this is insufficient to explain and quantify model knowledge transfer during (un-)supervised training. Thus, for TX-Ray, we modify the established computer vision explainability principle of 'visualizing preferred inputs of neurons' to make it usable transfer analysis and NLP. This allows one to analyze, track and quantify how self- or supervised NLP models first build knowledge abstractions in pretraining (1), and then transfer these abstractions to a new domain (2), or adapt them during supervised fine-tuning (3). TX-Ray expresses neurons as feature preference distributions to quantify fine-grained knowledge transfer or adaptation and guide human analysis. We find that, similar to Lottery Ticket based pruning, TX-Ray based pruning can improve test set generalization and that it can reveal how early stages of self-supervision automatically learn linguistic abstractions like parts-of-speech. △ Less

Submitted 19 June, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

arXiv:1906.11902 [pdf, other]

doi 10.1145/3372278.3390694

PredNet and Predictive Coding: A Critical Review

Authors: Roshan Rane, Edit Szügyi, Vageesh Saxena, André Ofner, Sebastian Stober

Abstract: PredNet, a deep predictive coding network developed by Lotter et al., combines a biologically inspired architecture based on the propagation of prediction error with self-supervised representation learning in video. While the architecture has drawn a lot of attention and various extensions of the model exist, there is a lack of a critical analysis. We fill in the gap by evaluating PredNet both as… ▽ More PredNet, a deep predictive coding network developed by Lotter et al., combines a biologically inspired architecture based on the propagation of prediction error with self-supervised representation learning in video. While the architecture has drawn a lot of attention and various extensions of the model exist, there is a lack of a critical analysis. We fill in the gap by evaluating PredNet both as an implementation of the predictive coding theory and as a self-supervised video prediction model using a challenging video action classification dataset. We design an extended model to test if conditioning future frame predictions on the action class of the video improves the model performance. We show that PredNet does not yet completely follow the principles of predictive coding. The proposed top-down conditioning leads to a performance gain on synthetic data, but does not scale up to the more complex real-world action classification dataset. Our analysis is aimed at guiding future research on similar architectures based on the predictive coding theory. △ Less

Submitted 18 May, 2020; v1 submitted 14 June, 2019; originally announced June 2019.

arXiv:1903.12272 [pdf, other]

Deep Convolutional Spiking Neural Networks for Image Classification

Authors: Ruthvik Vaila, John Chiasson, Vishal Saxena

Abstract: Spiking neural networks are biologically plausible counterparts of the artificial neural networks, artificial neural networks are usually trained with stochastic gradient descent and spiking neural networks are trained with spike timing dependant plasticity. Training deep convolutional neural networks is a memory and power intensive job. Spiking networks could potentially help in reducing the powe… ▽ More Spiking neural networks are biologically plausible counterparts of the artificial neural networks, artificial neural networks are usually trained with stochastic gradient descent and spiking neural networks are trained with spike timing dependant plasticity. Training deep convolutional neural networks is a memory and power intensive job. Spiking networks could potentially help in reducing the power usage. There is a large pool of tools for one to chose to train artificial neural networks of any size, on the other hand all the available tools to simulate spiking neural networks are geared towards computational neuroscience applications and they are not suitable for real life applications. In this work we focus on implementing a spiking CNN using Tensorflow to examine behaviour of the network and empirically study the effect of various parameters on learning capabilities and also study catastrophic forgetting in the spiking CNN and weight initialization problem in R-STDP using MNIST and N-MNIST data sets. △ Less

Submitted 25 September, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

arXiv:1903.03234 [pdf, other]

Dyna-AIL : Adversarial Imitation Learning by Planning

Authors: Vaibhav Saxena, Srinivasan Sivanandan, Pulkit Mathur

Abstract: Adversarial methods for imitation learning have been shown to perform well on various control tasks. However, they require a large number of environment interactions for convergence. In this paper, we propose an end-to-end differentiable adversarial imitation learning algorithm in a Dyna-like framework for switching between model-based planning and model-free learning from expert data. Our results… ▽ More Adversarial methods for imitation learning have been shown to perform well on various control tasks. However, they require a large number of environment interactions for convergence. In this paper, we propose an end-to-end differentiable adversarial imitation learning algorithm in a Dyna-like framework for switching between model-based planning and model-free learning from expert data. Our results on both discrete and continuous environments show that our approach of using model-based planning along with model-free learning converges to an optimal policy with fewer number of environment interactions in comparison to the state-of-the-art learning methods. △ Less

Submitted 7 March, 2019; originally announced March 2019.

Comments: 8 pages, 6 figures, pre-print

arXiv:1902.11102 [pdf, other]

Constrained Thompson Sampling for Wireless Link Optimization

Authors: Vidit Saxena, Joseph E. Gonzalez, Ion Stoica, Hugo Tullberg, Joakim Jaldén

Abstract: Wireless communication systems operate in complex time-varying environments. Therefore, selecting the optimal configuration parameters in these systems is a challenging problem. For wireless links, \emph{rate selection} is used to select the optimal data transmission rate that maximizes the link throughput subject to an application-defined latency constraint. We model rate selection as a stochasti… ▽ More Wireless communication systems operate in complex time-varying environments. Therefore, selecting the optimal configuration parameters in these systems is a challenging problem. For wireless links, \emph{rate selection} is used to select the optimal data transmission rate that maximizes the link throughput subject to an application-defined latency constraint. We model rate selection as a stochastic multi-armed bandit (MAB) problem, where a finite set of transmission rates are modeled as independent bandit arms. For this setup, we propose Con-TS, a novel constrained version of the Thompson sampling algorithm, where the latency requirement is modeled by a high-probability linear constraint. We show that for Con-TS, the expected number of constraint violations over T transmission intervals is upper bounded by O(\sqrt{KT}), where K is the number of available rates. Further, the expected loss in cumulative throughput compared to the optimal rate selection scheme (i.e., the egret is also upper bounded by O(\sqrt{KT \log K}). Through numerical simulations, we demonstrate that Con-TS significantly outperforms state-of-the-art bandit schemes for rate selection. △ Less

Submitted 18 April, 2020; v1 submitted 28 February, 2019; originally announced February 2019.

Comments: 11 pages, 2 figures. Revised version containing theoretical performance bounds

arXiv:1802.02342 [pdf, other]

Energy-Efficient CMOS Memristive Synapses for Mixed-Signal Neuromorphic System-on-a-Chip

Authors: Vishal Saxena, Xinyu Wu, Kehan Zhu

Abstract: Emerging non-volatile memory (NVM), or memristive, devices promise energy-efficient realization of deep learning, when efficiently integrated with mixed-signal integrated circuits on a CMOS substrate. Even though several algorithmic challenges need to be addressed to turn the vision of memristive Neuromorphic Systems-on-a-Chip (NeuSoCs) into reality, issues at the device and circuit interface need… ▽ More Emerging non-volatile memory (NVM), or memristive, devices promise energy-efficient realization of deep learning, when efficiently integrated with mixed-signal integrated circuits on a CMOS substrate. Even though several algorithmic challenges need to be addressed to turn the vision of memristive Neuromorphic Systems-on-a-Chip (NeuSoCs) into reality, issues at the device and circuit interface need immediate attention from the community. In this work, we perform energy-estimation of a NeuSoC system and predict the desirable circuit and device parameters for energy-efficiency optimization. Also, CMOS synapse circuits based on the concept of CMOS memristor emulator are presented as a system prototyping methodology, while practical memristor devices are being developed and integrated with general-purpose CMOS. The proposed mixed-signal memristive synapse can be designed and fabricated using standard CMOS technologies and open doors to interesting applications in cognitive computing circuits. △ Less

Submitted 20 April, 2018; v1 submitted 7 February, 2018; originally announced February 2018.

Comments: This is a preprint of proceedings in IEEE International Symposium on Circuits and Systems (ISCAS), May 2018. Copyright 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See \URL{http://www.ieee.org/publications\_standards/publications/rights/index.html} for more information

arXiv:1801.02797 [pdf]

doi 10.1109/TNANO.2018.2871680

Dendritic-Inspired Processing Enables Bio-Plausible STDP in Compound Binary Synapses

Authors: Xinyu Wu, Vishal Saxena

Abstract: Brain-inspired learning mechanisms, e.g. spike timing dependent plasticity (STDP), enable agile and fast on-the-fly adaptation capability in a spiking neural network. When incorporating emerging nanoscale resistive non-volatile memory (NVM) devices, with ultra-low power consumption and high-density integration capability, a spiking neural network hardware would result in several orders of magnitud… ▽ More Brain-inspired learning mechanisms, e.g. spike timing dependent plasticity (STDP), enable agile and fast on-the-fly adaptation capability in a spiking neural network. When incorporating emerging nanoscale resistive non-volatile memory (NVM) devices, with ultra-low power consumption and high-density integration capability, a spiking neural network hardware would result in several orders of magnitude reduction in energy consumption at a very small form factor and potentially herald autonomous learning machines. However, actual memory devices have shown to be intrinsically binary with stochastic switching, and thus impede the realization of ideal STDP with continuous analog values. In this work, a dendritic-inspired processing architecture is proposed in addition to novel CMOS neuron circuits. The utilization of spike attenuations and delays transforms the traditionally undesired stochastic behavior of binary NVMs into a useful leverage that enables biologically-plausible STDP learning. As a result, this work paves a pathway to adopt practical binary emerging NVM devices in brain-inspired neuromorphic computing. △ Less

Submitted 9 January, 2018; originally announced January 2018.

arXiv:1711.06819 [pdf, other]

A Compact CMOS Memristor Emulator Circuit and its Applications

Authors: Vishal Saxena

Abstract: Conceptual memristors have recently gathered wider interest due to their diverse application in non-von Neumann computing, machine learning, neuromorphic computing, and chaotic circuits. We introduce a compact CMOS circuit that emulates idealized memristor characteristics and can bridge the gap between concepts to chip-scale realization by transcending device challenges. The CMOS memristor circuit… ▽ More Conceptual memristors have recently gathered wider interest due to their diverse application in non-von Neumann computing, machine learning, neuromorphic computing, and chaotic circuits. We introduce a compact CMOS circuit that emulates idealized memristor characteristics and can bridge the gap between concepts to chip-scale realization by transcending device challenges. The CMOS memristor circuit embodies a two-terminal variable resistor whose resistance is controlled by the voltage applied across its terminals. The memristor 'state' is held in a capacitor that controls the resistor value. This work presents the design and simulation of the memristor emulation circuit, and applies it to a memcomputing application of maze solving using analog parallelism. Furthermore, the memristor emulator circuit can be designed and fabricated using standard commercial CMOS technologies and opens doors to interesting applications in neuromorphic and machine learning circuits. △ Less

Submitted 18 November, 2017; originally announced November 2017.

Comments: Submitted to International Symposium of Circuits and Systems (ISCAS) 2018

arXiv:1711.00705 [pdf, other]

Efficient Training of Convolutional Neural Nets on Large Distributed Systems

Authors: Sameer Kumar, Dheeraj Sreedhar, Vaibhav Saxena, Yogish Sabharwal, Ashish Verma

Abstract: Deep Neural Networks (DNNs) have achieved im- pressive accuracy in many application domains including im- age classification. Training of DNNs is an extremely compute- intensive process and is solved using variants of the stochastic gradient descent (SGD) algorithm. A lot of recent research has focussed on improving the performance of DNN training. In this paper, we present optimization techniques… ▽ More Deep Neural Networks (DNNs) have achieved im- pressive accuracy in many application domains including im- age classification. Training of DNNs is an extremely compute- intensive process and is solved using variants of the stochastic gradient descent (SGD) algorithm. A lot of recent research has focussed on improving the performance of DNN training. In this paper, we present optimization techniques to improve the performance of the data parallel synchronous SGD algorithm using the Torch framework: (i) we maintain data in-memory to avoid file I/O overheads, (ii) we present a multi-color based MPI Allreduce algorithm to minimize communication overheads, and (iii) we propose optimizations to the Torch data parallel table framework that handles multi-threading. We evaluate the performance of our optimizations on a Power 8 Minsky cluster with 32 nodes and 128 NVidia Pascal P100 GPUs. With our optimizations, we are able to train 90 epochs of the ResNet-50 model on the Imagenet-1k dataset using 256 GPUs in just 48 minutes. This significantly improves on the previously best known performance of training 90 epochs of the ResNet-50 model on the same dataset using 256 GPUs in 65 minutes. To the best of our knowledge, this is the best known training performance demonstrated for the Imagenet- 1k dataset. △ Less

Submitted 2 November, 2017; originally announced November 2017.

arXiv:1708.02188 [pdf, ps, other]

PowerAI DDL

Authors: Minsik Cho, Ulrich Finkler, Sameer Kumar, David Kung, Vaibhav Saxena, Dheeraj Sreedhar

Abstract: As deep neural networks become more complex and input datasets grow larger, it can take days or even weeks to train a deep neural network to the desired accuracy. Therefore, distributed Deep Learning at a massive scale is a critical capability, since it offers the potential to reduce the training time from weeks to hours. In this paper, we present a software-hardware co-optimized distributed Deep… ▽ More As deep neural networks become more complex and input datasets grow larger, it can take days or even weeks to train a deep neural network to the desired accuracy. Therefore, distributed Deep Learning at a massive scale is a critical capability, since it offers the potential to reduce the training time from weeks to hours. In this paper, we present a software-hardware co-optimized distributed Deep Learning system that can achieve near-linear scaling up to hundreds of GPUs. The core algorithm is a multi-ring communication pattern that provides a good tradeoff between latency and bandwidth and adapts to a variety of system configurations. The communication algorithm is implemented as a library for easy use. This library has been integrated into Tensorflow, Caffe, and Torch. We train Resnet-101 on Imagenet 22K with 64 IBM Power8 S822LC servers (256 GPUs) in about 7 hours to an accuracy of 33.8 % validation accuracy. Microsoft's ADAM and Google's DistBelief results did not reach 30 % validation accuracy for Imagenet 22K. Compared to Facebook AI Research's recent paper on 256 GPU training, we use a different communication algorithm, and our combined software and hardware system offers better communication overhead for Resnet-50. A PowerAI DDL enabled version of Torch completed 90 epochs of training on Resnet 50 for 1K classes in 50 minutes using 64 IBM Power8 S822LC servers (256 GPUs). △ Less

Submitted 7 August, 2017; originally announced August 2017.

arXiv:1612.01491 [pdf]

Enabling Bio-Plausible Multi-level STDP using CMOS Neurons with Dendrites and Bistable RRAMs

Authors: Xinyu Wu, Vishal Saxena

Abstract: Large-scale integration of emerging nanoscale non-volatile memory devices, e.g. resistive random-access memory (RRAM), can enable a new generation of neuromorphic computers that can solve a wide range of machine learning problems. Such hybrid CMOS-RRAM neuromorphic architectures will result in several orders of magnitude reduction in energy consumption at a very small form factor, and herald auton… ▽ More Large-scale integration of emerging nanoscale non-volatile memory devices, e.g. resistive random-access memory (RRAM), can enable a new generation of neuromorphic computers that can solve a wide range of machine learning problems. Such hybrid CMOS-RRAM neuromorphic architectures will result in several orders of magnitude reduction in energy consumption at a very small form factor, and herald autonomous learning machines capable of self-adapting to their environment. However, the progress in this area has been impeded from the realization that the actual memory devices fall well short of their expected behavior. In this work, we discuss the challenges associated with these memory devices and their use in neuromorphic computing circuits, and propose pathways to overcome these limitations by introducing 'dendritic learning'. △ Less

Submitted 18 December, 2016; v1 submitted 5 December, 2016; originally announced December 2016.

arXiv:1506.01072 [pdf]

doi 10.1109/JETCAS.2015.2433552

Homogeneous Spiking Neuromorphic System for Real-World Pattern Recognition

Authors: Xinyu Wu, Vishal Saxena, Kehan Zhu

Abstract: A neuromorphic chip that combines CMOS analog spiking neurons and memristive synapses offers a promising solution to brain-inspired computing, as it can provide massive neural network parallelism and density. Previous hybrid analog CMOS-memristor approaches required extensive CMOS circuitry for training, and thus eliminated most of the density advantages gained by the adoption of memristor synapse… ▽ More A neuromorphic chip that combines CMOS analog spiking neurons and memristive synapses offers a promising solution to brain-inspired computing, as it can provide massive neural network parallelism and density. Previous hybrid analog CMOS-memristor approaches required extensive CMOS circuitry for training, and thus eliminated most of the density advantages gained by the adoption of memristor synapses. Further, they used different waveforms for pre and post-synaptic spikes that added undesirable circuit overhead. Here we describe a hardware architecture that can feature a large number of memristor synapses to learn real-world patterns. We present a versatile CMOS neuron that combines integrate-and-fire behavior, drives passive memristors and implements competitive learning in a compact circuit module, and enables in-situ plasticity in the memristor synapses. We demonstrate handwritten-digits recognition using the proposed architecture using transistor-level circuit simulations. As the described neuromorphic architecture is homogeneous, it realizes a fundamental building block for large-scale energy-efficient brain-inspired silicon chips that could lead to next-generation cognitive computing. △ Less

Submitted 8 June, 2015; v1 submitted 2 June, 2015; originally announced June 2015.

Comments: This is a preprint of an article accepted for publication in IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol 5, no. 2, June 2015

Journal ref: IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol 5, no. 2, June 2015

arXiv:1506.01069 [pdf]

A CMOS Spiking Neuron for Dense Memristor-Synapse Connectivity for Brain-Inspired Computing

Authors: Xinyu Wu, Vishal Saxena, Kehan Zhu

Abstract: Neuromorphic systems that densely integrate CMOS spiking neurons and nano-scale memristor synapses open a new avenue of brain-inspired computing. Existing silicon neurons have molded neural biophysical dynamics but are incompatible with memristor synapses, or used extra training circuitry thus eliminating much of the density advantages gained by using memristors, or were energy inefficient. Here w… ▽ More Neuromorphic systems that densely integrate CMOS spiking neurons and nano-scale memristor synapses open a new avenue of brain-inspired computing. Existing silicon neurons have molded neural biophysical dynamics but are incompatible with memristor synapses, or used extra training circuitry thus eliminating much of the density advantages gained by using memristors, or were energy inefficient. Here we describe a novel CMOS spiking leaky integrate-and-fire neuron circuit. Building on a reconfigurable architecture with a single opamp, the described neuron accommodates a large number of memristor synapses, and enables online spike timing dependent plasticity (STDP) learning with optimized power consumption. Simulation results of an 180nm CMOS design showed 97% power efficiency metric when realizing STDP learning in 10,000 memristor synapses with a nominal 1MΩ memristance, and only 13μA current consumption when integrating input spikes. Therefore, the described CMOS neuron contributes a generalized building block for large-scale brain-inspired neuromorphic systems. △ Less

Submitted 8 June, 2015; v1 submitted 2 June, 2015; originally announced June 2015.

Comments: This is a preprint of an article accepted for publication in International Joint Conference on Neural Networks (IJCNN) 2015

arXiv:1506.00768 [pdf]

Soft Computing Techniques for Change Detection in remotely sensed images : A Review

Authors: Madhu Khurana, Vikas Saxena

Abstract: With the advent of remote sensing satellites, a huge repository of remotely sensed images is available. Change detection in remotely sensed images has been an active research area as it helps us understand the transitions that are taking place on the Earths surface. This paper discusses the methods and their classifications proposed by various researchers for change detection. Since use of soft co… ▽ More With the advent of remote sensing satellites, a huge repository of remotely sensed images is available. Change detection in remotely sensed images has been an active research area as it helps us understand the transitions that are taking place on the Earths surface. This paper discusses the methods and their classifications proposed by various researchers for change detection. Since use of soft computing based techniques are now very popular among research community, this paper also presents a classification based on learning techniques used in soft-computing methods for change detection. △ Less

Submitted 25 September, 2018; v1 submitted 2 June, 2015; originally announced June 2015.

Comments: 9 pages, 1 table, 1 figure

Journal ref: International Journal of Computer Science Issues, Volume 12, Issue 2, March 2015, pp 245-253

arXiv:1505.07814 [pdf]

doi 10.1109/TCSII.2015.2456372

A CMOS Spiking Neuron for Brain-Inspired Neural Networks with Resistive Synapses and In-Situ Learning

Authors: Xinyu Wu, Vishal Saxena, Kehan Zhu, Sakkarapani Balagopal

Abstract: Nanoscale resistive memories are expected to fuel dense integration of electronic synapses for large-scale neuromorphic system. To realize such a brain-inspired computing chip, a compact CMOS spiking neuron that performs in-situ learning and computing while driving a large number of resistive synapses is desired. This work presents a novel leaky integrate-and-fire neuron design which implements th… ▽ More Nanoscale resistive memories are expected to fuel dense integration of electronic synapses for large-scale neuromorphic system. To realize such a brain-inspired computing chip, a compact CMOS spiking neuron that performs in-situ learning and computing while driving a large number of resistive synapses is desired. This work presents a novel leaky integrate-and-fire neuron design which implements the dual-mode operation of current integration and synaptic drive, with a single opamp and enables in-situ learning with crossbar resistive synapses. The proposed design was implemented in a 0.18 $μ$m CMOS technology. Measurements show neuron's ability to drive a thousand resistive synapses, and demonstrate an in-situ associative learning. The neuron circuit occupies a small area of 0.01 mm$^2$ and has an energy-efficiency of 9.3 pJ$/$spike$/$synapse. △ Less

Submitted 24 November, 2015; v1 submitted 28 May, 2015; originally announced May 2015.

Journal ref: IEEE Transactions on Circuits and Systems II: Express Briefs, 62(11), 1088-1092, 2015

arXiv:1405.7771 [pdf]

DEM Registration and Error Analysis using ASCII values

Authors: Suma Dawn, Vikas Saxena, Bhu Dev Sharma

Abstract: Digital Elevation Model (DEM), while providing a bare earth look, is heavily used in many applications including construction modeling, visualization, and GIS. Their registration techniques have not been explored much. Methods like Coarse-to-fine or pyramid making are common in DEM-to-image or DEM-to-map registration. Self-consistency measure is used to detect any change in terrain elevation and h… ▽ More Digital Elevation Model (DEM), while providing a bare earth look, is heavily used in many applications including construction modeling, visualization, and GIS. Their registration techniques have not been explored much. Methods like Coarse-to-fine or pyramid making are common in DEM-to-image or DEM-to-map registration. Self-consistency measure is used to detect any change in terrain elevation and hence was used for DEM-to-DEM registration. But these methods apart from being time and complexity intensive, lack in error matrix evaluation. This paper gives a method of registration of DEMs using specified height values as control points by initially converting these DEMs to ASCII files. These control points may be found by two mannerisms - either by direct detection of appropriate height data in ASCII files or by edge matching along congruous quadrangle of the control point, followed by sub-graph matching. Error analysis for the same has also been done. △ Less

Submitted 30 May, 2014; originally announced May 2014.

Comments: 10 pages, 4 figures, 1 table, Proceeding of International Conference on Signal Processing and Imaging Engineering 2010, San Francisco, USA, 20-22 October 2010

arXiv:1405.6662 [pdf]

Cognitive-mapping and contextual pyramid based Digital Elevation Model Registration and its effective storage using fractal based compression

Authors: Suma Dawn, Vikas Saxena, Bhudev Sharma

Abstract: Digital Elevation models (DEM) are images having terrain information embedded into them. Using cognitive mapping concepts for DEM registration, has evolved from this basic idea of using the mapping between the space to objects and defining their relationships to form the basic landmarks that need to be marked, stored and manipulated in and about the environment or other candidate environments, nam… ▽ More Digital Elevation models (DEM) are images having terrain information embedded into them. Using cognitive mapping concepts for DEM registration, has evolved from this basic idea of using the mapping between the space to objects and defining their relationships to form the basic landmarks that need to be marked, stored and manipulated in and about the environment or other candidate environments, namely, in our case, the DEMs. The progressive two-level encapsulation of methods of geo-spatial cognition includes landmark knowledge and layout knowledge and can be useful for DEM registration. Space-based approach, that emphasizes on explicit extent of the environment under consideration, and object-based approach, that emphasizes on the relationships between objects in the local environment being the two paradigms of cognitive mapping can be methodically integrated in this three-architecture for DEM registration. Initially, P-model based segmentation is performed followed by landmark formation for contextual mapping that uses contextual pyramid formation. Apart from landmarks being used for registration key-point finding, Euclidean distance based deformation calculation has been used for transformation and change detection. Landmarks have been categorized to belong to either being flat-plain areas without much variation in the land heights; peaks that can be found when there is gradual increase in height as compared to the flat areas; valleys, marked with gradual decrease in the height seen in DEM; and finally, ripple areas with very shallow crests and nadirs. Fractal based compression was used for storage of co-registered DEMs. This method may further be extended for DEM-topographic map and DEM-to-remote sensed image registration. Experimental results further cement the fact that DEM registration may be effectively done using the proposed method. △ Less

Submitted 9 May, 2014; originally announced May 2014.

Comments: 17 pages, 8 tables, and 3 figures; IJCSI International Journal of Computer Science Issues, Vol. 10, Issue 3, No 1, May 2013, ISSN (Print): 1694-0814 | ISSN (Online): 1694-0784 (www.IJCSI.org)

arXiv:1405.5340 [pdf]

A hybrid video quality metric for analyzing quality degradation due to frame drop

Authors: Manish K Thakur, Vikas Saxena, J P Gupta

Abstract: In last decade, ever growing internet technologies provided platform to share the multimedia data among different communities. As the ultimate users are human subjects who are concerned about quality of visual information, it is often desired to have good resumed perceptual quality of videos, thus arises the need of quality assessment. This paper presents a full reference hybrid video quality metr… ▽ More In last decade, ever growing internet technologies provided platform to share the multimedia data among different communities. As the ultimate users are human subjects who are concerned about quality of visual information, it is often desired to have good resumed perceptual quality of videos, thus arises the need of quality assessment. This paper presents a full reference hybrid video quality metric which is capable to analyse the video quality for spatially or temporally (frame drop) or spatio-temporally distorted video sequences. Simulated results show that the metric efficiently analyses the quality degradation and more closer to the developed human visual system △ Less

Submitted 21 May, 2014; originally announced May 2014.

Comments: 7 pages, 9 figures

Journal ref: IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 6, No 1, November 2012

arXiv:1403.7455 [pdf]

Hybrid Approach to English-Hindi Name Entity Transliteration

Authors: Shruti Mathur, Varun Prakash Saxena

Abstract: Machine translation (MT) research in Indian languages is still in its infancy. Not much work has been done in proper transliteration of name entities in this domain. In this paper we address this issue. We have used English-Hindi language pair for our experiments and have used a hybrid approach. At first we have processed English words using a rule based approach which extracts individual phonemes… ▽ More Machine translation (MT) research in Indian languages is still in its infancy. Not much work has been done in proper transliteration of name entities in this domain. In this paper we address this issue. We have used English-Hindi language pair for our experiments and have used a hybrid approach. At first we have processed English words using a rule based approach which extracts individual phonemes from the words and then we have applied statistical approach which converts the English into its equivalent Hindi phoneme and in turn the corresponding Hindi word. Through this approach we have attained 83.40% accuracy. △ Less

Submitted 28 March, 2014; originally announced March 2014.

Comments: Proceedings of IEEE Students' Conference on Electrical, Electronics and Computer Sciences 2014

arXiv:1112.0836 [pdf]

Performance Study on Image Encryption Schemes

Authors: Jolly Shah, Vikas Saxena

Abstract: Image applications have been increasing in recent years.Encryption is used to provide the security needed for image applications. In this paper, we classify various image encryption schemes and analyze them with respect to various parameters like tunability, visual degradation, compression friendliness,format compliance, encryption ratio, speed, and cryptographic security. Image applications have been increasing in recent years.Encryption is used to provide the security needed for image applications. In this paper, we classify various image encryption schemes and analyze them with respect to various parameters like tunability, visual degradation, compression friendliness,format compliance, encryption ratio, speed, and cryptographic security. △ Less

Submitted 5 December, 2011; originally announced December 2011.

Journal ref: IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 4, No 1, July 2011, 349-355

arXiv:1104.0800 [pdf]

Video Encryption: A Survey

Authors: Jolly Shah, Dr. Vikas Saxena

Abstract: Multimedia data security is becoming important with the continuous increase of digital communications on internet. The encryption algorithms developed to secure text data are not suitable for multimedia application because of the large data size and real time constraint. In this paper, classification and description of various video encryption algorithms are presented. Analysis and Comparison of t… ▽ More Multimedia data security is becoming important with the continuous increase of digital communications on internet. The encryption algorithms developed to secure text data are not suitable for multimedia application because of the large data size and real time constraint. In this paper, classification and description of various video encryption algorithms are presented. Analysis and Comparison of these algorithms with respect to various parameters like visual degradation, encryption ratio, speed, compression friendliness, format compliance and cryptographic security is presented. △ Less

Submitted 5 April, 2011; originally announced April 2011.

Journal ref: International Journal of Computer Science Issues,Volume 8, Issue 2, 2011

Showing 1–36 of 36 results for author: Saxena, V