subscribe to arXiv mailings

arXiv:2012.05948 [pdf, other]

GNNUnlock: Graph Neural Networks-based Oracle-less Unlocking Scheme for Provably Secure Logic Locking

Authors: Lilas Alrahis, Satwik Patnaik, Faiq Khalid, Muhammad Abdullah Hanif, Hani Saleh, Muhammad Shafique, Ozgur Sinanoglu

Abstract: In this paper, we propose GNNUnlock, the first-of-its-kind oracle-less machine learning-based attack on provably secure logic locking that can identify any desired protection logic without focusing on a specific syntactic topology. The key is to leverage a well-trained graph neural network (GNN) to identify all the gates in a given locked netlist that belong to the targeted protection logic, witho… ▽ More In this paper, we propose GNNUnlock, the first-of-its-kind oracle-less machine learning-based attack on provably secure logic locking that can identify any desired protection logic without focusing on a specific syntactic topology. The key is to leverage a well-trained graph neural network (GNN) to identify all the gates in a given locked netlist that belong to the targeted protection logic, without requiring an oracle. This approach fits perfectly with the targeted problem since a circuit is a graph with an inherent structure and the protection logic is a sub-graph of nodes (gates) with specific and common characteristics. GNNs are powerful in capturing the nodes' neighborhood properties, facilitating the detection of the protection logic. To rectify any misclassifications induced by the GNN, we additionally propose a connectivity analysis-based post-processing algorithm to successfully remove the predicted protection logic, thereby retrieving the original design. Our extensive experimental evaluation demonstrates that GNNUnlock is 99.24%-100% successful in breaking various benchmarks locked using stripped-functionality logic locking, tenacious and traceless logic locking, and Anti-SAT. Our proposed post-processing enhances the detection accuracy, reaching 100% for all of our tested locked benchmarks. Analysis of the results corroborates that GNNUnlock is powerful enough to break the considered schemes under different parameters, synthesis settings, and technology nodes. The evaluation further shows that GNNUnlock successfully breaks corner cases where even the most advanced state-of-the-art attacks fail. △ Less

Submitted 10 December, 2020; originally announced December 2020.

Comments: 6 pages, 4 figures, 6 tables, conference

arXiv:2010.05754 [pdf, other]

doi 10.1109/TCAD.2020.3030610

DESCNet: Developing Efficient Scratchpad Memories for Capsule Network Hardware

Authors: Alberto Marchisio, Vojtech Mrazek, Muhammad Abdullah Hanif, Muhammad Shafique

Abstract: Deep Neural Networks (DNNs) have been established as the state-of-the-art algorithm for advanced machine learning applications. Recently proposed by the Google Brain's team, the Capsule Networks (CapsNets) have improved the generalization ability, as compared to DNNs, due to their multi-dimensional capsules and preserving the spatial relationship between different objects. However, they pose signi… ▽ More Deep Neural Networks (DNNs) have been established as the state-of-the-art algorithm for advanced machine learning applications. Recently proposed by the Google Brain's team, the Capsule Networks (CapsNets) have improved the generalization ability, as compared to DNNs, due to their multi-dimensional capsules and preserving the spatial relationship between different objects. However, they pose significantly high computational and memory requirements, making their energy-efficient inference a challenging task. This paper provides, for the first time, an in-depth analysis to highlight the design and management related challenges for the (on-chip) memories deployed in hardware accelerators executing fast CapsNets inference. To enable an efficient design, we propose an application-specific memory hierarchy, which minimizes the off-chip memory accesses, while efficiently feeding the data to the hardware accelerator. We analyze the corresponding on-chip memory requirements and leverage it to propose a novel methodology to explore different scratchpad memory designs and their energy/area trade-offs. Afterwards, an application-specific power-gating technique is proposed to further reduce the energy consumption, depending upon the utilization across different operations of the CapsNets. Our results for a selected Pareto-optimal solution demonstrate no performance loss and an energy reduction of 79% for the complete accelerator, including computational units and memories, when compared to a state-of-the-art design executing Google's CapsNet model for the MNIST dataset. △ Less

Submitted 12 October, 2020; originally announced October 2020.

Comments: Accepted for publication at the IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

arXiv:2008.01191 [pdf, other]

Deep Learning Techniques for Future Intelligent Cross-Media Retrieval

Authors: Sadaqat ur Rehman, Muhammad Waqas, Shanshan Tu, Anis Koubaa, Obaid ur Rehman, Jawad Ahmad, Muhammad Hanif, Zhu Han

Abstract: With the advancement in technology and the expansion of broadcasting, cross-media retrieval has gained much attention. It plays a significant role in big data applications and consists in searching and finding data from different types of media. In this paper, we provide a novel taxonomy according to the challenges faced by multi-modal deep learning approaches in solving cross-media retrieval, nam… ▽ More With the advancement in technology and the expansion of broadcasting, cross-media retrieval has gained much attention. It plays a significant role in big data applications and consists in searching and finding data from different types of media. In this paper, we provide a novel taxonomy according to the challenges faced by multi-modal deep learning approaches in solving cross-media retrieval, namely: representation, alignment, and translation. These challenges are evaluated on deep learning (DL) based methods, which are categorized into four main groups: 1) unsupervised methods, 2) supervised methods, 3) pairwise based methods, and 4) rank based methods. Then, we present some well-known cross-media datasets used for retrieval, considering the importance of these datasets in the context in of deep learning based cross-media retrieval approaches. Moreover, we also present an extensive review of the state-of-the-art problems and its corresponding solutions for encouraging deep learning in cross-media retrieval. The fundamental objective of this work is to exploit Deep Neural Networks (DNNs) for bridging the "media gap", and provide researchers and developers with a better understanding of the underlying problems and the potential solutions of deep learning assisted cross-media retrieval. To the best of our knowledge, this is the first comprehensive survey to address cross-media retrieval under deep learning methods. △ Less

Submitted 21 July, 2020; originally announced August 2020.

Comments: arXiv admin note: text overlap with arXiv:1804.09539 by other authors

arXiv:2004.10341 [pdf, other]

doi 10.1109/DAC18072.2020.9218672

DRMap: A Generic DRAM Data Mapping Policy for Energy-Efficient Processing of Convolutional Neural Networks

Authors: Rachmad Vidya Wicaksana Putra, Muhammad Abdullah Hanif, Muhammad Shafique

Abstract: Many convolutional neural network (CNN) accelerators face performance- and energy-efficiency challenges which are crucial for embedded implementations, due to high DRAM access latency and energy. Recently, some DRAM architectures have been proposed to exploit subarray-level parallelism for decreasing the access latency. Towards this, we present a design space exploration methodology to study the l… ▽ More Many convolutional neural network (CNN) accelerators face performance- and energy-efficiency challenges which are crucial for embedded implementations, due to high DRAM access latency and energy. Recently, some DRAM architectures have been proposed to exploit subarray-level parallelism for decreasing the access latency. Towards this, we present a design space exploration methodology to study the latency and energy of different mapping policies on different DRAM architectures, and identify the pareto-optimal design choices. The results show that the energy-efficient DRAM accesses can be achieved by a mapping policy that orderly prioritizes to maximize the row buffer hits, bank- and subarray-level parallelism. △ Less

Submitted 21 April, 2020; originally announced April 2020.

Comments: To appear at the 57th Design Automation Conference (DAC), July 2020, San Francisco, CA, USA

arXiv:1912.01978 [pdf, other]

FANNet: Formal Analysis of Noise Tolerance, Training Bias and Input Sensitivity in Neural Networks

Authors: Mahum Naseer, Mishal Fatima Minhas, Faiq Khalid, Muhammad Abdullah Hanif, Osman Hasan, Muhammad Shafique

Abstract: With a constant improvement in the network architectures and training methodologies, Neural Networks (NNs) are increasingly being deployed in real-world Machine Learning systems. However, despite their impressive performance on "known inputs", these NNs can fail absurdly on the "unseen inputs", especially if these real-time inputs deviate from the training dataset distributions, or contain certain… ▽ More With a constant improvement in the network architectures and training methodologies, Neural Networks (NNs) are increasingly being deployed in real-world Machine Learning systems. However, despite their impressive performance on "known inputs", these NNs can fail absurdly on the "unseen inputs", especially if these real-time inputs deviate from the training dataset distributions, or contain certain types of input noise. This indicates the low noise tolerance of NNs, which is a major reason for the recent increase of adversarial attacks. This is a serious concern, particularly for safety-critical applications, where inaccurate results lead to dire consequences. We propose a novel methodology that leverages model checking for the Formal Analysis of Neural Network (FANNet) under different input noise ranges. Our methodology allows us to rigorously analyze the noise tolerance of NNs, their input node sensitivity, and the effects of training bias on their performance, e.g., in terms of classification accuracy. For evaluation, we use a feed-forward fully-connected NN architecture trained for the Leukemia classification. Our experimental results show $\pm 11\%$ noise tolerance for the given trained network, identify the most sensitive input nodes, and confirm the biasness of the available training dataset. △ Less

Submitted 14 May, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

Comments: To appear at the 23rd Design, Automation and Test in Europe (DATE 2020). Grenoble, France

arXiv:1912.00941 [pdf, other]

FT-ClipAct: Resilience Analysis of Deep Neural Networks and Improving their Fault Tolerance using Clipped Activation

Authors: Le-Ha Hoang, Muhammad Abdullah Hanif, Muhammad Shafique

Abstract: Deep Neural Networks (DNNs) are widely being adopted for safety-critical applications, e.g., healthcare and autonomous driving. Inherently, they are considered to be highly error-tolerant. However, recent studies have shown that hardware faults that impact the parameters of a DNN (e.g., weights) can have drastic impacts on its classification accuracy. In this paper, we perform a comprehensive erro… ▽ More Deep Neural Networks (DNNs) are widely being adopted for safety-critical applications, e.g., healthcare and autonomous driving. Inherently, they are considered to be highly error-tolerant. However, recent studies have shown that hardware faults that impact the parameters of a DNN (e.g., weights) can have drastic impacts on its classification accuracy. In this paper, we perform a comprehensive error resilience analysis of DNNs subjected to hardware faults (e.g., permanent faults) in the weight memory. The outcome of this analysis is leveraged to propose a novel error mitigation technique which squashes the high-intensity faulty activation values to alleviate their impact. We achieve this by replacing the unbounded activation functions with their clipped versions. We also present a method to systematically define the clipping values of the activation functions that result in increased resilience of the networks against faults. We evaluate our technique on the AlexNet and the VGG-16 DNNs trained for the CIFAR-10 dataset. The experimental results show that our mitigation technique significantly improves the resilience of the DNNs to faults. For example, the proposed technique offers on average 68.92% improvement in the classification accuracy of resilience-optimized VGG-16 model at 1e-5 fault rate, when compared to the base network without any fault mitigation. △ Less

Submitted 2 December, 2019; originally announced December 2019.

Comments: The 23rd Design, Automation and Test in Europe (DATE 2020)

arXiv:1912.00700 [pdf, other]

doi 10.23919/DATE48585.2020.9116393

ReD-CaNe: A Systematic Methodology for Resilience Analysis and Design of Capsule Networks under Approximations

Authors: Alberto Marchisio, Vojtech Mrazek, Muhammad Abudllah Hanif, Muhammad Shafique

Abstract: Recent advances in Capsule Networks (CapsNets) have shown their superior learning capability, compared to the traditional Convolutional Neural Networks (CNNs). However, the extremely high complexity of CapsNets limits their fast deployment in real-world applications. Moreover, while the resilience of CNNs have been extensively investigated to enable their energy-efficient implementations, the anal… ▽ More Recent advances in Capsule Networks (CapsNets) have shown their superior learning capability, compared to the traditional Convolutional Neural Networks (CNNs). However, the extremely high complexity of CapsNets limits their fast deployment in real-world applications. Moreover, while the resilience of CNNs have been extensively investigated to enable their energy-efficient implementations, the analysis of CapsNets' resilience is a largely unexplored area, that can provide a strong foundation to investigate techniques to overcome the CapsNets' complexity challenge. Following the trend of Approximate Computing to enable energy-efficient designs, we perform an extensive resilience analysis of the CapsNets inference subjected to the approximation errors. Our methodology models the errors arising from the approximate components (like multipliers), and analyze their impact on the classification accuracy of CapsNets. This enables the selection of approximate components based on the resilience of each operation of the CapsNet inference. We modify the TensorFlow framework to simulate the injection of approximation noise (based on the models of the approximate components) at different computational operations of the CapsNet inference. Our results show that the CapsNets are more resilient to the errors injected in the computations that occur during the dynamic routing (the softmax and the update of the coefficients), rather than other stages like convolutions and activation functions. Our analysis is extremely useful towards designing efficient CapsNet hardware accelerators with approximate components. To the best of our knowledge, this is the first proof-of-concept for employing approximations on the specialized CapsNet hardware. △ Less

Submitted 2 December, 2019; originally announced December 2019.

Comments: To appear at the 23rd Design, Automation and Test in Europe (DATE 2020). Grenoble, France

arXiv:1907.07229 [pdf, other]

doi 10.1109/ICCAD45719.2019.8942068

ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining

Authors: Vojtech Mrazek, Zdenek Vasicek, Lukas Sekanina, Muhammad Abdullah Hanif, Muhammad Shafique

Abstract: The state-of-the-art approaches employ approximate computing to reduce the energy consumption of DNN hardware. Approximate DNNs then require extensive retraining afterwards to recover from the accuracy loss caused by the use of approximate operations. However, retraining of complex DNNs does not scale well. In this paper, we demonstrate that efficient approximations can be introduced into the comp… ▽ More The state-of-the-art approaches employ approximate computing to reduce the energy consumption of DNN hardware. Approximate DNNs then require extensive retraining afterwards to recover from the accuracy loss caused by the use of approximate operations. However, retraining of complex DNNs does not scale well. In this paper, we demonstrate that efficient approximations can be introduced into the computational path of DNN accelerators while retraining can completely be avoided. ALWANN provides highly optimized implementations of DNNs for custom low-power accelerators in which the number of computing units is lower than the number of DNN layers. First, a fully trained DNN is converted to operate with 8-bit weights and 8-bit multipliers in convolutional layers. A suitable approximate multiplier is then selected for each computing element from a library of approximate multipliers in such a way that (i) one approximate multiplier serves several layers, and (ii) the overall classification error and energy consumption are minimized. The optimizations including the multiplier selection problem are solved by means of a multiobjective optimization NSGA-II algorithm. In order to completely avoid the computationally expensive retraining of DNN, which is usually employed to improve the classification accuracy, we propose a simple weight updating scheme that compensates the inaccuracy introduced by employing approximate multipliers. The proposed approach is evaluated for two architectures of DNN accelerators with approximate multipliers from the open-source "EvoApprox" library. We report that the proposed approach saves 30% of energy needed for multiplication in convolutional layers of ResNet-50 while the accuracy is degraded by only 0.6%. The proposed technique and approximate layers are available as an open-source extension of TensorFlow at https://github.com/ehw-fit/tf-approximate. △ Less

Submitted 25 July, 2019; v1 submitted 11 June, 2019; originally announced July 2019.

Comments: Accepted for 2019 IEEE/ACM International Conference On Computer-Aided Design (ICCAD'19)

arXiv:1905.10142 [pdf, other]

doi 10.1109/IJCNN48605.2020.9207533

FasTrCaps: An Integrated Framework for Fast yet Accurate Training of Capsule Networks

Authors: Alberto Marchisio, Beatrice Bussolino, Alessio Colucci, Muhammad Abdullah Hanif, Maurizio Martina, Guido Masera, Muhammad Shafique

Abstract: Recently, Capsule Networks (CapsNets) have shown improved performance compared to the traditional Convolutional Neural Networks (CNNs), by encoding and preserving spatial relationships between the detected features in a better way. This is achieved through the so-called Capsules (i.e., groups of neurons) that encode both the instantiation probability and the spatial information. However, one of th… ▽ More Recently, Capsule Networks (CapsNets) have shown improved performance compared to the traditional Convolutional Neural Networks (CNNs), by encoding and preserving spatial relationships between the detected features in a better way. This is achieved through the so-called Capsules (i.e., groups of neurons) that encode both the instantiation probability and the spatial information. However, one of the major hurdles in the wide adoption of CapsNets is their gigantic training time, which is primarily due to the relatively higher complexity of their new constituting elements that are different from CNNs. In this paper, we implement different optimizations in the training loop of the CapsNets, and investigate how these optimizations affect their training speed and the accuracy. Towards this, we propose a novel framework FasTrCaps that integrates multiple lightweight optimizations and a novel learning rate policy called WarmAdaBatch (that jointly performs warm restarts and adaptive batch size), and steers them in an appropriate way to provide high training-loop speedup at minimal accuracy loss. We also propose weight sharing for capsule layers. The goal is to reduce the hardware requirements of CapsNets by removing unused/redundant connections and capsules, while keeping high accuracy through tests of different learning rate policies and batch sizes. We demonstrate that one of the solutions generated by the FasTrCaps framework can achieve 58.6% reduction in the training time, while preserving the accuracy (even 0.12% accuracy improvement for the MNIST dataset), compared to the CapsNet by Google Brain. The Pareto-optimal solutions generated by FasTrCaps can be leveraged to realize trade-offs between training time and achieved accuracy. We have open-sourced our framework on https://github.com/Alexei95/FasTrCaps. △ Less

Submitted 18 May, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

Comments: Accepted for publication at the 2020 International Joint Conference on Neural Networks (IJCNN)

arXiv:1902.10807 [pdf, other]

doi 10.1145/3316781.3317781

autoAx: An Automatic Design Space Exploration and Circuit Building Methodology utilizing Libraries of Approximate Components

Authors: Vojtech Mrazek, Muhammad Abdullah Hanif, Zdenek Vasicek, Lukas Sekanina, Muhammad Shafique

Abstract: Approximate computing is an emerging paradigm for developing highly energy-efficient computing systems such as various accelerators. In the literature, many libraries of elementary approximate circuits have already been proposed to simplify the design process of approximate accelerators. Because these libraries contain from tens to thousands of approximate implementations for a single arithmetic o… ▽ More Approximate computing is an emerging paradigm for developing highly energy-efficient computing systems such as various accelerators. In the literature, many libraries of elementary approximate circuits have already been proposed to simplify the design process of approximate accelerators. Because these libraries contain from tens to thousands of approximate implementations for a single arithmetic operation it is intractable to find an optimal combination of approximate circuits in the library even for an application consisting of a few operations. An open problem is "how to effectively combine circuits from these libraries to construct complex approximate accelerators". This paper proposes a novel methodology for searching, selecting and combining the most suitable approximate circuits from a set of available libraries to generate an approximate accelerator for a given application. To enable fast design space generation and exploration, the methodology utilizes machine learning techniques to create computational models estimating the overall quality of processing and hardware cost without performing full synthesis at the accelerator level. Using the methodology, we construct hundreds of approximate accelerators (for a Sobel edge detector) showing different but relevant tradeoffs between the quality of processing and hardware cost and identify a corresponding Pareto-frontier. Furthermore, when searching for approximate implementations of a generic Gaussian filter consisting of 17 arithmetic operations, the proposed approach allows us to identify approximately $10^3$ highly important implementations from $10^{23}$ possible solutions in a few hours, while the exhaustive search would take four months on a high-end processor. △ Less

Submitted 1 April, 2019; v1 submitted 22 February, 2019; originally announced February 2019.

Comments: Accepted for publication at the Design Automation Conference 2019 (DAC'19), Las Vegas, Nevada, USA

arXiv:1902.10222 [pdf, other]

doi 10.1109/TVLSI.2021.3060509

ROMANet: Fine-Grained Reuse-Driven Off-Chip Memory Access Management and Data Organization for Deep Neural Network Accelerators

Authors: Rachmad Vidya Wicaksana Putra, Muhammad Abdullah Hanif, Muhammad Shafique

Abstract: Enabling high energy efficiency is crucial for embedded implementations of deep learning. Several studies have shown that the DRAM-based off-chip memory accesses are one of the most energy-consuming operations in deep neural network (DNN) accelerators, and thereby limit the designs from achieving efficiency gains at the full potential. DRAM access energy varies depending upon the number of accesse… ▽ More Enabling high energy efficiency is crucial for embedded implementations of deep learning. Several studies have shown that the DRAM-based off-chip memory accesses are one of the most energy-consuming operations in deep neural network (DNN) accelerators, and thereby limit the designs from achieving efficiency gains at the full potential. DRAM access energy varies depending upon the number of accesses required as well as the energy consumed per-access. Therefore, searching for a solution towards the minimum DRAM access energy is an important optimization problem. Towards this, we propose the ROMANet methodology that aims at reducing the number of memory accesses, by searching for the appropriate data partitioning and scheduling for each layer of a network using a design space exploration, based on the knowledge of the available on-chip memory and the data reuse factors. Moreover, ROMANet also targets decreasing the number of DRAM row buffer conflicts and misses, by exploiting the DRAM multi-bank burst feature to improve the energy-per-access. Besides providing the energy benefits, our proposed DRAM data mapping also results in an increased effective DRAM throughput, which is useful for latency-constraint scenarios. Our experimental results show that the ROMANet saves DRAM access energy by 12% for the AlexNet, by 36% for the VGG-16, and by 46% for the MobileNet, while also improving the DRAM throughput by 10%, as compared to the state-of-the-art. △ Less

Submitted 2 August, 2020; v1 submitted 4 February, 2019; originally announced February 2019.

Comments: Submitted to the IEEE-TVLSI journal, 14 pages, 26 figures

arXiv:1902.01151 [pdf, other]

CapStore: Energy-Efficient Design and Management of the On-Chip Memory for CapsuleNet Inference Accelerators

Authors: Alberto Marchisio, Muhammad Abdullah Hanif, Mohammad Taghi Teimoori, Muhammad Shafique

Abstract: Deep Neural Networks (DNNs) have been established as the state-of-the-art algorithm for advanced machine learning applications. Recently, CapsuleNets have improved the generalization ability, as compared to DNNs, due to their multi-dimensional capsules. However, they pose high computational and memory requirements, which makes energy-efficient inference a challenging task. In this paper, we perfor… ▽ More Deep Neural Networks (DNNs) have been established as the state-of-the-art algorithm for advanced machine learning applications. Recently, CapsuleNets have improved the generalization ability, as compared to DNNs, due to their multi-dimensional capsules. However, they pose high computational and memory requirements, which makes energy-efficient inference a challenging task. In this paper, we perform an extensive analysis to demonstrate their key limitations due to intense memory accesses and large on-chip memory requirements. To enable efficient CaspuleNet inference accelerators, we propose a specialized on-chip memory hierarchy which minimizes the off-chip memory accesses, while efficiently feeding the data to the accelerator. We analyze the on-chip memory requirements for each memory component of the architecture. By leveraging this analysis, we propose a methodology to explore different on-chip memory designs and a power-gating technique to further reduce the energy consumption, depending upon the utilization across different operations of a CapsuleNet. Our memory designs can significantly reduce the energy consumption of the on-chip memory by up to 86%, when compared to a state-of-the-art memory design. Since the power consumption of the memory elements is the major contributor in the power breakdown of the CapsuleNet accelerator, as we will also show in our analyses, the proposed memory design can effectively reduce the overall energy consumption of the complete CapsuleNet accelerator architecture. △ Less

Submitted 12 April, 2019; v1 submitted 4 February, 2019; originally announced February 2019.

arXiv:1902.01147 [pdf, other]

doi 10.1109/IJCNN48605.2020.9207297

Is Spiking Secure? A Comparative Study on the Security Vulnerabilities of Spiking and Deep Neural Networks

Authors: Alberto Marchisio, Giorgio Nanfa, Faiq Khalid, Muhammad Abdullah Hanif, Maurizio Martina, Muhammad Shafique

Abstract: Spiking Neural Networks (SNNs) claim to present many advantages in terms of biological plausibility and energy efficiency compared to standard Deep Neural Networks (DNNs). Recent works have shown that DNNs are vulnerable to adversarial attacks, i.e., small perturbations added to the input data can lead to targeted or random misclassifications. In this paper, we aim at investigating the key researc… ▽ More Spiking Neural Networks (SNNs) claim to present many advantages in terms of biological plausibility and energy efficiency compared to standard Deep Neural Networks (DNNs). Recent works have shown that DNNs are vulnerable to adversarial attacks, i.e., small perturbations added to the input data can lead to targeted or random misclassifications. In this paper, we aim at investigating the key research question: ``Are SNNs secure?'' Towards this, we perform a comparative study of the security vulnerabilities in SNNs and DNNs w.r.t. the adversarial noise. Afterwards, we propose a novel black-box attack methodology, i.e., without the knowledge of the internal structure of the SNN, which employs a greedy heuristic to automatically generate imperceptible and robust adversarial examples (i.e., attack images) for the given SNN. We perform an in-depth evaluation for a Spiking Deep Belief Network (SDBN) and a DNN having the same number of layers and neurons (to obtain a fair comparison), in order to study the efficiency of our methodology and to understand the differences between SNNs and DNNs w.r.t. the adversarial examples. Our work opens new avenues of research towards the robustness of the SNNs, considering their similarities to the human brain's functionality. △ Less

Submitted 18 May, 2020; v1 submitted 4 February, 2019; originally announced February 2019.

Comments: Accepted for publication at the 2020 International Joint Conference on Neural Networks (IJCNN)

arXiv:1901.10258 [pdf, other]

RED-Attack: Resource Efficient Decision based Attack for Machine Learning

Authors: Faiq Khalid, Hassan Ali, Muhammad Abdullah Hanif, Semeen Rehman, Rehan Ahmed, Muhammad Shafique

Abstract: Due to data dependency and model leakage properties, Deep Neural Networks (DNNs) exhibit several security vulnerabilities. Several security attacks exploited them but most of them require the output probability vector. These attacks can be mitigated by concealing the output probability vector. To address this limitation, decision-based attacks have been proposed which can estimate the model but th… ▽ More Due to data dependency and model leakage properties, Deep Neural Networks (DNNs) exhibit several security vulnerabilities. Several security attacks exploited them but most of them require the output probability vector. These attacks can be mitigated by concealing the output probability vector. To address this limitation, decision-based attacks have been proposed which can estimate the model but they require several thousand queries to generate a single untargeted attack image. However, in real-time attacks, resources and attack time are very crucial parameters. Therefore, in resource-constrained systems, e.g., autonomous vehicles where an untargeted attack can have a catastrophic effect, these attacks may not work efficiently. To address this limitation, we propose a resource efficient decision-based methodology which generates the imperceptible attack, i.e., the RED-Attack, for a given black-box model. The proposed methodology follows two main steps to generate the imperceptible attack, i.e., classification boundary estimation and adversarial noise optimization. Firstly, we propose a half-interval search-based algorithm for estimating a sample on the classification boundary using a target image and a randomly selected image from another class. Secondly, we propose an optimization algorithm which first, introduces a small perturbation in some randomly selected pixels of the estimated sample. Then to ensure imperceptibility, it optimizes the distance between the perturbed and target samples. For illustration, we evaluate it for CFAR-10 and German Traffic Sign Recognition (GTSR) using state-of-the-art networks. △ Less

Submitted 30 January, 2019; v1 submitted 29 January, 2019; originally announced January 2019.

arXiv:1901.09878 [pdf, other]

CapsAttacks: Robust and Imperceptible Adversarial Attacks on Capsule Networks

Authors: Alberto Marchisio, Giorgio Nanfa, Faiq Khalid, Muhammad Abdullah Hanif, Maurizio Martina, Muhammad Shafique

Abstract: Capsule Networks preserve the hierarchical spatial relationships between objects, and thereby bears a potential to surpass the performance of traditional Convolutional Neural Networks (CNNs) in performing tasks like image classification. A large body of work has explored adversarial examples for CNNs, but their effectiveness on Capsule Networks has not yet been well studied. In our work, we perfor… ▽ More Capsule Networks preserve the hierarchical spatial relationships between objects, and thereby bears a potential to surpass the performance of traditional Convolutional Neural Networks (CNNs) in performing tasks like image classification. A large body of work has explored adversarial examples for CNNs, but their effectiveness on Capsule Networks has not yet been well studied. In our work, we perform an analysis to study the vulnerabilities in Capsule Networks to adversarial attacks. These perturbations, added to the test inputs, are small and imperceptible to humans, but can fool the network to mispredict. We propose a greedy algorithm to automatically generate targeted imperceptible adversarial examples in a black-box attack scenario. We show that this kind of attacks, when applied to the German Traffic Sign Recognition Benchmark (GTSRB), mislead Capsule Networks. Moreover, we apply the same kind of adversarial attacks to a 5-layer CNN and a 9-layer CNN, and analyze the outcome, compared to the Capsule Networks to study differences in their behavior. △ Less

Submitted 24 May, 2019; v1 submitted 28 January, 2019; originally announced January 2019.

arXiv:1901.04986 [pdf]

Systimator: A Design Space Exploration Methodology for Systolic Array based CNNs Acceleration on the FPGA-based Edge Nodes

Authors: Hazoor Ahmad, Muhammad Tanvir, Muhammad Abdullah Hanif, Muhammad Usama Javed, Rehan Hafiz, Muhammad Shafique

Abstract: The evolution of IoT based smart applications demand porting of artificial intelligence algorithms to the edge computing devices. CNNs form a large part of these AI algorithms. Systolic array based CNN acceleration is being widely advocated due its ability to allow scalable architectures. However, CNNs are inherently memory and compute intensive algorithms, and hence pose significant challenges to… ▽ More The evolution of IoT based smart applications demand porting of artificial intelligence algorithms to the edge computing devices. CNNs form a large part of these AI algorithms. Systolic array based CNN acceleration is being widely advocated due its ability to allow scalable architectures. However, CNNs are inherently memory and compute intensive algorithms, and hence pose significant challenges to be implemented on the resource-constrained edge computing devices. Memory-constrained low-cost FPGA based devices form a substantial fraction of these edge computing devices. Thus, when porting to such edge-computing devices, the designer is left unguided as to how to select a suitable systolic array configuration that could fit in the available hardware resources. In this paper we propose Systimator, a design space exploration based methodology that provides a set of design points that can be mapped within the memory bounds of the target FPGA device. The methodology is based upon an analytical model that is formulated to estimate the required resources for systolic arrays, assuming multiple data reuse patterns. The methodology further provides the performance estimates for each of the candidate design points. We show that Systimator provides an in-depth analysis of resource-requirement of systolic array based CNNs. We provide our resource estimation results for porting of convolutional layers of TINY YOLO, a CNN based object detector, on a Xilinx ARTIX 7 FPGA. △ Less

Submitted 8 February, 2019; v1 submitted 15 December, 2018; originally announced January 2019.

Comments: 5 Pages, 3 Figures, work in progress

arXiv:1812.08034 [pdf, other]

doi 10.1063/1.5074130

Inelastic scattering of photoelectrons from He nanodroplets

Authors: M. V. Shcherbinin, F. Vad Westergaard, M. Hanif, S. R. Krishnan, A. C. LaForge, R. Richter, T. Pfeifer, M. Mudrich

Abstract: We present a detailed study of inelastic energy-loss collisions of photoelectrons emitted from He nanodroplets by tunable extreme ultraviolet (XUV) radiation. Using coincidence imaging detection of electrons and ions, we probe the lowest He droplet excited states up to the electron impact ionization threshold. We find significant signal contributions from photoelectrons emitted from free He atoms… ▽ More We present a detailed study of inelastic energy-loss collisions of photoelectrons emitted from He nanodroplets by tunable extreme ultraviolet (XUV) radiation. Using coincidence imaging detection of electrons and ions, we probe the lowest He droplet excited states up to the electron impact ionization threshold. We find significant signal contributions from photoelectrons emitted from free He atoms accompanying the He nanodroplet beam. Furthermore, signal contributions from photoionization and electron impact excitation/ionization occurring in pairs of nearest-neighbor atoms in the He droplets are detected. This work highlights the importance of inelastic electron scattering in the interaction of nanoparticles with XUV radiation. △ Less

Submitted 19 December, 2018; originally announced December 2018.

arXiv:1811.08932 [pdf, other]

doi 10.23919/DATE.2019.8714922

CapsAcc: An Efficient Hardware Accelerator for CapsuleNets with Data Reuse

Authors: Alberto Marchisio, Muhammad Abdullah Hanif, Muhammad Shafique

Abstract: Deep Neural Networks (DNNs) have been widely deployed for many Machine Learning applications. Recently, CapsuleNets have overtaken traditional DNNs, because of their improved generalization ability due to the multi-dimensional capsules, in contrast to the single-dimensional neurons. Consequently, CapsuleNets also require extremely intense matrix computations, making it a gigantic challenge to achi… ▽ More Deep Neural Networks (DNNs) have been widely deployed for many Machine Learning applications. Recently, CapsuleNets have overtaken traditional DNNs, because of their improved generalization ability due to the multi-dimensional capsules, in contrast to the single-dimensional neurons. Consequently, CapsuleNets also require extremely intense matrix computations, making it a gigantic challenge to achieve high performance. In this paper, we propose CapsAcc, the first specialized CMOS-based hardware architecture to perform CapsuleNets inference with high performance and energy efficiency. State-of-the-art convolutional DNN accelerators would not work efficiently for CapsuleNets, as their designs do not account for key operations involved in CapsuleNets, like squashing and dynamic routing, as well as multi-dimensional matrix processing. Our CapsAcc architecture targets this problem and achieves significant improvements, when compared to an optimized GPU implementation. Our architecture exploits the massive parallelism by flexibly feeding the data to a specialized systolic array according to the operations required in different layers. It also avoids extensive load and store operations on the on-chip memory, by reusing the data when possible. We further optimize the routing algorithm to reduce the computations needed at this stage. We synthesized the complete CapsAcc architecture in a 32nm CMOS technology using Synopsys design tools, and evaluated it for the MNIST benchmark (as also done by the original CapsuleNet paper) to ensure consistent and fair comparisons. This work enables highly-efficient CapsuleNets inference on embedded platforms. △ Less

Submitted 2 November, 2018; originally announced November 2018.

Comments: Accepted for publication at Design, Automation and Test in Europe (DATE 2019). Florence, Italy

arXiv:1811.03980 [pdf, other]

A Methodology for Automatic Selection of Activation Functions to Design Hybrid Deep Neural Networks

Authors: Alberto Marchisio, Muhammad Abdullah Hanif, Semeen Rehman, Maurizio Martina, Muhammad Shafique

Abstract: Activation functions influence behavior and performance of DNNs. Nonlinear activation functions, like Rectified Linear Units (ReLU), Exponential Linear Units (ELU) and Scaled Exponential Linear Units (SELU), outperform the linear counterparts. However, selecting an appropriate activation function is a challenging problem, as it affects the accuracy and the complexity of the given DNN. In this pape… ▽ More Activation functions influence behavior and performance of DNNs. Nonlinear activation functions, like Rectified Linear Units (ReLU), Exponential Linear Units (ELU) and Scaled Exponential Linear Units (SELU), outperform the linear counterparts. However, selecting an appropriate activation function is a challenging problem, as it affects the accuracy and the complexity of the given DNN. In this paper, we propose a novel methodology to automatically select the best-possible activation function for each layer of a given DNN, such that the overall DNN accuracy, compared to considering only one type of activation function for the whole DNN, is improved. However, an associated scientific challenge in exploring all the different configurations of activation functions would be time and resource-consuming. Towards this, our methodology identifies the Evaluation Points during learning to evaluate the accuracy in an intermediate step of training and to perform early termination by checking the accuracy gradient of the learning curve. This helps in significantly reducing the exploration time during training. Moreover, our methodology selects, for each layer, the dropout rate that optimizes the accuracy. Experiments show that we are able to achieve on average 7% to 15% Relative Error Reduction on MNIST, CIFAR-10 and CIFAR-100 benchmarks, with limited performance and power penalty on GPUs. △ Less

Submitted 27 October, 2018; originally announced November 2018.

arXiv:1811.01463 [pdf]

doi 10.1109/FIT.2018.00064

Security for Machine Learning-based Systems: Attacks and Challenges during Training and Inference

Authors: Faiq Khalid, Muhammad Abdullah Hanif, Semeen Rehman, Muhammad Shafique

Abstract: The exponential increase in dependencies between the cyber and physical world leads to an enormous amount of data which must be efficiently processed and stored. Therefore, computing paradigms are evolving towards machine learning (ML)-based systems because of their ability to efficiently and accurately process the enormous amount of data. Although ML-based solutions address the efficient computin… ▽ More The exponential increase in dependencies between the cyber and physical world leads to an enormous amount of data which must be efficiently processed and stored. Therefore, computing paradigms are evolving towards machine learning (ML)-based systems because of their ability to efficiently and accurately process the enormous amount of data. Although ML-based solutions address the efficient computing requirements of big data, they introduce (new) security vulnerabilities into the systems, which cannot be addressed by traditional monitoring-based security measures. Therefore, this paper first presents a brief overview of various security threats in machine learning, their respective threat models and associated research challenges to develop robust security measures. To illustrate the security vulnerabilities of ML during training, inferencing and hardware implementation, we demonstrate some key security threats on ML using LeNet and VGGNet for MNIST and German Traffic Sign Recognition Benchmarks (GTSRB), respectively. Moreover, based on the security analysis of ML-training, we also propose an attack that has a very less impact on the inference accuracy. Towards the end, we highlight the associated research challenges in developing security measures and provide a brief overview of the techniques used to mitigate such security threats. △ Less

Submitted 4 November, 2018; originally announced November 2018.

Report number: INSPEC Accession Number: 18398499

Journal ref: International Conference on Frontiers of Information Technology (FIT) 2018

arXiv:1811.01444 [pdf, other]

FAdeML: Understanding the Impact of Pre-Processing Noise Filtering on Adversarial Machine Learning

Authors: Faiq Khalid, Muhammmad Abdullah Hanif, Semeen Rehman, Junaid Qadir, Muhammad Shafique

Abstract: Deep neural networks (DNN)-based machine learning (ML) algorithms have recently emerged as the leading ML paradigm particularly for the task of classification due to their superior capability of learning efficiently from large datasets. The discovery of a number of well-known attacks such as dataset poisoning, adversarial examples, and network manipulation (through the addition of malicious nodes)… ▽ More Deep neural networks (DNN)-based machine learning (ML) algorithms have recently emerged as the leading ML paradigm particularly for the task of classification due to their superior capability of learning efficiently from large datasets. The discovery of a number of well-known attacks such as dataset poisoning, adversarial examples, and network manipulation (through the addition of malicious nodes) has, however, put the spotlight squarely on the lack of security in DNN-based ML systems. In particular, malicious actors can use these well-known attacks to cause random/targeted misclassification, or cause a change in the prediction confidence, by only slightly but systematically manipulating the environmental parameters, inference data, or the data acquisition block. Most of the prior adversarial attacks have, however, not accounted for the pre-processing noise filters commonly integrated with the ML-inference module. Our contribution in this work is to show that this is a major omission since these noise filters can render ineffective the majority of the existing attacks, which rely essentially on introducing adversarial noise. Apart from this, we also extend the state of the art by proposing a novel pre-processing noise Filter-aware Adversarial ML attack called FAdeML. To demonstrate the effectiveness of the proposed methodology, we generate an adversarial attack image by exploiting the "VGGNet" DNN trained for the "German Traffic Sign Recognition Benchmarks (GTSRB" dataset, which despite having no visual noise, can cause a classifier to misclassify even in the presence of pre-processing noise filters. △ Less

Submitted 4 November, 2018; originally announced November 2018.

Comments: Accepted in Design, Automation and Test in Europe 2019

arXiv:1811.01443 [pdf, other]

doi 10.1109/MDAT.2019.2961325

SSCNets: Robustifying DNNs using Secure Selective Convolutional Filters

Authors: Hassan Ali, Faiq Khalid, Hammad Tariq, Muhammad Abdullah Hanif, Semeen Rehman, Rehan Ahmed, Muhammad Shafique

Abstract: In this paper, we introduce a novel technique based on the Secure Selective Convolutional (SSC) techniques in the training loop that increases the robustness of a given DNN by allowing it to learn the data distribution based on the important edges in the input image. We validate our technique on Convolutional DNNs against the state-of-the-art attacks from the open-source Cleverhans library using t… ▽ More In this paper, we introduce a novel technique based on the Secure Selective Convolutional (SSC) techniques in the training loop that increases the robustness of a given DNN by allowing it to learn the data distribution based on the important edges in the input image. We validate our technique on Convolutional DNNs against the state-of-the-art attacks from the open-source Cleverhans library using the MNIST, the CIFAR-10, and the CIFAR-100 datasets. Our experimental results show that the attack success rate, as well as the imperceptibility of the adversarial images, can be significantly reduced by adding effective pre-processing functions, i.e., Sobel filtering. △ Less

Submitted 14 May, 2020; v1 submitted 4 November, 2018; originally announced November 2018.

Journal ref: IEEE Design & Test, vol. 37, no. 2, pp. 58-65, April 2020

arXiv:1811.01437 [pdf, other]

doi 10.1109/IOLTS.2019.8854377

QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial Attacks

Authors: Faiq Khalid, Hassan Ali, Hammad Tariq, Muhammad Abdullah Hanif, Semeen Rehman, Rehan Ahmed, Muhammad Shafique

Abstract: Adversarial examples have emerged as a significant threat to machine learning algorithms, especially to the convolutional neural networks (CNNs). In this paper, we propose two quantization-based defense mechanisms, Constant Quantization (CQ) and Trainable Quantization (TQ), to increase the robustness of CNNs against adversarial examples. CQ quantizes input pixel intensities based on a "fixed" numb… ▽ More Adversarial examples have emerged as a significant threat to machine learning algorithms, especially to the convolutional neural networks (CNNs). In this paper, we propose two quantization-based defense mechanisms, Constant Quantization (CQ) and Trainable Quantization (TQ), to increase the robustness of CNNs against adversarial examples. CQ quantizes input pixel intensities based on a "fixed" number of quantization levels, while in TQ, the quantization levels are "iteratively learned during the training phase", thereby providing a stronger defense mechanism. We apply the proposed techniques on undefended CNNs against different state-of-the-art adversarial attacks from the open-source \textit{Cleverhans} library. The experimental results demonstrate 50%-96% and 10%-50% increase in the classification accuracy of the perturbed images generated from the MNIST and the CIFAR-10 datasets, respectively, on commonly used CNN (Conv2D(64, 8x8) - Conv2D(128, 6x6) - Conv2D(128, 5x5) - Dense(10) - Softmax()) available in \textit{Cleverhans} library. △ Less

Submitted 14 May, 2020; v1 submitted 4 November, 2018; originally announced November 2018.

Journal ref: 2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS), Rhodes, Greece, 2019, pp. 182-187

arXiv:1811.01031 [pdf, other]

doi 10.1109/IOLTS.2019.8854425

TrISec: Training Data-Unaware Imperceptible Security Attacks on Deep Neural Networks

Authors: Faiq Khalid, Muhammad Abdullah Hanif, Semeen Rehman, Rehan Ahmed, Muhammad Shafique

Abstract: Most of the data manipulation attacks on deep neural networks (DNNs) during the training stage introduce a perceptible noise that can be catered by preprocessing during inference or can be identified during the validation phase. Therefore, data poisoning attacks during inference (e.g., adversarial attacks) are becoming more popular. However, many of them do not consider the imperceptibility factor… ▽ More Most of the data manipulation attacks on deep neural networks (DNNs) during the training stage introduce a perceptible noise that can be catered by preprocessing during inference or can be identified during the validation phase. Therefore, data poisoning attacks during inference (e.g., adversarial attacks) are becoming more popular. However, many of them do not consider the imperceptibility factor in their optimization algorithms, and can be detected by correlation and structural similarity analysis, or noticeable (e.g., by humans) in a multi-level security system. Moreover, the majority of the inference attack relies on some knowledge about the training dataset. In this paper, we propose a novel methodology which automatically generates imperceptible attack images by using the back-propagation algorithm on pre-trained DNNs, without requiring any information about the training dataset (i.e., completely training data-unaware). We present a case study on traffic sign detection using the VGGNet trained on the German Traffic Sign Recognition Benchmarks dataset in an autonomous driving use case. Our results demonstrate that the generated attack images successfully perform misclassification while remaining imperceptible in both "subjective" and "objective" quality tests. △ Less

Submitted 14 May, 2020; v1 submitted 2 November, 2018; originally announced November 2018.

Journal ref: 2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS), Rhodes, Greece, 2019, pp. 188-193

arXiv:1810.12910 [pdf, other]

MPNA: A Massively-Parallel Neural Array Accelerator with Dataflow Optimization for Convolutional Neural Networks

Authors: Muhammad Abdullah Hanif, Rachmad Vidya Wicaksana Putra, Muhammad Tanvir, Rehan Hafiz, Semeen Rehman, Muhammad Shafique

Abstract: The state-of-the-art accelerators for Convolutional Neural Networks (CNNs) typically focus on accelerating only the convolutional layers, but do not prioritize the fully-connected layers much. Hence, they lack a synergistic optimization of the hardware architecture and diverse dataflows for the complete CNN design, which can provide a higher potential for performance/energy efficiency. Towards thi… ▽ More The state-of-the-art accelerators for Convolutional Neural Networks (CNNs) typically focus on accelerating only the convolutional layers, but do not prioritize the fully-connected layers much. Hence, they lack a synergistic optimization of the hardware architecture and diverse dataflows for the complete CNN design, which can provide a higher potential for performance/energy efficiency. Towards this, we propose a novel Massively-Parallel Neural Array (MPNA) accelerator that integrates two heterogeneous systolic arrays and respective highly-optimized dataflow patterns to jointly accelerate both the convolutional (CONV) and the fully-connected (FC) layers. Besides fully-exploiting the available off-chip memory bandwidth, these optimized dataflows enable high data-reuse of all the data types (i.e., weights, input and output activations), and thereby enable our MPNA to achieve high energy savings. We synthesized our MPNA architecture using the ASIC design flow for a 28nm technology, and performed functional and timing validation using multiple real-world complex CNNs. MPNA achieves 149.7GOPS/W at 280MHz and consumes 239mW. Experimental results show that our MPNA architecture provides 1.7x overall performance improvement compared to state-of-the-art accelerator, and 51% energy saving compared to the baseline architecture. △ Less

Submitted 30 October, 2018; originally announced October 2018.

arXiv:1801.03781 [pdf, other]

doi 10.1021/acs.jpca.7b12506

Penning ionization of acene molecules by He nanodroplets

Authors: Mykola Shcherbinin, Aaron C. LaForge, Muhammad Hanif, Robert Richter, Marcel Mudrich

Abstract: Acene molecules (anthracene, tetracene, pentacene) and fullerene (C$_{60}$) are embedded in He nanodroplets (He$_N$) and probed by EUV synchrotron radiation. When resonantly exciting the He nanodroplets, the embedded molecules M are efficiently ionized by the Penning reaction $\mathrm{He}_N^*+\mathrm{M}\rightarrow\mathrm{He}_N + \mathrm{M}^+ + e^-$. However, the Penning electron spectra are broad… ▽ More Acene molecules (anthracene, tetracene, pentacene) and fullerene (C$_{60}$) are embedded in He nanodroplets (He$_N$) and probed by EUV synchrotron radiation. When resonantly exciting the He nanodroplets, the embedded molecules M are efficiently ionized by the Penning reaction $\mathrm{He}_N^*+\mathrm{M}\rightarrow\mathrm{He}_N + \mathrm{M}^+ + e^-$. However, the Penning electron spectra are broad and structureless -- showing no resemblance neither with those measured by binary Penning collisions, nor with those measured for dopants bound to the He droplet surface. The similarity of all four spectra indicates that electron spectra of embedded species are substantially altered by electron-He scattering. Simulations based on elastic binary electron-He collisions qualitatively reproduce the measured spectra, but require the assumption of unexpectedly large He droplets. △ Less

Submitted 11 January, 2018; originally announced January 2018.

arXiv:1704.00651 [pdf, other]

Fast Encoding and Decoding of Flexible-Rate and Flexible-Length Polar Codes

Authors: Muhammad Hanif, Masoud Ardakani

Abstract: This work is on fast encoding and decoding of polar codes. We propose and detail 8-bit and 16-bit parallel decoders that can be used to reduce the decoding latency of the successive-cancellation decoder. These decoders are universal and can decode flexible-rate and flexible-length polar codes. We also present fast encoders that can be used to increase the throughput of serially-implemented polar e… ▽ More This work is on fast encoding and decoding of polar codes. We propose and detail 8-bit and 16-bit parallel decoders that can be used to reduce the decoding latency of the successive-cancellation decoder. These decoders are universal and can decode flexible-rate and flexible-length polar codes. We also present fast encoders that can be used to increase the throughput of serially-implemented polar encoders. △ Less

Submitted 3 April, 2017; originally announced April 2017.

arXiv:1701.04733 [pdf]

BTAS: A Library for Tropical Algebra

Authors: Ahsan Humayun, Dr. Muhammad Asif, Dr. Muhammmad Kashif Hanif

Abstract: GPUs are dedicated processors used for complex calculations and simulations and they can be effectively used for tropical algebra computations. Tropical algebra is based on max-plus algebra and min-plus algebra. In this paper we proposed and designed a library based on Tropical Algebra which is used to provide standard vector and matrix operations namely Basic Tropical Algebra Subroutines (BTAS).… ▽ More GPUs are dedicated processors used for complex calculations and simulations and they can be effectively used for tropical algebra computations. Tropical algebra is based on max-plus algebra and min-plus algebra. In this paper we proposed and designed a library based on Tropical Algebra which is used to provide standard vector and matrix operations namely Basic Tropical Algebra Subroutines (BTAS). The testing of BTAS library is conducted by implementing the sequential version of Floyd Warshall Algorithm on CPU and furthermore parallel version on GPU. The developed library for tropical algebra delivered extensively better results on a less expensive GPU as compared to the same on CPU. △ Less

Submitted 17 January, 2017; originally announced January 2017.

Journal ref: International Journal of Computer Science and Information Security 2016 Volume 14 No.12

arXiv:1505.05735 [pdf, ps, other]

doi 10.1109/TSP.2015.2480042

A Minorization-Maximization Method for Optimizing Sum Rate in Non-Orthogonal Multiple Access Systems

Authors: Muhammad Fainan Hanif, Zhiguo Ding, Tharmalingam Ratnarajah, George K. Karagiannidis

Abstract: Non-orthogonal multiple access (NOMA) systems have the potential to deliver higher system throughput, compared to contemporary orthogonal multiple access techniques. For a linearly precoded multiple-input multiple-output (MISO) system, we study the downlink sum rate maximization problem, when the NOMA principles are applied. Being a non-convex and intractable optimization problem,we resort to appr… ▽ More Non-orthogonal multiple access (NOMA) systems have the potential to deliver higher system throughput, compared to contemporary orthogonal multiple access techniques. For a linearly precoded multiple-input multiple-output (MISO) system, we study the downlink sum rate maximization problem, when the NOMA principles are applied. Being a non-convex and intractable optimization problem,we resort to approximate it with a minorization-maximization algorithm (MMA), which is a widely used tool in statistics. In each step of the MMA, we solve a second-order cone program, such that the feasibility set in each step contains that of the previous one, and is always guaranteed to be a subset of the feasibility set of the original problem. It should be noted that the algorithm takes a few iterations to converge. Furthermore, we study the conditions under which the achievable rates maximization can be further simplified to a low complexity design problem, and we compute the probability of occurrence of this event. Numerical examples are conducted to show a comparison of the proposed approach against conventional multiple access systems. NOMA is reported to provide better spectral and power efficiency with a polynomial time computational complexity. △ Less

Submitted 21 May, 2015; originally announced May 2015.

Comments: Submitted for journal publication

Journal ref: IEEE Transactions on Signal Processing, vol.64, no.1, pp.76-88, Jan.1, 2016

arXiv:1404.5083 [pdf, other]

Transmit Antenna Selection in Underlay Cognitive Radio Environment

Authors: Muhammad Hanif, Hong-Chuan Yang, Mohamed-Slim Alouini

Abstract: Cognitive radio (CR) technology addresses the problem of spectrum under-utilization. In underlay CR mode, the secondary users are allowed to communicate provided that their transmission is not detrimental to primary user communication. Transmit antenna selection is one of the low-complexity methods to increase the capacity of wireless communication systems. In this article, we propose and analyze… ▽ More Cognitive radio (CR) technology addresses the problem of spectrum under-utilization. In underlay CR mode, the secondary users are allowed to communicate provided that their transmission is not detrimental to primary user communication. Transmit antenna selection is one of the low-complexity methods to increase the capacity of wireless communication systems. In this article, we propose and analyze the performance benefit of a transmit antenna selection scheme for underlay secondary system that ensures the instantaneous interference caused by the secondary transmitter to the primary receiver is below a predetermined level. Closed-form expressions of the outage probability, amount of fading, and ergodic capacity for the secondary network are derived. Monte-carlo simulations are also carried out to confirm various mathematical results presented in this article. △ Less

Submitted 30 July, 2014; v1 submitted 20 April, 2014; originally announced April 2014.

Comments: 16 pages, 5 figures

arXiv:1312.3145 [pdf]

Algorithm for spectral response analysis of superconducting microwave transmission-line resonator

Authors: Muhammad Hanif

Abstract: It has always been a challenge for researchers to efficiently and accurately post process experimental data which is distorted by the noise. Superconducting microwave devices e.g. resonators, directional filters, beam-splitters etc. operate at frequency of several GHz to THz and temperatures well below critical temperature (Tc) with few exceptions like transition edge sensors where devices are ope… ▽ More It has always been a challenge for researchers to efficiently and accurately post process experimental data which is distorted by the noise. Superconducting microwave devices e.g. resonators, directional filters, beam-splitters etc. operate at frequency of several GHz to THz and temperatures well below critical temperature (Tc) with few exceptions like transition edge sensors where devices are operated at temperatures close to Tc. These devices are measured usually with vector network analyser in terms of scattering parameters. Two kinds of errors, systematic and drift can easily be removed from the measurements taken with VNA. However, random errors are not easy to address and remove due to their unpredictability and randomness. In this manuscript we will present an algorithm to post process experimental data to cope with measurements that have been corrupted or useful spectral response is buried in spurious signal. We have developed a robust and efficient algorithm, implemented in MATLAB, to detect peaks in spectral response, remove baseline and finally estimate parameters of two-port superconductor resonator using an Improved Nelder-Mead Method for unconstrained multidimensional least square minimization. The algorithm has been successfully tested and verified by processing spectral response of half wavelength microwave transmission-line resonator successfully isolating resonator response from noisy background. We were able to compute loaded quality factor, resonance frequency from response data with high reproducibility even from those experimental data sets where resonance spikes were hardly visible. △ Less

Submitted 11 December, 2013; originally announced December 2013.

Comments: 5 pages,5 figures

Journal ref: Sci. Int.(Lahore),25(4),813-817,2013

arXiv:1304.3998 [pdf, ps, other]

doi 10.1109/TCOMM.2014.2320913

Computationally Efficient Robust Beamforming for SINR Balancing in Multicell Downlink

Authors: Muhammad Fainan Hanif, Le-Nam Tran, Antti Tölli, Markku Juntti

Abstract: We address the problem of downlink beamformer design for signal-to-interference-plus-noise ratio (SINR) balancing in a multiuser multicell environment with imperfectly estimated channels at base stations (BSs). We first present a semidefinite program (SDP) based approximate solution to the problem. Then, as our main contribution, by exploiting some properties of the robust counterpart of the optim… ▽ More We address the problem of downlink beamformer design for signal-to-interference-plus-noise ratio (SINR) balancing in a multiuser multicell environment with imperfectly estimated channels at base stations (BSs). We first present a semidefinite program (SDP) based approximate solution to the problem. Then, as our main contribution, by exploiting some properties of the robust counterpart of the optimization problem, we arrive at a second-order cone program (SOCP) based approximation of the balancing problem. The advantages of the proposed SOCP-based design are twofold. First, it greatly reduces the computational complexity compared to the SDP-based method. Second, it applies to a wide range of uncertainty models. As a case study, we investigate the performance of proposed formulations when the base station is equipped with a massive antenna array. Numerical experiments are carried out to confirm that the proposed robust designs achieve favorable results in scenarios of practical interest. △ Less

Submitted 15 April, 2013; originally announced April 2013.

Comments: 26 pages, 5 figures. Submitted for possible publication

Journal ref: IEEE Transactions on Communications, vol.62, no.6, pp.1908,1920, June 2014

arXiv:1301.0178 [pdf, ps, other]

doi 10.1109/TSP.2013.2278815

Efficient Solutions for Weighted Sum Rate Maximization in Multicellular Networks With Channel Uncertainties

Authors: Muhammad Fainan Hanif, Le-Nam Tran, Antti Tölli, Markku Juntti, Savo Glisic

Abstract: The important problem of weighted sum rate maximization (WSRM) in a multicellular environment is intrinsically sensitive to channel estimation errors. In this paper, we study ways to maximize the weighted sum rate in a linearly precoded multicellular downlink system where the receivers are equipped with a single antenna. With perfect channel information available at the base stations, we first pre… ▽ More The important problem of weighted sum rate maximization (WSRM) in a multicellular environment is intrinsically sensitive to channel estimation errors. In this paper, we study ways to maximize the weighted sum rate in a linearly precoded multicellular downlink system where the receivers are equipped with a single antenna. With perfect channel information available at the base stations, we first present a novel fast converging algorithm that solves the WSRM problem. Then, the assumption is relaxed to the case where the error vectors in the channel estimates are assumed to lie in an uncertainty set formed by the intersection of finite ellipsoids. As our main contributions, we present two procedures to solve the intractable nonconvex robust designs based on the worst case principle. The proposed iterative algorithms solve the semidefinite programs in each of their steps and provably converge to a locally optimal solution of the robust WSRM problem. The proposed approaches are numerically compared against each other to ascertain their robustness towards channel estimation imperfections. The results clearly indicate the performance gain compared to the case when channel uncertainties are ignored in the design process. For certain scenarios, we also quantify the gap between the proposed approximations and exact solutions. △ Less

Submitted 2 January, 2013; originally announced January 2013.

Comments: 31 pages, 8 figures. Submitted for possible publication

Journal ref: IEEE Transactions on Signal Processing, vol.61, no.22, pp.5659--5674, Nov., 2013

arXiv:1211.1969 [pdf, ps, other]

doi 10.1109/LSP.2012.2223211

Fast Converging Algorithm for Weighted Sum Rate Maximization in Multicell MISO Downlink

Authors: Le-Nam Tran, Muhammad Fainan Hanif, Antti Tölli, Markku Juntti

Abstract: The problem of maximizing weighted sum rates in the downlink of a multicell environment is of considerable interest. Unfortunately, this problem is known to be NP-hard. For the case of multi-antenna base stations and single antenna mobile terminals, we devise a low complexity, fast and provably convergent algorithm that locally optimizes the weighted sum rate in the downlink of the system. In part… ▽ More The problem of maximizing weighted sum rates in the downlink of a multicell environment is of considerable interest. Unfortunately, this problem is known to be NP-hard. For the case of multi-antenna base stations and single antenna mobile terminals, we devise a low complexity, fast and provably convergent algorithm that locally optimizes the weighted sum rate in the downlink of the system. In particular, we derive an iterative second-order cone program formulation of the weighted sum rate maximization problem. The algorithm converges to a local optimum within a few iterations. Superior performance of the proposed approach is established by numerically comparing it to other known solutions. △ Less

Submitted 8 November, 2012; originally announced November 2012.

Comments: 10 pages, 2 figures. The MATLAB code of the proposed algorithm can be downloaded from: https://sites.google.com/site/namletran/publication

Journal ref: IEEE Signal Processing Letters, vol.19, no.12, pp.872-875, Dec. 2012. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6327333&isnumber=6323087

arXiv:1209.5448 [pdf]

A New Compression Based Index Structure for Efficient Information Retrieval

Authors: Md. Abdullah al Mamun, Md. Hanif, Md. Rakib Uddin, Tanvir Ahmed, Md. Mofizul Islam

Abstract: Finding desired information from large data set is a difficult problem. Information retrieval is concerned with the structure, analysis, organization, storage, searching, and retrieval of information. Index is the main constituent of an IR system. Now a day exponential growth of information makes the index structure large enough affecting the IR system's quality. So compressing the Index structure… ▽ More Finding desired information from large data set is a difficult problem. Information retrieval is concerned with the structure, analysis, organization, storage, searching, and retrieval of information. Index is the main constituent of an IR system. Now a day exponential growth of information makes the index structure large enough affecting the IR system's quality. So compressing the Index structure is our main contribution in this paper. We compressed the document number in inverted file entries using a new coding technique based on run-length encoding. Our coding mechanism uses a specified code which acts over run-length coding. We experimented and found that our coding mechanism on an average compresses 67.34% percent more than the other techniques. △ Less

Submitted 24 September, 2012; originally announced September 2012.

Comments: 5 pages

Journal ref: International Journal of Science and Technology, Volume 2 No.1, pp. 10-14, January 2012

arXiv:0906.1618 [pdf, ps, other]

doi 10.1109/TWC.2010.02.090864

On the Statistics of Cognitive Radio Capacity in Shadowing and Fast Fading Environments (Journal Version)

Authors: Muhammad Fainan Hanif, Peter J. Smith

Abstract: In this paper we consider the capacity of the cognitive radio channel in different fading environments under a low interference regime. First we derive the probability that the low interference regime holds under shadow fading as well as Rayleigh and Rician fast fading conditions. We demonstrate that this is the dominant case, especially in practical cognitive radio deployment scenarios. The cap… ▽ More In this paper we consider the capacity of the cognitive radio channel in different fading environments under a low interference regime. First we derive the probability that the low interference regime holds under shadow fading as well as Rayleigh and Rician fast fading conditions. We demonstrate that this is the dominant case, especially in practical cognitive radio deployment scenarios. The capacity of the cognitive radio channel depends critically on a power loss parameter, $α$, which governs how much transmit power the cognitive radio dedicates to relaying the primary message. We derive a simple, accurate approximation to $α$ in Rayleigh and Rician fading environments which gives considerable insight into system capacity. We also investigate the effects of system parameters and propagation environment on $α$ and the cognitive radio capacity. In all cases, the use of the approximation is shown to be extremely accurate. △ Less

Submitted 8 June, 2009; originally announced June 2009.

Comments: Submitted to the IEEE Transactions on Wireless Commun. The conference version of this paper appears in Proc. IEEE CrownCom, 2009

Journal ref: IEEE Transactions on Wireless Communications, vol.9, no.2, pp.844-852, 2010

arXiv:0905.3602 [pdf, ps, other]

doi 10.1109/TWC.2010.04.090749

Level Crossing Rates of Interference in Cognitive Radio Networks

Authors: Muhammad Fainan Hanif, Peter J. Smith

Abstract: The future deployment of cognitive radios is critically dependent on the fact that the incumbent primary user system must remain as oblivious as possible to their presence. This in turn heavily relies on the fluctuations of the interfering cognitive radio signals. In this letter we compute the level crossing rates of the cumulative interference created by the cognitive radios. We derive analytic… ▽ More The future deployment of cognitive radios is critically dependent on the fact that the incumbent primary user system must remain as oblivious as possible to their presence. This in turn heavily relies on the fluctuations of the interfering cognitive radio signals. In this letter we compute the level crossing rates of the cumulative interference created by the cognitive radios. We derive analytical formulae for the level crossing rates in Rayleigh and Rician fast fading conditions. We approximate Rayleigh and Rician level crossing rates using fluctuation rates of gamma and scaled noncentral $χ^2$ processes respectively. The analytical results and the approximations used in their derivations are verified by Monte Carlo simulations and the analysis is applied to a particular CR allocation strategy. △ Less

Submitted 22 May, 2009; originally announced May 2009.

Comments: submitted to the IEEE Transactions on Wireless Communications

Journal ref: IEEE Transactions on Wireless Communications, vol.9, no.4, pp.1283-1287, 2010

arXiv:0905.3201 [pdf, ps, other]

On the Statistics of Cognitive Radio Capacity in Shadowing and Fast Fading Environments

Authors: Muhammad Fainan Hanif, Peter J. Smith, Mansoor Shafi

Abstract: In this paper we consider the capacity of the cognitive radio channel in a fading environment under a "low interference regime". This capacity depends critically on a power loss parameter, $α$, which governs how much transmit power the cognitive radio dedicates to relaying the primary message. We derive a simple, accurate approximation to $α$ which gives considerable insight into system capacity… ▽ More In this paper we consider the capacity of the cognitive radio channel in a fading environment under a "low interference regime". This capacity depends critically on a power loss parameter, $α$, which governs how much transmit power the cognitive radio dedicates to relaying the primary message. We derive a simple, accurate approximation to $α$ which gives considerable insight into system capacity. We also investigate the effects of system parameters and propagation environment on $α$ and the cognitive radio capacity. In all cases, the use of the approximation is shown to be extremely accurate. Finally, we derive the probability that the "low interference regime" holds and demonstrate that this is the dominant case, especially in practical cognitive radio deployment scenarios. △ Less

Submitted 20 May, 2009; originally announced May 2009.

Comments: to appear in IEEE CrownCom 2009 Proc

arXiv:0905.3030 [pdf, ps, other]

doi 10.1109/AUSCTW.2009.4805601

Performance of Cognitive Radio Systems with Imperfect Radio Environment Map Information

Authors: Muhammad Fainan Hanif, Peter J. Smith, Mansoor Shafi

Abstract: In this paper we describe the effect of imperfections in the radio environment map (REM) information on the performance of cognitive radio (CR) systems. Via simulations we explore the relationship between the required precision of the REM and various channel/system properties. For example, the degree of spatial correlation in the shadow fading is a key factor as is the interference constraint em… ▽ More In this paper we describe the effect of imperfections in the radio environment map (REM) information on the performance of cognitive radio (CR) systems. Via simulations we explore the relationship between the required precision of the REM and various channel/system properties. For example, the degree of spatial correlation in the shadow fading is a key factor as is the interference constraint employed by the primary user. Based on the CR interferers obtained from the simulations, we characterize the temporal behavior of such systems by computing the level crossing rates (LCRs) of the cumulative interference represented by these CRs. This evaluates the effect of short term fluctuations above acceptable interference levels due to the fast fading. We derive analytical formulae for the LCRs in Rayleigh and Rician fast fading conditions. The analytical results are verified by Monte Carlo simulations. △ Less

Submitted 20 May, 2009; v1 submitted 19 May, 2009; originally announced May 2009.

Comments: presented at IEEE AusCTW 2009. Journal versions are under preparation. This posting is the same as the original one. Only author's list is updated that was unfortunately not correctly mentioned in the first version

arXiv:0905.3023 [pdf, ps, other]

doi 10.1109/ICC.2009.5199089

Interference and Deployment Issues for Cognitive Radio Systems in Shadowing Environments

Authors: Muhammad Fainan Hanif, Mansoor Shafi, Peter J. Smith, Pawel A. Dmochowski

Abstract: In this paper we describe a model for calculating the aggregate interference encountered by primary receivers in the presence of randomly placed cognitive radios (CRs). We show that incorporating the impact of distance attenuation and lognormal fading on each constituent interferer in the aggregate, leads to a composite interference that cannot be satisfactorily modeled by a lognormal. Using the… ▽ More In this paper we describe a model for calculating the aggregate interference encountered by primary receivers in the presence of randomly placed cognitive radios (CRs). We show that incorporating the impact of distance attenuation and lognormal fading on each constituent interferer in the aggregate, leads to a composite interference that cannot be satisfactorily modeled by a lognormal. Using the interference statistics we determine a number of key parameters needed for the deployment of CRs. Examples of these are the exclusion zone radius, needed to protect the primary receiver under different types of fading environments and acceptable interference levels, and the numbers of CRs that can be deployed. We further show that if the CRs have apriori knowledge of the radio environment map (REM), then a much larger number of CRs can be deployed especially in a high density environment. Given REM information, we also look at the CR numbers achieved by two different types of techniques to process the scheduling information. △ Less

Submitted 20 May, 2009; v1 submitted 19 May, 2009; originally announced May 2009.

Comments: to be presented at IEEE ICC 2009. This posting is the same as the original one. Only author's list is updated that was unfortunately not correctly mentioned in first version

arXiv:0905.0664 [pdf, ps, other]

doi 10.1142/S0217751X10047804

Radiative Corrections in Vector-Tensor Models

Authors: A. Buchel, F. A. Chishtie, M. T. Hanif, S. Homayouni, J. Jia, D. G. C. McKeon

Abstract: We consider a two-form antisymmetric tensor field φminimally coupled to a non-abelian vector field with a field strength F. Canonical analysis suggests that a pseudoscalar mass term \frac{μ^2}{2} \tr (φ\wedge φ) for the tensor field eliminates degrees of freedom associated with this field. Explicit one loop calculations show that an additional coupling m\tr(φ\wedge F) (which can be eliminated cl… ▽ More We consider a two-form antisymmetric tensor field φminimally coupled to a non-abelian vector field with a field strength F. Canonical analysis suggests that a pseudoscalar mass term \frac{μ^2}{2} \tr (φ\wedge φ) for the tensor field eliminates degrees of freedom associated with this field. Explicit one loop calculations show that an additional coupling m\tr(φ\wedge F) (which can be eliminated classically by a tensor field shift) reintroduces tensor field degrees of freedom. We attribute this to the lack of the renormalizability in our vector-tensor model. We also explore a vector-tensor model with a tensor field scalar mass term \frac {μ^2}{2} \tr (φ\wedge\star φ) and coupling m\tr(φ\wedge \star F). We comment on the Stueckelberg mechanism for mass generation in the Abelian version of the latter model. △ Less

Submitted 5 May, 2009; originally announced May 2009.

Comments: 8 pages

Journal ref: Int.J.Mod.Phys.A25:163-169,2010

arXiv:0901.2039 [pdf]

doi 10.1186/1757-5036-1-3

ATR-FTIR spectroscopy detects alterations induced by organotin(IV) carboxylates in MCF-7 cells at sub-cytotoxic/-genotoxic concentrations

Authors: Muhammad S Ahmad, Bushra Mirza, Mukhtiar Hussain, Muhammad Hanif, Saqib Ali, Michael J Walsh, Francis L Martin

Abstract: The environmental impact of metal complexes such as organotin(IV) compounds is of increasing concern. Genotoxic effects of organotin(IV) compounds (0.01 microg/ml, 0.1 microg/ml or 1.0 microg/ml) were measured using the alkaline single-cell gel electrophoresis (comet) assay to measure DNA single-strand breaks (SSBs) and the cytokinesis-block micronucleus (CBMN) assay to determine micronucleus fo… ▽ More The environmental impact of metal complexes such as organotin(IV) compounds is of increasing concern. Genotoxic effects of organotin(IV) compounds (0.01 microg/ml, 0.1 microg/ml or 1.0 microg/ml) were measured using the alkaline single-cell gel electrophoresis (comet) assay to measure DNA single-strand breaks (SSBs) and the cytokinesis-block micronucleus (CBMN) assay to determine micronucleus formation. Biochemical-cell signatures were also ascertained using attenuated total reflection Fourier-transform infrared (ATR-FTIR) spectroscopy. In the comet assay, organotin(IV) carboxylates induced significantly-elevated levels of DNA SSBs. Elevated micronucleus-forming activities were also observed. Following interrogation using ATR-FTIR spectroscopy, infrared spectra in the biomolecular range (900 cm-1 - 1800 cm-1) derived from orga... △ Less

Submitted 14 January, 2009; originally announced January 2009.

Comments: 19 pages, 7 figures

Journal ref: PMC Biophysics 2008, 1:3

Showing 51–92 of 92 results for author: Hanif, M