Skip to main content

Showing 1–50 of 94 results for author: Kannan, R

  1. arXiv:2407.11215  [pdf, other

    cs.LG cs.AI cs.CE cs.CL math.NA

    Mechanistic interpretability of large language models with applications to the financial services industry

    Authors: Ashkan Golgoon, Khashayar Filom, Arjun Ravi Kannan

    Abstract: Large Language Models such as GPTs (Generative Pre-trained Transformers) exhibit remarkable capabilities across a broad spectrum of applications. Nevertheless, due to their intrinsic complexity, these models present substantial challenges in interpreting their internal decision-making processes. This lack of transparency poses critical challenges when it comes to their adaptation by financial inst… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    MSC Class: 68T01 ACM Class: I.2.7

  2. arXiv:2406.02778  [pdf, other

    cs.LG

    MS-IMAP -- A Multi-Scale Graph Embedding Approach for Interpretable Manifold Learning

    Authors: Shay Deutsch, Lionel Yelibi, Alex Tong Lin, Arjun Ravi Kannan

    Abstract: Deriving meaningful representations from complex, high-dimensional data in unsupervised settings is crucial across diverse machine learning applications. This paper introduces a framework for multi-scale graph network embedding based on spectral graph wavelets that employs a contrastive learning approach. A significant feature of the proposed embedding is its capacity to establish a correspondence… ▽ More

    Submitted 5 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  3. Sparse MTTKRP Acceleration for Tensor Decomposition on GPU

    Authors: Sasindu Wijeratne, Rajgopal Kannan, Viktor Prasanna

    Abstract: Sparse Matricized Tensor Times Khatri-Rao Product (spMTTKRP) is the bottleneck kernel of sparse tensor decomposition. In this work, we propose a GPU-based algorithm design to address the key challenges in accelerating spMTTKRP computation, including (1) eliminating global atomic operations across GPU thread blocks, (2) avoiding the intermediate values being communicated between GPU thread blocks a… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: In 21st ACM International Conference on Computing Frontiers (CF '24), May 7-9, 2024, Ischia, Italy

  4. arXiv:2405.00636  [pdf, other

    physics.soc-ph cs.LG cs.SI physics.data-an

    Robustness of graph embedding methods for community detection

    Authors: Zhi-Feng Wei, Pablo Moriano, Ramakrishnan Kannan

    Abstract: This study investigates the robustness of graph embedding methods for community detection in the face of network perturbations, specifically edge deletions. Graph embedding techniques, which represent nodes as low-dimensional vectors, are widely used for various graph machine learning tasks due to their ability to capture structural properties of networks effectively. However, the impact of pertur… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 17 pages, 26 figures, 3 tables. Comments are welcome

  5. arXiv:2404.07188  [pdf, other

    cs.DC cs.CV eess.IV

    GCV-Turbo: End-to-end Acceleration of GNN-based Computer Vision Tasks on FPGA

    Authors: Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna

    Abstract: Graph neural networks (GNNs) have recently empowered various novel computer vision (CV) tasks. In GNN-based CV tasks, a combination of CNN layers and GNN layers or only GNN layers are employed. This paper introduces GCV-Turbo, a domain-specific accelerator on FPGA for end-to-end acceleration of GNN-based CV tasks. GCV-Turbo consists of two key components: (1) a \emph{novel} hardware architecture o… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  6. arXiv:2404.04527  [pdf, other

    cs.CV cs.AI cs.AR cs.DC

    VTR: An Optimized Vision Transformer for SAR ATR Acceleration on FPGA

    Authors: Sachini Wickramasinghe, Dhruv Parikh, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, Carl Busart

    Abstract: Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) is a key technique used in military applications like remote-sensing image recognition. Vision Transformers (ViTs) are the current state-of-the-art in various computer vision applications, outperforming their CNN counterparts. However, using ViTs for SAR ATR applications is challenging due to (1) standard ViTs require extensive trai… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: SPIE DCS 2024

  7. arXiv:2404.03225  [pdf, other

    cs.CV cs.LG

    FACTUAL: A Novel Framework for Contrastive Learning Based Robust SAR Image Classification

    Authors: Xu Wang, Tian Ye, Rajgopal Kannan, Viktor Prasanna

    Abstract: Deep Learning (DL) Models for Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR), while delivering improved performance, have been shown to be quite vulnerable to adversarial attacks. Existing works improve robustness by training models on adversarial samples. However, by focusing mostly on attacks that manipulate images randomly, they neglect the real-world feasibility of such atta… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 2024 IEEE Radar Conference

  8. arXiv:2403.18318  [pdf, other

    cs.CV

    Uncertainty-Aware SAR ATR: Defending Against Adversarial Attacks via Bayesian Neural Networks

    Authors: Tian Ye, Rajgopal Kannan, Viktor Prasanna, Carl Busart

    Abstract: Adversarial attacks have demonstrated the vulnerability of Machine Learning (ML) image classifiers in Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) systems. An adversarial attack can deceive the classifier into making incorrect predictions by perturbing the input SAR images, for example, with a few scatterers attached to the on-ground objects. Therefore, it is critical to devel… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  9. arXiv:2403.14047  [pdf, other

    cs.DC cs.AR cs.CV

    Accelerating ViT Inference on FPGA through Static and Dynamic Pruning

    Authors: Dhruv Parikh, Shouyi Li, Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna

    Abstract: Vision Transformers (ViTs) have achieved state-of-the-art accuracy on various computer vision tasks. However, their high computational complexity prevents them from being applied to many real-world applications. Weight and token pruning are two well-known methods for reducing complexity: weight pruning reduces the model size and associated computational demands, while token pruning further dynamic… ▽ More

    Submitted 12 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: FCCM 2024

  10. PaCKD: Pattern-Clustered Knowledge Distillation for Compressing Memory Access Prediction Models

    Authors: Neelesh Gupta, Pengmiao Zhang, Rajgopal Kannan, Viktor Prasanna

    Abstract: Deep neural networks (DNNs) have proven to be effective models for accurate Memory Access Prediction (MAP), a critical task in mitigating memory latency through data prefetching. However, existing DNN-based MAP models suffer from the challenges such as significant physical storage space and poor inference latency, primarily due to their large number of parameters. These limitations render them imp… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: 6 pages, 2 figures, HPEC '23

    Journal ref: 2023 IEEE High Performance Extreme Computing Conference (HPEC), 2023, pp. 1-7

  11. arXiv:2402.11760  [pdf, other

    cs.LG cs.CV

    Reinforcement Learning as a Parsimonious Alternative to Prediction Cascades: A Case Study on Image Segmentation

    Authors: Bharat Srikishan, Anika Tabassum, Srikanth Allu, Ramakrishnan Kannan, Nikhil Muralidhar

    Abstract: Deep learning architectures have achieved state-of-the-art (SOTA) performance on computer vision tasks such as object detection and image segmentation. This may be attributed to the use of over-parameterized, monolithic deep learning architectures executed on large datasets. Although such architectures lead to increased accuracy, this is usually accompanied by a large increase in computation and m… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  12. arXiv:2402.05396  [pdf, other

    cs.LG cs.AI

    TASER: Temporal Adaptive Sampling for Fast and Accurate Dynamic Graph Representation Learning

    Authors: Gangda Deng, Hongkuan Zhou, Hanqing Zeng, Yinglong Xia, Christopher Leung, Jianbo Li, Rajgopal Kannan, Viktor Prasanna

    Abstract: Recently, Temporal Graph Neural Networks (TGNNs) have demonstrated state-of-the-art performance in various high-impact applications, including fraud detection and content recommendation. Despite the success of TGNNs, they are prone to the prevalent noise found in real-world dynamic graphs like time-deprecated links and skewed interaction distribution. The noise causes two critical issues that sign… ▽ More

    Submitted 18 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: IPDPS 2024

  13. arXiv:2402.00564  [pdf, other

    cs.CV cs.LG

    A Single Graph Convolution Is All You Need: Efficient Grayscale Image Classification

    Authors: Jacob Fein-Ashley, Sachini Wickramasinghe, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna

    Abstract: Image classifiers for domain-specific tasks like Synthetic Aperture Radar Automatic Target Recognition (SAR ATR) and chest X-ray classification often rely on convolutional neural networks (CNNs). These networks, while powerful, experience high latency due to the number of operations they perform, which can be problematic in real-time applications. Many image classification models are designed to w… ▽ More

    Submitted 26 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted to IEEE ICIP 2024

  14. arXiv:2401.06362  [pdf, other

    cs.NE cs.AR cs.LG cs.OS

    Attention, Distillation, and Tabularization: Towards Practical Neural Network-Based Prefetching

    Authors: Pengmiao Zhang, Neelesh Gupta, Rajgopal Kannan, Viktor K. Prasanna

    Abstract: Attention-based Neural Networks (NN) have demonstrated their effectiveness in accurate memory access prediction, an essential step in data prefetching. However, the substantial computational overheads associated with these models result in high inference latency, limiting their feasibility as practical prefetchers. To close the gap, we propose a new approach based on tabularization that significan… ▽ More

    Submitted 21 February, 2024; v1 submitted 23 December, 2023; originally announced January 2024.

  15. arXiv:2401.02687  [pdf, other

    cs.CV cs.LG eess.IV

    PAHD: Perception-Action based Human Decision Making using Explainable Graph Neural Networks on SAR Images

    Authors: Sasindu Wijeratne, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, Carl Busart

    Abstract: Synthetic Aperture Radar (SAR) images are commonly utilized in military applications for automatic target recognition (ATR). Machine learning (ML) methods, such as Convolutional Neural Networks (CNN) and Graph Neural Networks (GNN), are frequently used to identify ground-based objects, including battle tanks, personnel carriers, and missile launchers. Determining the vehicle class, such as the BRD… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  16. Benchmarking Deep Learning Classifiers for SAR Automatic Target Recognition

    Authors: Jacob Fein-Ashley, Tian Ye, Rajgopal Kannan, Viktor Prasanna, Carl Busart

    Abstract: Synthetic Aperture Radar SAR Automatic Target Recognition ATR is a key technique of remote-sensing image recognition which can be supported by deep neural networks The existing works of SAR ATR mostly focus on improving the accuracy of the target recognition while ignoring the systems performance in terms of speed and storage which is critical to real-world applications of SAR ATR For decision-mak… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 6 Pages

  17. arXiv:2312.02912  [pdf, other

    cs.CV

    Realistic Scatterer Based Adversarial Attacks on SAR Image Classifiers

    Authors: Tian Ye, Rajgopal Kannan, Viktor Prasanna, Carl Busart, Lance Kaplan

    Abstract: Adversarial attacks have highlighted the vulnerability of classifiers based on machine learning for Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) tasks. An adversarial attack perturbs SAR images of on-ground targets such that the classifiers are misled into making incorrect predictions. However, many existing attacking techniques rely on arbitrary manipulation of SAR images whi… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  18. arXiv:2310.16249  [pdf, other

    cs.CE cs.AI math.NA

    A clustering tool for interrogating finite element models based on eigenvectors of graph adjacency

    Authors: Ramaseshan Kannan

    Abstract: This note introduces an unsupervised learning algorithm to debug errors in finite element (FE) simulation models and details how it was productionised. The algorithm clusters degrees of freedom in the FE model using numerical properties of the adjacency of its stiffness matrix. The algorithm has been deployed as a tool called `Model Stability Analysis' tool within the commercial structural FE suit… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  19. arXiv:2310.10902  [pdf, other

    cs.AR eess.SP

    Reuse Kernels or Activations? A Flexible Dataflow for Low-latency Spectral CNN Acceleration

    Authors: Yue Niu, Rajgopal Kannan, Ajitesh Srivastava, Viktor Prasanna

    Abstract: Spectral-domain CNNs have been shown to be more efficient than traditional spatial CNNs in terms of reducing computation complexity. However they come with a `kernel explosion' problem that, even after compression (pruning), imposes a high memory burden and off-chip bandwidth requirement for kernel access. This creates a performance gap between the potential acceleration offered by compression and… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 11 pages, 11 figures Accepted to ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA) 2020

  20. arXiv:2309.09142  [pdf, other

    cs.DC

    Performance of Graph Neural Networks for Point Cloud Applications

    Authors: Dhruv Parikh, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, Carl Busart

    Abstract: Graph Neural Networks (GNNs) have gained significant momentum recently due to their capability to learn on unstructured graph data. Dynamic GNNs (DGNNs) are the current state-of-the-art for point cloud applications; such applications (viz. autonomous driving) require real-time processing at the edge with tight latency and memory constraints. Conducting performance analysis on such DGNNs, thus, bec… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

    Comments: 27th Annual IEEE High Performance Extreme Computing Conference

  21. arXiv:2309.09131  [pdf, other

    cs.DC

    Dynasor: A Dynamic Memory Layout for Accelerating Sparse MTTKRP for Tensor Decomposition on Multi-core CPU

    Authors: Sasindu Wijeratne, Rajgopal Kannan, Viktor Prasanna

    Abstract: Sparse Matricized Tensor Times Khatri-Rao Product (spMTTKRP) is the most time-consuming compute kernel in sparse tensor decomposition. In this paper, we introduce a novel algorithm to minimize the execution time of spMTTKRP across all modes of an input tensor on multi-core CPU platform. The proposed algorithm leverages the FLYCOO tensor format to exploit data locality in external memory accesses.… ▽ More

    Submitted 13 October, 2023; v1 submitted 16 September, 2023; originally announced September 2023.

  22. arXiv:2309.07108  [pdf, other

    cs.LG cs.AI cs.MA

    Characterizing Speed Performance of Multi-Agent Reinforcement Learning

    Authors: Samuel Wiggins, Yuan Meng, Rajgopal Kannan, Viktor Prasanna

    Abstract: Multi-Agent Reinforcement Learning (MARL) has achieved significant success in large-scale AI systems and big-data applications such as smart grids, surveillance, etc. Existing advancements in MARL algorithms focus on improving the rewards obtained by introducing various mechanisms for inter-agent cooperation. However, these optimizations are usually compute- and memory-intensive, thus leading to s… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Journal ref: In Proceedings of the 12th International Conference on Data Science, Technology and Applications - DATA (2023) 327-334

  23. arXiv:2307.11371  [pdf, ps, other

    cs.LG cs.CG

    Random Separating Hyperplane Theorem and Learning Polytopes

    Authors: Chiranjib Bhattacharyya, Ravindran Kannan, Amit Kumar

    Abstract: The Separating Hyperplane theorem is a fundamental result in Convex Geometry with myriad applications. Our first result, Random Separating Hyperplane Theorem (RSH), is a strengthening of this for polytopes. $\rsh$ asserts that if the distance between $a$ and a polytope $K$ with $k$ vertices and unit diameter in $\Re^d$ is at least $δ$, where $δ$ is a fixed constant in $(0,1)$, then a randomly chos… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  24. arXiv:2305.07119  [pdf, other

    cs.CV cs.DC

    Graph Neural Network for Accurate and Low-complexity SAR ATR

    Authors: Bingyi Zhang, Sasindu Wijeratne, Rajgopal Kannan, Viktor Prasanna, Carl Busart

    Abstract: Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) is the key technique for remote sensing image recognition. The state-of-the-art works exploit the deep convolutional neural networks (CNNs) for SAR ATR, leading to high computation costs. These deep CNN models are unsuitable to be deployed on resource-limited platforms. In this work, we propose a graph neural network (GNN) model to… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

  25. arXiv:2304.10013  [pdf, other

    cs.NI

    HTNet: Dynamic WLAN Performance Prediction using Heterogenous Temporal GNN

    Authors: Hongkuan Zhou, Rajgopal Kannan, Ananthram Swami, Viktor Prasanna

    Abstract: Predicting the throughput of WLAN deployments is a classic problem that occurs in the design of robust and high performance WLAN systems. However, due to the increasingly complex communication protocols and the increase in interference between devices in denser and denser WLAN deployments, traditional methods either have substantial runtime or enormous prediction error and hence cannot be applied… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: InfoCom'23

  26. arXiv:2303.10216  [pdf, other

    cs.LG math.PR

    Approximation of group explainers with coalition structure using Monte Carlo sampling on the product space of coalitions and features

    Authors: Konstandinos Kotsiopoulos, Alexey Miroshnikov, Khashayar Filom, Arjun Ravi Kannan

    Abstract: In recent years, many Machine Learning (ML) explanation techniques have been designed using ideas from cooperative game theory. These game-theoretic explainers suffer from high complexity, hindering their exact computation in practical settings. In our work, we focus on a wide class of linear game values, as well as coalitional values, for the marginal game based on a given ML model and predictor… ▽ More

    Submitted 18 April, 2024; v1 submitted 17 March, 2023; originally announced March 2023.

    Comments: 31 pages, 6 figures

  27. On marginal feature attributions of tree-based models

    Authors: Khashayar Filom, Alexey Miroshnikov, Konstandinos Kotsiopoulos, Arjun Ravi Kannan

    Abstract: Due to their power and ease of use, tree-based machine learning models, such as random forests and gradient-boosted tree ensembles, have become very popular. To interpret them, local feature attributions based on marginal expectations, e.g. marginal (interventional) Shapley, Owen or Banzhaf values, may be employed. Such methods are true to the model and implementation invariant, i.e. dependent onl… ▽ More

    Submitted 5 May, 2024; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: Minor corrections. 30 pages+appendix (64 pages in total), 10 figures. To appear in Foundations of Data Science

    MSC Class: Primary: 68T01; 91A12; 91A80; 05A19; Secondary: 91A68; 91A06; 05C05

  28. arXiv:2301.01454  [pdf, other

    cs.AR cs.CV eess.IV

    Accurate, Low-latency, Efficient SAR Automatic Target Recognition on FPGA

    Authors: Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, Carl Busart

    Abstract: Synthetic aperture radar (SAR) automatic target recognition (ATR) is the key technique for remote-sensing image recognition. The state-of-the-art convolutional neural networks (CNNs) for SAR ATR suffer from \emph{high computation cost} and \emph{large memory footprint}, making them unsuitable to be deployed on resource-limited platforms, such as small/micro satellites. In this paper, we propose a… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

  29. arXiv:2212.05250  [pdf, other

    cs.LG cs.AR

    Phases, Modalities, Temporal and Spatial Locality: Domain Specific ML Prefetcher for Accelerating Graph Analytics

    Authors: Pengmiao Zhang, Rajgopal Kannan, Viktor K. Prasanna

    Abstract: Memory performance is a bottleneck in graph analytics acceleration. Existing Machine Learning (ML) prefetchers struggle with phase transitions and irregular memory accesses in graph processing. We propose MPGraph, an ML-based Prefetcher for Graph analytics using domain specific models. MPGraph introduces three novel optimizations: soft detection for phase transitions, phase-specific multi-modality… ▽ More

    Submitted 24 September, 2023; v1 submitted 10 December, 2022; originally announced December 2022.

  30. arXiv:2208.11208  [pdf, other

    cs.DC eess.SY

    Accelerating Monte-Carlo Tree Search on CPU-FPGA Heterogeneous Platform

    Authors: Yuan Meng, Rajgopal Kannan, Viktor Prasanna

    Abstract: Monte Carlo Tree Search (MCTS) methods have achieved great success in many Artificial Intelligence (AI) benchmarks. The in-tree operations become a critical performance bottleneck in realizing parallel MCTS on CPUs. In this work, we develop a scalable CPU-FPGA system for Tree-Parallel MCTS. We propose a novel decomposition and mapping of MCTS data structure and computation onto CPU and FPGA to red… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  31. arXiv:2207.08298  [pdf, other

    cs.DC cs.AR cs.LG

    Towards Programmable Memory Controller for Tensor Decomposition

    Authors: Sasindu Wijeratne, Ta-Yang Wang, Rajgopal Kannan, Viktor Prasanna

    Abstract: Tensor decomposition has become an essential tool in many data science applications. Sparse Matricized Tensor Times Khatri-Rao Product (MTTKRP) is the pivotal kernel in tensor decomposition algorithms that decompose higher-order real-world large tensors into multiple matrices. Accelerating MTTKRP can speed up the tensor decomposition process immensely. Sparse MTTKRP is a challenging kernel to acce… ▽ More

    Submitted 17 July, 2022; originally announced July 2022.

  32. arXiv:2205.14778  [pdf, other

    cs.AR cs.LG

    TransforMAP: Transformer for Memory Access Prediction

    Authors: Pengmiao Zhang, Ajitesh Srivastava, Anant V. Nori, Rajgopal Kannan, Viktor K. Prasanna

    Abstract: Data Prefetching is a technique that can hide memory latency by fetching data before it is needed by a program. Prefetching relies on accurate memory access prediction, to which task machine learning based methods are increasingly applied. Unlike previous approaches that learn from deltas or offsets and perform one access prediction, we develop TransforMAP, based on the powerful Transformer model,… ▽ More

    Submitted 29 May, 2022; originally announced May 2022.

  33. Fine-Grained Address Segmentation for Attention-Based Variable-Degree Prefetching

    Authors: Pengmiao Zhang, Ajitesh Srivastava, Anant V. Nori, Rajgopal Kannan, Viktor K. Prasanna

    Abstract: Machine learning algorithms have shown potential to improve prefetching performance by accurately predicting future memory accesses. Existing approaches are based on the modeling of text prediction, considering prefetching as a classification problem for sequence prediction. However, the vast and sparse memory address space leads to large vocabulary, which makes this modeling impractical. The numb… ▽ More

    Submitted 1 May, 2022; originally announced May 2022.

  34. arXiv:2204.08646  [pdf, other

    cs.LG cs.AI

    Label Efficient Regularization and Propagation for Graph Node Classification

    Authors: Tian Xie, Rajgopal Kannan, C. -C. Jay Kuo

    Abstract: An enhanced label propagation (LP) method called GraphHop was proposed recently. It outperforms graph convolutional networks (GCNs) in the semi-supervised node classification task on various networks. Although the performance of GraphHop was explained intuitively with joint node attribute and label signal smoothening, its rigorous mathematical treatment is lacking. In this paper, we propose a labe… ▽ More

    Submitted 30 October, 2022; v1 submitted 18 April, 2022; originally announced April 2022.

  35. Design and Implementation of Knowledge Base for Runtime Management of Software Defined Hardware

    Authors: Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, Viktor Prasanna

    Abstract: Runtime-reconfigurable software coupled with reconfigurable hardware is highly desirable as a means towards maximizing runtime efficiency without compromising programmability. Compilers for such software systems are extremely difficult to design as they must leverage different types of hardware at runtime. To address the need for static and dynamic compiler optimization of workflows matched to dyn… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: HPEC'19

  36. arXiv:2203.05095  [pdf, other

    cs.AR cs.LG

    Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA

    Authors: Hongkuan Zhou, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, Carl Busart

    Abstract: Temporal Graph Neural Networks (TGNNs) are powerful models to capture temporal, structural, and contextual information on temporal graphs. The generated temporal node embeddings outperform other methods in many downstream tasks. Real-world applications require high performance inference on real-time streaming dynamic graphs. However, these models usually rely on complex attention mechanisms to cap… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

    Comments: IPDPS'22

  37. arXiv:2201.07858  [pdf, ps, other

    cs.LG cs.AI

    Decoupling the Depth and Scope of Graph Neural Networks

    Authors: Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, Ren Chen

    Abstract: State-of-the-art Graph Neural Networks (GNNs) have limited scalability with respect to the graph and model sizes. On large graphs, increasing the model depth often means exponential expansion of the scope (i.e., receptive field). Beyond just a few layers, two fundamental challenges emerge: 1. degraded expressivity due to oversmoothing, and 2. expensive computation due to neighborhood explosion. We… ▽ More

    Submitted 19 January, 2022; originally announced January 2022.

    Comments: Accepted to NeurIPS 2021

    Journal ref: Advances in Neural Information Processing Systems, 2021

  38. arXiv:2111.11259  [pdf, other

    cs.LG math.PR

    Model-agnostic bias mitigation methods with regressor distribution control for Wasserstein-based fairness metrics

    Authors: Alexey Miroshnikov, Konstandinos Kotsiopoulos, Ryan Franks, Arjun Ravi Kannan

    Abstract: This article is a companion paper to our earlier work Miroshnikov et al. (2021) on fairness interpretability, which introduces bias explanations. In the current work, we propose a bias mitigation methodology based upon the construction of post-processed models with fairer regressor distributions for Wasserstein-based fairness metrics. By identifying the list of predictors contributing the most to… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

    Comments: 29 pages, 32 figures

    MSC Class: 49Q22; 91A12; 68T01

  39. arXiv:2110.12511  [pdf, other

    cs.DC

    Parallel Peeling of Bipartite Networks for Hierarchical Dense Subgraph Discovery

    Authors: Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna

    Abstract: Wing and Tip decomposition construct a hierarchy of butterfly-dense edge and vertex induced bipartite subgraphs, respectively. They have applications in several domains including e-commerce, recommendation systems and document analysis. Existing decomposition algorithms use a bottom-up approach that constructs the hierarchy in an increasing order of subgraph density. They iteratively peel the enti… ▽ More

    Submitted 24 October, 2021; originally announced October 2021.

    Comments: 31 pages, 11 figures, 4 tables. Source code available at https://github.com/kartiklakhotia/RECEIPT

  40. arXiv:2109.13956  [pdf, ps, other

    cs.DS cs.SC math.NA math.OC

    Bit Complexity of Jordan Normal Form and Spectral Factorization

    Authors: Papri Dey, Ravi Kannan, Nick Ryder, Nikhil Srivastava

    Abstract: We study the bit complexity of two related fundamental computational problems in linear algebra and control theory. Our results are: (1) An $\tilde{O}(n^{ω+3}a+n^4a^2+n^ω\log(1/ε))$ time algorithm for finding an $ε-$approximation to the Jordan Normal form of an integer matrix with $a-$bit entries, where $ω$ is the exponent of matrix multiplication. (2) An… ▽ More

    Submitted 25 November, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

    Comments: 19pp

    Journal ref: ITCS 2023

  41. arXiv:2109.08874  [pdf, other

    cs.AR cs.AI cs.DC

    Reconfigurable Low-latency Memory System for Sparse Matricized Tensor Times Khatri-Rao Product on FPGA

    Authors: Sasindu Wijeratne, Rajgopal Kannan, Viktor Prasanna

    Abstract: Tensor decomposition has become an essential tool in many applications in various domains, including machine learning. Sparse Matricized Tensor Times Khatri-Rao Product (MTTKRP) is one of the most computationally expensive kernels in tensor computations. Despite having significant computational parallelism, MTTKRP is a challenging kernel to optimize due to its irregular memory access characteristi… ▽ More

    Submitted 18 September, 2021; originally announced September 2021.

  42. SeDyT: A General Framework for Multi-Step Event Forecasting via Sequence Modeling on Dynamic Entity Embeddings

    Authors: Hongkuan Zhou, James Orme-Rogers, Rajgopal Kannan, Viktor Prasanna

    Abstract: Temporal Knowledge Graphs store events in the form of subjects, relations, objects, and timestamps which are often represented by dynamic heterogeneous graphs. Event forecasting is a critical and challenging task in Temporal Knowledge Graph reasoning that predicts the subject or object of an event in the future. To obtain temporal embeddings multi-step away in the future, existing methods learn ge… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

  43. arXiv:2108.09601  [pdf, other

    cs.AR cs.AI cs.DC

    Programmable FPGA-based Memory Controller

    Authors: Sasindu Wijeratne, Sanket Pattnaik, Zhiyu Chen, Rajgopal Kannan, Viktor Prasanna

    Abstract: Even with generational improvements in DRAM technology, memory access latency still remains the major bottleneck for application accelerators, primarily due to limitations in memory interface IPs which cannot fully account for variations in target applications, the algorithms used, and accelerator architectures. Since developing memory controllers for different applications is time-consuming, this… ▽ More

    Submitted 21 August, 2021; originally announced August 2021.

  44. arXiv:2105.08005  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Learning a Latent Simplex in Input-Sparsity Time

    Authors: Ainesh Bakshi, Chiranjib Bhattacharyya, Ravi Kannan, David P. Woodruff, Samson Zhou

    Abstract: We consider the problem of learning a latent $k$-vertex simplex $K\subset\mathbb{R}^d$, given access to $A\in\mathbb{R}^{d\times n}$, which can be viewed as a data matrix with $n$ points that are obtained by randomly perturbing latent points in the simplex $K$ (potentially beyond $K$). A large class of latent variable models, such as adversarial clustering, mixed membership stochastic block models… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: ICLR 2021

  45. Accelerating Large Scale Real-Time GNN Inference using Channel Pruning

    Authors: Hongkuan Zhou, Ajitesh Srivastava, Hanqing Zeng, Rajgopal Kannan, Viktor Prasanna

    Abstract: Graph Neural Networks (GNNs) are proven to be powerful models to generate node embedding for downstream applications. However, due to the high computation complexity of GNN inference, it is hard to deploy GNNs for large-scale or real-time applications. In this paper, we propose to accelerate GNN inference by pruning the dimensions in each layer with negligible accuracy loss. Our pruning framework… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

  46. arXiv:2104.11079  [pdf, other

    cs.AI cs.CE

    Randomized Algorithms for Scientific Computing (RASC)

    Authors: Aydin Buluc, Tamara G. Kolda, Stefan M. Wild, Mihai Anitescu, Anthony DeGennaro, John Jakeman, Chandrika Kamath, Ramakrishnan Kannan, Miles E. Lopes, Per-Gunnar Martinsson, Kary Myers, Jelani Nelson, Juan M. Restrepo, C. Seshadhri, Draguna Vrabie, Brendt Wohlberg, Stephen J. Wright, Chao Yang, Peter Zwart

    Abstract: Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and sc… ▽ More

    Submitted 21 March, 2022; v1 submitted 19 April, 2021; originally announced April 2021.

  47. arXiv:2102.10878  [pdf, other

    cs.GT math.PR

    Stability theory of game-theoretic group feature explanations for machine learning models

    Authors: Alexey Miroshnikov, Konstandinos Kotsiopoulos, Khashayar Filom, Arjun Ravi Kannan

    Abstract: In this article, we study feature attributions of Machine Learning (ML) models originating from linear game values and coalitional values defined as operators on appropriate functional spaces. The main focus is on random games based on the conditional and marginal expectations. The first part of our work formulates a stability theory for these explanation operators by establishing certain bounds f… ▽ More

    Submitted 3 April, 2024; v1 submitted 22 February, 2021; originally announced February 2021.

    Comments: 76 pages, 41 figures. Major revision. The title has been changed

    MSC Class: 91A06; 91A12; 91A80; 46N30; 46N99; 68T01

  48. arXiv:2012.04388  [pdf, ps, other

    cs.DS cs.IR

    Algorithms for finding $k$ in $k$-means

    Authors: Chiranjib Bhattacharyya, Ravindran Kannan, Amit Kumar

    Abstract: $k-$means Clustering requires as input the exact value of $k$, the number of clusters. Two challenges are open: (i) Is there a data-determined definition of $k$ which is provably correct and (ii) Is there a polynomial time algorithm to find $k… ▽ More

    Submitted 8 December, 2020; originally announced December 2020.

    Comments: 38 pages

    ACM Class: I.5.3

  49. arXiv:2012.01380   

    cs.LG

    Deep Graph Neural Networks with Shallow Subgraph Samplers

    Authors: Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, Ren Chen

    Abstract: While Graph Neural Networks (GNNs) are powerful models for learning representations on graphs, most state-of-the-art models do not have significant accuracy gain beyond two to three layers. Deep GNNs fundamentally need to address: 1). expressivity challenge due to oversmoothing, and 2). computation challenge due to neighborhood explosion. We propose a simple "deep GNN, shallow sampler" design prin… ▽ More

    Submitted 23 March, 2022; v1 submitted 2 December, 2020; originally announced December 2020.

    Comments: The complete version of this paper is accepted to NeurIPS 2021, available on arXiv under the new title "Decoupling the depth and scope of graph neural networks" (arXiv:2201.07858). This version, "Deep graph neural networks with shallow subgraph samplers", is a short version and we withdraw it to avoid confusion. Please always refer to arXiv:2201.07858

  50. DYNAMAP: Dynamic Algorithm Mapping Framework for Low Latency CNN Inference

    Authors: Yuan Meng, Sanmukh Kuppannagari, Rajgopal Kannan, Viktor Prasanna

    Abstract: Most of the existing work on FPGA acceleration of Convolutional Neural Network (CNN) focus on employing a single strategy (algorithm, dataflow, etc.) across all the layers. Such an approach does not achieve optimal latency on complex and deep CNNs. Emerging CNNs have diverse per-layer computation characteristics including parallelism, arithmetic intensity, locality, and memory footprint. Per-layer… ▽ More

    Submitted 13 March, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: Published in ACM/SIGDA FPGA '21