subscribe to arXiv mailings

Self-Supervised Learning Based Handwriting Verification

Authors: Mihir Chauhan, Mohammad Abuzar Shaikh, Bina Ramamurthy, Mingchen Gao, Siwei Lyu, Sargur Srihari

Abstract: We present SSL-HV: Self-Supervised Learning approaches applied to the task of Handwriting Verification. This task involves determining whether a given pair of handwritten images originate from the same or different writer distribution. We have compared the performance of multiple generative, contrastive SSL approaches against handcrafted feature extractors and supervised learning on CEDAR AND data… ▽ More We present SSL-HV: Self-Supervised Learning approaches applied to the task of Handwriting Verification. This task involves determining whether a given pair of handwritten images originate from the same or different writer distribution. We have compared the performance of multiple generative, contrastive SSL approaches against handcrafted feature extractors and supervised learning on CEDAR AND dataset. We show that ResNet based Variational Auto-Encoder (VAE) outperforms other generative approaches achieving 76.3% accuracy, while ResNet-18 fine-tuned using Variance-Invariance-Covariance Regularization (VICReg) outperforms other contrastive approaches achieving 78% accuracy. Using a pre-trained VAE and VICReg for the downstream task of writer verification we observed a relative improvement in accuracy of 6.7% and 9% over ResNet-18 supervised baseline with 10% writer labels. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 14 pages, 6 figures, 2 tables

arXiv:2310.19287 [pdf]

Enhancing Scalability and Reliability in Semi-Decentralized Federated Learning With Blockchain: Trust Penalization and Asynchronous Functionality

Authors: Ajay Kumar Shrestha, Faijan Ahamad Khan, Mohammed Afaan Shaikh, Amir Jaberzadeh, Jason Geng

Abstract: The paper presents an innovative approach to address the challenges of scalability and reliability in Distributed Federated Learning by leveraging the integration of blockchain technology. The paper focuses on enhancing the trustworthiness of participating nodes through a trust penalization mechanism while also enabling asynchronous functionality for efficient and robust model updates. By combinin… ▽ More The paper presents an innovative approach to address the challenges of scalability and reliability in Distributed Federated Learning by leveraging the integration of blockchain technology. The paper focuses on enhancing the trustworthiness of participating nodes through a trust penalization mechanism while also enabling asynchronous functionality for efficient and robust model updates. By combining Semi-Decentralized Federated Learning with Blockchain (SDFL-B), the proposed system aims to create a fair, secure and transparent environment for collaborative machine learning without compromising data privacy. The research presents a comprehensive system architecture, methodologies, experimental results, and discussions that demonstrate the advantages of this novel approach in fostering scalable and reliable SDFL-B systems. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: To appear in 2023 IEEE Ubiquitous Computing, Electronics & Mobile Communication Conference (IEEE UEMCON)

arXiv:2307.10492 [pdf]

Blockchain-Based Federated Learning: Incentivizing Data Sharing and Penalizing Dishonest Behavior

Authors: Amir Jaberzadeh, Ajay Kumar Shrestha, Faijan Ahamad Khan, Mohammed Afaan Shaikh, Bhargav Dave, Jason Geng

Abstract: With the increasing importance of data sharing for collaboration and innovation, it is becoming more important to ensure that data is managed and shared in a secure and trustworthy manner. Data governance is a common approach to managing data, but it faces many challenges such as data silos, data consistency, privacy, security, and access control. To address these challenges, this paper proposes a… ▽ More With the increasing importance of data sharing for collaboration and innovation, it is becoming more important to ensure that data is managed and shared in a secure and trustworthy manner. Data governance is a common approach to managing data, but it faces many challenges such as data silos, data consistency, privacy, security, and access control. To address these challenges, this paper proposes a comprehensive framework that integrates data trust in federated learning with InterPlanetary File System, blockchain, and smart contracts to facilitate secure and mutually beneficial data sharing while providing incentives, access control mechanisms, and penalizing any dishonest behavior. The experimental results demonstrate that the proposed model is effective in improving the accuracy of federated learning models while ensuring the security and fairness of the data-sharing process. The research paper also presents a decentralized federated learning platform that successfully trained a CNN model on the MNIST dataset using blockchain technology. The platform enables multiple workers to train the model simultaneously while maintaining data privacy and security. The decentralized architecture and use of blockchain technology allow for efficient communication and coordination between workers. This platform has the potential to facilitate decentralized machine learning and support privacy-preserving collaboration in various domains. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: To appear in the 5th International Congress on Blockchain and Applications (BLOCKCHAIN'23). Publish by the Lecture Notes in Networks and Systems series of Springer Verlag

arXiv:2302.00995 [pdf, other]

Open-Set Multi-Source Multi-Target Domain Adaptation

Authors: Rohit Lal, Arihant Gaur, Aadhithya Iyer, Muhammed Abdullah Shaikh, Ritik Agrawal

Abstract: Single-Source Single-Target Domain Adaptation (1S1T) aims to bridge the gap between a labelled source domain and an unlabelled target domain. Despite 1S1T being a well-researched topic, they are typically not deployed to the real world. Methods like Multi-Source Domain Adaptation and Multi-Target Domain Adaptation have evolved to model real-world problems but still do not generalise well. The fact… ▽ More Single-Source Single-Target Domain Adaptation (1S1T) aims to bridge the gap between a labelled source domain and an unlabelled target domain. Despite 1S1T being a well-researched topic, they are typically not deployed to the real world. Methods like Multi-Source Domain Adaptation and Multi-Target Domain Adaptation have evolved to model real-world problems but still do not generalise well. The fact that most of these methods assume a common label-set between source and target is very restrictive. Recent Open-Set Domain Adaptation methods handle unknown target labels but fail to generalise in multiple domains. To overcome these difficulties, first, we propose a novel generic domain adaptation (DA) setting named Open-Set Multi-Source Multi-Target Domain Adaptation (OS-nSmT), with n and m being number of source and target domains respectively. Next, we propose a graph attention based framework named DEGAA which can capture information from multiple source and target domains without knowing the exact label-set of the target. We argue that our method, though offered for multiple sources and multiple targets, can also be agnostic to various other DA settings. To check the robustness and versatility of DEGAA, we put forward ample experiments and ablation studies. △ Less

Submitted 3 February, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

Comments: Accepted in NeurIPS 2021 Workshop on Pre-registration in Machine Learning

arXiv:2109.04993 [pdf, other]

LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation

Authors: Mohammad Abuzar Shaikh, Zhanghexuan Ji, Dana Moukheiber, Yan Shen, Sargur Srihari, Mingchen Gao

Abstract: Pre-training visual and textual representations from large-scale image-text pairs is becoming a standard approach for many downstream vision-language tasks. The transformer-based models learn inter and intra-modal attention through a list of self-supervised learning tasks. This paper proposes LAViTeR, a novel architecture for visual and textual representation learning. The main module, Visual Text… ▽ More Pre-training visual and textual representations from large-scale image-text pairs is becoming a standard approach for many downstream vision-language tasks. The transformer-based models learn inter and intra-modal attention through a list of self-supervised learning tasks. This paper proposes LAViTeR, a novel architecture for visual and textual representation learning. The main module, Visual Textual Alignment (VTA) will be assisted by two auxiliary tasks, GAN-based image synthesis and Image Captioning. We also propose a new evaluation metric measuring the similarity between the learnt visual and textual embedding. The experimental results on two public datasets, CUB and MS-COCO, demonstrate superior visual and textual representation alignment in the joint feature embedding space △ Less

Submitted 19 October, 2021; v1 submitted 4 September, 2021; originally announced September 2021.

Comments: 14 pages, 10 Figures, 5 Tables

arXiv:2109.01949 [pdf, other]

Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment

Authors: Zhanghexuan Ji, Mohammad Abuzar Shaikh, Dana Moukheiber, Sargur Srihari, Yifan Peng, Mingchen Gao

Abstract: Self-supervised learning provides an opportunity to explore unlabeled chest X-rays and their associated free-text reports accumulated in clinical routine without manual supervision. This paper proposes a Joint Image Text Representation Learning Network (JoImTeRNet) for pre-training on chest X-ray images and their radiology reports. The model was pre-trained on both the global image-sentence level… ▽ More Self-supervised learning provides an opportunity to explore unlabeled chest X-rays and their associated free-text reports accumulated in clinical routine without manual supervision. This paper proposes a Joint Image Text Representation Learning Network (JoImTeRNet) for pre-training on chest X-ray images and their radiology reports. The model was pre-trained on both the global image-sentence level and the local image region-word level for visual-textual matching. Both are bidirectionally constrained on Cross-Entropy based and ranking-based Triplet Matching Losses. The region-word matching is calculated using the attention mechanism without direct supervision about their mapping. The pre-trained multi-modal representation learning paves the way for downstream tasks concerning image and/or text encoding. We demonstrate the representation learning quality by cross-modality retrievals and multi-label classifications on two datasets: OpenI-IU and MIMIC-CXR △ Less

Submitted 4 September, 2021; originally announced September 2021.

Comments: 10 Pages, 1 Figure, 3 Tables, Accepted in 12th Machine Learning in Medical Imaging (MLMI 2021) workshop

arXiv:2105.03358 [pdf, other]

Soft-Attention Improves Skin Cancer Classification Performance

Authors: Soumyya Kanti Datta, Mohammad Abuzar Shaikh, Sargur N. Srihari, Mingchen Gao

Abstract: In clinical applications, neural networks must focus on and highlight the most important parts of an input image. Soft-Attention mechanism enables a neural network toachieve this goal. This paper investigates the effectiveness of Soft-Attention in deep neural architectures. The central aim of Soft-Attention is to boost the value of important features and suppress the noise-inducing features. We co… ▽ More In clinical applications, neural networks must focus on and highlight the most important parts of an input image. Soft-Attention mechanism enables a neural network toachieve this goal. This paper investigates the effectiveness of Soft-Attention in deep neural architectures. The central aim of Soft-Attention is to boost the value of important features and suppress the noise-inducing features. We compare the performance of VGG, ResNet, InceptionResNetv2 and DenseNet architectures with and without the Soft-Attention mechanism, while classifying skin lesions. The original network when coupled with Soft-Attention outperforms the baseline[16] by 4.7% while achieving a precision of 93.7% on HAM10000 dataset [25]. Additionally, Soft-Attention coupling improves the sensitivity score by 3.8% compared to baseline[31] and achieves 91.6% on ISIC-2017 dataset [2]. The code is publicly available at github. △ Less

Submitted 4 June, 2021; v1 submitted 4 May, 2021; originally announced May 2021.

Comments: 8 pages, 9 figures, 4 tables

arXiv:2102.02335 [pdf, other]

Self-Supervised Claim Identification for Automated Fact Checking

Authors: Archita Pathak, Mohammad Abuzar Shaikh, Rohini Srihari

Abstract: We propose a novel, attention-based self-supervised approach to identify "claim-worthy" sentences in a fake news article, an important first step in automated fact-checking. We leverage "aboutness" of headline and content using attention mechanism for this task. The identified claims can be used for downstream task of claim verification for which we are releasing a benchmark dataset of manually se… ▽ More We propose a novel, attention-based self-supervised approach to identify "claim-worthy" sentences in a fake news article, an important first step in automated fact-checking. We leverage "aboutness" of headline and content using attention mechanism for this task. The identified claims can be used for downstream task of claim verification for which we are releasing a benchmark dataset of manually selected compelling articles with veracity labels and associated evidence. This work goes beyond stylistic analysis to identifying content that influences reader belief. Experiments with three datasets show the strength of our model. Data and code available at https://github.com/architapathak/Self-Supervised-ClaimIdentification △ Less

Submitted 3 February, 2021; originally announced February 2021.

Comments: 15 pages, 4 figures, Accepted at ICON 2020

arXiv:2009.04532 [pdf, other]

doi 10.1109/ICFHR2020.2020.00074

Attention based Writer Independent Handwriting Verification

Authors: Mohammad Abuzar Shaikh, Tiehang Duan, Mihir Chauhan, Sargur Srihari

Abstract: The task of writer verification is to provide a likelihood score for whether the queried and known handwritten image samples belong to the same writer or not. Such a task calls for the neural network to make it's outcome interpretable, i.e. provide a view into the network's decision making process. We implement and integrate cross-attention and soft-attention mechanisms to capture the highly corre… ▽ More The task of writer verification is to provide a likelihood score for whether the queried and known handwritten image samples belong to the same writer or not. Such a task calls for the neural network to make it's outcome interpretable, i.e. provide a view into the network's decision making process. We implement and integrate cross-attention and soft-attention mechanisms to capture the highly correlated and salient points in feature space of 2D inputs. The attention maps serve as an explanation premise for the network's output likelihood score. The attention mechanism also allows the network to focus more on relevant areas of the input, thus improving the classification performance. Our proposed approach achieves a precision of 86\% for detecting intra-writer cases in CEDAR cursive "AND" dataset. Furthermore, we generate meaningful explanations for the provided decision by extracting attention maps from multiple levels of the network. △ Less

Submitted 30 September, 2020; v1 submitted 7 September, 2020; originally announced September 2020.

Comments: 7 pages, 6 figures, Published in 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR)

arXiv:2003.06113 [pdf, ps, other]

Ultra Efficient Transfer Learning with Meta Update for Cross Subject EEG Classification

Authors: Tiehang Duan, Mihir Chauhan, Mohammad Abuzar Shaikh, Jun Chu, Sargur Srihari

Abstract: The pattern of Electroencephalogram (EEG) signal differs significantly across different subjects, and poses challenge for EEG classifiers in terms of 1) effectively adapting a learned classifier onto a new subject, 2) retaining knowledge of known subjects after the adaptation. We propose an efficient transfer learning method, named Meta UPdate Strategy (MUPS-EEG), for continuous EEG classification… ▽ More The pattern of Electroencephalogram (EEG) signal differs significantly across different subjects, and poses challenge for EEG classifiers in terms of 1) effectively adapting a learned classifier onto a new subject, 2) retaining knowledge of known subjects after the adaptation. We propose an efficient transfer learning method, named Meta UPdate Strategy (MUPS-EEG), for continuous EEG classification across different subjects. The model learns effective representations with meta update which accelerates adaptation on new subject and mitigate forgetting of knowledge on previous subjects at the same time. The proposed mechanism originates from meta learning and works to 1) find feature representation that is broadly suitable for different subjects, 2) maximizes sensitivity of loss function for fast adaptation on new subject. The method can be applied to all deep learning oriented models. Extensive experiments on two public datasets demonstrate the effectiveness of the proposed model, outperforming current state of the arts by a large margin in terms of both adapting on new subject and retain knowledge of learned subjects. △ Less

Submitted 1 March, 2021; v1 submitted 13 March, 2020; originally announced March 2020.

arXiv:1909.02548 [pdf, other]

Explanation based Handwriting Verification

Authors: Mihir Chauhan, Mohammad Abuzar Shaikh, Sargur N. Srihari

Abstract: Deep learning system have drawback that their output is not accompanied with ex-planation. In a domain such as forensic handwriting verification it is essential to provideexplanation to jurors. The goal of handwriting verification is to find a measure of confi-dence whether the given handwritten samples are written by the same or different writer.We propose a method to generate explanations for th… ▽ More Deep learning system have drawback that their output is not accompanied with ex-planation. In a domain such as forensic handwriting verification it is essential to provideexplanation to jurors. The goal of handwriting verification is to find a measure of confi-dence whether the given handwritten samples are written by the same or different writer.We propose a method to generate explanations for the confidence provided by convolu-tional neural network (CNN) which maps the input image to 15 annotations (features)provided by experts. Our system comprises of: (1) Feature learning network (FLN),a differentiable system, (2) Inference module for providing explanations. Furthermore,inference module provides two types of explanations: (a) Based on cosine similaritybetween categorical probabilities of each feature, (b) Based on Log-Likelihood Ratio(LLR) using directed probabilistic graphical model. We perform experiments using acombination of feature learning network (FLN) and each inference module. We evaluateour system using XAI-AND dataset, containing 13700 handwritten samples and 15 cor-responding expert examined features for each sample. The dataset is released for publicuse and the methods can be extended to provide explanations on other verification taskslike face verification and bio-medical comparison. This dataset can serve as the basis and benchmark for future research in explanation based handwriting verification. The code is available on github. △ Less

Submitted 14 August, 2019; originally announced September 2019.

Comments: Presented at BMVC 2019: Workshop on Interpretable and Explainable Machine Vision, Cardiff, UK

arXiv:1812.02621 [pdf, other]

doi 10.1109/ICFHR-2018.2018.00041

Hybrid Feature Learning for Handwriting Verification

Authors: Mohammad Abuzar Shaikh, Mihir Chauhan, Jun Chu, Sargur Srihari

Abstract: We propose an effective Hybrid Deep Learning (HDL) architecture for the task of determining the probability that a questioned handwritten word has been written by a known writer. HDL is an amalgamation of Auto-Learned Features (ALF) and Human-Engineered Features (HEF). To extract auto-learned features we use two methods: First, Two Channel Convolutional Neural Network (TC-CNN); Second, Two Channel… ▽ More We propose an effective Hybrid Deep Learning (HDL) architecture for the task of determining the probability that a questioned handwritten word has been written by a known writer. HDL is an amalgamation of Auto-Learned Features (ALF) and Human-Engineered Features (HEF). To extract auto-learned features we use two methods: First, Two Channel Convolutional Neural Network (TC-CNN); Second, Two Channel Autoencoder (TC-AE). Furthermore, human-engineered features are extracted by using two methods: First, Gradient Structural Concavity (GSC); Second, Scale Invariant Feature Transform (SIFT). Experiments are performed by complementing one of the HEF methods with one ALF method on 150000 pairs of samples of the word "AND" cropped from handwritten notes written by 1500 writers. Our results indicate that HDL architecture with AE-GSC achieves 99.7% accuracy on seen writer dataset and 92.16% accuracy on shuffled writer dataset which out performs CEDAR-FOX, as for unseen writer dataset, AE-SIFT performs comparable to this sophisticated handwriting comparison tool. △ Less

Submitted 18 November, 2018; originally announced December 2018.

Comments: Accepted and presented in International Conference on Frontiers in Handwriting Recognition (ICFHR) 2018

Showing 1–12 of 12 results for author: Shaikh, M A