-
PrivFED -- A Framework for Privacy-Preserving Federated Learning in Enhanced Breast Cancer Diagnosis
Authors:
Maithili Jha,
S. Maitri,
M. Lohithdakshan,
Shiny Duela J,
K. Raja
Abstract:
In the day-to-day operations of healthcare institutions, a multitude of Personally Identifiable Information (PII) data exchanges occur, exposing the data to a spectrum of cybersecurity threats. This study introduces a federated learning framework, trained on the Wisconsin dataset, to mitigate challenges such as data scarcity and imbalance. Techniques like the Synthetic Minority Over-sampling Techn…
▽ More
In the day-to-day operations of healthcare institutions, a multitude of Personally Identifiable Information (PII) data exchanges occur, exposing the data to a spectrum of cybersecurity threats. This study introduces a federated learning framework, trained on the Wisconsin dataset, to mitigate challenges such as data scarcity and imbalance. Techniques like the Synthetic Minority Over-sampling Technique (SMOTE) are incorporated to bolster robustness, while isolation forests are employed to fortify the model against outliers. Catboost serves as the classification tool across all devices. The identification of optimal features for heightened accuracy is pursued through Principal Component Analysis (PCA),accentuating the significance of hyperparameter tuning, as underscored in a comparative analysis. The model exhibits an average accuracy of 99.95% on edge devices and 98% on the central server.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Towards Inclusive Face Recognition Through Synthetic Ethnicity Alteration
Authors:
Praveen Kumar Chandaliya,
Kiran Raja,
Raghavendra Ramachandra,
Zahid Akhtar,
Christoph Busch
Abstract:
Numerous studies have shown that existing Face Recognition Systems (FRS), including commercial ones, often exhibit biases toward certain ethnicities due to under-represented data. In this work, we explore ethnicity alteration and skin tone modification using synthetic face image generation methods to increase the diversity of datasets. We conduct a detailed analysis by first constructing a balance…
▽ More
Numerous studies have shown that existing Face Recognition Systems (FRS), including commercial ones, often exhibit biases toward certain ethnicities due to under-represented data. In this work, we explore ethnicity alteration and skin tone modification using synthetic face image generation methods to increase the diversity of datasets. We conduct a detailed analysis by first constructing a balanced face image dataset representing three ethnicities: Asian, Black, and Indian. We then make use of existing Generative Adversarial Network-based (GAN) image-to-image translation and manifold learning models to alter the ethnicity from one to another. A systematic analysis is further conducted to assess the suitability of such datasets for FRS by studying the realistic skin-tone representation using Individual Typology Angle (ITA). Further, we also analyze the quality characteristics using existing Face image quality assessment (FIQA) approaches. We then provide a holistic FRS performance analysis using four different systems. Our findings pave the way for future research works in (i) developing both specific ethnicity and general (any to any) ethnicity alteration models, (ii) expanding such approaches to create databases with diverse skin tones, (iii) creating datasets representing various ethnicities which further can help in mitigating bias while addressing privacy concerns.
△ Less
Submitted 6 May, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
What's in the Flow? Exploiting Temporal Motion Cues for Unsupervised Generic Event Boundary Detection
Authors:
Sourabh Vasant Gothe,
Vibhav Agarwal,
Sourav Ghosh,
Jayesh Rajkumar Vachhani,
Pranay Kashyap,
Barath Raj Kandur Raja
Abstract:
Generic Event Boundary Detection (GEBD) task aims to recognize generic, taxonomy-free boundaries that segment a video into meaningful events. Current methods typically involve a neural model trained on a large volume of data, demanding substantial computational power and storage space. We explore two pivotal questions pertaining to GEBD: Can non-parametric algorithms outperform unsupervised neural…
▽ More
Generic Event Boundary Detection (GEBD) task aims to recognize generic, taxonomy-free boundaries that segment a video into meaningful events. Current methods typically involve a neural model trained on a large volume of data, demanding substantial computational power and storage space. We explore two pivotal questions pertaining to GEBD: Can non-parametric algorithms outperform unsupervised neural methods? Does motion information alone suffice for high performance? This inquiry drives us to algorithmically harness motion cues for identifying generic event boundaries in videos. In this work, we propose FlowGEBD, a non-parametric, unsupervised technique for GEBD. Our approach entails two algorithms utilizing optical flow: (i) Pixel Tracking and (ii) Flow Normalization. By conducting thorough experimentation on the challenging Kinetics-GEBD and TAPOS datasets, our results establish FlowGEBD as the new state-of-the-art (SOTA) among unsupervised methods. FlowGEBD exceeds the neural models on the Kinetics-GEBD dataset by obtaining an F1@0.05 score of 0.713 with an absolute gain of 31.7% compared to the unsupervised baseline and achieves an average F1 score of 0.623 on the TAPOS validation dataset.
△ Less
Submitted 15 February, 2024;
originally announced April 2024.
-
NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and Results
Authors:
Xiaoning Liu,
Zongwei Wu,
Ao Li,
Florin-Alexandru Vasluianu,
Yulun Zhang,
Shuhang Gu,
Le Zhang,
Ce Zhu,
Radu Timofte,
Zhi Jin,
Hongjun Wu,
Chenxi Wang,
Haitao Ling,
Yuanhao Cai,
Hao Bian,
Yuxin Zheng,
Jing Lin,
Alan Yuille,
Ben Shao,
Jin Guo,
Tianli Liu,
Mohao Wu,
Yixu Feng,
Shuo Hou,
Haotian Lin
, et al. (87 additional authors not shown)
Abstract:
This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlig…
▽ More
This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlighting, extreme darkness, and night scenes. A notable total of 428 participants registered for the challenge, with 22 teams ultimately making valid submissions. This paper meticulously evaluates the state-of-the-art advancements in enhancing low-light images, reflecting the significant progress and creativity in this field.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
E2F-Net: Eyes-to-Face Inpainting via StyleGAN Latent Space
Authors:
Ahmad Hassanpour,
Fatemeh Jamalbafrani,
Bian Yang,
Kiran Raja,
Raymond Veldhuis,
Julian Fierrez
Abstract:
Face inpainting, the technique of restoring missing or damaged regions in facial images, is pivotal for applications like face recognition in occluded scenarios and image analysis with poor-quality captures. This process not only needs to produce realistic visuals but also preserve individual identity characteristics. The aim of this paper is to inpaint a face given periocular region (eyes-to-face…
▽ More
Face inpainting, the technique of restoring missing or damaged regions in facial images, is pivotal for applications like face recognition in occluded scenarios and image analysis with poor-quality captures. This process not only needs to produce realistic visuals but also preserve individual identity characteristics. The aim of this paper is to inpaint a face given periocular region (eyes-to-face) through a proposed new Generative Adversarial Network (GAN)-based model called Eyes-to-Face Network (E2F-Net). The proposed approach extracts identity and non-identity features from the periocular region using two dedicated encoders have been used. The extracted features are then mapped to the latent space of a pre-trained StyleGAN generator to benefit from its state-of-the-art performance and its rich, diverse and expressive latent space without any additional training. We further improve the StyleGAN output to find the optimal code in the latent space using a new optimization for GAN inversion technique. Our E2F-Net requires a minimum training process reducing the computational complexity as a secondary benefit. Through extensive experiments, we show that our method successfully reconstructs the whole face with high quality, surpassing current techniques, despite significantly less training and supervision efforts. We have generated seven eyes-to-face datasets based on well-known public face datasets for training and verifying our proposed methods. The code and datasets are publicly available.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
COPS: A Compact On-device Pipeline for real-time Smishing detection
Authors:
Harichandana B S S,
Sumit Kumar,
Manjunath Bhimappa Ujjinakoppa,
Barath Raj Kandur Raja
Abstract:
Smartphones have become indispensable in our daily lives and can do almost everything, from communication to online shopping. However, with the increased usage, cybercrime aimed at mobile devices is rocketing. Smishing attacks, in particular, have observed a significant upsurge in recent years. This problem is further exacerbated by the perpetrator creating new deceptive websites daily, with an av…
▽ More
Smartphones have become indispensable in our daily lives and can do almost everything, from communication to online shopping. However, with the increased usage, cybercrime aimed at mobile devices is rocketing. Smishing attacks, in particular, have observed a significant upsurge in recent years. This problem is further exacerbated by the perpetrator creating new deceptive websites daily, with an average life cycle of under 15 hours. This renders the standard practice of keeping a database of malicious URLs ineffective. To this end, we propose a novel on-device pipeline: COPS that intelligently identifies features of fraudulent messages and URLs to alert the user in real-time. COPS is a lightweight pipeline with a detection module based on the Disentangled Variational Autoencoder of size 3.46MB for smishing and URL phishing detection, and we benchmark it on open datasets. We achieve an accuracy of 98.15% and 99.5%, respectively, for both tasks, with a false negative and false positive rate of a mere 0.037 and 0.015, outperforming previous works with the added advantage of ensuring real-time alerts on resource-constrained devices.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
TrICy: Trigger-guided Data-to-text Generation with Intent aware Attention-Copy
Authors:
Vibhav Agarwal,
Sourav Ghosh,
Harichandana BSS,
Himanshu Arora,
Barath Raj Kandur Raja
Abstract:
Data-to-text (D2T) generation is a crucial task in many natural language understanding (NLU) applications and forms the foundation of task-oriented dialog systems. In the context of conversational AI solutions that can work directly with local data on the user's device, architectures utilizing large pre-trained language models (PLMs) are impractical for on-device deployment due to a high memory fo…
▽ More
Data-to-text (D2T) generation is a crucial task in many natural language understanding (NLU) applications and forms the foundation of task-oriented dialog systems. In the context of conversational AI solutions that can work directly with local data on the user's device, architectures utilizing large pre-trained language models (PLMs) are impractical for on-device deployment due to a high memory footprint. To this end, we propose TrICy, a novel lightweight framework for an enhanced D2T task that generates text sequences based on the intent in context and may further be guided by user-provided triggers. We leverage an attention-copy mechanism to predict out-of-vocabulary (OOV) words accurately. Performance analyses on E2E NLG dataset (BLEU: 66.43%, ROUGE-L: 70.14%), WebNLG dataset (BLEU: Seen 64.08%, Unseen 52.35%), and our Custom dataset related to text messaging applications, showcase our architecture's effectiveness. Moreover, we show that by leveraging an optional trigger input, data-to-text generation quality increases significantly and achieves the new SOTA score of 69.29% BLEU for E2E NLG. Furthermore, our analyses show that TrICy achieves at least 24% and 3% improvement in BLEU and METEOR respectively over LLMs like GPT-3, ChatGPT, and Llama 2. We also demonstrate that in some scenarios, performance improvement due to triggers is observed even when they are absent in training.
△ Less
Submitted 25 January, 2024;
originally announced February 2024.
-
Robust Sclera Segmentation for Skin-tone Agnostic Face Image Quality Assessment
Authors:
Wassim Kabbani,
Christoph Busch,
Kiran Raja
Abstract:
Face image quality assessment (FIQA) is crucial for obtaining good face recognition performance. FIQA algorithms should be robust and insensitive to demographic factors. The eye sclera has a consistent whitish color in all humans regardless of their age, ethnicity and skin-tone. This work proposes a robust sclera segmentation method that is suitable for face images in the enrolment and the border…
▽ More
Face image quality assessment (FIQA) is crucial for obtaining good face recognition performance. FIQA algorithms should be robust and insensitive to demographic factors. The eye sclera has a consistent whitish color in all humans regardless of their age, ethnicity and skin-tone. This work proposes a robust sclera segmentation method that is suitable for face images in the enrolment and the border control face recognition scenarios. It shows how the statistical analysis of the sclera pixels produces features that are invariant to skin-tone, age and ethnicity and thus can be incorporated into FIQA algorithms to make them agnostic to demographic factors.
△ Less
Submitted 22 December, 2023;
originally announced December 2023.
-
Log-Likelihood Score Level Fusion for Improved Cross-Sensor Smartphone Periocular Recognition
Authors:
Fernando Alonso-Fernandez,
Kiran B. Raja,
Christoph Busch,
Josef Bigun
Abstract:
The proliferation of cameras and personal devices results in a wide variability of imaging conditions, producing large intra-class variations and a significant performance drop when images from heterogeneous environments are compared. However, many applications require to deal with data from different sources regularly, thus needing to overcome these interoperability problems. Here, we employ fusi…
▽ More
The proliferation of cameras and personal devices results in a wide variability of imaging conditions, producing large intra-class variations and a significant performance drop when images from heterogeneous environments are compared. However, many applications require to deal with data from different sources regularly, thus needing to overcome these interoperability problems. Here, we employ fusion of several comparators to improve periocular performance when images from different smartphones are compared. We use a probabilistic fusion framework based on linear logistic regression, in which fused scores tend to be log-likelihood ratios, obtaining a reduction in cross-sensor EER of up to 40% due to the fusion. Our framework also provides an elegant and simple solution to handle signals from different devices, since same-sensor and cross-sensor score distributions are aligned and mapped to a common probabilistic domain. This allows the use of Bayes thresholds for optimal decision-making, eliminating the need of sensor-specific thresholds, which is essential in operational conditions because the threshold setting critically determines the accuracy of the authentication process in many applications.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
Terrain-Informed Self-Supervised Learning: Enhancing Building Footprint Extraction from LiDAR Data with Limited Annotations
Authors:
Anuja Vats,
David Völgyes,
Martijn Vermeer,
Marius Pedersen,
Kiran Raja,
Daniele S. M. Fantin,
Jacob Alexander Hay
Abstract:
Estimating building footprint maps from geospatial data is of paramount importance in urban planning, development, disaster management, and various other applications. Deep learning methodologies have gained prominence in building segmentation maps, offering the promise of precise footprint extraction without extensive post-processing. However, these methods face challenges in generalization and l…
▽ More
Estimating building footprint maps from geospatial data is of paramount importance in urban planning, development, disaster management, and various other applications. Deep learning methodologies have gained prominence in building segmentation maps, offering the promise of precise footprint extraction without extensive post-processing. However, these methods face challenges in generalization and label efficiency, particularly in remote sensing, where obtaining accurate labels can be both expensive and time-consuming. To address these challenges, we propose terrain-aware self-supervised learning, tailored to remote sensing, using digital elevation models from LiDAR data. We propose to learn a model to differentiate between bare Earth and superimposed structures enabling the network to implicitly learn domain-relevant features without the need for extensive pixel-level annotations. We test the effectiveness of our approach by evaluating building segmentation performance on test datasets with varying label fractions. Remarkably, with only 1% of the labels (equivalent to 25 labeled examples), our method improves over ImageNet pre-training, showing the advantage of leveraging unlabeled data for feature extraction in the domain of remote sensing. The performance improvement is more pronounced in few-shot scenarios and gradually closes the gap with ImageNet pre-training as the label fraction increases. We test on a dataset characterized by substantial distribution shifts and labeling errors to demonstrate the generalizability of our approach. When compared to other baselines, including ImageNet pretraining and more complex architectures, our approach consistently performs better, demonstrating the efficiency and effectiveness of self-supervised terrain-aware feature learning.
△ Less
Submitted 18 April, 2024; v1 submitted 2 November, 2023;
originally announced November 2023.
-
Towards minimizing efforts for Morphing Attacks -- Deep embeddings for morphing pair selection and improved Morphing Attack Detection
Authors:
Roman Kessler,
Kiran Raja,
Juan Tapia,
Christoph Busch
Abstract:
Face Morphing Attacks pose a threat to the security of identity documents, especially with respect to a subsequent access control process, because it enables both individuals involved to exploit the same document. In this study, face embeddings serve two purposes: pre-selecting images for large-scale Morphing Attack generation and detecting potential Morphing Attacks. We build upon previous embedd…
▽ More
Face Morphing Attacks pose a threat to the security of identity documents, especially with respect to a subsequent access control process, because it enables both individuals involved to exploit the same document. In this study, face embeddings serve two purposes: pre-selecting images for large-scale Morphing Attack generation and detecting potential Morphing Attacks. We build upon previous embedding studies in both use cases using the MagFace model. For the first objective, we employ an pre-selection algorithm that pairs individuals based on face embedding similarity. We quantify the attack potential of differently morphed face images to compare the usability of pre-selection in automatically generating numerous successful Morphing Attacks. Regarding the second objective, we compare embeddings from two state-of-the-art face recognition systems in terms of their ability to detect Morphing Attacks. Our findings demonstrate that ArcFace and MagFace provide valuable face embeddings for image pre-selection. Both open-source and COTS face recognition systems are susceptible to generated attacks, particularly when pre-selection is based on embeddings rather than random pairing which was only constrained by soft biometrics. More accurate face recognition systems exhibit greater vulnerability to attacks, with COTS systems being the most susceptible. Additionally, MagFace embeddings serve as a robust alternative for detecting morphed face images compared to the previously used ArcFace embeddings. The results endorse the advantages of face embeddings in more effective image pre-selection for face morphing and accurate detection of morphed face images. This is supported by extensive analysis of various designed attacks. The MagFace model proves to be a powerful alternative to the commonly used ArcFace model for both objectives, pre-selection and attack detection.
△ Less
Submitted 30 March, 2024; v1 submitted 29 May, 2023;
originally announced May 2023.
-
Large language models in biomedical natural language processing: benchmarks, baselines, and recommendations
Authors:
Qingyu Chen,
Jingcheng Du,
Yan Hu,
Vipina Kuttichi Keloth,
Xueqing Peng,
Kalpana Raja,
Rui Zhang,
Zhiyong Lu,
Hua Xu
Abstract:
Biomedical literature is growing rapidly, making it challenging to curate and extract knowledge manually. Biomedical natural language processing (BioNLP) techniques that can automatically extract information from biomedical literature help alleviate this burden. Recently, large Language Models (LLMs), such as GPT-3 and GPT-4, have gained significant attention for their impressive performance. Howe…
▽ More
Biomedical literature is growing rapidly, making it challenging to curate and extract knowledge manually. Biomedical natural language processing (BioNLP) techniques that can automatically extract information from biomedical literature help alleviate this burden. Recently, large Language Models (LLMs), such as GPT-3 and GPT-4, have gained significant attention for their impressive performance. However, their effectiveness in BioNLP tasks and impact on method development and downstream users remain understudied. This pilot study (1) establishes the baseline performance of GPT-3 and GPT-4 at both zero-shot and one-shot settings in eight BioNLP datasets across four applications: named entity recognition, relation extraction, multi-label document classification, and semantic similarity and reasoning, (2) examines the errors produced by the LLMs and categorized the errors into three types: missingness, inconsistencies, and unwanted artificial content, and (3) provides suggestions for using LLMs in BioNLP applications. We make the datasets, baselines, and results publicly available to the community via https://github.com/qingyu-qc/gpt_bionlp_benchmark.
△ Less
Submitted 20 January, 2024; v1 submitted 10 May, 2023;
originally announced May 2023.
-
Differential Newborn Face Morphing Attack Detection using Wavelet Scatter Network
Authors:
Raghavendra Ramachandra,
Sushma Venkatesh,
Guoqiang Li,
Kiran Raja
Abstract:
Face Recognition System (FRS) are shown to be vulnerable to morphed images of newborns. Detecting morphing attacks stemming from face images of newborn is important to avoid unwanted consequences, both for security and society. In this paper, we present a new reference-based/Differential Morphing Attack Detection (MAD) method to detect newborn morphing images using Wavelet Scattering Network (WSN)…
▽ More
Face Recognition System (FRS) are shown to be vulnerable to morphed images of newborns. Detecting morphing attacks stemming from face images of newborn is important to avoid unwanted consequences, both for security and society. In this paper, we present a new reference-based/Differential Morphing Attack Detection (MAD) method to detect newborn morphing images using Wavelet Scattering Network (WSN). We propose a two-layer WSN with 250 $\times$ 250 pixels and six rotations of wavelets per layer, resulting in 577 paths. The proposed approach is validated on a dataset of 852 bona fide images and 2460 morphing images constructed using face images of 42 unique newborns. The obtained results indicate a gain of over 10\% in detection accuracy over other existing D-MAD techniques.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
A Latent Fingerprint in the Wild Database
Authors:
Xinwei Liu,
Kiran Raja,
Renfang Wang,
Hong Qiu,
Hucheng Wu,
Dechao Sun,
Qiguang Zheng,
Nian Liu,
Xiaoxia Wang,
Gehang Huang,
Raghavendra Ramachandra,
Christoph Busch
Abstract:
Latent fingerprints are among the most important and widely used evidence in crime scenes, digital forensics and law enforcement worldwide. Despite the number of advancements reported in recent works, we note that significant open issues such as independent benchmarking and lack of large-scale evaluation databases for improving the algorithms are inadequately addressed. The available databases are…
▽ More
Latent fingerprints are among the most important and widely used evidence in crime scenes, digital forensics and law enforcement worldwide. Despite the number of advancements reported in recent works, we note that significant open issues such as independent benchmarking and lack of large-scale evaluation databases for improving the algorithms are inadequately addressed. The available databases are mostly of semi-public nature, lack of acquisition in the wild environment, and post-processing pipelines. Moreover, they do not represent a realistic capture scenario similar to real crime scenes, to benchmark the robustness of the algorithms. Further, existing databases for latent fingerprint recognition do not have a large number of unique subjects/fingerprint instances or do not provide ground truth/reference fingerprint images to conduct a cross-comparison against the latent. In this paper, we introduce a new wild large-scale latent fingerprint database that includes five different acquisition scenarios: reference fingerprints from (1) optical and (2) capacitive sensors, (3) smartphone fingerprints, latent fingerprints captured from (4) wall surface, (5) Ipad surface, and (6) aluminium foil surface. The new database consists of 1,318 unique fingerprint instances captured in all above mentioned settings. A total of 2,636 reference fingerprints from optical and capacitive sensors, 1,318 fingerphotos from smartphones, and 9,224 latent fingerprints from each of the 132 subjects were provided in this work. The dataset is constructed considering various age groups, equal representations of genders and backgrounds. In addition, we provide an extensive set of analysis of various subset evaluations to highlight open challenges for future directions in latent fingerprint recognition research.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Learning Pairwise Interaction for Generalizable DeepFake Detection
Authors:
Ying Xu,
Kiran Raja,
Luisa Verdoliva,
Marius Pedersen
Abstract:
A fast-paced development of DeepFake generation techniques challenge the detection schemes designed for known type DeepFakes. A reliable Deepfake detection approach must be agnostic to generation types, which can present diverse quality and appearance. Limited generalizability across different generation schemes will restrict the wide-scale deployment of detectors if they fail to handle unseen att…
▽ More
A fast-paced development of DeepFake generation techniques challenge the detection schemes designed for known type DeepFakes. A reliable Deepfake detection approach must be agnostic to generation types, which can present diverse quality and appearance. Limited generalizability across different generation schemes will restrict the wide-scale deployment of detectors if they fail to handle unseen attacks in an open set scenario. We propose a new approach, Multi-Channel Xception Attention Pairwise Interaction (MCX-API), that exploits the power of pairwise learning and complementary information from different color space representations in a fine-grained manner. We first validate our idea on a publicly available dataset in a intra-class setting (closed set) with four different Deepfake schemes. Further, we report all the results using balanced-open-set-classification (BOSC) accuracy in an inter-class setting (open-set) using three public datasets. Our experiments indicate that our proposed method can generalize better than the state-of-the-art Deepfakes detectors. We obtain 98.48% BOSC accuracy on the FF++ dataset and 90.87% BOSC accuracy on the CelebDF dataset suggesting a promising direction for generalization of DeepFake detection. We further utilize t-SNE and attention maps to interpret and visualize the decision-making process of our proposed network. https://github.com/xuyingzhongguo/MCX-API
△ Less
Submitted 26 February, 2023;
originally announced February 2023.
-
SRTGAN: Triplet Loss based Generative Adversarial Network for Real-World Super-Resolution
Authors:
Dhruv Patel,
Abhinav Jain,
Simran Bawkar,
Manav Khorasiya,
Kalpesh Prajapati,
Kishor Upla,
Kiran Raja,
Raghavendra Ramachandra,
Christoph Busch
Abstract:
Many applications such as forensics, surveillance, satellite imaging, medical imaging, etc., demand High-Resolution (HR) images. However, obtaining an HR image is not always possible due to the limitations of optical sensors and their costs. An alternative solution called Single Image Super-Resolution (SISR) is a software-driven approach that aims to take a Low-Resolution (LR) image and obtain the…
▽ More
Many applications such as forensics, surveillance, satellite imaging, medical imaging, etc., demand High-Resolution (HR) images. However, obtaining an HR image is not always possible due to the limitations of optical sensors and their costs. An alternative solution called Single Image Super-Resolution (SISR) is a software-driven approach that aims to take a Low-Resolution (LR) image and obtain the HR image. Most supervised SISR solutions use ground truth HR image as a target and do not include the information provided in the LR image, which could be valuable. In this work, we introduce Triplet Loss-based Generative Adversarial Network hereafter referred as SRTGAN for Image Super-Resolution problem on real-world degradation. We introduce a new triplet-based adversarial loss function that exploits the information provided in the LR image by using it as a negative sample. Allowing the patch-based discriminator with access to both HR and LR images optimizes to better differentiate between HR and LR images; hence, improving the adversary. Further, we propose to fuse the adversarial loss, content loss, perceptual loss, and quality loss to obtain Super-Resolution (SR) image with high perceptual fidelity. We validate the superior performance of the proposed method over the other existing methods on the RealSR dataset in terms of quantitative and qualitative metrics.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Time flies by: Analyzing the Impact of Face Ageing on the Recognition Performance with Synthetic Data
Authors:
Marcel Grimmer,
Haoyu Zhang,
Raghavendra Ramachandra,
Kiran Raja,
Christoph Busch
Abstract:
The vast progress in synthetic image synthesis enables the generation of facial images in high resolution and photorealism. In biometric applications, the main motivation for using synthetic data is to solve the shortage of publicly-available biometric data while reducing privacy risks when processing such sensitive information. These advantages are exploited in this work by simulating human face…
▽ More
The vast progress in synthetic image synthesis enables the generation of facial images in high resolution and photorealism. In biometric applications, the main motivation for using synthetic data is to solve the shortage of publicly-available biometric data while reducing privacy risks when processing such sensitive information. These advantages are exploited in this work by simulating human face ageing with recent face age modification algorithms to generate mated samples, thereby studying the impact of ageing on the performance of an open-source biometric recognition system. Further, a real dataset is used to evaluate the effects of short-term ageing, comparing the biometric performance to the synthetic domain. The main findings indicate that short-term ageing in the range of 1-5 years has only minor effects on the general recognition performance. However, the correct verification of mated faces with long-term age differences beyond 20 years poses still a significant challenge and requires further investigation.
△ Less
Submitted 17 August, 2022;
originally announced August 2022.
-
SYN-MAD 2022: Competition on Face Morphing Attack Detection Based on Privacy-aware Synthetic Training Data
Authors:
Marco Huber,
Fadi Boutros,
Anh Thi Luu,
Kiran Raja,
Raghavendra Ramachandra,
Naser Damer,
Pedro C. Neto,
Tiago Gonçalves,
Ana F. Sequeira,
Jaime S. Cardoso,
João Tremoço,
Miguel Lourenço,
Sergio Serra,
Eduardo Cermeño,
Marija Ivanovska,
Borut Batagelj,
Andrej Kronovšek,
Peter Peer,
Vitomir Štruc
Abstract:
This paper presents a summary of the Competition on Face Morphing Attack Detection Based on Privacy-aware Synthetic Training Data (SYN-MAD) held at the 2022 International Joint Conference on Biometrics (IJCB 2022). The competition attracted a total of 12 participating teams, both from academia and industry and present in 11 different countries. In the end, seven valid submissions were submitted by…
▽ More
This paper presents a summary of the Competition on Face Morphing Attack Detection Based on Privacy-aware Synthetic Training Data (SYN-MAD) held at the 2022 International Joint Conference on Biometrics (IJCB 2022). The competition attracted a total of 12 participating teams, both from academia and industry and present in 11 different countries. In the end, seven valid submissions were submitted by the participating teams and evaluated by the organizers. The competition was held to present and attract solutions that deal with detecting face morphing attacks while protecting people's privacy for ethical and legal reasons. To ensure this, the training data was limited to synthetic data provided by the organizers. The submitted solutions presented innovations that led to outperforming the considered baseline in many experimental settings. The evaluation benchmark is now available at: https://github.com/marcohuber/SYN-MAD-2022.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
Analyzing Fairness in Deepfake Detection With Massively Annotated Databases
Authors:
Ying Xu,
Philipp Terhörst,
Kiran Raja,
Marius Pedersen
Abstract:
In recent years, image and video manipulations with Deepfake have become a severe concern for security and society. Many detection models and datasets have been proposed to detect Deepfake data reliably. However, there is an increased concern that these models and training databases might be biased and, thus, cause Deepfake detectors to fail. In this work, we investigate factors causing biased det…
▽ More
In recent years, image and video manipulations with Deepfake have become a severe concern for security and society. Many detection models and datasets have been proposed to detect Deepfake data reliably. However, there is an increased concern that these models and training databases might be biased and, thus, cause Deepfake detectors to fail. In this work, we investigate factors causing biased detection in public Deepfake datasets by (a) creating large-scale demographic and non-demographic attribute annotations with 47 different attributes for five popular Deepfake datasets and (b) comprehensively analysing attributes resulting in AI-bias of three state-of-the-art Deepfake detection backbone models on these datasets. The analysis shows how various attributes influence a large variety of distinctive attributes (from over 65M labels) on the detection performance which includes demographic (age, gender, ethnicity) and non-demographic (hair, skin, accessories, etc.) attributes. The results examined datasets show limited diversity and, more importantly, show that the utilised Deepfake detection backbone models are strongly affected by investigated attributes making them not fair across attributes. The Deepfake detection backbone methods trained on such imbalanced/biased datasets result in incorrect detection results leading to generalisability, fairness, and security issues. Our findings and annotated datasets will guide future research to evaluate and mitigate bias in Deepfake detection techniques. The annotated datasets and the corresponding code are publicly available.
△ Less
Submitted 11 March, 2024; v1 submitted 11 August, 2022;
originally announced August 2022.
-
Personalized Detection of Cognitive Biases in Actions of Users from Their Logs: Anchoring and Recency Biases
Authors:
Atanu R Sinha,
Navita Goyal,
Sunny Dhamnani,
Tanay Asija,
Raja K Dubey,
M V Kaarthik Raja,
Georgios Theocharous
Abstract:
Cognitive biases are mental shortcuts humans use in dealing with information and the environment, and which result in biased actions and behaviors (or, actions), unbeknownst to themselves. Biases take many forms, with cognitive biases occupying a central role that inflicts fairness, accountability, transparency, ethics, law, medicine, and discrimination. Detection of biases is considered a necessa…
▽ More
Cognitive biases are mental shortcuts humans use in dealing with information and the environment, and which result in biased actions and behaviors (or, actions), unbeknownst to themselves. Biases take many forms, with cognitive biases occupying a central role that inflicts fairness, accountability, transparency, ethics, law, medicine, and discrimination. Detection of biases is considered a necessary step toward their mitigation. Herein, we focus on two cognitive biases - anchoring and recency. The recognition of cognitive bias in computer science is largely in the domain of information retrieval, and bias is identified at an aggregate level with the help of annotated data. Proposing a different direction for bias detection, we offer a principled approach along with Machine Learning to detect these two cognitive biases from Web logs of users' actions. Our individual user level detection makes it truly personalized, and does not rely on annotated data. Instead, we start with two basic principles established in cognitive psychology, use modified training of an attention network, and interpret attention weights in a novel way according to those principles, to infer and distinguish between these two biases. The personalized approach allows detection for specific users who are susceptible to these biases when performing their tasks, and can help build awareness among them so as to undertake bias mitigation.
△ Less
Submitted 1 July, 2022; v1 submitted 30 June, 2022;
originally announced June 2022.
-
On the (Limited) Generalization of MasterFace Attacks and Its Relation to the Capacity of Face Representations
Authors:
Philipp Terhörst,
Florian Bierbaum,
Marco Huber,
Naser Damer,
Florian Kirchbuchner,
Kiran Raja,
Arjan Kuijper
Abstract:
A MasterFace is a face image that can successfully match against a large portion of the population. Since their generation does not require access to the information of the enrolled subjects, MasterFace attacks represent a potential security risk for widely-used face recognition systems. Previous works proposed methods for generating such images and demonstrated that these attacks can strongly com…
▽ More
A MasterFace is a face image that can successfully match against a large portion of the population. Since their generation does not require access to the information of the enrolled subjects, MasterFace attacks represent a potential security risk for widely-used face recognition systems. Previous works proposed methods for generating such images and demonstrated that these attacks can strongly compromise face recognition. However, previous works followed evaluation settings consisting of older recognition models, limited cross-dataset and cross-model evaluations, and the use of low-scale testing data. This makes it hard to state the generalizability of these attacks. In this work, we comprehensively analyse the generalizability of MasterFace attacks in empirical and theoretical investigations. The empirical investigations include the use of six state-of-the-art FR models, cross-dataset and cross-model evaluation protocols, and utilizing testing datasets of significantly higher size and variance. The results indicate a low generalizability when MasterFaces are training on a different face recognition model than the one used for testing. In these cases, the attack performance is similar to zero-effort imposter attacks. In the theoretical investigations, we define and estimate the face capacity and the maximum MasterFace coverage under the assumption that identities in the face space are well separated. The current trend of increasing the fairness and generalizability in face recognition indicates that the vulnerability of future systems might further decrease. Future works might analyse the utility of MasterFaces for understanding and enhancing the robustness of face recognition models.
△ Less
Submitted 1 April, 2022; v1 submitted 23 March, 2022;
originally announced March 2022.
-
Analyzing Human Observer Ability in Morphing Attack Detection -- Where Do We Stand?
Authors:
Sankini Rancha Godage,
Frøy Løvåsdal,
Sushma Venkatesh,
Kiran Raja,
Raghavendra Ramachandra,
Christoph Busch
Abstract:
Few studies have focused on examining how people recognize morphing attacks, even as several publications have examined the susceptibility of automated FRS and offered morphing attack detection (MAD) approaches. MAD approaches base their decisions either on a single image with no reference to compare against (S-MAD) or using a reference image (D-MAD). One prevalent misconception is that an examine…
▽ More
Few studies have focused on examining how people recognize morphing attacks, even as several publications have examined the susceptibility of automated FRS and offered morphing attack detection (MAD) approaches. MAD approaches base their decisions either on a single image with no reference to compare against (S-MAD) or using a reference image (D-MAD). One prevalent misconception is that an examiner's or observer's capacity for facial morph detection depends on their subject expertise, experience, and familiarity with the issue and that no works have reported the specific results of observers who regularly verify identity (ID) documents for their jobs. As human observers are involved in checking the ID documents having facial images, a lapse in their competence can have significant societal challenges. To assess the observers' proficiency, this work first builds a new benchmark database of realistic morphing attacks from 48 different subjects, resulting in 400 morphed images. We also capture images from Automated Border Control (ABC) gates to mimic the realistic border-crossing scenarios in the D-MAD setting with 400 probe images to study the ability of human observers to detect morphed images. A new dataset of 180 morphing images is also produced to research human capacity in the S-MAD environment. In addition to creating a new evaluation platform to conduct S-MAD and D-MAD analysis, the study employs 469 observers for D-MAD and 410 observers for S-MAD who are primarily governmental employees from more than 40 countries, along with 103 subjects who are not examiners. The analysis offers intriguing insights and highlights the lack of expertise and failure to recognize a sizable number of morphing attacks by experts. The results of this study are intended to aid in the development of training programs to prevent security failures while determining whether an image is bona fide or altered.
△ Less
Submitted 5 September, 2022; v1 submitted 24 February, 2022;
originally announced February 2022.
-
PrivPAS: A real time Privacy-Preserving AI System and applied ethics
Authors:
Harichandana B S S,
Vibhav Agarwal,
Sourav Ghosh,
Gopi Ramena,
Sumit Kumar,
Barath Raj Kandur Raja
Abstract:
With 3.78 billion social media users worldwide in 2021 (48% of the human population), almost 3 billion images are shared daily. At the same time, a consistent evolution of smartphone cameras has led to a photography explosion with 85% of all new pictures being captured using smartphones. However, lately, there has been an increased discussion of privacy concerns when a person being photographed is…
▽ More
With 3.78 billion social media users worldwide in 2021 (48% of the human population), almost 3 billion images are shared daily. At the same time, a consistent evolution of smartphone cameras has led to a photography explosion with 85% of all new pictures being captured using smartphones. However, lately, there has been an increased discussion of privacy concerns when a person being photographed is unaware of the picture being taken or has reservations about the same being shared. These privacy violations are amplified for people with disabilities, who may find it challenging to raise dissent even if they are aware. Such unauthorized image captures may also be misused to gain sympathy by third-party organizations, leading to a privacy breach. Privacy for people with disabilities has so far received comparatively less attention from the AI community. This motivates us to work towards a solution to generate privacy-conscious cues for raising awareness in smartphone users of any sensitivity in their viewfinder content. To this end, we introduce PrivPAS (A real time Privacy-Preserving AI System) a novel framework to identify sensitive content. Additionally, we curate and annotate a dataset to identify and localize accessibility markers and classify whether an image is sensitive to a featured subject with a disability. We demonstrate that the proposed lightweight architecture, with a memory footprint of a mere 8.49MB, achieves a high mAP of 89.52% on resource-constrained devices. Furthermore, our pipeline, trained on face anonymized data, achieves an F1-score of 73.1%.
△ Less
Submitted 8 February, 2022; v1 submitted 5 February, 2022;
originally announced February 2022.
-
Generation of Non-Deterministic Synthetic Face Datasets Guided by Identity Priors
Authors:
Marcel Grimmer,
Haoyu Zhang,
Raghavendra Ramachandra,
Kiran Raja,
Christoph Busch
Abstract:
Enabling highly secure applications (such as border crossing) with face recognition requires extensive biometric performance tests through large scale data. However, using real face images raises concerns about privacy as the laws do not allow the images to be used for other purposes than originally intended. Using representative and subsets of face data can also lead to unwanted demographic biase…
▽ More
Enabling highly secure applications (such as border crossing) with face recognition requires extensive biometric performance tests through large scale data. However, using real face images raises concerns about privacy as the laws do not allow the images to be used for other purposes than originally intended. Using representative and subsets of face data can also lead to unwanted demographic biases and cause an imbalance in datasets. One possible solution to overcome these issues is to replace real face images with synthetically generated samples. While generating synthetic images has benefited from recent advancements in computer vision, generating multiple samples of the same synthetic identity resembling real-world variations is still unaddressed, i.e., mated samples. This work proposes a non-deterministic method for generating mated face images by exploiting the well-structured latent space of StyleGAN. Mated samples are generated by manipulating latent vectors, and more precisely, we exploit Principal Component Analysis (PCA) to define semantically meaningful directions in the latent space and control the similarity between the original and the mated samples using a pre-trained face recognition system. We create a new dataset of synthetic face images (SymFace) consisting of 77,034 samples including 25,919 synthetic IDs. Through our analysis using well-established face image quality metrics, we demonstrate the differences in the biometric quality of synthetic samples mimicking characteristics of real biometric data. The analysis and results thereof indicate the use of synthetic samples created using the proposed approach as a viable alternative to replacing real biometric data.
△ Less
Submitted 7 December, 2021;
originally announced December 2021.
-
QMagFace: Simple and Accurate Quality-Aware Face Recognition
Authors:
Philipp Terhörst,
Malte Ihlefeld,
Marco Huber,
Naser Damer,
Florian Kirchbuchner,
Kiran Raja,
Arjan Kuijper
Abstract:
Face recognition systems have to deal with large variabilities (such as different poses, illuminations, and expressions) that might lead to incorrect matching decisions. These variabilities can be measured in terms of face image quality which is defined over the utility of a sample for recognition. Previous works on face recognition either do not employ this valuable information or make use of non…
▽ More
Face recognition systems have to deal with large variabilities (such as different poses, illuminations, and expressions) that might lead to incorrect matching decisions. These variabilities can be measured in terms of face image quality which is defined over the utility of a sample for recognition. Previous works on face recognition either do not employ this valuable information or make use of non-inherently fit quality estimates. In this work, we propose a simple and effective face recognition solution (QMagFace) that combines a quality-aware comparison score with a recognition model based on a magnitude-aware angular margin loss. The proposed approach includes model-specific face image qualities in the comparison process to enhance the recognition performance under unconstrained circumstances. Exploiting the linearity between the qualities and their comparison scores induced by the utilized loss, our quality-aware comparison function is simple and highly generalizable. The experiments conducted on several face recognition databases and benchmarks demonstrate that the introduced quality-awareness leads to consistent improvements in the recognition performance. Moreover, the proposed QMagFace approach performs especially well under challenging circumstances, such as cross-pose, cross-age, or cross-quality. Consequently, it leads to state-of-the-art performances on several face recognition benchmarks, such as 98.50% on AgeDB, 83.95% on XQLFQ, and 98.74% on CFP-FP. The code for QMagFace is publicly available
△ Less
Submitted 23 March, 2022; v1 submitted 26 November, 2021;
originally announced November 2021.
-
Algorithmic Fairness in Face Morphing Attack Detection
Authors:
Raghavendra Ramachandra,
Kiran Raja,
Christoph Busch
Abstract:
Face morphing attacks can compromise Face Recognition System (FRS) by exploiting their vulnerability. Face Morphing Attack Detection (MAD) techniques have been developed in recent past to deter such attacks and mitigate risks from morphing attacks. MAD algorithms, as any other algorithms should treat the images of subjects from different ethnic origins in an equal manner and provide non-discrimina…
▽ More
Face morphing attacks can compromise Face Recognition System (FRS) by exploiting their vulnerability. Face Morphing Attack Detection (MAD) techniques have been developed in recent past to deter such attacks and mitigate risks from morphing attacks. MAD algorithms, as any other algorithms should treat the images of subjects from different ethnic origins in an equal manner and provide non-discriminatory results. While the promising MAD algorithms are tested for robustness, there is no study comprehensively bench-marking their behaviour against various ethnicities. In this paper, we study and present a comprehensive analysis of algorithmic fairness of the existing Single image-based Morph Attack Detection (S-MAD) algorithms. We attempt to better understand the influence of ethnic bias on MAD algorithms and to this extent, we study the performance of MAD algorithms on a newly created dataset consisting of four different ethnic groups. With Extensive experiments using six different S-MAD techniques, we first present benchmark of detection performance and then measure the quantitative value of the algorithmic fairness for each of them using Fairness Discrepancy Rate (FDR). The results indicate the lack of fairness on all six different S-MAD methods when trained and tested on different ethnic groups suggesting the need for reliable MAD approaches to mitigate the algorithmic bias.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
Pixel-Level Face Image Quality Assessment for Explainable Face Recognition
Authors:
Philipp Terhörst,
Marco Huber,
Naser Damer,
Florian Kirchbuchner,
Kiran Raja,
Arjan Kuijper
Abstract:
An essential factor to achieve high performance in face recognition systems is the quality of its samples. Since these systems are involved in daily life there is a strong need of making face recognition processes understandable for humans. In this work, we introduce the concept of pixel-level face image quality that determines the utility of pixels in a face image for recognition. We propose a tr…
▽ More
An essential factor to achieve high performance in face recognition systems is the quality of its samples. Since these systems are involved in daily life there is a strong need of making face recognition processes understandable for humans. In this work, we introduce the concept of pixel-level face image quality that determines the utility of pixels in a face image for recognition. We propose a training-free approach to assess the pixel-level qualities of a face image given an arbitrary face recognition network. To achieve this, a model-specific quality value of the input image is estimated and used to build a sample-specific quality regression model. Based on this model, quality-based gradients are back-propagated and converted into pixel-level quality estimates. In the experiments, we qualitatively and quantitatively investigated the meaningfulness of our proposed pixel-level qualities based on real and artificial disturbances and by comparing the explanation maps on faces incompliant with the ICAO standards. In all scenarios, the results demonstrate that the proposed solution produces meaningful pixel-level qualities enhancing the interpretability of the complete face image quality. The code is publicly available
△ Less
Submitted 30 November, 2021; v1 submitted 21 October, 2021;
originally announced October 2021.
-
ReGenMorph: Visibly Realistic GAN Generated Face Morphing Attacks by Attack Re-generation
Authors:
Naser Damer,
Kiran Raja,
Marius Süßmilch,
Sushma Venkatesh,
Fadi Boutros,
Meiling Fang,
Florian Kirchbuchner,
Raghavendra Ramachandra,
Arjan Kuijper
Abstract:
Face morphing attacks aim at creating face images that are verifiable to be the face of multiple identities, which can lead to building faulty identity links in operations like border checks. While creating a morphed face detector (MFD), training on all possible attack types is essential to achieve good detection performance. Therefore, investigating new methods of creating morphing attacks drives…
▽ More
Face morphing attacks aim at creating face images that are verifiable to be the face of multiple identities, which can lead to building faulty identity links in operations like border checks. While creating a morphed face detector (MFD), training on all possible attack types is essential to achieve good detection performance. Therefore, investigating new methods of creating morphing attacks drives the generalizability of MADs. Creating morphing attacks was performed on the image level, by landmark interpolation, or on the latent-space level, by manipulating latent vectors in a generative adversarial network. The earlier results in varying blending artifacts and the latter results in synthetic-like striping artifacts. This work presents the novel morphing pipeline, ReGenMorph, to eliminate the LMA blending artifacts by using a GAN-based generation, as well as, eliminate the manipulation in the latent space, resulting in visibly realistic morphed images compared to previous works. The generated ReGenMorph appearance is compared to recent morphing approaches and evaluated for face recognition vulnerability and attack detectability, whether as known or unknown attacks.
△ Less
Submitted 24 September, 2021; v1 submitted 20 August, 2021;
originally announced August 2021.
-
MFR 2021: Masked Face Recognition Competition
Authors:
Fadi Boutros,
Naser Damer,
Jan Niklas Kolf,
Kiran Raja,
Florian Kirchbuchner,
Raghavendra Ramachandra,
Arjan Kuijper,
Pengcheng Fang,
Chao Zhang,
Fei Wang,
David Montero,
Naiara Aginako,
Basilio Sierra,
Marcos Nieto,
Mustafa Ekrem Erakin,
Ugur Demir,
Hazim Kemal,
Ekenel,
Asaki Kataoka,
Kohei Ichikawa,
Shizuma Kubo,
Jie Zhang,
Mingjie He,
Dan Han,
Shiguang Shan
, et al. (10 additional authors not shown)
Abstract:
This paper presents a summary of the Masked Face Recognition Competitions (MFR) held within the 2021 International Joint Conference on Biometrics (IJCB 2021). The competition attracted a total of 10 participating teams with valid submissions. The affiliations of these teams are diverse and associated with academia and industry in nine different countries. These teams successfully submitted 18 vali…
▽ More
This paper presents a summary of the Masked Face Recognition Competitions (MFR) held within the 2021 International Joint Conference on Biometrics (IJCB 2021). The competition attracted a total of 10 participating teams with valid submissions. The affiliations of these teams are diverse and associated with academia and industry in nine different countries. These teams successfully submitted 18 valid solutions. The competition is designed to motivate solutions aiming at enhancing the face recognition accuracy of masked faces. Moreover, the competition considered the deployability of the proposed solutions by taking the compactness of the face recognition models into account. A private dataset representing a collaborative, multi-session, real masked, capture scenario is used to evaluate the submitted solutions. In comparison to one of the top-performing academic face recognition solutions, 10 out of the 18 submitted solutions did score higher masked face verification accuracy.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
Learned Smartphone ISP on Mobile NPUs with Deep Learning, Mobile AI 2021 Challenge: Report
Authors:
Andrey Ignatov,
Cheng-Ming Chiang,
Hsien-Kai Kuo,
Anastasia Sycheva,
Radu Timofte,
Min-Hung Chen,
Man-Yu Lee,
Yu-Syuan Xu,
Yu Tseng,
Shusong Xu,
Jin Guo,
Chao-Hung Chen,
Ming-Chun Hsyu,
Wen-Chia Tsai,
Chao-Wei Chen,
Grigory Malivenko,
Minsu Kwon,
Myungje Lee,
Jaeyoon Yoo,
Changbeom Kang,
Shinjo Wang,
Zheng Shaolong,
Hao Dejun,
Xie Fen,
Feng Zhuang
, et al. (16 additional authors not shown)
Abstract:
As the quality of mobile cameras starts to play a crucial role in modern smartphones, more and more attention is now being paid to ISP algorithms used to improve various perceptual aspects of mobile photos. In this Mobile AI challenge, the target was to develop an end-to-end deep learning-based image signal processing (ISP) pipeline that can replace classical hand-crafted ISPs and achieve nearly r…
▽ More
As the quality of mobile cameras starts to play a crucial role in modern smartphones, more and more attention is now being paid to ISP algorithms used to improve various perceptual aspects of mobile photos. In this Mobile AI challenge, the target was to develop an end-to-end deep learning-based image signal processing (ISP) pipeline that can replace classical hand-crafted ISPs and achieve nearly real-time performance on smartphone NPUs. For this, the participants were provided with a novel learned ISP dataset consisting of RAW-RGB image pairs captured with the Sony IMX586 Quad Bayer mobile sensor and a professional 102-megapixel medium format camera. The runtime of all models was evaluated on the MediaTek Dimensity 1000+ platform with a dedicated AI processing unit capable of accelerating both floating-point and quantized neural networks. The proposed solutions are fully compatible with the above NPU and are capable of processing Full HD photos under 60-100 milliseconds while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper.
△ Less
Submitted 17 May, 2021;
originally announced May 2021.
-
On the Applicability of Synthetic Data for Face Recognition
Authors:
Haoyu Zhang,
Marcel Grimmer,
Raghavendra Ramachandra,
Kiran Raja,
Christoph Busch
Abstract:
Face verification has come into increasing focus in various applications including the European Entry/Exit System, which integrates face recognition mechanisms. At the same time, the rapid advancement of biometric authentication requires extensive performance tests in order to inhibit the discriminatory treatment of travellers due to their demographic background. However, the use of face images co…
▽ More
Face verification has come into increasing focus in various applications including the European Entry/Exit System, which integrates face recognition mechanisms. At the same time, the rapid advancement of biometric authentication requires extensive performance tests in order to inhibit the discriminatory treatment of travellers due to their demographic background. However, the use of face images collected as part of border controls is restricted by the European General Data Protection Law to be processed for no other reason than its original purpose. Therefore, this paper investigates the suitability of synthetic face images generated with StyleGAN and StyleGAN2 to compensate for the urgent lack of publicly available large-scale test data. Specifically, two deep learning-based (SER-FIQ, FaceQnet v1) and one standard-based (ISO/IEC TR 29794-5) face image quality assessment algorithm is utilized to compare the applicability of synthetic face images compared to real face images extracted from the FRGC dataset. Finally, based on the analysis of impostor score distributions and utility score distributions, our experiments reveal negligible differences between StyleGAN vs. StyleGAN2, and further also minor discrepancies compared to real face images.
△ Less
Submitted 6 April, 2021;
originally announced April 2021.
-
edATLAS: An Efficient Disambiguation Algorithm for Texting in Languages with Abugida Scripts
Authors:
Sourav Ghosh,
Sourabh Vasant Gothe,
Chandramouli Sanchi,
Barath Raj Kandur Raja
Abstract:
Abugida refers to a phonogram writing system where each syllable is represented using a single consonant or typographic ligature, along with a default vowel or optional diacritic(s) to denote other vowels. However, texting in these languages has some unique challenges in spite of the advent of devices with soft keyboard supporting custom key layouts. The number of characters in these languages is…
▽ More
Abugida refers to a phonogram writing system where each syllable is represented using a single consonant or typographic ligature, along with a default vowel or optional diacritic(s) to denote other vowels. However, texting in these languages has some unique challenges in spite of the advent of devices with soft keyboard supporting custom key layouts. The number of characters in these languages is large enough to require characters to be spread over multiple views in the layout. Having to switch between views many times to type a single word hinders the natural thought process. This prevents popular usage of native keyboard layouts. On the other hand, supporting romanized scripts (native words transcribed using Latin characters) with language model based suggestions is also set back by the lack of uniform romanization rules.
To this end, we propose a disambiguation algorithm and showcase its usefulness in two novel mutually non-exclusive input methods for languages natively using the abugida writing system: (a) disambiguation of ambiguous input for abugida scripts, and (b) disambiguation of word variants in romanized scripts. We benchmark these approaches using public datasets, and show an improvement in typing speed by 19.49%, 25.13%, and 14.89%, in Hindi, Bengali, and Thai, respectively, using Ambiguous Input, owing to the human ease of locating keys combined with the efficiency of our inference method. Our Word Variant Disambiguation (WDA) maps valid variants of romanized words, previously treated as Out-of-Vocab, to a vocabulary of 100k words with high accuracy, leading to an increase in Error Correction F1 score by 10.03% and Next Word Prediction (NWP) by 62.50% on average.
△ Less
Submitted 29 March, 2021; v1 submitted 4 January, 2021;
originally announced January 2021.
-
EmpLite: A Lightweight Sequence Labeling Model for Emphasis Selection of Short Texts
Authors:
Vibhav Agarwal,
Sourav Ghosh,
Kranti Chalamalasetti,
Bharath Challa,
Sonal Kumari,
Harshavardhana,
Barath Raj Kandur Raja
Abstract:
Word emphasis in textual content aims at conveying the desired intention by changing the size, color, typeface, style (bold, italic, etc.), and other typographical features. The emphasized words are extremely helpful in drawing the readers' attention to specific information that the authors wish to emphasize. However, performing such emphasis using a soft keyboard for social media interactions is…
▽ More
Word emphasis in textual content aims at conveying the desired intention by changing the size, color, typeface, style (bold, italic, etc.), and other typographical features. The emphasized words are extremely helpful in drawing the readers' attention to specific information that the authors wish to emphasize. However, performing such emphasis using a soft keyboard for social media interactions is time-consuming and has an associated learning curve. In this paper, we propose a novel approach to automate the emphasis word detection on short written texts. To the best of our knowledge, this work presents the first lightweight deep learning approach for smartphone deployment of emphasis selection. Experimental results show that our approach achieves comparable accuracy at a much lower model size than existing models. Our best lightweight model has a memory footprint of 2.82 MB with a matching score of 0.716 on SemEval-2020 public benchmark dataset.
△ Less
Submitted 15 December, 2020;
originally announced January 2021.
-
LiteMuL: A Lightweight On-Device Sequence Tagger using Multi-task Learning
Authors:
Sonal Kumari,
Vibhav Agarwal,
Bharath Challa,
Kranti Chalamalasetti,
Sourav Ghosh,
Harshavardhana,
Barath Raj Kandur Raja
Abstract:
Named entity detection and Parts-of-speech tagging are the key tasks for many NLP applications. Although the current state of the art methods achieved near perfection for long, formal, structured text there are hindrances in deploying these models on memory-constrained devices such as mobile phones. Furthermore, the performance of these models is degraded when they encounter short, informal, and c…
▽ More
Named entity detection and Parts-of-speech tagging are the key tasks for many NLP applications. Although the current state of the art methods achieved near perfection for long, formal, structured text there are hindrances in deploying these models on memory-constrained devices such as mobile phones. Furthermore, the performance of these models is degraded when they encounter short, informal, and casual conversations. To overcome these difficulties, we present LiteMuL - a lightweight on-device sequence tagger that can efficiently process the user conversations using a Multi-Task Learning (MTL) approach. To the best of our knowledge, the proposed model is the first on-device MTL neural model for sequence tagging. Our LiteMuL model is about 2.39 MB in size and achieved an accuracy of 0.9433 (for NER), 0.9090 (for POS) on the CoNLL 2003 dataset. The proposed LiteMuL not only outperforms the current state of the art results but also surpasses the results of our proposed on-device task-specific models, with accuracy gains of up to 11% and model-size reduction by 50%-56%. Our model is competitive with other MTL approaches for NER and POS tasks while outshines them with a low memory footprint. We also evaluated our model on custom-curated user conversations and observed impressive results.
△ Less
Submitted 29 March, 2021; v1 submitted 15 December, 2020;
originally announced January 2021.
-
Face Morphing Attack Generation & Detection: A Comprehensive Survey
Authors:
Sushma Venkatesh,
Raghavendra Ramachandra,
Kiran Raja,
Christoph Busch
Abstract:
The vulnerability of Face Recognition System (FRS) to various kind of attacks (both direct and in-direct attacks) and face morphing attacks has received a great interest from the biometric community. The goal of a morphing attack is to subvert the FRS at Automatic Border Control (ABC) gates by presenting the Electronic Machine Readable Travel Document (eMRTD) or e-passport that is obtained based o…
▽ More
The vulnerability of Face Recognition System (FRS) to various kind of attacks (both direct and in-direct attacks) and face morphing attacks has received a great interest from the biometric community. The goal of a morphing attack is to subvert the FRS at Automatic Border Control (ABC) gates by presenting the Electronic Machine Readable Travel Document (eMRTD) or e-passport that is obtained based on the morphed face image. Since the application process for the e-passport in the majority countries requires a passport photo to be presented by the applicant, a malicious actor and the accomplice can generate the morphed face image and to obtain the e-passport. An e-passport with a morphed face images can be used by both the malicious actor and the accomplice to cross the border as the morphed face image can be verified against both of them. This can result in a significant threat as a malicious actor can cross the border without revealing the track of his/her criminal background while the details of accomplice are recorded in the log of the access control system. This survey aims to present a systematic overview of the progress made in the area of face morphing in terms of both morph generation and morph detection. In this paper, we describe and illustrate various aspects of face morphing attacks, including different techniques for generating morphed face images but also the state-of-the-art regarding Morph Attack Detection (MAD) algorithms based on a stringent taxonomy and finally the availability of public databases, which allow to benchmark new MAD algorithms in a reproducible manner. The outcomes of competitions/benchmarking, vulnerability assessments and performance evaluation metrics are also provided in a comprehensive manner. Furthermore, we discuss the open challenges and potential future works that need to be addressed in this evolving field of biometrics.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
On Benchmarking Iris Recognition within a Head-mounted Display for AR/VR Application
Authors:
Fadi Boutros,
Naser Damer,
Kiran Raja,
Raghavendra Ramachandra,
Florian Kirchbuchner,
Arjan Kuijper
Abstract:
Augmented and virtual reality is being deployed in different fields of applications. Such applications might involve accessing or processing critical and sensitive information, which requires strict and continuous access control. Given that Head-Mounted Displays (HMD) developed for such applications commonly contains internal cameras for gaze tracking purposes, we evaluate the suitability of such…
▽ More
Augmented and virtual reality is being deployed in different fields of applications. Such applications might involve accessing or processing critical and sensitive information, which requires strict and continuous access control. Given that Head-Mounted Displays (HMD) developed for such applications commonly contains internal cameras for gaze tracking purposes, we evaluate the suitability of such setup for verifying the users through iris recognition. In this work, we first evaluate a set of iris recognition algorithms suitable for HMD devices by investigating three well-established handcrafted feature extraction approaches, and to complement it, we also present the analysis using four deep learning models. While taking into consideration the minimalistic hardware requirements of stand-alone HMD, we employ and adapt a recently developed miniature segmentation model (EyeMMS) for segmenting the iris. Further, to account for non-ideal and non-collaborative capture of iris, we define a new iris quality metric that we termed as Iris Mask Ratio (IMR) to quantify the iris recognition performance. Motivated by the performance of iris recognition, we also propose the continuous authentication of users in a non-collaborative capture setting in HMD. Through the experiments on a publicly available OpenEDS dataset, we show that performance with EER = 5% can be achieved using deep learning methods in a general setting, along with high accuracy for continuous user authentication.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
MIPGAN -- Generating Strong and High Quality Morphing Attacks Using Identity Prior Driven GAN
Authors:
Haoyu Zhang,
Sushma Venkatesh,
Raghavendra Ramachandra,
Kiran Raja,
Naser Damer,
Christoph Busch
Abstract:
Face morphing attacks target to circumvent Face Recognition Systems (FRS) by employing face images derived from multiple data subjects (e.g., accomplices and malicious actors). Morphed images can be verified against contributing data subjects with a reasonable success rate, given they have a high degree of facial resemblance. The success of morphing attacks is directly dependent on the quality of…
▽ More
Face morphing attacks target to circumvent Face Recognition Systems (FRS) by employing face images derived from multiple data subjects (e.g., accomplices and malicious actors). Morphed images can be verified against contributing data subjects with a reasonable success rate, given they have a high degree of facial resemblance. The success of morphing attacks is directly dependent on the quality of the generated morph images. We present a new approach for generating strong attacks extending our earlier framework for generating face morphs. We present a new approach using an Identity Prior Driven Generative Adversarial Network, which we refer to as MIPGAN (Morphing through Identity Prior driven GAN). The proposed MIPGAN is derived from the StyleGAN with a newly formulated loss function exploiting perceptual quality and identity factor to generate a high quality morphed facial image with minimal artefacts and with high resolution. We demonstrate the proposed approach's applicability to generate strong morphing attacks by evaluating its vulnerability against both commercial and deep learning based Face Recognition System (FRS) and demonstrate the success rate of attacks. Extensive experiments are carried out to assess the FRS's vulnerability against the proposed morphed face generation technique on three types of data such as digital images, re-digitized (printed and scanned) images, and compressed images after re-digitization from newly generated MIPGAN Face Morph Dataset. The obtained results demonstrate that the proposed approach of morph generation poses a high threat to FRS.
△ Less
Submitted 7 April, 2021; v1 submitted 3 September, 2020;
originally announced September 2020.
-
Can GAN Generated Morphs Threaten Face Recognition Systems Equally as Landmark Based Morphs? -- Vulnerability and Detection
Authors:
Sushma Venkatesh,
Haoyu Zhang,
Raghavendra Ramachandra,
Kiran Raja,
Naser Damer,
Christoph Busch
Abstract:
The primary objective of face morphing is to combine face images of different data subjects (e.g. a malicious actor and an accomplice) to generate a face image that can be equally verified for both contributing data subjects. In this paper, we propose a new framework for generating face morphs using a newer Generative Adversarial Network (GAN) - StyleGAN. In contrast to earlier works, we generate…
▽ More
The primary objective of face morphing is to combine face images of different data subjects (e.g. a malicious actor and an accomplice) to generate a face image that can be equally verified for both contributing data subjects. In this paper, we propose a new framework for generating face morphs using a newer Generative Adversarial Network (GAN) - StyleGAN. In contrast to earlier works, we generate realistic morphs of both high-quality and high resolution of 1024$\times$1024 pixels. With the newly created morphing dataset of 2500 morphed face images, we pose a critical question in this work. \textit{(i) Can GAN generated morphs threaten Face Recognition Systems (FRS) equally as Landmark based morphs?} Seeking an answer, we benchmark the vulnerability of a Commercial-Off-The-Shelf FRS (COTS) and a deep learning-based FRS (ArcFace). This work also benchmarks the detection approaches for both GAN generated morphs against the landmark based morphs using established Morphing Attack Detection (MAD) schemes.
△ Less
Submitted 7 July, 2020;
originally announced July 2020.
-
On the Influence of Ageing on Face Morph Attacks: Vulnerability and Detection
Authors:
Sushma Venkatesh,
Kiran Raja,
Raghavendra Ramachandra,
Christoph Busch
Abstract:
Face morphing attacks have raised critical concerns as they demonstrate a new vulnerability of Face Recognition Systems (FRS), which are widely deployed in border control applications. The face morphing process uses the images from multiple data subjects and performs an image blending operation to generate a morphed image of high quality. The generated morphed image exhibits similar visual charact…
▽ More
Face morphing attacks have raised critical concerns as they demonstrate a new vulnerability of Face Recognition Systems (FRS), which are widely deployed in border control applications. The face morphing process uses the images from multiple data subjects and performs an image blending operation to generate a morphed image of high quality. The generated morphed image exhibits similar visual characteristics corresponding to the biometric characteristics of the data subjects that contributed to the composite image and thus making it difficult for both humans and FRS, to detect such attacks. In this paper, we report a systematic investigation on the vulnerability of the Commercial-Off-The-Shelf (COTS) FRS when morphed images under the influence of ageing are presented. To this extent, we have introduced a new morphed face dataset with ageing derived from the publicly available MORPH II face dataset, which we refer to as MorphAge dataset. The dataset has two bins based on age intervals, the first bin - MorphAge-I dataset has 1002 unique data subjects with the age variation of 1 year to 2 years while the MorphAge-II dataset consists of 516 data subjects whose age intervals are from 2 years to 5 years. To effectively evaluate the vulnerability for morphing attacks, we also introduce a new evaluation metric, namely the Fully Mated Morphed Presentation Match Rate (FMMPMR), to quantify the vulnerability effectively in a realistic scenario. Extensive experiments are carried out by using two different COTS FRS (COTS I - Cognitec and COTS II - Neurotechnology) to quantify the vulnerability with ageing. Further, we also evaluate five different Morph Attack Detection (MAD) techniques to benchmark their detection performance with ageing.
△ Less
Submitted 19 September, 2020; v1 submitted 6 July, 2020;
originally announced July 2020.
-
Morphing Attack Detection -- Database, Evaluation Platform and Benchmarking
Authors:
Kiran Raja,
Matteo Ferrara,
Annalisa Franco,
Luuk Spreeuwers,
Illias Batskos,
Florens de Wit Marta Gomez-Barrero,
Ulrich Scherhag,
Daniel Fischer,
Sushma Venkatesh,
Jag Mohan Singh,
Guoqiang Li,
Loïc Bergeron,
Sergey Isadskiy,
Raghavendra Ramachandra,
Christian Rathgeb,
Dinusha Frings,
Uwe Seidel,
Fons Knopjes,
Raymond Veldhuis,
Davide Maltoni,
Christoph Busch
Abstract:
Morphing attacks have posed a severe threat to Face Recognition System (FRS). Despite the number of advancements reported in recent works, we note serious open issues such as independent benchmarking, generalizability challenges and considerations to age, gender, ethnicity that are inadequately addressed. Morphing Attack Detection (MAD) algorithms often are prone to generalization challenges as th…
▽ More
Morphing attacks have posed a severe threat to Face Recognition System (FRS). Despite the number of advancements reported in recent works, we note serious open issues such as independent benchmarking, generalizability challenges and considerations to age, gender, ethnicity that are inadequately addressed. Morphing Attack Detection (MAD) algorithms often are prone to generalization challenges as they are database dependent. The existing databases, mostly of semi-public nature, lack in diversity in terms of ethnicity, various morphing process and post-processing pipelines. Further, they do not reflect a realistic operational scenario for Automated Border Control (ABC) and do not provide a basis to test MAD on unseen data, in order to benchmark the robustness of algorithms. In this work, we present a new sequestered dataset for facilitating the advancements of MAD where the algorithms can be tested on unseen data in an effort to better generalize. The newly constructed dataset consists of facial images from 150 subjects from various ethnicities, age-groups and both genders. In order to challenge the existing MAD algorithms, the morphed images are with careful subject pre-selection created from the contributing images, and further post-processed to remove morphing artifacts. The images are also printed and scanned to remove all digital cues and to simulate a realistic challenge for MAD algorithms. Further, we present a new online evaluation platform to test algorithms on sequestered data. With the platform we can benchmark the morph detection performance and study the generalization ability. This work also presents a detailed analysis on various subsets of sequestered data and outlines open challenges for future directions in MAD research.
△ Less
Submitted 28 September, 2020; v1 submitted 11 June, 2020;
originally announced June 2020.
-
Morton Filters for Superior Template Protection for Iris Recognition
Authors:
Kiran B. Raja,
R. Raghavendra,
Sushma Venkatesh,
Christoph Busch
Abstract:
We address the fundamental performance issues of template protection (TP) for iris verification. We base our work on the popular Bloom-Filter templates protection & address the key challenges like sub-optimal performance and low unlinkability. Specifically, we focus on cases where Bloom-filter templates results in non-ideal performance due to presence of large degradations within iris images. Iris…
▽ More
We address the fundamental performance issues of template protection (TP) for iris verification. We base our work on the popular Bloom-Filter templates protection & address the key challenges like sub-optimal performance and low unlinkability. Specifically, we focus on cases where Bloom-filter templates results in non-ideal performance due to presence of large degradations within iris images. Iris recognition is challenged with number of occluding factors such as presence of eye-lashes within captured image, occlusion due to eyelids, low quality iris images due to motion blur. All of such degrading factors result in obtaining non-reliable iris codes & thereby provide non-ideal biometric performance. These factors directly impact the protected templates derived from iris images when classical Bloom-filters are employed. To this end, we propose and extend our earlier ideas of Morton-filters for obtaining better and reliable templates for iris. Morton filter based TP for iris codes is based on leveraging the intra and inter-class distribution by exploiting low-rank iris codes to derive the stable bits across iris images for a particular subject and also analyzing the discriminable bits across various subjects. Such low-rank non-noisy iris codes enables realizing the template protection in a superior way which not only can be used in constrained setting, but also in relaxed iris imaging. We further extend the work to analyze the applicability to VIS iris images by employing a large scale public iris image database - UBIRIS(v1 & v2), captured in a unconstrained setting. Through a set of experiments, we demonstrate the applicability of proposed approach and vet the strengths and weakness. Yet another contribution of this work stems in assessing the security of the proposed approach where factors of Unlinkability is studied to indicate the antagonistic nature to relaxed iris imaging scenarios.
△ Less
Submitted 15 January, 2020;
originally announced January 2020.
-
Smartphone Multi-modal Biometric Authentication: Database and Evaluation
Authors:
Raghavendra Ramachandra,
Martin Stokkenes,
Amir Mohammadi,
Sushma Venkatesh,
Kiran Raja,
Pankaj Wasnik,
Eric Poiret,
Sébastien Marcel,
Christoph Busch
Abstract:
Biometric-based verification is widely employed on the smartphones for various applications, including financial transactions. In this work, we present a new multimodal biometric dataset (face, voice, and periocular) acquired using a smartphone. The new dataset is comprised of 150 subjects that are captured in six different sessions reflecting real-life scenarios of smartphone assisted authenticat…
▽ More
Biometric-based verification is widely employed on the smartphones for various applications, including financial transactions. In this work, we present a new multimodal biometric dataset (face, voice, and periocular) acquired using a smartphone. The new dataset is comprised of 150 subjects that are captured in six different sessions reflecting real-life scenarios of smartphone assisted authentication. One of the unique features of this dataset is that it is collected in four different geographic locations representing a diverse population and ethnicity. Additionally, we also present a multimodal Presentation Attack (PA) or spoofing dataset using a low-cost Presentation Attack Instrument (PAI) such as print and electronic display attacks. The novel acquisition protocols and the diversity of the data subjects collected from different geographic locations will allow developing a novel algorithm for either unimodal or multimodal biometrics. Further, we also report the performance evaluation of the baseline biometric verification and Presentation Attack Detection (PAD) on the newly collected dataset.
△ Less
Submitted 5 December, 2019;
originally announced December 2019.
-
Detecting Finger-Vein Presentation Attacks Using 3D Shape & Diffuse Reflectance Decomposition
Authors:
Jag Mohan Singh,
Sushma Venkatesh,
Kiran B. Raja,
Raghavendra Ramachandra,
Christoph Busch
Abstract:
Despite the high biometric performance, finger-vein recognition systems are vulnerable to presentation attacks (aka., spoofing attacks). In this paper, we present a new and robust approach for detecting presentation attacks on finger-vein biometric systems exploiting the 3D Shape (normal-map) and material properties (diffuse-map) of the finger. Observing the normal-map and diffuse-map exhibiting e…
▽ More
Despite the high biometric performance, finger-vein recognition systems are vulnerable to presentation attacks (aka., spoofing attacks). In this paper, we present a new and robust approach for detecting presentation attacks on finger-vein biometric systems exploiting the 3D Shape (normal-map) and material properties (diffuse-map) of the finger. Observing the normal-map and diffuse-map exhibiting enhanced textural differences in comparison with the original finger-vein image, especially in the presence of varying illumination intensity, we propose to employ textural feature-descriptors on both of them independently. The features are subsequently used to compute a separating hyper-plane using Support Vector Machine (SVM) classifiers for the features computed from normal-maps and diffuse-maps independently. Given the scores from each classifier for normal-map and diffuse-map, we propose sum-rule based score level fusion to make detection of such presentation attack more robust. To this end, we construct a new database of finger-vein images acquired using a custom capture device with three inbuilt illuminations and validate the applicability of the proposed approach. The newly collected database consists of 936 images, which corresponds to 468 bona fide images and 468 artefact images. We establish the superiority of the proposed approach by benchmarking it with classical textural feature-descriptor applied directly on finger-vein images. The proposed approach outperforms the classical approaches by providing the Attack Presentation Classification Error Rate (APCER) & Bona fide Presentation Classification Error Rate (BPCER) of 0% compared to comparable traditional methods.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
Robust Morph-Detection at Automated Border Control Gate using Deep Decomposed 3D Shape and Diffuse Reflectance
Authors:
Jag Mohan Singh,
Raghavendra Ramachandra,
Kiran B. Raja,
Christoph Busch
Abstract:
Face recognition is widely employed in Automated Border Control (ABC) gates, which verify the face image on passport or electronic Machine Readable Travel Document (eMTRD) against the captured image to confirm the identity of the passport holder. In this paper, we present a robust morph detection algorithm that is based on differential morph detection. The proposed method decomposes the bona fide…
▽ More
Face recognition is widely employed in Automated Border Control (ABC) gates, which verify the face image on passport or electronic Machine Readable Travel Document (eMTRD) against the captured image to confirm the identity of the passport holder. In this paper, we present a robust morph detection algorithm that is based on differential morph detection. The proposed method decomposes the bona fide image captured from the ABC gate and the digital face image extracted from the eMRTD into the diffuse reconstructed image and a quantized normal map. The extracted features are further used to learn a linear classifier (SVM) to detect a morphing attack based on the assessment of differences between the bona fide image from the ABC gate and the digital face image extracted from the passport. Owing to the availability of multiple cameras within an ABC gate, we extend the proposed method to fuse the classification scores to generate the final decision on morph-attack-detection. To validate our proposed algorithm, we create a morph attack database with overall 588 images, where bona fide are captured in an indoor lighting environment with a Canon DSLR Camera with one sample per subject and correspondingly images from ABC gates. We benchmark our proposed method with the existing state-of-the-art and can state that the new approach significantly outperforms previous approaches in the ABC gate scenario.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
Cross-Sensor Periocular Biometrics in a Global Pandemic: Comparative Benchmark and Novel Multialgorithmic Approach
Authors:
Fernando Alonso-Fernandez,
Kiran B. Raja,
R. Raghavendra,
Cristoph Busch,
Josef Bigun,
Ruben Vera-Rodriguez,
Julian Fierrez
Abstract:
The massive availability of cameras results in a wide variability of imaging conditions, producing large intra-class variations and a significant performance drop if heterogeneous images are compared for person recognition. However, as biometrics is deployed, it is common to replace damaged or obsolete hardware, or to exchange information between heterogeneous applications. Variations in spectral…
▽ More
The massive availability of cameras results in a wide variability of imaging conditions, producing large intra-class variations and a significant performance drop if heterogeneous images are compared for person recognition. However, as biometrics is deployed, it is common to replace damaged or obsolete hardware, or to exchange information between heterogeneous applications. Variations in spectral bands can also occur. For example, surveillance face images (typically acquired in the visible spectrum, VIS) may need to be compared against a legacy iris database (typically acquired in near-infrared, NIR). Here, we propose a multialgorithmic approach to cope with periocular images from different sensors. With face masks in the front line against COVID-19, periocular recognition is regaining popularity since it is the only face region that remains visible. We integrate different comparators with a fusion scheme based on linear logistic regression, in which scores are represented by log-likelihood ratios. This allows easy interpretation of scores and the use of Bayes thresholds for optimal decision-making since scores from different comparators are in the same probabilistic range. We evaluate our approach in the context of the Cross-Eyed Competition, whose aim was to compare recognition approaches when NIR and VIS periocular images are matched. Our approach achieves EER=0.2% and FRR of just 0.47% at FAR=0.01%, representing the best overall approach of the competition. Experiments are also reported with a database of VIS images from different smartphones. We also discuss the impact of template size and computation times, with the most computationally heavy comparator playing an important role in the results. Lastly, the proposed method is shown to outperform other popular fusion approaches, such as the average of scores, SVMs or Random Forest.
△ Less
Submitted 30 March, 2022; v1 submitted 21 February, 2019;
originally announced February 2019.
-
A Hybrid Citation Retrieval Algorithm for Evidence-based Clinical Knowledge Summarization: Combining Concept Extraction, Vector Similarity and Query Expansion for High Precision
Authors:
Kalpana Raja,
Andrew J Sauer,
Ravi P Garg,
Melanie R Klerer,
Siddhartha R Jonnalagadda
Abstract:
Novel information retrieval methods to identify citations relevant to a clinical topic can overcome the knowledge gap existing between the primary literature (MEDLINE) and online clinical knowledge resources such as UpToDate. Searching the MEDLINE database directly or with query expansion methods returns a large number of citations that are not relevant to the query. The current study presents a c…
▽ More
Novel information retrieval methods to identify citations relevant to a clinical topic can overcome the knowledge gap existing between the primary literature (MEDLINE) and online clinical knowledge resources such as UpToDate. Searching the MEDLINE database directly or with query expansion methods returns a large number of citations that are not relevant to the query. The current study presents a citation retrieval system that retrieves citations for evidence-based clinical knowledge summarization. This approach combines query expansion, concept-based screening algorithm, and concept-based vector similarity. We also propose an information extraction framework for automated concept (Population, Intervention, Comparison, and Disease) extraction. We evaluated our proposed system on all topics (as queries) available from UpToDate for two diseases, heart failure (HF) and atrial fibrillation (AFib). The system achieved an overall F-score of 41.2% on HF topics and 42.4% on AFib topics on a gold standard of citations available in UpToDate. This is significantly high when compared to a query-expansion based baseline (F-score of 1.3% on HF and 2.2% on AFib) and a system that uses query expansion with disease hyponyms and journal names, concept-based screening, and term-based vector similarity system (F-score of 37.5% on HF and 39.5% on AFib). Evaluating the system with top K relevant citations, where K is the number of citations in the gold standard achieved a much higher overall F-score of 69.9% on HF topics and 75.1% on AFib topics. In addition, the system retrieved up to 18 new relevant citations per topic when tested on ten HF and six AFib clinical topics.
△ Less
Submitted 6 September, 2016;
originally announced September 2016.
-
CRTS: A type system for representing clinical recommendations
Authors:
Ravi P Garg,
Kalpana Raja,
Siddhartha R Jonnalagadda
Abstract:
Background: Clinical guidelines and recommendations are the driving wheels of the evidence-based medicine (EBM) paradigm, but these are available primarily as unstructured text and are generally highly heterogeneous in nature. This significantly reduces the dissemination and automatic application of these recommendations at the point of care. A comprehensive structured representation of these reco…
▽ More
Background: Clinical guidelines and recommendations are the driving wheels of the evidence-based medicine (EBM) paradigm, but these are available primarily as unstructured text and are generally highly heterogeneous in nature. This significantly reduces the dissemination and automatic application of these recommendations at the point of care. A comprehensive structured representation of these recommendations is highly beneficial in this regard. Objective: The objective of this paper to present Clinical Recommendation Type System (CRTS), a common type system that can effectively represent a clinical recommendation in a structured form. Methods: CRTS is built by analyzing 125 recommendations and 195 research articles corresponding to 6 different diseases available from UpToDate, a publicly available clinical knowledge system, and from the National Guideline Clearinghouse, a public resource for evidence-based clinical practice guidelines. Results: We show that CRTS not only covers the recommendations but also is flexible to be extended to represent information from primary literature. We also describe how our developed type system can be applied for clinical decision support, medical knowledge summarization, and citation retrieval. Conclusion: We showed that our proposed type system is precise and comprehensive in representing a large sample of recommendations available for various disorders. CRTS can now be used to build interoperable information extraction systems that automatically extract clinical recommendations and related data elements from clinical evidence resources, guidelines, systematic reviews and primary publications.
Keywords: guidelines and recommendations, type system, clinical decision support, evidence-based medicine, information storage and retrieval
△ Less
Submitted 6 September, 2016;
originally announced September 2016.
-
Computational Model to Generate Case-Inflected Forms of Masculine Nouns for Word Search in Sanskrit E-Text
Authors:
S V Kasmir Raja,
V Rajitha,
Lakshmanan Meenakshi
Abstract:
The problem of word search in Sanskrit is inseparable from complexities that include those caused by euphonic conjunctions and case-inflections. The case-inflectional forms of a noun normally number 24 owing to the fact that in Sanskrit there are eight cases and three numbers-singular, dual and plural. The traditional method of generating these inflectional forms is rather elaborate owing to the f…
▽ More
The problem of word search in Sanskrit is inseparable from complexities that include those caused by euphonic conjunctions and case-inflections. The case-inflectional forms of a noun normally number 24 owing to the fact that in Sanskrit there are eight cases and three numbers-singular, dual and plural. The traditional method of generating these inflectional forms is rather elaborate owing to the fact that there are differences in the forms generated between even very similar words and there are subtle nuances involved. Further, it would be a cumbersome exercise to generate and search for 24 forms of a word during a word search in a large text, using the currently available case-inflectional form generators. This study presents a new approach to generating case-inflectional forms that is simpler to compute. Further, an optimized model that is sufficient for generating only those word forms that are required in a word search and is more than 80% efficient compared to the complete case-inflectional forms generator, is presented in this study for the first time.
△ Less
Submitted 17 December, 2014;
originally announced December 2014.
-
An Ontology for Comprehensive Tutoring of Euphonic Conjunctions of Sanskrit Grammar
Authors:
S. V. Kasmir Raja,
V. Rajitha,
Meenakshi Lakshmanan
Abstract:
Euphonic conjunctions (sandhis) form a very important aspect of Sanskrit morphology and phonology. The traditional and modern methods of studying about euphonic conjunctions in Sanskrit follow different methodologies. The former involves a rigorous study of the Paninian system embodied in Panini's Ashtadhyayi, while the latter usually involves the study of a few important sandhi rules with the use…
▽ More
Euphonic conjunctions (sandhis) form a very important aspect of Sanskrit morphology and phonology. The traditional and modern methods of studying about euphonic conjunctions in Sanskrit follow different methodologies. The former involves a rigorous study of the Paninian system embodied in Panini's Ashtadhyayi, while the latter usually involves the study of a few important sandhi rules with the use of examples. The former is not suitable for beginners, and the latter, not sufficient to gain a comprehensive understanding of the operation of sandhi rules. This is so since there are not only numerous sandhi rules and exceptions, but also complex precedence rules involved. The need for a new ontology for sandhi-tutoring was hence felt. This work presents a comprehensive ontology designed to enable a student-user to learn in stages all about euphonic conjunctions and the relevant aphorisms of Sanskrit grammar and to test and evaluate the progress of the student-user. The ontology forms the basis of a multimedia sandhi tutor that was given to different categories of users including Sanskrit scholars for extensive and rigorous testing.
△ Less
Submitted 3 February, 2015; v1 submitted 10 October, 2014;
originally announced October 2014.
-
Computational Algorithms Based on the Paninian System to Process Euphonic Conjunctions for Word Searches
Authors:
S. V. Kasmir Raja,
V. Rajitha,
Meenakshi Lakshmanan
Abstract:
Searching for words in Sanskrit E-text is a problem that is accompanied by complexities introduced by features of Sanskrit such as euphonic conjunctions or sandhis. A word could occur in an E-text in a transformed form owing to the operation of rules of sandhi. Simple word search would not yield these transformed forms of the word. Further, there is no search engine in the literature that can comp…
▽ More
Searching for words in Sanskrit E-text is a problem that is accompanied by complexities introduced by features of Sanskrit such as euphonic conjunctions or sandhis. A word could occur in an E-text in a transformed form owing to the operation of rules of sandhi. Simple word search would not yield these transformed forms of the word. Further, there is no search engine in the literature that can comprehensively search for words in Sanskrit E-texts taking euphonic conjunctions into account. This work presents an optimal binary representational schema for letters of the Sanskrit alphabet along with algorithms to efficiently process the sandhi rules of Sanskrit grammar. The work further presents an algorithm that uses the sandhi processing algorithm to perform a comprehensive word search on E-text.
△ Less
Submitted 15 September, 2014;
originally announced September 2014.