-
Modality Agnostic Heterogeneous Face Recognition with Switch Style Modulators
Authors:
Anjith George,
Sebastien Marcel
Abstract:
Heterogeneous Face Recognition (HFR) systems aim to enhance the capability of face recognition in challenging cross-modal authentication scenarios. However, the significant domain gap between the source and target modalities poses a considerable challenge for cross-domain matching. Existing literature primarily focuses on developing HFR approaches for specific pairs of face modalities, necessitati…
▽ More
Heterogeneous Face Recognition (HFR) systems aim to enhance the capability of face recognition in challenging cross-modal authentication scenarios. However, the significant domain gap between the source and target modalities poses a considerable challenge for cross-domain matching. Existing literature primarily focuses on developing HFR approaches for specific pairs of face modalities, necessitating the explicit training of models for each source-target combination. In this work, we introduce a novel framework designed to train a modality-agnostic HFR method capable of handling multiple modalities during inference, all without explicit knowledge of the target modality labels. We achieve this by implementing a computationally efficient automatic routing mechanism called Switch Style Modulation Blocks (SSMB) that trains various domain expert modulators which transform the feature maps adaptively reducing the domain gap. Our proposed SSMB can be trained end-to-end and seamlessly integrated into pre-trained face recognition models, transforming them into modality-agnostic HFR models. We have performed extensive evaluations on HFR benchmark datasets to demonstrate its effectiveness. The source code and protocols will be made publicly available.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Low Fidelity Visuo-Tactile Pretraining Improves Vision-Only Manipulation Performance
Authors:
Selam Gano,
Abraham George,
Amir Barati Farimani
Abstract:
Tactile perception is a critical component of solving real-world manipulation tasks, but tactile sensors for manipulation have barriers to use such as fragility and cost. In this work, we engage a robust, low-cost tactile sensor, BeadSight, as an alternative to precise pre-calibrated sensors for a pretraining approach to manipulation. We show that tactile pretraining, even with a low-fidelity sens…
▽ More
Tactile perception is a critical component of solving real-world manipulation tasks, but tactile sensors for manipulation have barriers to use such as fragility and cost. In this work, we engage a robust, low-cost tactile sensor, BeadSight, as an alternative to precise pre-calibrated sensors for a pretraining approach to manipulation. We show that tactile pretraining, even with a low-fidelity sensor as BeadSight, can improve an imitation learning agent's performance on complex manipulation tasks. We demonstrate this method against a baseline USB cable plugging task, previously achieved with a much higher precision GelSight sensor as the tactile input to pretraining. Our best BeadSight pretrained visuo-tactile agent completed the task with 70\% accuracy compared to 85\% for the best GelSight pretrained visuo-tactile agent, with vision-only inference for both.
△ Less
Submitted 25 June, 2024; v1 submitted 21 June, 2024;
originally announced June 2024.
-
BeadSight: An Inexpensive Tactile Sensor Using Hydro-Gel Beads
Authors:
Abraham George,
Yibo Chen,
Atharva Dikshit,
Peter Pak,
Amir Barati Farimani
Abstract:
In robotic manipulation, tactile sensors are indispensable, especially when dealing with soft objects, objects of varying dimensions, or those out of the robot's direct line of sight. Traditional tactile sensors often grapple with challenges related to cost and durability. To address these issues, our study introduces a novel approach to visuo-tactile sensing with an emphasis on economy and replac…
▽ More
In robotic manipulation, tactile sensors are indispensable, especially when dealing with soft objects, objects of varying dimensions, or those out of the robot's direct line of sight. Traditional tactile sensors often grapple with challenges related to cost and durability. To address these issues, our study introduces a novel approach to visuo-tactile sensing with an emphasis on economy and replacablity. Our proposed sensor, BeadSight, uses hydro-gel beads encased in a vinyl bag as an economical, easily replaceable sensing medium. When the sensor makes contact with a surface, the deformation of the hydrogel beads is observed using a rear camera. This observation is then passed through a U-net Neural Network to predict the forces acting on the surface of the bead bag, in the form of a pressure map. Our results show that the sensor can accurately predict these pressure maps, detecting the location and magnitude of forces applied to the surface. These abilities make BeadSight an effective, inexpensive, and easily replaceable tactile sensor, ideal for many robotics applications.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Heterogeneous Face Recognition Using Domain Invariant Units
Authors:
Anjith George,
Sebastien Marcel
Abstract:
Heterogeneous Face Recognition (HFR) aims to expand the applicability of Face Recognition (FR) systems to challenging scenarios, enabling the matching of face images across different domains, such as matching thermal images to visible spectra. However, the development of HFR systems is challenging because of the significant domain gap between modalities and the lack of availability of large-scale…
▽ More
Heterogeneous Face Recognition (HFR) aims to expand the applicability of Face Recognition (FR) systems to challenging scenarios, enabling the matching of face images across different domains, such as matching thermal images to visible spectra. However, the development of HFR systems is challenging because of the significant domain gap between modalities and the lack of availability of large-scale paired multi-channel data. In this work, we leverage a pretrained face recognition model as a teacher network to learn domaininvariant network layers called Domain-Invariant Units (DIU) to reduce the domain gap. The proposed DIU can be trained effectively even with a limited amount of paired training data, in a contrastive distillation framework. This proposed approach has the potential to enhance pretrained models, making them more adaptable to a wider range of variations in data. We extensively evaluate our approach on multiple challenging benchmarks, demonstrating superior performance compared to state-of-the-art methods.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
From Modalities to Styles: Rethinking the Domain Gap in Heterogeneous Face Recognition
Authors:
Anjith George,
Sebastien Marcel
Abstract:
Heterogeneous Face Recognition (HFR) focuses on matching faces from different domains, for instance, thermal to visible images, making Face Recognition (FR) systems more versatile for challenging scenarios. However, the domain gap between these domains and the limited large-scale datasets in the target HFR modalities make it challenging to develop robust HFR models from scratch. In our work, we vi…
▽ More
Heterogeneous Face Recognition (HFR) focuses on matching faces from different domains, for instance, thermal to visible images, making Face Recognition (FR) systems more versatile for challenging scenarios. However, the domain gap between these domains and the limited large-scale datasets in the target HFR modalities make it challenging to develop robust HFR models from scratch. In our work, we view different modalities as distinct styles and propose a method to modulate feature maps of the target modality to address the domain gap. We present a new Conditional Adaptive Instance Modulation (CAIM ) module that seamlessly fits into existing FR networks, turning them into HFR-ready systems. The CAIM block modulates intermediate feature maps, efficiently adapting to the style of the source modality and bridging the domain gap. Our method enables end-to-end training using a small set of paired samples. We extensively evaluate the proposed approach on various challenging HFR benchmarks, showing that it outperforms state-of-the-art methods. The source code and protocols for reproducing the findings will be made publicly available
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Second Edition FRCSyn Challenge at CVPR 2024: Face Recognition Challenge in the Era of Synthetic Data
Authors:
Ivan DeAndres-Tame,
Ruben Tolosana,
Pietro Melzi,
Ruben Vera-Rodriguez,
Minchul Kim,
Christian Rathgeb,
Xiaoming Liu,
Aythami Morales,
Julian Fierrez,
Javier Ortega-Garcia,
Zhizhou Zhong,
Yuge Huang,
Yuxi Mi,
Shouhong Ding,
Shuigeng Zhou,
Shuai He,
Lingzhi Fu,
Heng Cong,
Rongyu Zhang,
Zhihong Xiao,
Evgeny Smirnov,
Anton Pimenov,
Aleksei Grigorev,
Denis Timoshenko,
Kaleb Mesfin Asfaw
, et al. (33 additional authors not shown)
Abstract:
Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data…
▽ More
Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at CVPR 2024. FRCSyn aims to investigate the use of synthetic data in face recognition to address current technological limitations, including data privacy concerns, demographic biases, generalization to novel scenarios, and performance constraints in challenging situations such as aging, pose variations, and occlusions. Unlike the 1st edition, in which synthetic data from DCFace and GANDiffFace methods was only allowed to train face recognition systems, in this 2nd edition we propose new sub-tasks that allow participants to explore novel face generative methods. The outcomes of the 2nd FRCSyn Challenge, along with the proposed experimental protocol and benchmarking contribute significantly to the application of synthetic data to face recognition.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
SDFR: Synthetic Data for Face Recognition Competition
Authors:
Hatef Otroshi Shahreza,
Christophe Ecabert,
Anjith George,
Alexander Unnervik,
Sébastien Marcel,
Nicolò Di Domenico,
Guido Borghi,
Davide Maltoni,
Fadi Boutros,
Julia Vogel,
Naser Damer,
Ángela Sánchez-Pérez,
EnriqueMas-Candela,
Jorge Calvo-Zaragoza,
Bernardo Biesseck,
Pedro Vidal,
Roger Granada,
David Menotti,
Ivan DeAndres-Tame,
Simone Maurizio La Cava,
Sara Concas,
Pietro Melzi,
Ruben Tolosana,
Ruben Vera-Rodriguez,
Gianpaolo Perelli
, et al. (3 additional authors not shown)
Abstract:
Large-scale face recognition datasets are collected by crawling the Internet and without individuals' consent, raising legal, ethical, and privacy concerns. With the recent advances in generative models, recently several works proposed generating synthetic face recognition datasets to mitigate concerns in web-crawled face recognition datasets. This paper presents the summary of the Synthetic Data…
▽ More
Large-scale face recognition datasets are collected by crawling the Internet and without individuals' consent, raising legal, ethical, and privacy concerns. With the recent advances in generative models, recently several works proposed generating synthetic face recognition datasets to mitigate concerns in web-crawled face recognition datasets. This paper presents the summary of the Synthetic Data for Face Recognition (SDFR) Competition held in conjunction with the 18th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2024) and established to investigate the use of synthetic data for training face recognition models. The SDFR competition was split into two tasks, allowing participants to train face recognition systems using new synthetic datasets and/or existing ones. In the first task, the face recognition backbone was fixed and the dataset size was limited, while the second task provided almost complete freedom on the model backbone, the dataset, and the training pipeline. The submitted models were trained on existing and also new synthetic datasets and used clever methods to improve training with synthetic data. The submissions were evaluated and ranked on a diverse set of seven benchmarking datasets. The paper gives an overview of the submitted face recognition models and reports achieved performance compared to baseline models trained on real and synthetic datasets. Furthermore, the evaluation of submissions is extended to bias assessment across different demography groups. Lastly, an outlook on the current state of the research in training face recognition models using synthetic data is presented, and existing problems as well as potential future directions are also discussed.
△ Less
Submitted 9 April, 2024; v1 submitted 6 April, 2024;
originally announced April 2024.
-
Visuo-Tactile Pretraining for Cable Plugging
Authors:
Abraham George,
Selam Gano,
Pranav Katragadda,
Amir Barati Farimani
Abstract:
Tactile information is a critical tool for fine-grain manipulation. As humans, we rely heavily on tactile information to understand objects in our environments and how to interact with them. We use touch not only to perform manipulation tasks but also to learn how to perform these tasks. Therefore, to create robotic agents that can learn to complete manipulation tasks at a human or super-human lev…
▽ More
Tactile information is a critical tool for fine-grain manipulation. As humans, we rely heavily on tactile information to understand objects in our environments and how to interact with them. We use touch not only to perform manipulation tasks but also to learn how to perform these tasks. Therefore, to create robotic agents that can learn to complete manipulation tasks at a human or super-human level of performance, we need to properly incorporate tactile information into both skill execution and skill learning. In this paper, we investigate how we can incorporate tactile information into imitation learning platforms to improve performance on complex tasks. To do this, we tackle the challenge of plugging in a USB cable, a dexterous manipulation task that relies on fine-grain visuo-tactile serving. By incorporating tactile information into imitation learning frameworks, we are able to train a robotic agent to plug in a USB cable - a first for imitation learning. Additionally, we explore how tactile information can be used to train non-tactile agents through a contrastive-loss pretraining process. Our results show that by pretraining with tactile information, the performance of a non-tactile agent can be significantly improved, reaching a level on par with visuo-tactile agents.
For demonstration videos and access to our codebase, see the project website: https://sites.google.com/andrew.cmu.edu/visuo-tactile-cable-plugging/home
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Flow-Based Visual Stream Compression for Event Cameras
Authors:
Daniel C. Stumpp,
Himanshu Akolkar,
Alan D. George,
Ryad Benosman
Abstract:
As the use of neuromorphic, event-based vision sensors expands, the need for compression of their output streams has increased. While their operational principle ensures event streams are spatially sparse, the high temporal resolution of the sensors can result in high data rates from the sensor depending on scene dynamics. For systems operating in communication-bandwidth-constrained and power-cons…
▽ More
As the use of neuromorphic, event-based vision sensors expands, the need for compression of their output streams has increased. While their operational principle ensures event streams are spatially sparse, the high temporal resolution of the sensors can result in high data rates from the sensor depending on scene dynamics. For systems operating in communication-bandwidth-constrained and power-constrained environments, it is essential to compress these streams before transmitting them to a remote receiver. Therefore, we introduce a flow-based method for the real-time asynchronous compression of event streams as they are generated. This method leverages real-time optical flow estimates to predict future events without needing to transmit them, therefore, drastically reducing the amount of data transmitted. The flow-based compression introduced is evaluated using a variety of methods including spatiotemporal distance between event streams. The introduced method itself is shown to achieve an average compression ratio of 2.81 on a variety of event-camera datasets with the evaluation configuration used. That compression is achieved with a median temporal error of 0.48 ms and an average spatiotemporal event-stream distance of 3.07. When combined with LZMA compression for non-real-time applications, our method can achieve state-of-the-art average compression ratios ranging from 10.45 to 17.24. Additionally, we demonstrate that the proposed prediction algorithm is capable of performing real-time, low-latency event prediction.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks
Authors:
Alexander Unnervik,
Hatef Otroshi Shahreza,
Anjith George,
Sébastien Marcel
Abstract:
Backdoor attacks allow an attacker to embed a specific vulnerability in a machine learning algorithm, activated when an attacker-chosen pattern is presented, causing a specific misprediction. The need to identify backdoors in biometric scenarios has led us to propose a novel technique with different trade-offs. In this paper we propose to use model pairs on open-set classification tasks for detect…
▽ More
Backdoor attacks allow an attacker to embed a specific vulnerability in a machine learning algorithm, activated when an attacker-chosen pattern is presented, causing a specific misprediction. The need to identify backdoors in biometric scenarios has led us to propose a novel technique with different trade-offs. In this paper we propose to use model pairs on open-set classification tasks for detecting backdoors. Using a simple linear operation to project embeddings from a probe model's embedding space to a reference model's embedding space, we can compare both embeddings and compute a similarity score. We show that this score, can be an indicator for the presence of a backdoor despite models being of different architectures, having been trained independently and on different datasets. Additionally, we show that backdoors can be detected even when both models are backdoored. The source code is made available for reproducibility purposes.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Maximizing Real-Time Video QoE via Bandwidth Sharing under Markovian setting
Authors:
Sushi Anna George,
Vinay Joseph
Abstract:
We consider the problem of optimizing Quality of Experience (QoE) of clients streaming real-time video, served by networks managed by different operators that can share bandwidth with each other. The abundance of real-time video traffic is evident in the popularity of applications like video conferencing and video streaming of live events, which have increased significantly since the recent pandem…
▽ More
We consider the problem of optimizing Quality of Experience (QoE) of clients streaming real-time video, served by networks managed by different operators that can share bandwidth with each other. The abundance of real-time video traffic is evident in the popularity of applications like video conferencing and video streaming of live events, which have increased significantly since the recent pandemic. We model the problem as a joint optimization of resource allocation for the clients and bandwidth sharing across the operators, with special attention to how the resource allocation impacts clients' perceived video quality. We propose an online policy as a solution, which involves dynamically sharing a portion of one operator's bandwidth with another operator. We provide strong theoretical optimality guarantees for the policy. We also use extensive simulations to demonstrate the policy's substantial performance improvements (of up to ninety percent), and identify insights into key system parameters (e.g., imbalance in arrival rates or channel conditions of the operators) that dictate the improvements.
△ Less
Submitted 26 January, 2024; v1 submitted 19 January, 2024;
originally announced January 2024.
-
Eliciting Kemeny Rankings
Authors:
Anne-Marie George,
Christos Dimitrakakis
Abstract:
We formulate the problem of eliciting agents' preferences with the goal of finding a Kemeny ranking as a Dueling Bandits problem. Here the bandits' arms correspond to alternatives that need to be ranked and the feedback corresponds to a pairwise comparison between alternatives by a randomly sampled agent. We consider both sampling with and without replacement, i.e., the possibility to ask the same…
▽ More
We formulate the problem of eliciting agents' preferences with the goal of finding a Kemeny ranking as a Dueling Bandits problem. Here the bandits' arms correspond to alternatives that need to be ranked and the feedback corresponds to a pairwise comparison between alternatives by a randomly sampled agent. We consider both sampling with and without replacement, i.e., the possibility to ask the same agent about some comparison multiple times or not.
We find approximation bounds for Kemeny rankings dependant on confidence intervals over estimated winning probabilities of arms. Based on these we state algorithms to find Probably Approximately Correct (PAC) solutions and elaborate on their sample complexity for sampling with or without replacement. Furthermore, if all agents' preferences are strict rankings over the alternatives, we provide means to prune confidence intervals and thereby guide a more efficient elicitation. We formulate several adaptive sampling methods that use look-aheads to estimate how much confidence intervals (and thus approximation guarantees) might be tightened. All described methods are compared on synthetic data.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Minimal Macro-Based Rewritings of Formal Languages: Theory and Applications in Ontology Engineering (and beyond)
Authors:
Christian Kindermann,
Anne-Marie George,
Bijan Parsia,
Uli Sattler
Abstract:
In this paper, we introduce the problem of rewriting finite formal languages using syntactic macros such that the rewriting is minimal in size. We present polynomial-time algorithms to solve variants of this problem and show their correctness. To demonstrate the practical relevance of the proposed problems and the feasibility and effectiveness of our algorithms in practice, we apply these to biome…
▽ More
In this paper, we introduce the problem of rewriting finite formal languages using syntactic macros such that the rewriting is minimal in size. We present polynomial-time algorithms to solve variants of this problem and show their correctness. To demonstrate the practical relevance of the proposed problems and the feasibility and effectiveness of our algorithms in practice, we apply these to biomedical ontologies authored in OWL. We find that such rewritings can significantly reduce the size of ontologies by capturing repeated expressions with macros. In addition to offering valuable assistance in enhancing ontology quality and comprehension, the presented approach introduces a systematic way of analysing and evaluating features of rewriting systems (including syntactic macros, templates, or other forms of rewriting rules) in terms of their influence on computational problems.
△ Less
Submitted 17 December, 2023;
originally announced December 2023.
-
FRCSyn Challenge at WACV 2024:Face Recognition Challenge in the Era of Synthetic Data
Authors:
Pietro Melzi,
Ruben Tolosana,
Ruben Vera-Rodriguez,
Minchul Kim,
Christian Rathgeb,
Xiaoming Liu,
Ivan DeAndres-Tame,
Aythami Morales,
Julian Fierrez,
Javier Ortega-Garcia,
Weisong Zhao,
Xiangyu Zhu,
Zheyu Yan,
Xiao-Yu Zhang,
Jinlin Wu,
Zhen Lei,
Suvidha Tripathi,
Mahak Kothari,
Md Haider Zama,
Debayan Deb,
Bernardo Biesseck,
Pedro Vidal,
Roger Granada,
Guilherme Fickel,
Gustavo Führ
, et al. (22 additional authors not shown)
Abstract:
Despite the widespread adoption of face recognition technology around the world, and its remarkable performance on current benchmarks, there are still several challenges that must be covered in more detail. This paper offers an overview of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at WACV 2024. This is the first international challenge aiming to explore the use…
▽ More
Despite the widespread adoption of face recognition technology around the world, and its remarkable performance on current benchmarks, there are still several challenges that must be covered in more detail. This paper offers an overview of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at WACV 2024. This is the first international challenge aiming to explore the use of synthetic data in face recognition to address existing limitations in the technology. Specifically, the FRCSyn Challenge targets concerns related to data privacy issues, demographic biases, generalization to unseen scenarios, and performance limitations in challenging scenarios, including significant age disparities between enrollment and testing, pose variations, and occlusions. The results achieved in the FRCSyn Challenge, together with the proposed benchmark, contribute significantly to the application of synthetic data to improve face recognition technology.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
One ACT Play: Single Demonstration Behavior Cloning with Action Chunking Transformers
Authors:
Abraham George,
Amir Barati Farimani
Abstract:
Learning from human demonstrations (behavior cloning) is a cornerstone of robot learning. However, most behavior cloning algorithms require a large number of demonstrations to learn a task, especially for general tasks that have a large variety of initial conditions. Humans, however, can learn to complete tasks, even complex ones, after only seeing one or two demonstrations. Our work seeks to emul…
▽ More
Learning from human demonstrations (behavior cloning) is a cornerstone of robot learning. However, most behavior cloning algorithms require a large number of demonstrations to learn a task, especially for general tasks that have a large variety of initial conditions. Humans, however, can learn to complete tasks, even complex ones, after only seeing one or two demonstrations. Our work seeks to emulate this ability, using behavior cloning to learn a task given only a single human demonstration. We achieve this goal by using linear transforms to augment the single demonstration, generating a set of trajectories for a wide range of initial conditions. With these demonstrations, we are able to train a behavior cloning agent to successfully complete three block manipulation tasks. Additionally, we developed a novel addition to the temporal ensembling method used by action chunking agents during inference. By incorporating the standard deviation of the action predictions into the ensembling method, our approach is more robust to unforeseen changes in the environment, resulting in significant performance improvements.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Pour me a drink: Robotic Precision Pouring Carbonated Beverages into Transparent Containers
Authors:
Feiya Zhu,
Shuo Hu,
Letian Leng,
Alison Bartsch,
Abraham George,
Amir Barati Farimani
Abstract:
With the growing emphasis on the development and integration of service robots within household environments, we will need to endow robots with the ability to reliably pour a variety of liquids. However, liquid handling and pouring is a challenging task due to the complex dynamics and varying properties of different liquids, the exacting precision required to prevent spills and ensure accurate pou…
▽ More
With the growing emphasis on the development and integration of service robots within household environments, we will need to endow robots with the ability to reliably pour a variety of liquids. However, liquid handling and pouring is a challenging task due to the complex dynamics and varying properties of different liquids, the exacting precision required to prevent spills and ensure accurate pouring, and the necessity for robots to adapt seamlessly to a multitude of containers in real-world scenarios. In response to these challenges, we propose a novel autonomous robotics pipeline that empowers robots to execute precision pouring tasks, encompassing both carbonated and non-carbonated liquids, as well as opaque and transparent liquids, into a variety of transparent containers. Our proposed approach maximizes the potential of RGB input alone, achieving zero-shot capability by harnessing existing pre-trained vision segmentation models. This eliminates the need for additional data collection, manual image annotations, or extensive training. Furthermore, our work integrates ChatGPT, facilitating seamless interaction between individuals without prior expertise in robotics and our pouring pipeline, this integration enables users to effortlessly request and execute pouring actions. Our experiments demonstrate the pipeline's capability to successfully pour a diverse range of carbonated and non-carbonated beverages into containers of varying sizes, relying solely on visual input.
△ Less
Submitted 19 September, 2023; v1 submitted 16 September, 2023;
originally announced September 2023.
-
SynthDistill: Face Recognition with Knowledge Distillation from Synthetic Data
Authors:
Hatef Otroshi Shahreza,
Anjith George,
Sébastien Marcel
Abstract:
State-of-the-art face recognition networks are often computationally expensive and cannot be used for mobile applications. Training lightweight face recognition models also requires large identity-labeled datasets. Meanwhile, there are privacy and ethical concerns with collecting and using large face recognition datasets. While generating synthetic datasets for training face recognition models is…
▽ More
State-of-the-art face recognition networks are often computationally expensive and cannot be used for mobile applications. Training lightweight face recognition models also requires large identity-labeled datasets. Meanwhile, there are privacy and ethical concerns with collecting and using large face recognition datasets. While generating synthetic datasets for training face recognition models is an alternative option, it is challenging to generate synthetic data with sufficient intra-class variations. In addition, there is still a considerable gap between the performance of models trained on real and synthetic data. In this paper, we propose a new framework (named SynthDistill) to train lightweight face recognition models by distilling the knowledge of a pretrained teacher face recognition model using synthetic data. We use a pretrained face generator network to generate synthetic face images and use the synthesized images to learn a lightweight student network. We use synthetic face images without identity labels, mitigating the problems in the intra-class variation generation of synthetic datasets. Instead, we propose a novel dynamic sampling strategy from the intermediate latent space of the face generator network to include new variations of the challenging images while further exploring new face images in the training batch. The results on five different face recognition datasets demonstrate the superiority of our lightweight model compared to models trained on previous synthetic datasets, achieving a verification accuracy of 99.52% on the LFW dataset with a lightweight network. The results also show that our proposed framework significantly reduces the gap between training with real and synthetic data. The source code for replicating the experiments is publicly released.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
EFaR 2023: Efficient Face Recognition Competition
Authors:
Jan Niklas Kolf,
Fadi Boutros,
Jurek Elliesen,
Markus Theuerkauf,
Naser Damer,
Mohamad Alansari,
Oussama Abdul Hay,
Sara Alansari,
Sajid Javed,
Naoufel Werghi,
Klemen Grm,
Vitomir Štruc,
Fernando Alonso-Fernandez,
Kevin Hernandez Diaz,
Josef Bigun,
Anjith George,
Christophe Ecabert,
Hatef Otroshi Shahreza,
Ketan Kotwal,
Sébastien Marcel,
Iurii Medvedev,
Bo Jin,
Diogo Nunes,
Ahmad Hassanpour,
Pankaj Khatiwada
, et al. (2 additional authors not shown)
Abstract:
This paper presents the summary of the Efficient Face Recognition Competition (EFaR) held at the 2023 International Joint Conference on Biometrics (IJCB 2023). The competition received 17 submissions from 6 different teams. To drive further development of efficient face recognition models, the submitted solutions are ranked based on a weighted score of the achieved verification accuracies on a div…
▽ More
This paper presents the summary of the Efficient Face Recognition Competition (EFaR) held at the 2023 International Joint Conference on Biometrics (IJCB 2023). The competition received 17 submissions from 6 different teams. To drive further development of efficient face recognition models, the submitted solutions are ranked based on a weighted score of the achieved verification accuracies on a diverse set of benchmarks, as well as the deployability given by the number of floating-point operations and model size. The evaluation of submissions is extended to bias, cross-quality, and large-scale recognition benchmarks. Overall, the paper gives an overview of the achieved performance values of the submitted solutions as well as a diverse set of baselines. The submitted solutions use small, efficient network architectures to reduce the computational cost, some solutions apply model quantization. An outlook on possible techniques that are underrepresented in current solutions is given as well.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
Fluid Viscosity Prediction Leveraging Computer Vision and Robot Interaction
Authors:
Jong Hoon Park,
Gauri Pramod Dalwankar,
Alison Bartsch,
Abraham George,
Amir Barati Farimani
Abstract:
Accurately determining fluid viscosity is crucial for various industrial and scientific applications. Traditional methods of viscosity measurement, though reliable, often require manual intervention and cannot easily adapt to real-time monitoring. With advancements in machine learning and computer vision, this work explores the feasibility of predicting fluid viscosity by analyzing fluid oscillati…
▽ More
Accurately determining fluid viscosity is crucial for various industrial and scientific applications. Traditional methods of viscosity measurement, though reliable, often require manual intervention and cannot easily adapt to real-time monitoring. With advancements in machine learning and computer vision, this work explores the feasibility of predicting fluid viscosity by analyzing fluid oscillations captured in video data. The pipeline employs a 3D convolutional autoencoder pretrained in a self-supervised manner to extract and learn features from semantic segmentation masks of oscillating fluids. Then, the latent representations of the input data, produced from the pretrained autoencoder, is processed with a distinct inference head to infer either the fluid category (classification) or the fluid viscosity (regression) in a time-resolved manner. When the latent representations generated by the pretrained autoencoder are used for classification, the system achieves a 97.1% accuracy across a total of 4,140 test datapoints. Similarly, for regression tasks, employing an additional fully-connected network as a regression head allows the pipeline to achieve a mean absolute error of 0.258 over 4,416 test datapoints. This study represents an innovative contribution to both fluid characterization and the evolving landscape of Artificial Intelligence, demonstrating the potential of deep learning in achieving near real-time viscosity estimation and addressing practical challenges in fluid dynamics through the analysis of video data capturing oscillating fluid dynamics.
△ Less
Submitted 2 December, 2023; v1 submitted 4 August, 2023;
originally announced August 2023.
-
RoboChop: Autonomous Framework for Fruit and Vegetable Chopping Leveraging Foundational Models
Authors:
Atharva Dikshit,
Alison Bartsch,
Abraham George,
Amir Barati Farimani
Abstract:
With the goal of developing fully autonomous cooking robots, developing robust systems that can chop a wide variety of objects is important. Existing approaches focus primarily on the low-level dynamics of the cutting action, which overlooks some of the practical real-world challenges of implementing autonomous cutting systems. In this work we propose an autonomous framework to sequence together a…
▽ More
With the goal of developing fully autonomous cooking robots, developing robust systems that can chop a wide variety of objects is important. Existing approaches focus primarily on the low-level dynamics of the cutting action, which overlooks some of the practical real-world challenges of implementing autonomous cutting systems. In this work we propose an autonomous framework to sequence together action primitives for the purpose of chopping fruits and vegetables on a cluttered cutting board. We present a novel technique to leverage vision foundational models SAM and YOLO to accurately detect, segment, and track fruits and vegetables as they visually change through the sequences of chops, finetuning YOLO on a novel dataset of whole and chopped fruits and vegetables. In our experiments, we demonstrate that our simple pipeline is able to reliably chop a variety of fruits and vegetables ranging in size, appearance, and texture, meeting a variety of chopping specifications, including fruit type, number of slices, and types of slices.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
Bridging the Gap: Heterogeneous Face Recognition with Conditional Adaptive Instance Modulation
Authors:
Anjith George,
Sebastien Marcel
Abstract:
Heterogeneous Face Recognition (HFR) aims to match face images across different domains, such as thermal and visible spectra, expanding the applicability of Face Recognition (FR) systems to challenging scenarios. However, the domain gap and limited availability of large-scale datasets in the target domain make training robust and invariant HFR models from scratch difficult. In this work, we treat…
▽ More
Heterogeneous Face Recognition (HFR) aims to match face images across different domains, such as thermal and visible spectra, expanding the applicability of Face Recognition (FR) systems to challenging scenarios. However, the domain gap and limited availability of large-scale datasets in the target domain make training robust and invariant HFR models from scratch difficult. In this work, we treat different modalities as distinct styles and propose a framework to adapt feature maps, bridging the domain gap. We introduce a novel Conditional Adaptive Instance Modulation (CAIM) module that can be integrated into pre-trained FR networks, transforming them into HFR networks. The CAIM block modulates intermediate feature maps, to adapt the style of the target modality effectively bridging the domain gap. Our proposed method allows for end-to-end training with a minimal number of paired samples. We extensively evaluate our approach on multiple challenging benchmarks, demonstrating superior performance compared to state-of-the-art methods. The source code and protocols for reproducing the findings will be made publicly available.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
EdgeFace: Efficient Face Recognition Model for Edge Devices
Authors:
Anjith George,
Christophe Ecabert,
Hatef Otroshi Shahreza,
Ketan Kotwal,
Sebastien Marcel
Abstract:
In this paper, we present EdgeFace, a lightweight and efficient face recognition network inspired by the hybrid architecture of EdgeNeXt. By effectively combining the strengths of both CNN and Transformer models, and a low rank linear layer, EdgeFace achieves excellent face recognition performance optimized for edge devices. The proposed EdgeFace network not only maintains low computational costs…
▽ More
In this paper, we present EdgeFace, a lightweight and efficient face recognition network inspired by the hybrid architecture of EdgeNeXt. By effectively combining the strengths of both CNN and Transformer models, and a low rank linear layer, EdgeFace achieves excellent face recognition performance optimized for edge devices. The proposed EdgeFace network not only maintains low computational costs and compact storage, but also achieves high face recognition accuracy, making it suitable for deployment on edge devices. Extensive experiments on challenging benchmark face datasets demonstrate the effectiveness and efficiency of EdgeFace in comparison to state-of-the-art lightweight models and deep face recognition models. Our EdgeFace model with 1.77M parameters achieves state of the art results on LFW (99.73%), IJB-B (92.67%), and IJB-C (94.85%), outperforming other efficient models with larger computational complexities. The code to replicate the experiments will be made available publicly.
△ Less
Submitted 12 January, 2024; v1 submitted 4 July, 2023;
originally announced July 2023.
-
Feasible Action-Space Reduction as a Metric of Causal Responsibility in Multi-Agent Spatial Interactions
Authors:
Ashwin George,
Luciano Cavalcante Siebert,
David Abbink,
Arkady Zgonnikov
Abstract:
Modelling causal responsibility in multi-agent spatial interactions is crucial for safety and efficiency of interactions of humans with autonomous agents. However, current formal metrics and models of responsibility either lack grounding in ethical and philosophical concepts of responsibility, or cannot be applied to spatial interactions. In this work we propose a metric of causal responsibility w…
▽ More
Modelling causal responsibility in multi-agent spatial interactions is crucial for safety and efficiency of interactions of humans with autonomous agents. However, current formal metrics and models of responsibility either lack grounding in ethical and philosophical concepts of responsibility, or cannot be applied to spatial interactions. In this work we propose a metric of causal responsibility which is tailored to multi-agent spatial interactions, for instance interactions in traffic. In such interactions, a given agent can, by reducing another agent's feasible action space, influence the latter. Therefore, we propose feasible action space reduction (FeAR) as a metric of causal responsibility among agents. Specifically, we look at ex-post causal responsibility for simultaneous actions. We propose the use of Moves de Rigueur (MdR) - a consistent set of prescribed actions for agents - to model the effect of norms on responsibility allocation. We apply the metric in a grid world simulation for spatial interactions and show how the actions, contexts, and norms affect the causal responsibility ascribed to agents. Finally, we demonstrate the application of this metric in complex multi-agent interactions. We argue that the FeAR metric is a step towards an interdisciplinary framework for quantifying responsibility that is needed to ensure safety and meaningful human control in human-AI systems.
△ Less
Submitted 13 November, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
OpenVR: Teleoperation for Manipulation
Authors:
Abraham George,
Alison Bartsch,
Amir Barati Farimani
Abstract:
Across the robotics field, quality demonstrations are an integral part of many control pipelines. However, collecting high-quality demonstration trajectories remains time-consuming and difficult, often resulting in the number of demonstrations being the performance bottleneck. To address this issue, we present a method of Virtual Reality (VR) Teleoperation that uses an Oculus VR headset to teleope…
▽ More
Across the robotics field, quality demonstrations are an integral part of many control pipelines. However, collecting high-quality demonstration trajectories remains time-consuming and difficult, often resulting in the number of demonstrations being the performance bottleneck. To address this issue, we present a method of Virtual Reality (VR) Teleoperation that uses an Oculus VR headset to teleoperate a Franka Emika Panda robot. Although other VR teleoperation methods exist, our code is open source, designed for readily available consumer hardware, easy to modify, agnostic to experimental setup, and simple to use.
△ Less
Submitted 16 May, 2023;
originally announced May 2023.
-
Continual Mean Estimation Under User-Level Privacy
Authors:
Anand Jerry George,
Lekshmi Ramesh,
Aditya Vikram Singh,
Himanshu Tyagi
Abstract:
We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come…
▽ More
We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come together as the users often contribute data repeatedly and multiple queries are made. We provide an algorithm that outputs a mean estimate at every time instant $t$ such that the overall release is user-level $\varepsilon$-DP and has the following error guarantee: Denoting by $M_t$ the maximum number of samples contributed by a user, as long as $\tildeΩ(1/\varepsilon)$ users have $M_t/2$ samples each, the error at time $t$ is $\tilde{O}(1/\sqrt{t}+\sqrt{M}_t/t\varepsilon)$. This is a universal error guarantee which is valid for all arrival patterns of the users. Furthermore, it (almost) matches the existing lower bounds for the single-release setting at all time instants when users have contributed equal number of samples.
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
Attacking Face Recognition with T-shirts: Database, Vulnerability Assessment and Detection
Authors:
M. Ibsen,
C. Rathgeb,
F. Brechtel,
R. Klepp,
K. Pöppelmann,
A. George,
S. Marcel,
C. Busch
Abstract:
Face recognition systems are widely deployed for biometric authentication. Despite this, it is well-known that, without any safeguards, face recognition systems are highly vulnerable to presentation attacks. In response to this security issue, several promising methods for detecting presentation attacks have been proposed which show high performance on existing benchmarks. However, an ongoing chal…
▽ More
Face recognition systems are widely deployed for biometric authentication. Despite this, it is well-known that, without any safeguards, face recognition systems are highly vulnerable to presentation attacks. In response to this security issue, several promising methods for detecting presentation attacks have been proposed which show high performance on existing benchmarks. However, an ongoing challenge is the generalization of presentation attack detection methods to unseen and new attack types. To this end, we propose a new T-shirt Face Presentation Attack (TFPA) database of 1,608 T-shirt attacks using 100 unique presentation attack instruments. In an extensive evaluation, we show that this type of attack can compromise the security of face recognition systems and that some state-of-the-art attack detection mechanisms trained on popular benchmarks fail to robustly generalize to the new attacks. Further, we propose three new methods for detecting T-shirt attack images, one which relies on the statistical differences between depth maps of bona fide images and T-shirt attacks, an anomaly detection approach trained on features only extracted from bona fide RGB images, and a fusion approach which achieves competitive detection performance.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
Optimizing Bandwidth Sharing for Real-time Traffic in Wireless Networks
Authors:
Sushi Anna George,
Vinay Joseph
Abstract:
We consider the problem of enhancing the delivery of real-time traffic in wireless networks using bandwidth sharing between operators. A key characteristic of real-time traffic is that a packet has to be delivered within a delay deadline for it to be useful. The abundance of real-time traffic is evident in the popularity of applications like video and audio conferencing, which increased significan…
▽ More
We consider the problem of enhancing the delivery of real-time traffic in wireless networks using bandwidth sharing between operators. A key characteristic of real-time traffic is that a packet has to be delivered within a delay deadline for it to be useful. The abundance of real-time traffic is evident in the popularity of applications like video and audio conferencing, which increased significantly during the COVID-19 period. We propose a sharing and scheduling policy which involves dynamically sharing a portion of one operator's bandwidth with another operator. We provide strong theoretical guarantees for the policy. We also evaluate its performance via extensive simulations, which show significant improvements of up to 90% in the ability to carry real-time traffic when using the policy. We also explore how the improvements from bandwidth sharing depend on the amount of sharing, and on additional traffic characteristics.
△ Less
Submitted 24 November, 2022; v1 submitted 12 November, 2022;
originally announced November 2022.
-
Prepended Domain Transformer: Heterogeneous Face Recognition without Bells and Whistles
Authors:
Anjith George,
Amir Mohammadi,
Sebastien Marcel
Abstract:
Heterogeneous Face Recognition (HFR) refers to matching face images captured in different domains, such as thermal to visible images (VIS), sketches to visible images, near-infrared to visible, and so on. This is particularly useful in matching visible spectrum images to images captured from other modalities. Though highly useful, HFR is challenging because of the domain gap between the source and…
▽ More
Heterogeneous Face Recognition (HFR) refers to matching face images captured in different domains, such as thermal to visible images (VIS), sketches to visible images, near-infrared to visible, and so on. This is particularly useful in matching visible spectrum images to images captured from other modalities. Though highly useful, HFR is challenging because of the domain gap between the source and target domain. Often, large-scale paired heterogeneous face image datasets are absent, preventing training models specifically for the heterogeneous task. In this work, we propose a surprisingly simple, yet, very effective method for matching face images across different sensing modalities. The core idea of the proposed approach is to add a novel neural network block called Prepended Domain Transformer (PDT) in front of a pre-trained face recognition (FR) model to address the domain gap. Retraining this new block with few paired samples in a contrastive learning setup was enough to achieve state-of-the-art performance in many HFR benchmarks. The PDT blocks can be retrained for several source-target combinations using the proposed general framework. The proposed approach is architecture agnostic, meaning they can be added to any pre-trained FR models. Further, the approach is modular and the new block can be trained with a minimal set of paired samples, making it much easier for practical deployment. The source code and protocols will be made available publicly.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
Minimizing Human Assistance: Augmenting a Single Demonstration for Deep Reinforcement Learning
Authors:
Abraham George,
Alison Bartsch,
Amir Barati Farimani
Abstract:
The use of human demonstrations in reinforcement learning has proven to significantly improve agent performance. However, any requirement for a human to manually 'teach' the model is somewhat antithetical to the goals of reinforcement learning. This paper attempts to minimize human involvement in the learning process while retaining the performance advantages by using a single human example collec…
▽ More
The use of human demonstrations in reinforcement learning has proven to significantly improve agent performance. However, any requirement for a human to manually 'teach' the model is somewhat antithetical to the goals of reinforcement learning. This paper attempts to minimize human involvement in the learning process while retaining the performance advantages by using a single human example collected through a simple-to-use virtual reality simulation to assist with RL training. Our method augments a single demonstration to generate numerous human-like demonstrations that, when combined with Deep Deterministic Policy Gradients and Hindsight Experience Replay (DDPG + HER) significantly improve training time on simple tasks and allows the agent to solve a complex task (block stacking) that DDPG + HER alone cannot solve. The model achieves this significant training advantage using a single human example, requiring less than a minute of human input. Moreover, despite learning from a human example, the agent is not constrained to human-level performance, often learning a policy that is significantly different from the human demonstration.
△ Less
Submitted 18 March, 2023; v1 submitted 22 September, 2022;
originally announced September 2022.
-
Robust Testing in High-Dimensional Sparse Models
Authors:
Anand Jerry George,
Clément L. Canonne
Abstract:
We consider the problem of robustly testing the norm of a high-dimensional sparse signal vector under two different observation models. In the first model, we are given $n$ i.i.d. samples from the distribution $\mathcal{N}\left(θ,I_d\right)$ (with unknown $θ$), of which a small fraction has been arbitrarily corrupted. Under the promise that $\|θ\|_0\le s$, we want to correctly distinguish whether…
▽ More
We consider the problem of robustly testing the norm of a high-dimensional sparse signal vector under two different observation models. In the first model, we are given $n$ i.i.d. samples from the distribution $\mathcal{N}\left(θ,I_d\right)$ (with unknown $θ$), of which a small fraction has been arbitrarily corrupted. Under the promise that $\|θ\|_0\le s$, we want to correctly distinguish whether $\|θ\|_2=0$ or $\|θ\|_2>γ$, for some input parameter $γ>0$. We show that any algorithm for this task requires $n=Ω\left(s\log\frac{ed}{s}\right)$ samples, which is tight up to logarithmic factors. We also extend our results to other common notions of sparsity, namely, $\|θ\|_q\le s$ for any $0 < q < 2$. In the second observation model that we consider, the data is generated according to a sparse linear regression model, where the covariates are i.i.d. Gaussian and the regression coefficient (signal) is known to be $s$-sparse. Here too we assume that an $ε$-fraction of the data is arbitrarily corrupted. We show that any algorithm that reliably tests the norm of the regression coefficient requires at least $n=Ω\left(\min(s\log d,{1}/{γ^4})\right)$ samples. Our results show that the complexity of testing in these two settings significantly increases under robustness constraints. This is in line with the recent observations made in robust mean testing and robust covariance testing.
△ Less
Submitted 4 November, 2022; v1 submitted 16 May, 2022;
originally announced May 2022.
-
Single-Peaked Opinion Updates
Authors:
Robert Bredereck,
Anne-Marie George,
Jonas Israel,
Leon Kellerhals
Abstract:
We consider opinion diffusion for undirected networks with sequential updates when the opinions of the agents are single-peaked preference rankings. Our starting point is the study of preserving single-peakedness. We identify voting rules that, when given a single-peaked profile, output at least one ranking that is single peaked w.r.t. a single-peaked axis of the input. For such voting rules we sh…
▽ More
We consider opinion diffusion for undirected networks with sequential updates when the opinions of the agents are single-peaked preference rankings. Our starting point is the study of preserving single-peakedness. We identify voting rules that, when given a single-peaked profile, output at least one ranking that is single peaked w.r.t. a single-peaked axis of the input. For such voting rules we show convergence to a stable state of the diffusion process that uses the voting rule as the agents' update rule. Further, we establish an efficient algorithm that maximises the spread of extreme opinions.
△ Less
Submitted 29 April, 2022;
originally announced April 2022.
-
A Comprehensive Evaluation on Multi-channel Biometric Face Presentation Attack Detection
Authors:
Anjith George,
David Geissbuhler,
Sebastien Marcel
Abstract:
The vulnerability against presentation attacks is a crucial problem undermining the wide-deployment of face recognition systems. Though presentation attack detection (PAD) systems try to address this problem, the lack of generalization and robustness continues to be a major concern. Several works have shown that using multi-channel PAD systems could alleviate this vulnerability and result in more…
▽ More
The vulnerability against presentation attacks is a crucial problem undermining the wide-deployment of face recognition systems. Though presentation attack detection (PAD) systems try to address this problem, the lack of generalization and robustness continues to be a major concern. Several works have shown that using multi-channel PAD systems could alleviate this vulnerability and result in more robust systems. However, there is a wide selection of channels available for a PAD system such as RGB, Near Infrared, Shortwave Infrared, Depth, and Thermal sensors. Having a lot of sensors increases the cost of the system, and therefore an understanding of the performance of different sensors against a wide variety of attacks is necessary while selecting the modalities. In this work, we perform a comprehensive study to understand the effectiveness of various imaging modalities for PAD. The studies are performed on a multi-channel PAD dataset, collected with 14 different sensing modalities considering a wide range of 2D, 3D, and partial attacks. We used the multi-channel convolutional network-based architecture, which uses pixel-wise binary supervision. The model has been evaluated with different combinations of channels, and different image qualities on a variety of challenging known and unknown attack protocols. The results reveal interesting trends and can act as pointers for sensor selection for safety-critical presentation attack detection systems. The source codes and protocols to reproduce the results are made available publicly making it possible to extend this work to other architectures.
△ Less
Submitted 21 February, 2022;
originally announced February 2022.
-
Liquid Democracy with Ranked Delegations
Authors:
Markus Brill,
Théo Delemazure,
Anne-Marie George,
Martin Lackner,
Ulrike Schmidt-Kraepelin
Abstract:
Liquid democracy is a novel paradigm for collective decision-making that gives agents the choice between casting a direct vote or delegating their vote to another agent. We consider a generalization of the standard liquid democracy setting by allowing agents to specify multiple potential delegates, together with a preference ranking among them. This generalization increases the number of possible…
▽ More
Liquid democracy is a novel paradigm for collective decision-making that gives agents the choice between casting a direct vote or delegating their vote to another agent. We consider a generalization of the standard liquid democracy setting by allowing agents to specify multiple potential delegates, together with a preference ranking among them. This generalization increases the number of possible delegation paths and enables higher participation rates because fewer votes are lost due to delegation cycles or abstaining agents. In order to implement this generalization of liquid democracy, we need to find a principled way of choosing between multiple delegation paths. In this paper, we provide a thorough axiomatic analysis of the space of delegation rules, i.e., functions assigning a feasible delegation path to each delegating agent. In particular, we prove axiomatic characterizations as well as an impossibility result for delegation rules. We also analyze requirements on delegation rules that have been suggested by practitioners, and introduce novel rules with attractive properties. By performing an extensive experimental analysis on synthetic as well as real-world data, we compare delegation rules with respect to several quantitative criteria relating to the chosen paths and the resulting distribution of voting power. Our experiments reveal that delegation rules can be aligned on a spectrum reflecting an inherent trade-off between competing objectives.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
Performance evaluation of the QOS provisioning ability of IEEE 802.11e WLAN standard for multimedia traffic
Authors:
Venkata Sitaram. A,
Venkatesh. T. G,
Arun George,
Manivasakan. R,
Bhasker Dappuri
Abstract:
This paper presents an analytical model for the average frame transmission delay and the jitter for the different Access Categories (ACs) of the IEEE 802.11e Enhanced Distributed Channel Access (EDCA) mechanism. Following are the salient features of our model. As defined by the standard we consider (1) the virtual collisions among different ACs inside each EDCA station in addition to external coll…
▽ More
This paper presents an analytical model for the average frame transmission delay and the jitter for the different Access Categories (ACs) of the IEEE 802.11e Enhanced Distributed Channel Access (EDCA) mechanism. Following are the salient features of our model. As defined by the standard we consider (1) the virtual collisions among different ACs inside each EDCA station in addition to external collisions. (2) the effect of priority parameters, such as minimum and maximum values of Contention Window (CW) sizes, Arbitration Inter Frame Space (AIFS). (3) the role of Transmission Opportunity (TXOP) of different ACs. (4) the finite number of retrials a packet experiences before being dropped. Our model and analytical results provide an in-depth understanding of the EDCA mechanism and the effect of Quality of Service (QoS) parameters in the performance of IEEE 802.11e protocol.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
hARMS: A Hardware Acceleration Architecture for Real-Time Event-Based Optical Flow
Authors:
Daniel C. Stumpp,
Himanshu Akolkar,
Alan D. George,
Ryad B. Benosman
Abstract:
Event-based vision sensors produce asynchronous event streams with high temporal resolution based on changes in the visual scene. The properties of these sensors allow for accurate and fast calculation of optical flow as events are generated. Existing solutions for calculating optical flow from event data either fail to capture the true direction of motion due to the aperture problem, do not use t…
▽ More
Event-based vision sensors produce asynchronous event streams with high temporal resolution based on changes in the visual scene. The properties of these sensors allow for accurate and fast calculation of optical flow as events are generated. Existing solutions for calculating optical flow from event data either fail to capture the true direction of motion due to the aperture problem, do not use the high temporal resolution of the sensor, or are too computationally expensive to be run in real time on embedded platforms. In this research, we first present a faster version of our previous algorithm, ARMS (Aperture Robust Multi-Scale flow). The new optimized software version (fARMS) significantly improves throughput on a traditional CPU. Further, we present hARMS, a hardware realization of the fARMS algorithm allowing for real-time computation of true flow on low-power, embedded platforms. The proposed hARMS architecture targets hybrid system-on-chip devices and was designed to maximize configurability and throughput. The hardware architecture and fARMS algorithm were developed with asynchronous neuromorphic processing in mind, abandoning the common use of an event frame and instead operating using only a small history of relevant events, allowing latency to scale independently of the sensor resolution. This change in processing paradigm improved the estimation of flow directions by up to 73% compared to the existing method and yielded a demonstrated hARMS throughput of up to 1.21 Mevent/s on the benchmark configuration selected. This throughput enables real-time performance and makes it the fastest known realization of aperture-robust, event-based optical flow to date.
△ Less
Submitted 13 December, 2021;
originally announced December 2021.
-
Interactive Inverse Reinforcement Learning for Cooperative Games
Authors:
Thomas Kleine Buening,
Anne-Marie George,
Christos Dimitrakakis
Abstract:
We study the problem of designing autonomous agents that can learn to cooperate effectively with a potentially suboptimal partner while having no access to the joint reward function. This problem is modeled as a cooperative episodic two-agent Markov decision process. We assume control over only the first of the two agents in a Stackelberg formulation of the game, where the second agent is acting s…
▽ More
We study the problem of designing autonomous agents that can learn to cooperate effectively with a potentially suboptimal partner while having no access to the joint reward function. This problem is modeled as a cooperative episodic two-agent Markov decision process. We assume control over only the first of the two agents in a Stackelberg formulation of the game, where the second agent is acting so as to maximise expected utility given the first agent's policy. How should the first agent act in order to learn the joint reward function as quickly as possible and so that the joint policy is as close to optimal as possible? We analyse how knowledge about the reward function can be gained in this interactive two-agent scenario. We show that when the learning agent's policies have a significant effect on the transition function, the reward function can be learned efficiently.
△ Less
Submitted 13 June, 2022; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Cross Modal Focal Loss for RGBD Face Anti-Spoofing
Authors:
Anjith George,
Sebastien Marcel
Abstract:
Automatic methods for detecting presentation attacks are essential to ensure the reliable use of facial recognition technology. Most of the methods available in the literature for presentation attack detection (PAD) fails in generalizing to unseen attacks. In recent years, multi-channel methods have been proposed to improve the robustness of PAD systems. Often, only a limited amount of data is ava…
▽ More
Automatic methods for detecting presentation attacks are essential to ensure the reliable use of facial recognition technology. Most of the methods available in the literature for presentation attack detection (PAD) fails in generalizing to unseen attacks. In recent years, multi-channel methods have been proposed to improve the robustness of PAD systems. Often, only a limited amount of data is available for additional channels, which limits the effectiveness of these methods. In this work, we present a new framework for PAD that uses RGB and depth channels together with a novel loss function. The new architecture uses complementary information from the two modalities while reducing the impact of overfitting. Essentially, a cross-modal focal loss function is proposed to modulate the loss contribution of each channel as a function of the confidence of individual channels. Extensive evaluations in two publicly available datasets demonstrate the effectiveness of the proposed approach.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
On Meritocracy in Optimal Set Selection
Authors:
Thomas Kleine Buening,
Meirav Segal,
Debabrota Basu,
Christos Dimitrakakis,
Anne-Marie George
Abstract:
Typically, merit is defined with respect to some intrinsic measure of worth. We instead consider a setting where an individual's worth is \emph{relative}: when a Decision Maker (DM) selects a set of individuals from a population to maximise expected utility, it is natural to consider the \emph{Expected Marginal Contribution} (EMC) of each person to the utility. We show that this notion satisfies a…
▽ More
Typically, merit is defined with respect to some intrinsic measure of worth. We instead consider a setting where an individual's worth is \emph{relative}: when a Decision Maker (DM) selects a set of individuals from a population to maximise expected utility, it is natural to consider the \emph{Expected Marginal Contribution} (EMC) of each person to the utility. We show that this notion satisfies an axiomatic definition of fairness for this setting. We also show that for certain policy structures, this notion of fairness is aligned with maximising expected utility, while for linear utility functions it is identical to the Shapley value. However, for certain natural policies, such as those that select individuals with a specific set of attributes (e.g. high enough test scores for college admissions), there is a trade-off between meritocracy and utility maximisation. We analyse the effect of constraints on the policy on both utility and fairness in extensive experiments based on college admissions and outcomes in Norwegian universities.
△ Less
Submitted 9 September, 2022; v1 submitted 23 February, 2021;
originally announced February 2021.
-
An MCMC Method to Sample from Lattice Distributions
Authors:
Anand Jerry George,
Navin Kashyap
Abstract:
We introduce a Markov Chain Monte Carlo (MCMC) algorithm to generate samples from probability distributions supported on a $d$-dimensional lattice $Λ= \mathbf{B}\mathbb{Z}^d$, where $\mathbf{B}$ is a full-rank matrix. Specifically, we consider lattice distributions $P_Λ$ in which the probability at a lattice point is proportional to a given probability density function, $f$, evaluated at that poin…
▽ More
We introduce a Markov Chain Monte Carlo (MCMC) algorithm to generate samples from probability distributions supported on a $d$-dimensional lattice $Λ= \mathbf{B}\mathbb{Z}^d$, where $\mathbf{B}$ is a full-rank matrix. Specifically, we consider lattice distributions $P_Λ$ in which the probability at a lattice point is proportional to a given probability density function, $f$, evaluated at that point. To generate samples from $P_Λ$, it suffices to draw samples from a pull-back measure $P_{\mathbb{Z}^d}$ defined on the integer lattice. The probability of an integer lattice point under $P_{\mathbb{Z}^d}$ is proportional to the density function $π= |\det(\mathbf{B})|f\circ \mathbf{B}$. The algorithm we present in this paper for sampling from $P_{\mathbb{Z}^d}$ is based on the Metropolis-Hastings framework. In particular, we use $π$ as the proposal distribution and calculate the Metropolis-Hastings acceptance ratio for a well-chosen target distribution. We can use any method, denoted by ALG, that ideally draws samples from the probability density $π$, to generate a proposed state. The target distribution is a piecewise sigmoidal distribution, chosen such that the coordinate-wise rounding of a sample drawn from the target distribution gives a sample from $P_{\mathbb{Z}^d}$. When ALG is ideal, we show that our algorithm is uniformly ergodic if $-\log(π)$ satisfies a gradient Lipschitz condition.
△ Less
Submitted 26 January, 2021; v1 submitted 16 January, 2021;
originally announced January 2021.
-
On the Effectiveness of Vision Transformers for Zero-shot Face Anti-Spoofing
Authors:
Anjith George,
Sebastien Marcel
Abstract:
The vulnerability of face recognition systems to presentation attacks has limited their application in security-critical scenarios. Automatic methods of detecting such malicious attempts are essential for the safe use of facial recognition technology. Although various methods have been suggested for detecting such attacks, most of them over-fit the training set and fail in generalizing to unseen a…
▽ More
The vulnerability of face recognition systems to presentation attacks has limited their application in security-critical scenarios. Automatic methods of detecting such malicious attempts are essential for the safe use of facial recognition technology. Although various methods have been suggested for detecting such attacks, most of them over-fit the training set and fail in generalizing to unseen attacks and environments. In this work, we use transfer learning from the vision transformer model for the zero-shot anti-spoofing task. The effectiveness of the proposed approach is demonstrated through experiments in publicly available datasets. The proposed approach outperforms the state-of-the-art methods in the zero-shot protocols in the HQ-WMCA and SiW-M datasets by a large margin. Besides, the model achieves a significant boost in cross-database performance as well.
△ Less
Submitted 2 June, 2021; v1 submitted 16 November, 2020;
originally announced November 2020.
-
Mez: A Messaging System for Latency-Sensitive Multi-Camera Machine Vision at the IoT Edge
Authors:
Anjus George,
Arun Ravindran,
Mattias Mendieta,
Hamed Tabkhi
Abstract:
Mez is a publish-subscribe messaging system for latency sensitive multi-camera machine vision at the IoT Edge. Unlike existing messaging systems, Mez allows applications to specify latency, and application accuracy bounds. Mez implements a network latency controller that dynamically adjusts the video frame quality to satisfy latency, and application accuracy requirements. Additionally, the design…
▽ More
Mez is a publish-subscribe messaging system for latency sensitive multi-camera machine vision at the IoT Edge. Unlike existing messaging systems, Mez allows applications to specify latency, and application accuracy bounds. Mez implements a network latency controller that dynamically adjusts the video frame quality to satisfy latency, and application accuracy requirements. Additionally, the design of Mez utilizes application domain specific features to provide low latency operations. Experimental evaluation on an IoT Edge testbed with a pedestrian detection machine vision application indicates that Mez is able to tolerate latency variations of up to 10x with a worst-case reduction of 4.2\% in the application accuracy F1 score metric.
△ Less
Submitted 28 September, 2020;
originally announced September 2020.
-
The High-Quality Wide Multi-Channel Attack (HQ-WMCA) database
Authors:
Zohreh Mostaani,
Anjith George,
Guillaume Heusch,
David Geissbuhler,
Sebastien Marcel
Abstract:
The High-Quality Wide Multi-Channel Attack database (HQ-WMCA) database extends the previous Wide Multi-Channel Attack database(WMCA), with more channels including color, depth, thermal, infrared (spectra), and short-wave infrared (spectra), and also a wide variety of attacks.
The High-Quality Wide Multi-Channel Attack database (HQ-WMCA) database extends the previous Wide Multi-Channel Attack database(WMCA), with more channels including color, depth, thermal, infrared (spectra), and short-wave infrared (spectra), and also a wide variety of attacks.
△ Less
Submitted 21 September, 2020;
originally announced September 2020.
-
Deep Models and Shortwave Infrared Information to Detect Face Presentation Attacks
Authors:
Guillaume Heusch,
Anjith George,
David Geissbuhler,
Zohreh Mostaani,
Sebastien Marcel
Abstract:
This paper addresses the problem of face presentation attack detection using different image modalities. In particular, the usage of short wave infrared (SWIR) imaging is considered. Face presentation attack detection is performed using recent models based on Convolutional Neural Networks using only carefully selected SWIR image differences as input. Conducted experiments show superior performance…
▽ More
This paper addresses the problem of face presentation attack detection using different image modalities. In particular, the usage of short wave infrared (SWIR) imaging is considered. Face presentation attack detection is performed using recent models based on Convolutional Neural Networks using only carefully selected SWIR image differences as input. Conducted experiments show superior performance over similar models acting on either color images or on a combination of different modalities (visible, NIR, thermal and depth), as well as on a SVM-based classifier acting on SWIR image differences. Experiments have been carried on a new public and freely available database, containing a wide variety of attacks. Video sequences have been recorded thanks to several sensors resulting in 14 different streams in the visible, NIR, SWIR and thermal spectra, as well as depth data. The best proposed approach is able to almost perfectly detect all impersonation attacks while ensuring low bonafide classification errors. On the other hand, obtained results show that obfuscation attacks are more difficult to detect. We hope that the proposed database will foster research on this challenging problem. Finally, all the code and instructions to reproduce presented experiments is made available to the research community.
△ Less
Submitted 22 July, 2020;
originally announced July 2020.
-
Learning One Class Representations for Face Presentation Attack Detection using Multi-channel Convolutional Neural Networks
Authors:
Anjith George,
Sebastien Marcel
Abstract:
Face recognition has evolved as a widely used biometric modality. However, its vulnerability against presentation attacks poses a significant security threat. Though presentation attack detection (PAD) methods try to address this issue, they often fail in generalizing to unseen attacks. In this work, we propose a new framework for PAD using a one-class classifier, where the representation used is…
▽ More
Face recognition has evolved as a widely used biometric modality. However, its vulnerability against presentation attacks poses a significant security threat. Though presentation attack detection (PAD) methods try to address this issue, they often fail in generalizing to unseen attacks. In this work, we propose a new framework for PAD using a one-class classifier, where the representation used is learned with a Multi-Channel Convolutional Neural Network (MCCNN). A novel loss function is introduced, which forces the network to learn a compact embedding for bonafide class while being far from the representation of attacks. A one-class Gaussian Mixture Model is used on top of these embeddings for the PAD task. The proposed framework introduces a novel approach to learn a robust PAD system from bonafide and available (known) attack classes. This is particularly important as collecting bonafide data and simpler attacks are much easier than collecting a wide variety of expensive attacks. The proposed system is evaluated on the publicly available WMCA multi-channel face PAD database, which contains a wide variety of 2D and 3D attacks. Further, we have performed experiments with MLFP and SiW-M datasets using RGB channels only. Superior performance in unseen attack protocols shows the effectiveness of the proposed approach. Software, data, and protocols to reproduce the results are made available publicly.
△ Less
Submitted 22 July, 2020;
originally announced July 2020.
-
Can Your Face Detector Do Anti-spoofing? Face Presentation Attack Detection with a Multi-Channel Face Detector
Authors:
Anjith George,
Sebastien Marcel
Abstract:
In a typical face recognition pipeline, the task of the face detector is to localize the face region. However, the face detector localizes regions that look like a face, irrespective of the liveliness of the face, which makes the entire system susceptible to presentation attacks. In this work, we try to reformulate the task of the face detector to detect real faces, thus eliminating the threat of…
▽ More
In a typical face recognition pipeline, the task of the face detector is to localize the face region. However, the face detector localizes regions that look like a face, irrespective of the liveliness of the face, which makes the entire system susceptible to presentation attacks. In this work, we try to reformulate the task of the face detector to detect real faces, thus eliminating the threat of presentation attacks. While this task could be challenging with visible spectrum images alone, we leverage the multi-channel information available from off the shelf devices (such as color, depth, and infrared channels) to design a multi-channel face detector. The proposed system can be used as a live-face detector obviating the need for a separate presentation attack detection module, making the system reliable in practice without any additional computational overhead. The main idea is to leverage a single-stage object detection framework, with a joint representation obtained from different channels for the PAD task. We have evaluated our approach in the multi-channel WMCA dataset containing a wide variety of attacks to show the effectiveness of the proposed framework.
△ Less
Submitted 29 July, 2020; v1 submitted 30 June, 2020;
originally announced June 2020.
-
Deepfake Detection using Spatiotemporal Convolutional Networks
Authors:
Oscar de Lima,
Sean Franklin,
Shreshtha Basu,
Blake Karwoski,
Annet George
Abstract:
Better generative models and larger datasets have led to more realistic fake videos that can fool the human eye but produce temporal and spatial artifacts that deep learning approaches can detect. Most current Deepfake detection methods only use individual video frames and therefore fail to learn from temporal information. We created a benchmark of the performance of spatiotemporal convolutional m…
▽ More
Better generative models and larger datasets have led to more realistic fake videos that can fool the human eye but produce temporal and spatial artifacts that deep learning approaches can detect. Most current Deepfake detection methods only use individual video frames and therefore fail to learn from temporal information. We created a benchmark of the performance of spatiotemporal convolutional methods using the Celeb-DF dataset. Our methods outperformed state-of-the-art frame-based detection methods. Code for our paper is publicly available at https://github.com/oidelima/Deepfake-Detection.
△ Less
Submitted 25 June, 2020;
originally announced June 2020.
-
Leveraging Multimodal Behavioral Analytics for Automated Job Interview Performance Assessment and Feedback
Authors:
Anumeha Agrawal,
Rosa Anil George,
Selvan Sunitha Ravi,
Sowmya Kamath S,
Anand Kumar M
Abstract:
Behavioral cues play a significant part in human communication and cognitive perception. In most professional domains, employee recruitment policies are framed such that both professional skills and personality traits are adequately assessed. Hiring interviews are structured to evaluate expansively a potential employee's suitability for the position - their professional qualifications, interperson…
▽ More
Behavioral cues play a significant part in human communication and cognitive perception. In most professional domains, employee recruitment policies are framed such that both professional skills and personality traits are adequately assessed. Hiring interviews are structured to evaluate expansively a potential employee's suitability for the position - their professional qualifications, interpersonal skills, ability to perform in critical and stressful situations, in the presence of time and resource constraints, etc. Therefore, candidates need to be aware of their positive and negative attributes and be mindful of behavioral cues that might have adverse effects on their success. We propose a multimodal analytical framework that analyzes the candidate in an interview scenario and provides feedback for predefined labels such as engagement, speaking rate, eye contact, etc. We perform a comprehensive analysis that includes the interviewee's facial expressions, speech, and prosodic information, using the video, audio, and text transcripts obtained from the recorded interview. We use these multimodal data sources to construct a composite representation, which is used for training machine learning classifiers to predict the class labels. Such analysis is then used to provide constructive feedback to the interviewee for their behavioral cues and body language. Experimental validation showed that the proposed methodology achieved promising results.
△ Less
Submitted 16 June, 2020; v1 submitted 14 June, 2020;
originally announced June 2020.
-
Applying the Decisiveness and Robustness Metrics to Convolutional Neural Networks
Authors:
Christopher A. George,
Eduardo A. Barrera,
Kenric P. Nelson
Abstract:
We review three recently-proposed classifier quality metrics and consider their suitability for large-scale classification challenges such as applying convolutional neural networks to the 1000-class ImageNet dataset. These metrics, referred to as the "geometric accuracy," "decisiveness," and "robustness," are based on the generalized mean ($ρ$ equals 0, 1, and -2/3, respectively) of the classifier…
▽ More
We review three recently-proposed classifier quality metrics and consider their suitability for large-scale classification challenges such as applying convolutional neural networks to the 1000-class ImageNet dataset. These metrics, referred to as the "geometric accuracy," "decisiveness," and "robustness," are based on the generalized mean ($ρ$ equals 0, 1, and -2/3, respectively) of the classifier's self-reported and measured probabilities of correct classification. We also propose some minor clarifications to standardize the metric definitions. With these updates, we show some examples of calculating the metrics using deep convolutional neural networks (AlexNet and DenseNet) acting on large datasets (the German Traffic Sign Recognition Benchmark and ImageNet).
△ Less
Submitted 29 May, 2020;
originally announced June 2020.
-
Discrimination Among Multiple Cutaneous and Proprioceptive Hand Percepts Evoked by Nerve Stimulation with Utah Slanted Electrode Arrays in Human Amputees
Authors:
David M. Page,
Suzanne M. Wendelken,
Tyler S. Davis,
David T. Kluger,
Douglas T. Hutchinson,
Jacob A. George,
Gregory A. Clark
Abstract:
Objective: This paper aims to demonstrate functional discriminability among restored hand sensations with different locations, qualities, and intensities that are evoked by microelectrode stimulation of residual afferent fibers in human amputees. Methods: We implanted a Utah Slanted Electrode Array (USEA) in the median and ulnar residual arm nerves of three transradial amputees and delivered stimu…
▽ More
Objective: This paper aims to demonstrate functional discriminability among restored hand sensations with different locations, qualities, and intensities that are evoked by microelectrode stimulation of residual afferent fibers in human amputees. Methods: We implanted a Utah Slanted Electrode Array (USEA) in the median and ulnar residual arm nerves of three transradial amputees and delivered stimulation via different electrodes and at different frequencies to produce various locations, qualities, and intensities of sensation on the missing hand. Blind discrimination trials were performed to determine how well subjects could discriminate among these restored sensations. Results: Subjects discriminated among restored sensory percepts with varying cutaneous and proprioceptive locations, qualities, and intensities in blind trials, including discrimination among up to 10 different location-intensity combinations (15/30 successes, p < 0.0005). Variations in the site of stimulation within the nerve, via electrode selection, enabled discrimination among up to 5 locations and qualities (35/35 successes, p < 0.0001). Variations in the stimulation frequency enabled discrimination among 4 different intensities at the same location (13/20 successes, p < 0.005). One subject discriminated among simultaneous, alternating, and isolated stimulation of two different USEA electrodes, as may be desired during multi-sensor closed-loop prosthesis use (20/25 successes, p < 0.001). Conclusion: USEA stimulation enables encoding of a diversity of functionally discriminable sensations with different locations, qualities, and intensities. Significance: These percepts provide a potentially rich source of sensory feedback that may enhance performance and embodiment during multi-sensor, closed-loop prosthesis use.
△ Less
Submitted 7 March, 2020;
originally announced March 2020.
-
Inexpensive surface electromyography sleeve with consistent electrode placement enables dexterous and stable prosthetic control through deep learning
Authors:
Jacob A. George,
Anna Neibling,
Michael D. Paskett,
Gregory A. Clark
Abstract:
The dexterity of conventional myoelectric prostheses is limited in part by the small datasets used to train the control algorithms. Variations in surface electrode positioning make it difficult to collect consistent data and to estimate motor intent reliably over time. To address these challenges, we developed an inexpensive, easy-to-don sleeve that can record robust and repeatable surface electro…
▽ More
The dexterity of conventional myoelectric prostheses is limited in part by the small datasets used to train the control algorithms. Variations in surface electrode positioning make it difficult to collect consistent data and to estimate motor intent reliably over time. To address these challenges, we developed an inexpensive, easy-to-don sleeve that can record robust and repeatable surface electromyography from 32 embedded monopolar electrodes. Embedded grommets are used to consistently align the sleeve with natural skin markings (e.g., moles, freckles, scars). The sleeve can be manufactured in a few hours for less than $60. Data from seven intact participants show the sleeve provides a signal-to-noise ratio of 14, a don-time under 11 seconds, and sub-centimeter precision for electrode placement. Furthermore, in a case study with one intact participant, we use the sleeve to demonstrate that neural networks can provide simultaneous and proportional control of six degrees of freedom, even 263 days after initial algorithm training. We also highlight that consistent recordings, accumulated over time to establish a large dataset, significantly improve dexterity. These results suggest that deep learning with a 74-layer neural network can substantially improve the dexterity and stability of myoelectric prosthetic control, and that deep-learning techniques can be readily instantiated and further validated through inexpensive sleeves/sockets with consistent recording locations.
△ Less
Submitted 28 February, 2020;
originally announced March 2020.