subscribe to arXiv mailings

BioNeRF: Biologically Plausible Neural Radiance Fields for View Synthesis

Authors: Leandro A. Passos, Douglas Rodrigues, Danilo Jodas, Kelton A. P. Costa, Ahsan Adeel, João Paulo Papa

Abstract: This paper presents BioNeRF, a biologically plausible architecture that models scenes in a 3D representation and synthesizes new views through radiance fields. Since NeRF relies on the network weights to store the scene's 3-dimensional representation, BioNeRF implements a cognitive-inspired mechanism that fuses inputs from multiple sources into a memory-like structure, improving the storing capaci… ▽ More This paper presents BioNeRF, a biologically plausible architecture that models scenes in a 3D representation and synthesizes new views through radiance fields. Since NeRF relies on the network weights to store the scene's 3-dimensional representation, BioNeRF implements a cognitive-inspired mechanism that fuses inputs from multiple sources into a memory-like structure, improving the storing capacity and extracting more intrinsic and correlated information. BioNeRF also mimics a behavior observed in pyramidal cells concerning contextual information, in which the memory is provided as the context and combined with the inputs of two subsequent neural models, one responsible for producing the volumetric densities and the other the colors used to render the scene. Experimental results show that BioNeRF outperforms state-of-the-art results concerning a quality measure that encodes human perception in two datasets: real-world images and synthetic data. △ Less

Submitted 25 March, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

arXiv:2310.13490 [pdf, other]

doi 10.1007/978-3-031-45389-2_22

Feature Selection and Hyperparameter Fine-tuning in Artificial Neural Networks for Wood Quality Classification

Authors: Mateus Roder, Leandro Aparecido Passos, João Paulo Papa, André Luis Debiaso Rossi

Abstract: Quality classification of wood boards is an essential task in the sawmill industry, which is still usually performed by human operators in small to median companies in developing countries. Machine learning algorithms have been successfully employed to investigate the problem, offering a more affordable alternative compared to other solutions. However, such approaches usually present some drawback… ▽ More Quality classification of wood boards is an essential task in the sawmill industry, which is still usually performed by human operators in small to median companies in developing countries. Machine learning algorithms have been successfully employed to investigate the problem, offering a more affordable alternative compared to other solutions. However, such approaches usually present some drawbacks regarding the proper selection of their hyperparameters. Moreover, the models are susceptible to the features extracted from wood board images, which influence the induction of the model and, consequently, its generalization power. Therefore, in this paper, we investigate the problem of simultaneously tuning the hyperparameters of an artificial neural network (ANN) as well as selecting a subset of characteristics that better describes the wood board quality. Experiments were conducted over a private dataset composed of images obtained from a sawmill industry and described using different feature descriptors. The predictive performance of the model was compared against five baseline methods as well as a random search, performing either ANN hyperparameter tuning and feature selection. Experimental results suggest that hyperparameters should be adjusted according to the feature set, or the features should be selected considering the hyperparameter values. In summary, the best predictive performance, i.e., a balanced accuracy of $0.80$, was achieved in two distinct scenarios: (i) performing only feature selection, and (ii) performing both tasks concomitantly. Thus, we suggest that at least one of the two approaches should be considered in the context of industrial applications. △ Less

Submitted 20 October, 2023; originally announced October 2023.

arXiv:2301.13671 [pdf, ps, other]

doi 10.1007/978-3-030-93420-0_11

Enhancing Hyper-To-Real Space Projections Through Euclidean Norm Meta-Heuristic Optimization

Authors: Luiz C. F. Ribeiro, Mateus Roder, Gustavo H. de Rosa, Leandro A. Passos, João P. Papa

Abstract: The continuous computational power growth in the last decades has made solving several optimization problems significant to humankind a tractable task; however, tackling some of them remains a challenge due to the overwhelming amount of candidate solutions to be evaluated, even by using sophisticated algorithms. In such a context, a set of nature-inspired stochastic methods, called meta-heuristic… ▽ More The continuous computational power growth in the last decades has made solving several optimization problems significant to humankind a tractable task; however, tackling some of them remains a challenge due to the overwhelming amount of candidate solutions to be evaluated, even by using sophisticated algorithms. In such a context, a set of nature-inspired stochastic methods, called meta-heuristic optimization, can provide robust approximate solutions to different kinds of problems with a small computational burden, such as derivative-free real function optimization. Nevertheless, these methods may converge to inadequate solutions if the function landscape is too harsh, e.g., enclosing too many local optima. Previous works addressed this issue by employing a hypercomplex representation of the search space, like quaternions, where the landscape becomes smoother and supposedly easier to optimize. Under this approach, meta-heuristic computations happen in the hypercomplex space, whereas variables are mapped back to the real domain before function evaluation. Despite this latter operation being performed by the Euclidean norm, we have found that after the optimization procedure has finished, it is usually possible to obtain even better solutions by employing the Minkowski $p$-norm instead and fine-tuning $p$ through an auxiliary sub-problem with neglecting additional cost and no hyperparameters. Such behavior was observed in eight well-established benchmarking functions, thus fostering a new research direction for hypercomplex meta-heuristic optimization. △ Less

Submitted 31 January, 2023; originally announced January 2023.

arXiv:2212.10411 [pdf, other]

doi 10.1109/IGARSS47720.2021.9554277

DDIPNet and DDIPNet+: Discriminant Deep Image Prior Networks for Remote Sensing Image Classification

Authors: Daniel F. S. Santos, Rafael G. Pires, Leandro A. Passos, João P. Papa

Abstract: Research on remote sensing image classification significantly impacts essential human routine tasks such as urban planning and agriculture. Nowadays, the rapid advance in technology and the availability of many high-quality remote sensing images create a demand for reliable automation methods. The current paper proposes two novel deep learning-based architectures for image classification purposes,… ▽ More Research on remote sensing image classification significantly impacts essential human routine tasks such as urban planning and agriculture. Nowadays, the rapid advance in technology and the availability of many high-quality remote sensing images create a demand for reliable automation methods. The current paper proposes two novel deep learning-based architectures for image classification purposes, i.e., the Discriminant Deep Image Prior Network and the Discriminant Deep Image Prior Network+, which combine Deep Image Prior and Triplet Networks learning strategies. Experiments conducted over three well-known public remote sensing image datasets achieved state-of-the-art results, evidencing the effectiveness of using deep image priors for remote sensing image classification. △ Less

Submitted 20 December, 2022; originally announced December 2022.

Comments: Published in: 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS

arXiv:2212.02507 [pdf, other]

doi 10.1109/ICPR56361.2022.9956112

FEMa-FS: Finite Element Machines for Feature Selection

Authors: Lucas Biaggi, João P. Papa, Kelton A. P Costa, Danillo R. Pereira, Leandro A. Passos

Abstract: Identifying anomalies has become one of the primary strategies towards security and protection procedures in computer networks. In this context, machine learning-based methods emerge as an elegant solution to identify such scenarios and learn irrelevant information so that a reduction in the identification time and possible gain in accuracy can be obtained. This paper proposes a novel feature sele… ▽ More Identifying anomalies has become one of the primary strategies towards security and protection procedures in computer networks. In this context, machine learning-based methods emerge as an elegant solution to identify such scenarios and learn irrelevant information so that a reduction in the identification time and possible gain in accuracy can be obtained. This paper proposes a novel feature selection approach called Finite Element Machines for Feature Selection (FEMa-FS), which uses the framework of finite elements to identify the most relevant information from a given dataset. Although FEMa-FS can be applied to any application domain, it has been evaluated in the context of anomaly detection in computer networks. The outcomes over two datasets showed promising results. △ Less

Submitted 5 December, 2022; originally announced December 2022.

arXiv:2211.17045 [pdf, other]

doi 10.1109/SSCI50451.2021.9660128

From Actions to Events: A Transfer Learning Approach Using Improved Deep Belief Networks

Authors: Mateus Roder, Jurandy Almeida, Gustavo H. de Rosa, Leandro A. Passos, André L. D. Rossi, João P. Papa

Abstract: In the last decade, exponential data growth supplied machine learning-based algorithms' capacity and enabled their usage in daily-life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors produce outstanding results, they also pose drawbacks… ▽ More In the last decade, exponential data growth supplied machine learning-based algorithms' capacity and enabled their usage in daily-life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors produce outstanding results, they also pose drawbacks regarding the learning process as training complex models over large datasets are expensive and time-consuming. Such a problem is even more evident when dealing with video analysis. Some works have considered transfer learning or domain adaptation, i.e., approaches that map the knowledge from one domain to another, to ease the training burden, yet most of them operate over individual or small blocks of frames. This paper proposes a novel approach to map the knowledge from action recognition to event recognition using an energy-based model, denoted as Spectral Deep Belief Network. Such a model can process all frames simultaneously, carrying spatial and temporal information through the learning process. The experimental results conducted over two public video dataset, the HMDB-51 and the UCF-101, depict the effectiveness of the proposed model and its reduced computational burden when compared to traditional energy-based models, such as Restricted Boltzmann Machines and Deep Belief Networks. △ Less

Submitted 30 November, 2022; originally announced November 2022.

arXiv:2209.12822 [pdf, ps, other]

doi 10.1109/IWSSIP55020.2022.9854419

ComplexWoundDB: A Database for Automatic Complex Wound Tissue Categorization

Authors: Talita A. Pereira, Regina C. Popim, Leandro A. Passos, Danillo R. Pereira, Clayton R. Pereira, João P. Papa

Abstract: Complex wounds usually face partial or total loss of skin thickness, healing by secondary intention. They can be acute or chronic, figuring infections, ischemia and tissue necrosis, and association with systemic diseases. Research institutes around the globe report countless cases, ending up in a severe public health problem, for they involve human resources (e.g., physicians and health care profe… ▽ More Complex wounds usually face partial or total loss of skin thickness, healing by secondary intention. They can be acute or chronic, figuring infections, ischemia and tissue necrosis, and association with systemic diseases. Research institutes around the globe report countless cases, ending up in a severe public health problem, for they involve human resources (e.g., physicians and health care professionals) and negatively impact life quality. This paper presents a new database for automatically categorizing complex wounds with five categories, i.e., non-wound area, granulation, fibrinoid tissue, and dry necrosis, hematoma. The images comprise different scenarios with complex wounds caused by pressure, vascular ulcers, diabetes, burn, and complications after surgical interventions. The dataset, called ComplexWoundDB, is unique because it figures pixel-level classifications from $27$ images obtained in the wild, i.e., images are collected at the patients' homes, labeled by four health professionals. Further experiments with distinct machine learning techniques evidence the challenges in addressing the problem of computer-aided complex wound tissue categorization. The manuscript sheds light on future directions in the area, with a detailed comparison among other databased widely used in the literature. △ Less

Submitted 26 September, 2022; originally announced September 2022.

arXiv:2209.12647 [pdf, ps, other]

doi 10.1109/IWSSIP55020.2022.9854445

PL-kNN: A Parameterless Nearest Neighbors Classifier

Authors: Danilo Samuel Jodas, Leandro Aparecido Passos, Ahsan Adeel, João Paulo Papa

Abstract: Demands for minimum parameter setup in machine learning models are desirable to avoid time-consuming optimization processes. The $k$-Nearest Neighbors is one of the most effective and straightforward models employed in numerous problems. Despite its well-known performance, it requires the value of $k$ for specific data distribution, thus demanding expensive computational efforts. This paper propos… ▽ More Demands for minimum parameter setup in machine learning models are desirable to avoid time-consuming optimization processes. The $k$-Nearest Neighbors is one of the most effective and straightforward models employed in numerous problems. Despite its well-known performance, it requires the value of $k$ for specific data distribution, thus demanding expensive computational efforts. This paper proposes a $k$-Nearest Neighbors classifier that bypasses the need to define the value of $k$. The model computes the $k$ value adaptively considering the data distribution of the training set. We compared the proposed model against the standard $k$-Nearest Neighbors classifier and two parameterless versions from the literature. Experiments over 11 public datasets confirm the robustness of the proposed approach, for the obtained results were similar or even better than its counterpart versions. △ Less

Submitted 30 September, 2022; v1 submitted 26 September, 2022; originally announced September 2022.

arXiv:2209.03275 [pdf, other]

Multimodal Speech Enhancement Using Burst Propagation

Authors: Mohsin Raza, Leandro A. Passos, Ahmed Khubaib, Ahsan Adeel

Abstract: This paper proposes the MBURST, a novel multimodal solution for audio-visual speech enhancements that consider the most recent neurological discoveries regarding pyramidal cells of the prefrontal cortex and other brain regions. The so-called burst propagation implements several criteria to address the credit assignment problem in a more biologically plausible manner: steering the sign and magnitud… ▽ More This paper proposes the MBURST, a novel multimodal solution for audio-visual speech enhancements that consider the most recent neurological discoveries regarding pyramidal cells of the prefrontal cortex and other brain regions. The so-called burst propagation implements several criteria to address the credit assignment problem in a more biologically plausible manner: steering the sign and magnitude of plasticity through feedback, multiplexing the feedback and feedforward information across layers through different weight connections, approximating feedback and feedforward connections, and linearizing the feedback signals. MBURST benefits from such capabilities to learn correlations between the noisy signal and the visual stimuli, thus attributing meaning to the speech by amplifying relevant information and suppressing noise. Experiments conducted over a Grid Corpus and CHiME3-based dataset show that MBURST can reproduce similar mask reconstructions to the multimodal backpropagation-based baseline while demonstrating outstanding energy efficiency management, reducing the neuron firing rates to values up to \textbf{$70\%$} lower. Such a feature implies more sustainable implementations, suitable and desirable for hearing aids or any other similar embedded systems. △ Less

Submitted 5 February, 2024; v1 submitted 7 September, 2022; originally announced September 2022.

arXiv:2206.02671 [pdf, ps, other]

doi 10.1016/j.neucom.2022.11.081

Canonical Cortical Graph Neural Networks and its Application for Speech Enhancement in Audio-Visual Hearing Aids

Authors: Leandro A. Passos, João Paulo Papa, Amir Hussain, Ahsan Adeel

Abstract: Despite the recent success of machine learning algorithms, most models face drawbacks when considering more complex tasks requiring interaction between different sources, such as multimodal input data and logical time sequences. On the other hand, the biological brain is highly sharpened in this sense, empowered to automatically manage and integrate such streams of information. In this context, th… ▽ More Despite the recent success of machine learning algorithms, most models face drawbacks when considering more complex tasks requiring interaction between different sources, such as multimodal input data and logical time sequences. On the other hand, the biological brain is highly sharpened in this sense, empowered to automatically manage and integrate such streams of information. In this context, this work draws inspiration from recent discoveries in brain cortical circuits to propose a more biologically plausible self-supervised machine learning approach. This combines multimodal information using intra-layer modulations together with Canonical Correlation Analysis, and a memory mechanism to keep track of temporal data, the overall approach termed Canonical Cortical Graph Neural networks. This is shown to outperform recent state-of-the-art models in terms of clean audio reconstruction and energy efficiency for a benchmark audio-visual speech dataset. The enhanced performance is demonstrated through a reduced and smother neuron firing rate distribution. suggesting that the proposed model is amenable for speech enhancement in future audio-visual hearing aid devices. △ Less

Submitted 31 January, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

arXiv:2204.06635 [pdf, other]

doi 10.1109/TFUZZ.2019.2949771

A Novel Approach for Optimum-Path Forest Classification Using Fuzzy Logic

Authors: Renato W. R. de Souza, João V. C. de Oliveira, Leandro A. Passos, Weiping Ding, João P. Papa, Victor Hugo C. de Albuquerque

Abstract: In the past decades, fuzzy logic has played an essential role in many research areas. Alongside, graph-based pattern recognition has shown to be of great importance due to its flexibility in partitioning the feature space using the background from graph theory. Some years ago, a new framework for both supervised, semi-supervised, and unsupervised learning named Optimum-Path Forest (OPF) was propos… ▽ More In the past decades, fuzzy logic has played an essential role in many research areas. Alongside, graph-based pattern recognition has shown to be of great importance due to its flexibility in partitioning the feature space using the background from graph theory. Some years ago, a new framework for both supervised, semi-supervised, and unsupervised learning named Optimum-Path Forest (OPF) was proposed with competitive results in several applications, besides comprising a low computational burden. In this paper, we propose the Fuzzy Optimum-Path Forest, an improved version of the standard OPF classifier that learns the samples' membership in an unsupervised fashion, which are further incorporated during supervised training. Such information is used to identify the most relevant training samples, thus improving the classification step. Experiments conducted over twelve public datasets highlight the robustness of the proposed approach, which behaves similarly to standard OPF in worst-case scenarios. △ Less

Submitted 13 April, 2022; originally announced April 2022.

Journal ref: IEEE Transactions on Fuzzy Systems 28.12 (2019): 3076-3086

arXiv:2203.13856 [pdf, other]

doi 10.1016/j.bspc.2024.106263

Robust deep learning for eye fundus images: Bridging real and synthetic data for enhancing generalization

Authors: Guilherme C. Oliveira, Gustavo H. Rosa, Daniel C. G. Pedronette, João P. Papa, Himeesh Kumar, Leandro A. Passos, Dinesh Kumar

Abstract: Deep learning applications for assessing medical images are limited because the datasets are often small and imbalanced. The use of synthetic data has been proposed in the literature, but neither a robust comparison of the different methods nor generalizability has been reported. Our approach integrates a retinal image quality assessment model and StyleGAN2 architecture to enhance Age-related Macu… ▽ More Deep learning applications for assessing medical images are limited because the datasets are often small and imbalanced. The use of synthetic data has been proposed in the literature, but neither a robust comparison of the different methods nor generalizability has been reported. Our approach integrates a retinal image quality assessment model and StyleGAN2 architecture to enhance Age-related Macular Degeneration (AMD) detection capabilities and improve generalizability. This work compares ten different Generative Adversarial Network (GAN) architectures to generate synthetic eye-fundus images with and without AMD. We combined subsets of three public databases (iChallenge-AMD, ODIR-2019, and RIADD) to form a single training and test set. We employed the STARE dataset for external validation, ensuring a comprehensive assessment of the proposed approach. The results show that StyleGAN2 reached the lowest Frechet Inception Distance (166.17), and clinicians could not accurately differentiate between real and synthetic images. ResNet-18 architecture obtained the best performance with 85% accuracy and outperformed the two human experts (80% and 75%) in detecting AMD fundus images. The accuracy rates were 82.8% for the test set and 81.3% for the STARE dataset, demonstrating the model's generalizability. The proposed methodology for synthetic medical image generation has been validated for robustness and accuracy, with free access to its code for further research and development in this field. △ Less

Submitted 3 April, 2024; v1 submitted 25 March, 2022; originally announced March 2022.

Comments: Accepted by the Biomedical Signal Processing and Control

Journal ref: Biomedical Signal Processing and Control, 94 (2024), 106263

arXiv:2203.02740 [pdf, ps, other]

MaxDropoutV2: An Improved Method to Drop out Neurons in Convolutional Neural Networks

Authors: Claudio Filipi Goncalves do Santos, Mateus Roder, Leandro A. Passos, João P. Papa

Abstract: In the last decade, exponential data growth supplied the machine learning-based algorithms' capacity and enabled their usage in daily life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors produce outstanding results, they also pose drawba… ▽ More In the last decade, exponential data growth supplied the machine learning-based algorithms' capacity and enabled their usage in daily life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors produce outstanding results, they also pose drawbacks regarding the learning process since training complex models denotes an expensive task and results are prone to overfit the training data. A supervised regularization technique called MaxDropout was recently proposed to tackle the latter, providing several improvements concerning traditional regularization approaches. In this paper, we present its improved version called MaxDropoutV2. Results considering two public datasets show that the model performs faster than the standard version and, in most cases, provides more accurate results. △ Less

Submitted 5 March, 2022; originally announced March 2022.

arXiv:2202.08934 [pdf, other]

doi 10.1016/j.knosys.2022.108445

Handling Imbalanced Datasets Through Optimum-Path Forest

Authors: Leandro Aparecido Passos, Danilo S. Jodas, Luiz C. F. Ribeiro, Marco Akio, Andre Nunes de Souza, João Paulo Papa

Abstract: In the last decade, machine learning-based approaches became capable of performing a wide range of complex tasks sometimes better than humans, demanding a fraction of the time. Such an advance is partially due to the exponential growth in the amount of data available, which makes it possible to extract trustworthy real-world information from them. However, such data is generally imbalanced since s… ▽ More In the last decade, machine learning-based approaches became capable of performing a wide range of complex tasks sometimes better than humans, demanding a fraction of the time. Such an advance is partially due to the exponential growth in the amount of data available, which makes it possible to extract trustworthy real-world information from them. However, such data is generally imbalanced since some phenomena are more likely than others. Such a behavior yields considerable influence on the machine learning model's performance since it becomes biased on the more frequent data it receives. Despite the considerable amount of machine learning methods, a graph-based approach has attracted considerable notoriety due to the outstanding performance over many applications, i.e., the Optimum-Path Forest (OPF). In this paper, we propose three OPF-based strategies to deal with the imbalance problem: the $\text{O}^2$PF and the OPF-US, which are novel approaches for oversampling and undersampling, respectively, as well as a hybrid strategy combining both approaches. The paper also introduces a set of variants concerning the strategies mentioned above. Results compared against several state-of-the-art techniques over public and private datasets confirm the robustness of the proposed approaches. △ Less

Submitted 17 February, 2022; originally announced February 2022.

arXiv:2202.06095 [pdf, other]

doi 10.1111/EXSY.13570

A Review of Deep Learning-based Approaches for Deepfake Content Detection

Authors: Leandro A. Passos, Danilo Jodas, Kelton A. P. da Costa, Luis A. Souza Júnior, Douglas Rodrigues, Javier Del Ser, David Camacho, João Paulo Papa

Abstract: Recent advancements in deep learning generative models have raised concerns as they can create highly convincing counterfeit images and videos. This poses a threat to people's integrity and can lead to social instability. To address this issue, there is a pressing need to develop new computational models that can efficiently detect forged content and alert users to potential image and video manipu… ▽ More Recent advancements in deep learning generative models have raised concerns as they can create highly convincing counterfeit images and videos. This poses a threat to people's integrity and can lead to social instability. To address this issue, there is a pressing need to develop new computational models that can efficiently detect forged content and alert users to potential image and video manipulations. This paper presents a comprehensive review of recent studies for deepfake content detection using deep learning-based approaches. We aim to broaden the state-of-the-art research by systematically reviewing the different categories of fake content detection. Furthermore, we report the advantages and drawbacks of the examined works, and prescribe several future directions towards the issues and shortcomings still unsolved on deepfake detection. △ Less

Submitted 15 February, 2024; v1 submitted 12 February, 2022; originally announced February 2022.

arXiv:2202.04528 [pdf, other]

doi 10.1016/j.inffus.2022.09.006

Multimodal Audio-Visual Information Fusion using Canonical-Correlated Graph Neural Network for Energy-Efficient Speech Enhancement

Authors: Leandro Aparecido Passos, João Paulo Papa, Javier Del Ser, Amir Hussain, Ahsan Adeel

Abstract: This paper proposes a novel multimodal self-supervised architecture for energy-efficient audio-visual (AV) speech enhancement that integrates Graph Neural Networks with canonical correlation analysis (CCA-GNN). The proposed approach lays its foundations on a state-of-the-art CCA-GNN that learns representative embeddings by maximizing the correlation between pairs of augmented views of the same inp… ▽ More This paper proposes a novel multimodal self-supervised architecture for energy-efficient audio-visual (AV) speech enhancement that integrates Graph Neural Networks with canonical correlation analysis (CCA-GNN). The proposed approach lays its foundations on a state-of-the-art CCA-GNN that learns representative embeddings by maximizing the correlation between pairs of augmented views of the same input while decorrelating disconnected features. The key idea of the conventional CCA-GNN involves discarding augmentation-variant information and preserving augmentation-invariant information while preventing capturing of redundant information. Our proposed AV CCA-GNN model deals with multimodal representation learning context. Specifically, our model improves contextual AV speech processing by maximizing canonical correlation from augmented views of the same channel and canonical correlation from audio and visual embeddings. In addition, it proposes a positional node encoding that considers a prior-frame sequence distance instead of a feature-space representation when computing the node's nearest neighbors, introducing temporal information in the embeddings through the neighborhood's connectivity. Experiments conducted on the benchmark ChiME3 dataset show that our proposed prior frame-based AV CCA-GNN ensures better feature learning in the temporal context, leading to more energy-efficient speech reconstruction than state-of-the-art CCA-GNN and multilayer perceptron. △ Less

Submitted 16 September, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

arXiv:2201.03323 [pdf, other]

doi 10.1145/3490235

Gait Recognition Based on Deep Learning: A Survey

Authors: Claudio Filipi Gonçalves dos Santos, Diego de Souza Oliveira, Leandro A. Passos, Rafael Gonçalves Pires, Daniel Felipe Silva Santos, Lucas Pascotti Valem, Thierry P. Moreira, Marcos Cleison S. Santana, Mateus Roder, João Paulo Papa, Danilo Colombo

Abstract: In general, biometry-based control systems may not rely on individual expected behavior or cooperation to operate appropriately. Instead, such systems should be aware of malicious procedures for unauthorized access attempts. Some works available in the literature suggest addressing the problem through gait recognition approaches. Such methods aim at identifying human beings through intrinsic perce… ▽ More In general, biometry-based control systems may not rely on individual expected behavior or cooperation to operate appropriately. Instead, such systems should be aware of malicious procedures for unauthorized access attempts. Some works available in the literature suggest addressing the problem through gait recognition approaches. Such methods aim at identifying human beings through intrinsic perceptible features, despite dressed clothes or accessories. Although the issue denotes a relatively long-time challenge, most of the techniques developed to handle the problem present several drawbacks related to feature extraction and low classification rates, among other issues. However, deep learning-based approaches recently emerged as a robust set of tools to deal with virtually any image and computer-vision related problem, providing paramount results for gait recognition as well. Therefore, this work provides a surveyed compilation of recent works regarding biometric detection through gait recognition with a focus on deep learning approaches, emphasizing their benefits, and exposing their weaknesses. Besides, it also presents categorized and characterized descriptions of the datasets, approaches, and architectures employed to tackle associated constraints. △ Less

Submitted 10 January, 2022; originally announced January 2022.

arXiv:2101.06749 [pdf, other]

doi 10.1007/978-3-030-61401-0_22

A Layer-Wise Information Reinforcement Approach to Improve Learning in Deep Belief Networks

Authors: Mateus Roder, Leandro A. Passos, Luiz Carlos Felix Ribeiro, Clayton Pereira, João Paulo Papa

Abstract: With the advent of deep learning, the number of works proposing new methods or improving existent ones has grown exponentially in the last years. In this scenario, "very deep" models were emerging, once they were expected to extract more intrinsic and abstract features while supporting a better performance. However, such models suffer from the gradient vanishing problem, i.e., backpropagation valu… ▽ More With the advent of deep learning, the number of works proposing new methods or improving existent ones has grown exponentially in the last years. In this scenario, "very deep" models were emerging, once they were expected to extract more intrinsic and abstract features while supporting a better performance. However, such models suffer from the gradient vanishing problem, i.e., backpropagation values become too close to zero in their shallower layers, ultimately causing learning to stagnate. Such an issue was overcome in the context of convolution neural networks by creating "shortcut connections" between layers, in a so-called deep residual learning framework. Nonetheless, a very popular deep learning technique called Deep Belief Network still suffers from gradient vanishing when dealing with discriminative tasks. Therefore, this paper proposes the Residual Deep Belief Network, which considers the information reinforcement layer-by-layer to improve the feature extraction and knowledge retaining, that support better discriminative performance. Experiments conducted over three public datasets demonstrate its robustness concerning the task of binary image classification. △ Less

Submitted 17 January, 2021; originally announced January 2021.

arXiv:2101.06747 [pdf, other]

doi 10.1007/978-3-030-61401-0_23

Intestinal Parasites Classification Using Deep Belief Networks

Authors: Mateus Roder, Leandro A. Passos, Luiz Carlos Felix Ribeiro, Barbara Caroline Benato, Alexandre Xavier Falcão, João Paulo Papa

Abstract: Currently, approximately $4$ billion people are infected by intestinal parasites worldwide. Diseases caused by such infections constitute a public health problem in most tropical countries, leading to physical and mental disorders, and even death to children and immunodeficient individuals. Although subjected to high error rates, human visual inspection is still in charge of the vast majority of c… ▽ More Currently, approximately $4$ billion people are infected by intestinal parasites worldwide. Diseases caused by such infections constitute a public health problem in most tropical countries, leading to physical and mental disorders, and even death to children and immunodeficient individuals. Although subjected to high error rates, human visual inspection is still in charge of the vast majority of clinical diagnoses. In the past years, some works addressed intelligent computer-aided intestinal parasites classification, but they usually suffer from misclassification due to similarities between parasites and fecal impurities. In this paper, we introduce Deep Belief Networks to the context of automatic intestinal parasites classification. Experiments conducted over three datasets composed of eggs, larvae, and protozoa provided promising results, even considering unbalanced classes and also fecal impurities. △ Less

Submitted 17 January, 2021; originally announced January 2021.

arXiv:2101.05795 [pdf, ps, other]

doi 10.1016/j.asoc.2019.105717

A Metaheuristic-Driven Approach to Fine-Tune Deep Boltzmann Machines

Authors: Leandro Aparecido Passos, João Paulo Papa

Abstract: Deep learning techniques, such as Deep Boltzmann Machines (DBMs), have received considerable attention over the past years due to the outstanding results concerning a variable range of domains. One of the main shortcomings of these techniques involves the choice of their hyperparameters, since they have a significant impact on the final results. This work addresses the issue of fine-tuning hyperpa… ▽ More Deep learning techniques, such as Deep Boltzmann Machines (DBMs), have received considerable attention over the past years due to the outstanding results concerning a variable range of domains. One of the main shortcomings of these techniques involves the choice of their hyperparameters, since they have a significant impact on the final results. This work addresses the issue of fine-tuning hyperparameters of Deep Boltzmann Machines using metaheuristic optimization techniques with different backgrounds, such as swarm intelligence, memory- and evolutionary-based approaches. Experiments conducted in three public datasets for binary image reconstruction showed that metaheuristic techniques can obtain reasonable results. △ Less

Submitted 14 January, 2021; originally announced January 2021.

Comments: 30 pages, 7 figures

Journal ref: Applied Soft Computing 97 (2020): 105717

arXiv:2101.05775 [pdf, other]

doi 10.1109/CBMS49503.2020.00100

$\text{O}^2$PF: Oversampling via Optimum-Path Forest for Breast Cancer Detection

Authors: Leandro Aparecido Passos, Danilo Samuel Jodas, Luiz C. F. Ribeiro, Thierry Pinheiro, João P. Papa

Abstract: Breast cancer is among the most deadly diseases, distressing mostly women worldwide. Although traditional methods for detection have presented themselves as valid for the task, they still commonly present low accuracies and demand considerable time and effort from professionals. Therefore, a computer-aided diagnosis (CAD) system capable of providing early detection becomes hugely desirable. In the… ▽ More Breast cancer is among the most deadly diseases, distressing mostly women worldwide. Although traditional methods for detection have presented themselves as valid for the task, they still commonly present low accuracies and demand considerable time and effort from professionals. Therefore, a computer-aided diagnosis (CAD) system capable of providing early detection becomes hugely desirable. In the last decade, machine learning-based techniques have been of paramount importance in this context, since they are capable of extracting essential information from data and reasoning about it. However, such approaches still suffer from imbalanced data, specifically on medical issues, where the number of healthy people samples is, in general, considerably higher than the number of patients. Therefore this paper proposes the $\text{O}^2$PF, a data oversampling method based on the unsupervised Optimum-Path Forest Algorithm. Experiments conducted over the full oversampling scenario state the robustness of the model, which is compared against three well-established oversampling methods considering three breast cancer and three general-purpose tasks for medical issues datasets. △ Less

Submitted 14 January, 2021; originally announced January 2021.

Comments: 6 pages, 3 figures. 2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS)

Showing 1–21 of 21 results for author: Passos, L A