subscribe to arXiv mailings

arXiv:2401.06755 [pdf, other]

Solving the Discretised Multiphase Flow Equations with Interface Capturing on Structured Grids Using Machine Learning Libraries

Authors: Boyang Chen, Claire E. Heaney, Jefferson L. M. A. Gomes, Omar K. Matar, Christopher C. Pain

Abstract: This paper solves the discretised multiphase flow equations using tools and methods from machine-learning libraries. The idea comes from the observation that convolutional layers can be used to express a discretisation as a neural network whose weights are determined by the numerical method, rather than by training, and hence, we refer to this approach as Neural Networks for PDEs (NN4PDEs). To sol… ▽ More This paper solves the discretised multiphase flow equations using tools and methods from machine-learning libraries. The idea comes from the observation that convolutional layers can be used to express a discretisation as a neural network whose weights are determined by the numerical method, rather than by training, and hence, we refer to this approach as Neural Networks for PDEs (NN4PDEs). To solve the discretised multiphase flow equations, a multigrid solver is implemented through a convolutional neural network with a U-Net architecture. Immiscible two-phase flow is modelled by the 3D incompressible Navier-Stokes equations with surface tension and advection of a volume fraction field, which describes the interface between the fluids. A new compressive algebraic volume-of-fluids method is introduced, based on a residual formulation using Petrov-Galerkin for accuracy and designed with NN4PDEs in mind. High-order finite-element based schemes are chosen to model a collapsing water column and a rising bubble. Results compare well with experimental data and other numerical results from the literature, demonstrating that, for the first time, finite element discretisations of multiphase flows can be solved using an approach based on (untrained) convolutional neural networks. A benefit of expressing numerical discretisations as neural networks is that the code can run, without modification, on CPUs, GPUs or the latest accelerators designed especially to run AI codes. △ Less

Submitted 3 March, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

Comments: 34 pages, 18 figures, 4 tables

arXiv:2401.02230 [pdf, ps, other]

Automated Test Production -- Complement to "Ad-hoc" Testing

Authors: José Marcos Gomes, Luis Alberto Vieira Dias

Abstract: A view on software testing, taken in a broad sense and considered a important activity is presented. We discuss the methods and techniques for applying tests and the reasons we recognize make it difficult for industry to adopt the advances observed in academia. We discuss some advances in the area and briefly point out the approach we intend to follow in the search for a solution. A view on software testing, taken in a broad sense and considered a important activity is presented. We discuss the methods and techniques for applying tests and the reasons we recognize make it difficult for industry to adopt the advances observed in academia. We discuss some advances in the area and briefly point out the approach we intend to follow in the search for a solution. △ Less

Submitted 4 January, 2024; originally announced January 2024.

arXiv:2401.02033 [pdf, other]

Automated Test Production -- Systematic Literature Review

Authors: José Marcos Gomes, Luis Alberto Vieira Dias

Abstract: Identifying the main contributions related to the Automated Test Production (ATP) of Computer Programs and providing an overview about models, methodologies and tools used for this purpose is the aim of this Systematic Literature Review (SLR). The results will enable a comprehensive analysis and insight to evaluate their applicability. A previously produced Systematic Literature Mapping (SLM) cont… ▽ More Identifying the main contributions related to the Automated Test Production (ATP) of Computer Programs and providing an overview about models, methodologies and tools used for this purpose is the aim of this Systematic Literature Review (SLR). The results will enable a comprehensive analysis and insight to evaluate their applicability. A previously produced Systematic Literature Mapping (SLM) contributed to the formulation of the ``Research Questions'' and parameters for the definition of the qualitative analysis protocol of this review. △ Less

Submitted 3 January, 2024; originally announced January 2024.

arXiv:2401.01430 [pdf, other]

Automated Test Production -- Systematic Literature Mapping

Authors: José Marcos Gomes, Luis Alberto Vieira Dias

Abstract: The broader goal of this research, on the one hand, is to obtain the State of the Art in Automated Test Production (ATP), to find the open questions and related problems and to track the progress of researchers in the field, and on the other hand is to list and categorize the methods, techniques and tools of ATP that meet the needs of practitioners who produce computerized business applications fo… ▽ More The broader goal of this research, on the one hand, is to obtain the State of the Art in Automated Test Production (ATP), to find the open questions and related problems and to track the progress of researchers in the field, and on the other hand is to list and categorize the methods, techniques and tools of ATP that meet the needs of practitioners who produce computerized business applications for internal use in their corporations - eventually it can be extended to the needs of practitioners in companies that specialize in producing computer applications for generic use. △ Less

Submitted 2 January, 2024; originally announced January 2024.

arXiv:2312.11498 [pdf, other]

Recommending Influencers to Merchants using Matching Game Algorithm

Authors: José Marcos Gomes, Luis Alberto Vieira Dias

Abstract: The goal of this work was to apply the ``Gale-Shapley'' algorithm to a real-world problem. We analyzed the pairing of influencers with merchants, and after a detailed specification of the variables involved, we conducted experiments to observe the validity of the approach. We conducted an analysis of the problem of aligning the interests of merchants to have digital influencers promote their produ… ▽ More The goal of this work was to apply the ``Gale-Shapley'' algorithm to a real-world problem. We analyzed the pairing of influencers with merchants, and after a detailed specification of the variables involved, we conducted experiments to observe the validity of the approach. We conducted an analysis of the problem of aligning the interests of merchants to have digital influencers promote their products and services. We propose applying the matching algorithm approach to address this issue. We demonstrate that it is possible to apply the algorithm and still achieve corporate objectives by translating performance indicators into the desired ranking of influencers and product campaigns to be advertised by merchants. △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2311.05051 [pdf, other]

Deep Learning Brasil at ABSAPT 2022: Portuguese Transformer Ensemble Approaches

Authors: Juliana Resplande Santanna Gomes, Eduardo Augusto Santos Garcia, Adalberto Ferreira Barbosa Junior, Ruan Chaves Rodrigues, Diogo Fernandes Costa Silva, Dyonnatan Ferreira Maia, Nádia Félix Felipe da Silva, Arlindo Rodrigues Galvão Filho, Anderson da Silva Soares

Abstract: Aspect-based Sentiment Analysis (ABSA) is a task whose objective is to classify the individual sentiment polarity of all entities, called aspects, in a sentence. The task is composed of two subtasks: Aspect Term Extraction (ATE), identify all aspect terms in a sentence; and Sentiment Orientation Extraction (SOE), given a sentence and its aspect terms, the task is to determine the sentiment polarit… ▽ More Aspect-based Sentiment Analysis (ABSA) is a task whose objective is to classify the individual sentiment polarity of all entities, called aspects, in a sentence. The task is composed of two subtasks: Aspect Term Extraction (ATE), identify all aspect terms in a sentence; and Sentiment Orientation Extraction (SOE), given a sentence and its aspect terms, the task is to determine the sentiment polarity of each aspect term (positive, negative or neutral). This article presents we present our participation in Aspect-Based Sentiment Analysis in Portuguese (ABSAPT) 2022 at IberLEF 2022. We submitted the best performing systems, achieving new state-of-the-art results on both subtasks. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: 11 pages, 3 figures, In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2022), Online. CEUR. org

Report number: urn:nbn:de:0074-3202-9

arXiv:2311.05047 [pdf, ps, other]

doi 10.26615/978-954-452-084-7_042

DeepLearningBrasil@LT-EDI-2023: Exploring Deep Learning Techniques for Detecting Depression in Social Media Text

Authors: Eduardo Garcia, Juliana Gomes, Adalberto Barbosa Júnior, Cardeque Borges, Nádia da Silva

Abstract: In this paper, we delineate the strategy employed by our team, DeepLearningBrasil, which secured us the first place in the shared task DepSign-LT-EDI@RANLP-2023, achieving a 47.0% Macro F1-Score and a notable 2.4% advantage. The task was to classify social media texts into three distinct levels of depression - "not depressed," "moderately depressed," and "severely depressed." Leveraging the power… ▽ More In this paper, we delineate the strategy employed by our team, DeepLearningBrasil, which secured us the first place in the shared task DepSign-LT-EDI@RANLP-2023, achieving a 47.0% Macro F1-Score and a notable 2.4% advantage. The task was to classify social media texts into three distinct levels of depression - "not depressed," "moderately depressed," and "severely depressed." Leveraging the power of the RoBERTa and DeBERTa models, we further pre-trained them on a collected Reddit dataset, specifically curated from mental health-related Reddit's communities (Subreddits), leading to an enhanced understanding of nuanced mental health discourse. To address lengthy textual data, we used truncation techniques that retained the essence of the content by focusing on its beginnings and endings. Our model was robust against unbalanced data by incorporating sample weights into the loss. Cross-validation and ensemble techniques were then employed to combine our k-fold trained models, delivering an optimal solution. The accompanying code is made available for transparency and further development. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Report number: 2023.ltedi-1.42

arXiv:2310.11527 [pdf, other]

Thin and Deep Gaussian Processes

Authors: Daniel Augusto de Souza, Alexander Nikitin, ST John, Magnus Ross, Mauricio A. Álvarez, Marc Peter Deisenroth, João P. P. Gomes, Diego Mesquita, César Lincoln C. Mattos

Abstract: Gaussian processes (GPs) can provide a principled approach to uncertainty quantification with easy-to-interpret kernel hyperparameters, such as the lengthscale, which controls the correlation distance of function values. However, selecting an appropriate kernel can be challenging. Deep GPs avoid manual kernel engineering by successively parameterizing kernels with GP layers, allowing them to learn… ▽ More Gaussian processes (GPs) can provide a principled approach to uncertainty quantification with easy-to-interpret kernel hyperparameters, such as the lengthscale, which controls the correlation distance of function values. However, selecting an appropriate kernel can be challenging. Deep GPs avoid manual kernel engineering by successively parameterizing kernels with GP layers, allowing them to learn low-dimensional embeddings of the inputs that explain the output data. Following the architecture of deep neural networks, the most common deep GPs warp the input space layer-by-layer but lose all the interpretability of shallow GPs. An alternative construction is to successively parameterize the lengthscale of a kernel, improving the interpretability but ultimately giving away the notion of learning lower-dimensional embeddings. Unfortunately, both methods are susceptible to particular pathologies which may hinder fitting and limit their interpretability. This work proposes a novel synthesis of both previous approaches: Thin and Deep GP (TDGP). Each TDGP layer defines locally linear transformations of the original input data maintaining the concept of latent embeddings while also retaining the interpretation of lengthscales of a kernel. Moreover, unlike the prior solutions, TDGP induces non-pathological manifolds that admit learning lower-dimensional representations. We show with theoretical and experimental results that i) TDGP is, unlike previous models, tailored to specifically discover lower-dimensional manifolds in the input data, ii) TDGP behaves well when increasing the number of layers, and iii) TDGP performs well in standard benchmark datasets. △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: Accepted at the Conference on Neural Information Processing Systems (NeurIPS) 2023

arXiv:2306.14840 [pdf, other]

Building Flyweight FLIM-based CNNs with Adaptive Decoding for Object Detection

Authors: Leonardo de Melo Joao, Azael de Melo e Sousa, Bianca Martins dos Santos, Silvio Jamil Ferzoli Guimaraes, Jancarlo Ferreira Gomes, Ewa Kijak, Alexandre Xavier Falcao

Abstract: State-of-the-art (SOTA) object detection methods have succeeded in several applications at the price of relying on heavyweight neural networks, which makes them inefficient and inviable for many applications with computational resource constraints. This work presents a method to build a Convolutional Neural Network (CNN) layer by layer for object detection from user-drawn markers on discriminative… ▽ More State-of-the-art (SOTA) object detection methods have succeeded in several applications at the price of relying on heavyweight neural networks, which makes them inefficient and inviable for many applications with computational resource constraints. This work presents a method to build a Convolutional Neural Network (CNN) layer by layer for object detection from user-drawn markers on discriminative regions of representative images. We address the detection of Schistosomiasis mansoni eggs in microscopy images of fecal samples, and the detection of ships in satellite images as application examples. We could create a flyweight CNN without backpropagation from very few input images. Our method explores a recent methodology, Feature Learning from Image Markers (FLIM), to build convolutional feature extractors (encoders) from marker pixels. We extend FLIM to include a single-layer adaptive decoder, whose weights vary with the input image -- a concept never explored in CNNs. Our CNN weighs thousands of times less than SOTA object detectors, being suitable for CPU execution and showing superior or equivalent performance to three methods in five measures. △ Less

Submitted 5 October, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

arXiv:2305.05518 [pdf, other]

Minimal Learning Machine for Multi-Label Learning

Authors: Joonas Hämäläinen, Amauri Souza, César L. C. Mattos, João P. P. Gomes, Tommi Kärkkäinen

Abstract: Distance-based supervised method, the minimal learning machine, constructs a predictive model from data by learning a mapping between input and output distance matrices. In this paper, we propose methods and evaluate how this technique and its core component, the distance mapping, can be adapted to multi-label learning. The proposed approach is based on combining the distance mapping with an inver… ▽ More Distance-based supervised method, the minimal learning machine, constructs a predictive model from data by learning a mapping between input and output distance matrices. In this paper, we propose methods and evaluate how this technique and its core component, the distance mapping, can be adapted to multi-label learning. The proposed approach is based on combining the distance mapping with an inverse distance weighting. Although the proposal is one of the simplest methods in the multi-label learning literature, it achieves state-of-the-art performance for small to moderate-sized multi-label learning problems. Besides its simplicity, the proposed method is fully deterministic and its hyper-parameter can be selected via ranking loss-based statistic which has a closed form, thus avoiding conventional cross-validation-based hyper-parameter tuning. In addition, due to its simple linear distance mapping-based construction, we demonstrate that the proposed method can assess predictions' uncertainty for multi-label classification, which is a valuable capability for data-centric machine learning pipelines. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: Submitted, 26 pages

arXiv:2210.09107 [pdf, other]

ISEE.U: Distributed online active target localization with unpredictable targets

Authors: Miguel Vasques, Claudia Soares, João Gomes

Abstract: This paper addresses target localization with an online active learning algorithm defined by distributed, simple and fast computations at each node, with no parameters to tune and where the estimate of the target position at each agent is asymptotically equal in expectation to the centralized maximum-likelihood estimator. ISEE.U takes noisy distances at each agent and finds a control that maximize… ▽ More This paper addresses target localization with an online active learning algorithm defined by distributed, simple and fast computations at each node, with no parameters to tune and where the estimate of the target position at each agent is asymptotically equal in expectation to the centralized maximum-likelihood estimator. ISEE.U takes noisy distances at each agent and finds a control that maximizes localization accuracy. We do not assume specific target dynamics and, thus, our method is robust when facing unpredictable targets. Each agent computes the control that maximizes overall target position accuracy via a local estimate of the Fisher Information Matrix. We compared the proposed method with a state of the art algorithm outperforming it when the target movements do not follow a prescribed trajectory, with x100 less computation time, even when our method is running in one central CPU. △ Less

Submitted 21 August, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

arXiv:2112.03213 [pdf, other]

Zero-shot hashtag segmentation for multilingual sentiment analysis

Authors: Ruan Chaves Rodrigues, Marcelo Akira Inuzuka, Juliana Resplande Sant'Anna Gomes, Acquila Santos Rocha, Iacer Calixto, Hugo Alexandre Dantas do Nascimento

Abstract: Hashtag segmentation, also known as hashtag decomposition, is a common step in preprocessing pipelines for social media datasets. It usually precedes tasks such as sentiment analysis and hate speech detection. For sentiment analysis in medium to low-resourced languages, previous research has demonstrated that a multilingual approach that resorts to machine translation can be competitive or superio… ▽ More Hashtag segmentation, also known as hashtag decomposition, is a common step in preprocessing pipelines for social media datasets. It usually precedes tasks such as sentiment analysis and hate speech detection. For sentiment analysis in medium to low-resourced languages, previous research has demonstrated that a multilingual approach that resorts to machine translation can be competitive or superior to previous approaches to the task. We develop a zero-shot hashtag segmentation framework and demonstrate how it can be used to improve the accuracy of multilingual sentiment analysis pipelines. Our zero-shot framework establishes a new state-of-the-art for hashtag segmentation datasets, surpassing even previous approaches that relied on feature engineering and language models trained on in-domain data. △ Less

Submitted 6 December, 2021; originally announced December 2021.

Comments: 12 pages, 5 figures, 5 tables

ACM Class: I.2.7

arXiv:2110.03523 [pdf, other]

doi 10.1109/LSP.2020.2988178

Range and Bearing Data Fusion for Precise Convex Network Localization

Authors: Claudia Soares, Filipa Valdeira, Joao Gomes

Abstract: Hybrid localization in GNSS-challenged environments using measured ranges and angles is becoming increasingly popular, in particular with the advent of multimodal communication systems. Here, we address the hybrid network localization problem using ranges and bearings to jointly determine the positions of a number of agents through a single maximum-likelihood (ML) optimization problem that seamles… ▽ More Hybrid localization in GNSS-challenged environments using measured ranges and angles is becoming increasingly popular, in particular with the advent of multimodal communication systems. Here, we address the hybrid network localization problem using ranges and bearings to jointly determine the positions of a number of agents through a single maximum-likelihood (ML) optimization problem that seamlessly fuses all the available pairwise range and angle measurements. We propose a tight convex surrogate to the ML estimator, we examine practical measures for the accuracy of the relaxation, and we comprehensively characterize its behavior in simulation. We found that our relaxation outperforms a state of the art SDP relaxation by one order of magnitude in terms of localization error, and is amenable to much more lightweight solution algorithms. △ Less

Submitted 1 October, 2021; originally announced October 2021.

Journal ref: IEEE Signal Processing Letters 27 (2020): 670-674

arXiv:2110.00594 [pdf, other]

doi 10.1016/j.sigpro.2021.108066

STRONG: Synchronous and asynchronous RObust Network localization, under Non-Gaussian noise

Authors: Claudia Soares, João Gomes

Abstract: Real-world network applications must cope with failing nodes, malicious attacks, or nodes facing corrupted data - data classified as outliers. Our work addresses these concerns in the scope of the sensor network localization problem where, despite the abundance of technical literature, prior research seldom considered outlier data. We propose robust, fast, and distributed network localization algo… ▽ More Real-world network applications must cope with failing nodes, malicious attacks, or nodes facing corrupted data - data classified as outliers. Our work addresses these concerns in the scope of the sensor network localization problem where, despite the abundance of technical literature, prior research seldom considered outlier data. We propose robust, fast, and distributed network localization algorithms, resilient to high-power noise, but also precise under regular Gaussian noise. We use a Huber M-estimator, thus obtaining a robust (but nonconvex) optimization problem. We convexify and change the problem representation, to allow for distributed robust localization algorithms: a synchronous distributed method that has optimal convergence rate and an asynchronous one with proven convergence guarantees. A major highlight of our contribution lies on the fact that we pay no price for provable distributed computation neither in accuracy, nor in communication cost or convergence speed. Simulations showcase the superior performance of our algorithms, both in the presence of outliers and under regular Gaussian noise: our method exceeds the accuracy of alternative approaches, distributed and centralized, even under heavy additive and multiplicative outlier noise. △ Less

Submitted 1 October, 2021; originally announced October 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:1610.09020

Journal ref: Signal Processing 185 (2021): 108066

arXiv:2109.00402 [pdf, other]

doi 10.1109/CBMS58004.2023.00306

Chronic Pain and Language: A Topic Modelling Approach to Personal Pain Descriptions

Authors: Diogo A. P. Nunes, Joana Ferreira Gomes, Fani Neto, David Martins de Matos

Abstract: Chronic pain is recognized as a major health problem, with impacts not only at the economic, but also at the social, and individual levels. Being a private and subjective experience, it is impossible to externally and impartially experience, describe, and interpret chronic pain as a purely noxious stimulus that would directly point to a causal agent and facilitate its mitigation, contrary to acute… ▽ More Chronic pain is recognized as a major health problem, with impacts not only at the economic, but also at the social, and individual levels. Being a private and subjective experience, it is impossible to externally and impartially experience, describe, and interpret chronic pain as a purely noxious stimulus that would directly point to a causal agent and facilitate its mitigation, contrary to acute pain, the assessment of which is usually straightforward. Verbal communication is, thus, key to convey relevant information to health professionals that would otherwise not be accessible to external entities, namely, intrinsic qualities about the painful experience and the patient. We propose and discuss a topic modelling approach to recognize patterns in verbal descriptions of chronic pain, and use these patterns to quantify and qualify experiences of pain. Our approaches allow for the extraction of novel insights on chronic pain experiences from the obtained topic models and latent spaces. We argue that our results are clinically relevant for the assessment and management of chronic pain. △ Less

Submitted 17 March, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

Comments: 9 pages, 5 figures, 6 tables

ACM Class: I.2.7; I.5.3; I.5.4; J.3; J.4

arXiv:2106.09883 [pdf, other]

doi 10.4204/EPTCS.357.5

On Logics of Perfect Paradefinite Algebras

Authors: Joel Gomes, Vitor Greati, Sérgio Marcelino, João Marcos, Umberto Rivieccio

Abstract: The present study shows how to enrich De Morgan algebras with a perfection operator that allows one to express the Boolean properties of negation-consistency and negation-determinedness. The variety of perfect paradefinite algebras thus obtained (PP-algebras) is shown to be term-equivalent to the variety of involutive Stone algebras, introduced by R. Cignoli and M. Sagastume, and more recently stu… ▽ More The present study shows how to enrich De Morgan algebras with a perfection operator that allows one to express the Boolean properties of negation-consistency and negation-determinedness. The variety of perfect paradefinite algebras thus obtained (PP-algebras) is shown to be term-equivalent to the variety of involutive Stone algebras, introduced by R. Cignoli and M. Sagastume, and more recently studied from a logical perspective by M. Figallo-L. Cantú and by S. Marcelino-U. Rivieccio. This equivalence plays an important role in the investigation of the 1-assertional logic and of the order-preserving logic associated to PP-algebras. The latter logic (here called PP<=) is characterized by a single 6-valued matrix and is shown to be a Logic of Formal Inconsistency and Formal Undeterminedness. We axiomatize PP<= by means of an analytic finite Hilbert-style calculus, and we present an axiomatization procedure that covers the logics corresponding to other classes of De Morgan algebras enriched by a perfection operator. △ Less

Submitted 8 April, 2022; v1 submitted 17 June, 2021; originally announced June 2021.

Comments: In Proceedings LSFA 2021, arXiv:2204.03415

Journal ref: EPTCS 357, 2022, pp. 56-76

arXiv:2101.06310 [pdf, other]

doi 10.1016/j.compbiomed.2020.103917

Automated Diagnosis of Intestinal Parasites: A new hybrid approach and its benefits

Authors: D. Osaku, C. F. Cuba, Celso T. N. Suzuki, J. F. Gomes, A. X. Falcão

Abstract: Intestinal parasites are responsible for several diseases in human beings. In order to eliminate the error-prone visual analysis of optical microscopy slides, we have investigated automated, fast, and low-cost systems for the diagnosis of human intestinal parasites. In this work, we present a hybrid approach that combines the opinion of two decision-making systems with complementary properties: (… ▽ More Intestinal parasites are responsible for several diseases in human beings. In order to eliminate the error-prone visual analysis of optical microscopy slides, we have investigated automated, fast, and low-cost systems for the diagnosis of human intestinal parasites. In this work, we present a hybrid approach that combines the opinion of two decision-making systems with complementary properties: ($DS_1$) a simpler system based on very fast handcrafted image feature extraction and support vector machine classification and ($DS_2$) a more complex system based on a deep neural network, Vgg-16, for image feature extraction and classification. $DS_1$ is much faster than $DS_2$, but it is less accurate than $DS_2$. Fortunately, the errors of $DS_1$ are not the same of $DS_2$. During training, we use a validation set to learn the probabilities of misclassification by $DS_1$ on each class based on its confidence values. When $DS_1$ quickly classifies all images from a microscopy slide, the method selects a number of images with higher chances of misclassification for characterization and reclassification by $DS_2$. Our hybrid system can improve the overall effectiveness without compromising efficiency, being suitable for the clinical routine -- a strategy that might be suitable for other real applications. As demonstrated on large datasets, the proposed system can achieve, on average, 94.9%, 87.8%, and 92.5% of Cohen's Kappa on helminth eggs, helminth larvae, and protozoa cysts, respectively. △ Less

Submitted 7 January, 2021; originally announced January 2021.

Comments: 18 pages, 11 figures

Journal ref: Computers in Biology and Medicine, Volume 123, August 2020, 103917

arXiv:2101.04699 [pdf, other]

doi 10.1016/j.patrec.2021.06.032

Convolutional Neural Network Simplification with Progressive Retraining

Authors: D. Osaku, J. F. Gomes, A. X. Falcão

Abstract: Kernel pruning methods have been proposed to speed up, simplify, and improve explanation of convolutional neural network (CNN) models. However, the effectiveness of a simplified model is often below the original one. In this letter, we present new methods based on objective and subjective relevance criteria for kernel elimination in a layer-by-layer fashion. During the process, a CNN model is retr… ▽ More Kernel pruning methods have been proposed to speed up, simplify, and improve explanation of convolutional neural network (CNN) models. However, the effectiveness of a simplified model is often below the original one. In this letter, we present new methods based on objective and subjective relevance criteria for kernel elimination in a layer-by-layer fashion. During the process, a CNN model is retrained only when the current layer is entirely simplified, by adjusting the weights from the next layer to the first one and preserving weights of subsequent layers not involved in the process. We call this strategy \emph{progressive retraining}, differently from kernel pruning methods that usually retrain the entire model after each simplification action -- e.g., the elimination of one or a few kernels. Our subjective relevance criterion exploits the ability of humans in recognizing visual patterns and improves the designer's understanding of the simplification process. The combination of suitable relevance criteria and progressive retraining shows that our methods can increase effectiveness with considerable model simplification. We also demonstrate that our methods can provide better results than two popular ones and another one from the state-of-the-art using four challenging image datasets. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: 7 pages, 4 figures. This paper was submitted to Pattern Recognition Letters

arXiv:2008.00558 [pdf, ps, other]

Semi-supervised deep learning based on label propagation in a 2D embedded space

Authors: Barbara Caroline Benato, Jancarlo Ferreira Gomes, Alexandru Cristian Telea, Alexandre Xavier Falcão

Abstract: While convolutional neural networks need large labeled sets for training images, expert human supervision of such datasets can be very laborious. Proposed solutions propagate labels from a small set of supervised images to a large set of unsupervised ones to obtain sufficient truly-and-artificially labeled samples to train a deep neural network model. Yet, such solutions need many supervised image… ▽ More While convolutional neural networks need large labeled sets for training images, expert human supervision of such datasets can be very laborious. Proposed solutions propagate labels from a small set of supervised images to a large set of unsupervised ones to obtain sufficient truly-and-artificially labeled samples to train a deep neural network model. Yet, such solutions need many supervised images for validation. We present a loop in which a deep neural network (VGG-16) is trained from a set with more correctly labeled samples along iterations, created by using t-SNE to project the features of its last max-pooling layer into a 2D embedded space in which labels are propagated using the Optimum-Path Forest semi-supervised classifier. As the labeled set improves along iterations, it improves the features of the neural network. We show that this can significantly improve classification results on test data (using only 1\% to 5\% of supervised samples) of three private challenging datasets and two public ones. △ Less

Submitted 15 January, 2021; v1 submitted 2 August, 2020; originally announced August 2020.

Comments: 7 pages, 5 figures

MSC Class: 68T07; 68T09; 68T10 ACM Class: I.5.1; I.5.2

arXiv:2007.13689 [pdf, other]

doi 10.1016/j.patcog.2020.107612

Semi-Automatic Data Annotation guided by Feature Space Projection

Authors: Barbara Caroline Benato, Jancarlo Ferreira Gomes, Alexandru Cristian Telea, Alexandre Xavier Falcão

Abstract: Data annotation using visual inspection (supervision) of each training sample can be laborious. Interactive solutions alleviate this by helping experts propagate labels from a few supervised samples to unlabeled ones based solely on the visual analysis of their feature space projection (with no further sample supervision). We present a semi-automatic data annotation approach based on suitable feat… ▽ More Data annotation using visual inspection (supervision) of each training sample can be laborious. Interactive solutions alleviate this by helping experts propagate labels from a few supervised samples to unlabeled ones based solely on the visual analysis of their feature space projection (with no further sample supervision). We present a semi-automatic data annotation approach based on suitable feature space projection and semi-supervised label estimation. We validate our method on the popular MNIST dataset and on images of human intestinal parasites with and without fecal impurities, a large and diverse dataset that makes classification very hard. We evaluate two approaches for semi-supervised learning from the latent and projection spaces, to choose the one that best reduces user annotation effort and also increases classification accuracy on unseen data. Our results demonstrate the added-value of visual analytics tools that combine complementary abilities of humans and machines for more effective machine learning. △ Less

Submitted 27 July, 2020; originally announced July 2020.

Comments: 28 pages, 10 figures

MSC Class: 68T07; 68T09; 68T10 ACM Class: I.5.1; I.5.2

arXiv:2007.08034 [pdf]

Evaluating and Validating Cluster Results

Authors: Anupriya Vysala, Dr. Joseph Gomes

Abstract: Clustering is the technique to partition data according to their characteristics. Data that are similar in nature belong to the same cluster [1]. There are two types of evaluation methods to evaluate clustering quality. One is an external evaluation where the truth labels in the data sets are known in advance and the other is internal evaluation in which the evaluation is done with data set itself… ▽ More Clustering is the technique to partition data according to their characteristics. Data that are similar in nature belong to the same cluster [1]. There are two types of evaluation methods to evaluate clustering quality. One is an external evaluation where the truth labels in the data sets are known in advance and the other is internal evaluation in which the evaluation is done with data set itself without true labels. In this paper, both external evaluation and internal evaluation are performed on the cluster results of the IRIS dataset. In the case of external evaluation Homogeneity, Correctness and V-measure scores are calculated for the dataset. For internal performance measures, the Silhouette Index and Sum of Square Errors are used. These internal performance measures along with the dendrogram (graphical tool from hierarchical Clustering) are used first to validate the number of clusters. Finally, as a statistical tool, we used the frequency distribution method to compare and provide a visual representation of the distribution of observations within a clustering result and the original data. △ Less

Submitted 15 July, 2020; originally announced July 2020.

Comments: 47 pages, 9th International Conference on Data Mining & Knowledge Management Process (CDKP 2020)

arXiv:2003.08748 [pdf]

Reduction of Surgical Risk Through the Evaluation of Medical Imaging Diagnostics

Authors: Marco A. V. M. Grinet, Nuno M. Garcia, Ana I. R. Gouveia, Jose A. F. Moutinho, Abel J. P. Gomes

Abstract: Computer aided diagnosis (CAD) of Breast Cancer (BRCA) images has been an active area of research in recent years. The main goals of this research is to develop reliable automatic methods for detecting and diagnosing different types of BRCA from diagnostic images. In this paper, we present a review of the state of the art CAD methods applied to magnetic resonance (MRI) and mammography images of BR… ▽ More Computer aided diagnosis (CAD) of Breast Cancer (BRCA) images has been an active area of research in recent years. The main goals of this research is to develop reliable automatic methods for detecting and diagnosing different types of BRCA from diagnostic images. In this paper, we present a review of the state of the art CAD methods applied to magnetic resonance (MRI) and mammography images of BRCA patients. The review aims to provide an extensive introduction to different features extracted from BRCA images through texture and statistical analysis and to categorize deep learning frameworks and data structures capable of using metadata to aggregate relevant information to assist oncologists and radiologists. We divide the existing literature according to the imaging modality and into radiomics, machine learning, or combination of both. We also emphasize the difference between each modality and methods strengths and weaknesses and analyze their performance in detecting BRCA through a quantitative comparison. We compare the results of various approaches for implementing CAD systems for the detection of BRCA. Each approachs standard workflow components are reviewed and summary tables provided. We present an extensive literature review of radiomics feature extraction techniques and machine learning methods applied in BRCA diagnosis and detection, focusing on data preparation, data structures, pre processing and post processing strategies available in the literature. There is a growing interest on radiomic feature extraction and machine learning methods for BRCA detection through histopathological images, MRI and mammography images. However, there isnt a CAD method able to combine distinct data types to provide the best diagnostic results. Employing data fusion techniques to medical images and patient data could lead to improved detection and classification results. △ Less

Submitted 8 March, 2020; originally announced March 2020.

Comments: 25 pages, 7 figures, Scientific grant report

ACM Class: I.4.6

arXiv:1911.05024 [pdf, other]

Pose Guided Attention for Multi-label Fashion Image Classification

Authors: Beatriz Quintino Ferreira, João P. Costeira, Ricardo G. Sousa, Liang-Yan Gui, João P. Gomes

Abstract: We propose a compact framework with guided attention for multi-label classification in the fashion domain. Our visual semantic attention model (VSAM) is supervised by automatic pose extraction creating a discriminative feature space. VSAM outperforms the state of the art for an in-house dataset and performs on par with previous works on the DeepFashion dataset, even without using any landmark anno… ▽ More We propose a compact framework with guided attention for multi-label classification in the fashion domain. Our visual semantic attention model (VSAM) is supervised by automatic pose extraction creating a discriminative feature space. VSAM outperforms the state of the art for an in-house dataset and performs on par with previous works on the DeepFashion dataset, even without using any landmark annotations. Additionally, we show that our semantic attention module brings robustness to large quantities of wrong annotations and provides more interpretable results. △ Less

Submitted 12 November, 2019; originally announced November 2019.

Comments: Published at ICCV 2019 Workshop on Computer Vision for Fashion, Art and Design

arXiv:1909.09978 [pdf, other]

Minimal Learning Machine: Theoretical Results and Clustering-Based Reference Point Selection

Authors: Joonas Hämäläinen, Alisson S. C. Alencar, Tommi Kärkkäinen, César L. C. Mattos, Amauri H. Souza Júnior, João P. P. Gomes

Abstract: The Minimal Learning Machine (MLM) is a nonlinear supervised approach based on learning a linear mapping between distance matrices computed in the input and output data spaces, where distances are calculated using a subset of points called reference points. Its simple formulation has attracted several recent works on extensions and applications. In this paper, we aim to address some open questions… ▽ More The Minimal Learning Machine (MLM) is a nonlinear supervised approach based on learning a linear mapping between distance matrices computed in the input and output data spaces, where distances are calculated using a subset of points called reference points. Its simple formulation has attracted several recent works on extensions and applications. In this paper, we aim to address some open questions related to the MLM. First, we detail theoretical aspects that assure the interpolation and universal approximation capabilities of the MLM, which were previously only empirically verified. Second, we identify the task of selecting reference points as having major importance for the MLM's generalization capability. Several clustering-based methods for reference point selection in regression scenarios are then proposed and analyzed. Based on an extensive empirical evaluation, we conclude that the evaluated methods are both scalable and useful. Specifically, for a small number of reference points, the clustering-based methods outperformed the standard random selection of the original MLM formulation. △ Less

Submitted 6 October, 2020; v1 submitted 22 September, 2019; originally announced September 2019.

Comments: 29 pages, Accepted to JMLR

arXiv:1908.00361 [pdf, other]

No-PASt-BO: Normalized Portfolio Allocation Strategy for Bayesian Optimization

Authors: Thiago de P. Vasconcelos, Daniel A. R. M. A. de Souza, César L. C. Mattos, João P. P. Gomes

Abstract: Bayesian Optimization (BO) is a framework for black-box optimization that is especially suitable for expensive cost functions. Among the main parts of a BO algorithm, the acquisition function is of fundamental importance, since it guides the optimization algorithm by translating the uncertainty of the regression model in a utility measure for each point to be evaluated. Considering such aspect, se… ▽ More Bayesian Optimization (BO) is a framework for black-box optimization that is especially suitable for expensive cost functions. Among the main parts of a BO algorithm, the acquisition function is of fundamental importance, since it guides the optimization algorithm by translating the uncertainty of the regression model in a utility measure for each point to be evaluated. Considering such aspect, selection and design of acquisition functions are one of the most popular research topics in BO. Since no single acquisition function was proved to have better performance in all tasks, a well-established approach consists of selecting different acquisition functions along the iterations of a BO execution. In such an approach, the GP-Hedge algorithm is a widely used option given its simplicity and good performance. Despite its success in various applications, GP-Hedge shows an undesirable characteristic of accounting on all past performance measures of each acquisition function to select the next function to be used. In this case, good or bad values obtained in an initial iteration may impact the choice of the acquisition function for the rest of the algorithm. This fact may induce a dominant behavior of an acquisition function and impact the final performance of the method. Aiming to overcome such limitation, in this work we propose a variant of GP-Hedge, named No-PASt-BO, that reduce the influence of far past evaluations. Moreover, our method presents a built-in normalization that avoids the functions in the portfolio to have similar probabilities, thus improving the exploration. The obtained results on both synthetic and real-world optimization tasks indicate that No-PASt-BO presents competitive performance and always outperforms GP-Hedge. △ Less

Submitted 1 August, 2019; originally announced August 2019.

Comments: 8 pages, currently under review

arXiv:1907.10545 [pdf, other]

CvxPnPL: A Unified Convex Solution to the Absolute Pose Estimation Problem from Point and Line Correspondences

Authors: Sérgio Agostinho, João Gomes, Alessio Del Bue

Abstract: We present a new convex method to estimate 3D pose from mixed combinations of 2D-3D point and line correspondences, the Perspective-n-Points-and-Lines problem (PnPL). We merge the contributions of each point and line into a unified Quadratic Constrained Quadratic Problem (QCQP) and then relax it into a Semi Definite Program (SDP) through Shor's relaxation. This makes it possible to gracefully hand… ▽ More We present a new convex method to estimate 3D pose from mixed combinations of 2D-3D point and line correspondences, the Perspective-n-Points-and-Lines problem (PnPL). We merge the contributions of each point and line into a unified Quadratic Constrained Quadratic Problem (QCQP) and then relax it into a Semi Definite Program (SDP) through Shor's relaxation. This makes it possible to gracefully handle mixed configurations of points and lines. Furthermore, the proposed relaxation allows us to recover a finite number of solutions under ambiguous configurations. In such cases, the 3D pose candidates are found by further enforcing geometric constraints on the solution space and then retrieving such poses from the intersections of multiple quadrics. Experiments provide results in line with the best performing state of the art methods while providing the flexibility of solving for an arbitrary number of points and lines. △ Less

Submitted 9 August, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

Comments: Main paper and supplemental material included. References added and minor change to fig 1

ACM Class: I.4.8

arXiv:1907.01867 [pdf, other]

Learning GPLVM with arbitrary kernels using the unscented transformation

Authors: Daniel Augusto R. M. A. de Souza, Diego Mesquita, César Lincoln C. Mattos, João Paulo P. Gomes

Abstract: Gaussian Process Latent Variable Model (GPLVM) is a flexible framework to handle uncertain inputs in Gaussian Processes (GPs) and incorporate GPs as components of larger graphical models. Nonetheless, the standard GPLVM variational inference approach is tractable only for a narrow family of kernel functions. The most popular implementations of GPLVM circumvent this limitation using quadrature meth… ▽ More Gaussian Process Latent Variable Model (GPLVM) is a flexible framework to handle uncertain inputs in Gaussian Processes (GPs) and incorporate GPs as components of larger graphical models. Nonetheless, the standard GPLVM variational inference approach is tractable only for a narrow family of kernel functions. The most popular implementations of GPLVM circumvent this limitation using quadrature methods, which may become a computational bottleneck even for relatively low dimensions. For instance, the widely employed Gauss-Hermite quadrature has exponential complexity on the number of dimensions. In this work, we propose using the unscented transformation instead. Overall, this method presents comparable, if not better, performance than offthe-shelf solutions to GPLVM and its computational complexity scales only linearly on dimension. In contrast to Monte Carlo methods, our approach is deterministic and works well with quasi-Newton methods, such as the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm. We illustrate the applicability of our method with experiments on dimensionality reduction and multistep-ahead prediction with uncertainty propagation. △ Less

Submitted 10 November, 2020; v1 submitted 3 July, 2019; originally announced July 2019.

Comments: 10 pages, currently under review

arXiv:1906.06751 [pdf, other]

Pi-surfaces: products of implicit surfaces towards constructive composition of 3D objects

Authors: Adriano N. Raposo, Abel J. P. Gomes

Abstract: Implicit functions provide a fundamental basis to model 3D objects, no matter they are rigid or deformable, in computer graphics and geometric modeling. This paper introduces a new constructive scheme of implicitly-defined 3D objects based on products of implicit functions. This scheme is in contrast with popular approaches like blobbies, meta balls and soft objects, which rely on the sum of speci… ▽ More Implicit functions provide a fundamental basis to model 3D objects, no matter they are rigid or deformable, in computer graphics and geometric modeling. This paper introduces a new constructive scheme of implicitly-defined 3D objects based on products of implicit functions. This scheme is in contrast with popular approaches like blobbies, meta balls and soft objects, which rely on the sum of specific implicit functions to fit a 3D object to a set of spheres. △ Less

Submitted 16 June, 2019; originally announced June 2019.

Journal ref: WSCG 2019 27. International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision

arXiv:1905.12265 [pdf, other]

Strategies for Pre-training Graph Neural Networks

Authors: Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, Jure Leskovec

Abstract: Many applications of machine learning require a model to make accurate pre-dictions on test examples that are distributionally different from training ones, while task-specific labels are scarce during training. An effective approach to this challenge is to pre-train a model on related tasks where data is abundant, and then fine-tune it on a downstream task of interest. While pre-training has been… ▽ More Many applications of machine learning require a model to make accurate pre-dictions on test examples that are distributionally different from training ones, while task-specific labels are scarce during training. An effective approach to this challenge is to pre-train a model on related tasks where data is abundant, and then fine-tune it on a downstream task of interest. While pre-training has been effective in many language and vision domains, it remains an open question how to effectively use pre-training on graph datasets. In this paper, we develop a new strategy and self-supervised methods for pre-training Graph Neural Networks (GNNs). The key to the success of our strategy is to pre-train an expressive GNN at the level of individual nodes as well as entire graphs so that the GNN can learn useful local and global representations simultaneously. We systematically study pre-training on multiple graph classification datasets. We find that naive strategies, which pre-train GNNs at the level of either entire graphs or individual nodes, give limited improvement and can even lead to negative transfer on many downstream tasks. In contrast, our strategy avoids negative transfer and improves generalization significantly across downstream tasks, leading up to 9.4% absolute improvements in ROC-AUC over non-pre-trained models and achieving state-of-the-art performance for molecular property prediction and protein function prediction. △ Less

Submitted 18 February, 2020; v1 submitted 29 May, 2019; originally announced May 2019.

Comments: Accepted as a spotlight to ICLR 2020

arXiv:1905.00332 [pdf, ps, other]

LS-SVR as a Bayesian RBF network

Authors: Diego P. P. Mesquita, Luis A. Freitas, João P. P. Gomes, César L. C. Mattos

Abstract: We show theoretical similarities between the Least Squares Support Vector Regression (LS-SVR) model with a Radial Basis Functions (RBF) kernel and maximum a posteriori (MAP) inference on Bayesian RBF networks with a specific Gaussian prior on the regression weights. Although previous works have pointed out similar expressions between those learning approaches, we explicit and formally state the ex… ▽ More We show theoretical similarities between the Least Squares Support Vector Regression (LS-SVR) model with a Radial Basis Functions (RBF) kernel and maximum a posteriori (MAP) inference on Bayesian RBF networks with a specific Gaussian prior on the regression weights. Although previous works have pointed out similar expressions between those learning approaches, we explicit and formally state the existing correspondences. We empirically demonstrate our result by performing computational experiments with standard regression benchmarks. Our findings open a range of possibilities to improve LS-SVR by borrowing strength from well-established developments in Bayesian methodology. △ Less

Submitted 2 August, 2019; v1 submitted 1 May, 2019; originally announced May 2019.

Comments: 14 pages, currently under review

arXiv:1811.03516 [pdf, other]

Learning from Demonstration in the Wild

Authors: Feryal Behbahani, Kyriacos Shiarlis, Xi Chen, Vitaly Kurin, Sudhanshu Kasewa, Ciprian Stirbu, João Gomes, Supratik Paul, Frans A. Oliehoek, João Messias, Shimon Whiteson

Abstract: Learning from demonstration (LfD) is useful in settings where hand-coding behaviour or a reward function is impractical. It has succeeded in a wide range of problems but typically relies on manually generated demonstrations or specially deployed sensors and has not generally been able to leverage the copious demonstrations available in the wild: those that capture behaviours that were occurring an… ▽ More Learning from demonstration (LfD) is useful in settings where hand-coding behaviour or a reward function is impractical. It has succeeded in a wide range of problems but typically relies on manually generated demonstrations or specially deployed sensors and has not generally been able to leverage the copious demonstrations available in the wild: those that capture behaviours that were occurring anyway using sensors that were already deployed for another purpose, e.g., traffic camera footage capturing demonstrations of natural behaviour of vehicles, cyclists, and pedestrians. We propose Video to Behaviour (ViBe), a new approach to learn models of behaviour from unlabelled raw video data of a traffic scene collected from a single, monocular, initially uncalibrated camera with ordinary resolution. Our approach calibrates the camera, detects relevant objects, tracks them through time, and uses the resulting trajectories to perform LfD, yielding models of naturalistic behaviour. We apply ViBe to raw videos of a traffic intersection and show that it can learn purely from videos, without additional expert knowledge. △ Less

Submitted 25 March, 2019; v1 submitted 8 November, 2018; originally announced November 2018.

Comments: Accepted to the IEEE International Conference on Robotics and Automation (ICRA) 2019; extended version with appendix

arXiv:1807.11374 [pdf, other]

Weakly-Supervised Deep Learning of Heat Transport via Physics Informed Loss

Authors: Rishi Sharma, Amir Barati Farimani, Joe Gomes, Peter Eastman, Vijay Pande

Abstract: In typical machine learning tasks and applications, it is necessary to obtain or create large labeled datasets in order to to achieve high performance. Unfortunately, large labeled datasets are not always available and can be expensive to source, creating a bottleneck towards more widely applicable machine learning. The paradigm of weak supervision offers an alternative that allows for integration… ▽ More In typical machine learning tasks and applications, it is necessary to obtain or create large labeled datasets in order to to achieve high performance. Unfortunately, large labeled datasets are not always available and can be expensive to source, creating a bottleneck towards more widely applicable machine learning. The paradigm of weak supervision offers an alternative that allows for integration of domain-specific knowledge by enforcing constraints that a correct solution to the learning problem will obey over the output space. In this work, we explore the application of this paradigm to 2-D physical systems governed by non-linear differential equations. We demonstrate that knowledge of the partial differential equations governing a system can be encoded into the loss function of a neural network via an appropriately chosen convolutional kernel. We demonstrate this by showing that the steady-state solution to the 2-D heat equation can be learned directly from initial conditions by a convolutional neural network, in the absence of labeled training data. We also extend recent work in the progressive growing of fully convolutional networks to achieve high accuracy (< 1.5% error) at multiple scales of the heat-flow problem, including at the very large scale (1024x1024). Finally, we demonstrate that this method can be used to speed up exact calculation of the solution to the differential equations via finite difference. △ Less

Submitted 21 August, 2018; v1 submitted 24 July, 2018; originally announced July 2018.

arXiv:1807.11318 [pdf, other]

doi 10.1007/s10723-018-9454-2

umd-verification: Automation of Software Validation for the EGI federated e-Infrastructure

Authors: Pablo Orviz Fernandez, Joao Pina, Alvaro Lopez Garcia, Isabel Campos Plasencia, Mario David, Jorge Gomes

Abstract: Supporting e-Science in the EGI e-Infrastructure requires extensive and reliable software, for advanced computing use, deployed across over approximately 300 European and worldwide data centers. The Unified Middleware Distribution (UMD) and Cloud Middleware Distribution (CMD) are the channels to deliver the software for the EGI e-Infrastructure consumption. The software is compiled, validated and… ▽ More Supporting e-Science in the EGI e-Infrastructure requires extensive and reliable software, for advanced computing use, deployed across over approximately 300 European and worldwide data centers. The Unified Middleware Distribution (UMD) and Cloud Middleware Distribution (CMD) are the channels to deliver the software for the EGI e-Infrastructure consumption. The software is compiled, validated and distributed following the Software Provisioning Process (SWPP), where the Quality Criteria (QC) definition sets the minimum quality requirements for EGI acceptance. The growing number of software components currently existing within UMD and CMD distributions hinders the application of the traditional, manual-based validation mechanisms, thus driving the adoption of automated solutions. This paper presents umd-verification, an open-source tool that enforces the fulfillment of the QC requirements in an automated way for the continuous validation of the software products for scientific disposal. The umd-verification tool has been successfully integrated within the SWPP pipeline and is progressively supporting the full validation of the products in the UMD and CMD repositories. While the cost of supporting new products is dependant on the availability of Infrastructure as Code solutions to take over the deployment and high test coverage, the results obtained for the already integrated products are promising, as the time invested in the validation of products has been drastically reduced. Furthermore, automation adoption has brought along benefits for the reliability of the process, such as the removal of human-associated errors or the risk of regression of previously tested functionalities. △ Less

Submitted 30 July, 2018; originally announced July 2018.

Comments: This is the author's pre-print version of this work. The final publication is available at http://dx.doi.org/10.1007/s10723-018-9454-2

Journal ref: Journal of Grid COmputing (2018) 1-14

arXiv:1803.08993 [pdf, other]

Deep Learning Phase Segregation

Authors: Amir Barati Farimani, Joseph Gomes, Rishi Sharma, Franklin L. Lee, Vijay S. Pande

Abstract: Phase segregation, the process by which the components of a binary mixture spontaneously separate, is a key process in the evolution and design of many chemical, mechanical, and biological systems. In this work, we present a data-driven approach for the learning, modeling, and prediction of phase segregation. A direct mapping between an initially dispersed, immiscible binary fluid and the equilibr… ▽ More Phase segregation, the process by which the components of a binary mixture spontaneously separate, is a key process in the evolution and design of many chemical, mechanical, and biological systems. In this work, we present a data-driven approach for the learning, modeling, and prediction of phase segregation. A direct mapping between an initially dispersed, immiscible binary fluid and the equilibrium concentration field is learned by conditional generative convolutional neural networks. Concentration field predictions by the deep learning model conserve phase fraction, correctly predict phase transition, and reproduce area, perimeter, and total free energy distributions up to 98% accuracy. △ Less

Submitted 23 March, 2018; originally announced March 2018.

Comments: arXiv admin note: text overlap with arXiv:1709.02432

arXiv:1711.01981 [pdf, other]

doi 10.1007/s10723-018-9453-3

INDIGO-DataCloud:A data and computing platform to facilitate seamless access to e-infrastructures

Authors: INDIGO-DataCloud Collaboration, :, Davide Salomoni, Isabel Campos, Luciano Gaido, Jesus Marco de Lucas, Peter Solagna, Jorge Gomes, Ludek Matyska, Patrick Fuhrman, Marcus Hardt, Giacinto Donvito, Lukasz Dutka, Marcin Plociennik, Roberto Barbera, Ignacio Blanquer, Andrea Ceccanti, Mario David, Cristina Duma, Alvaro López-García, Germán Moltó, Pablo Orviz, Zdenek Sustr, Matthew Viljoen, Fernando Aguilar , et al. (40 additional authors not shown)

Abstract: This paper describes the achievements of the H2020 project INDIGO-DataCloud. The project has provided e-infrastructures with tools, applications and cloud framework enhancements to manage the demanding requirements of scientific communities, either locally or through enhanced interfaces. The middleware developed allows to federate hybrid resources, to easily write, port and run scientific applicat… ▽ More This paper describes the achievements of the H2020 project INDIGO-DataCloud. The project has provided e-infrastructures with tools, applications and cloud framework enhancements to manage the demanding requirements of scientific communities, either locally or through enhanced interfaces. The middleware developed allows to federate hybrid resources, to easily write, port and run scientific applications to the cloud. In particular, we have extended existing PaaS (Platform as a Service) solutions, allowing public and private e-infrastructures, including those provided by EGI, EUDAT, and Helix Nebula, to integrate their existing services and make them available through AAI services compliant with GEANT interfederation policies, thus guaranteeing transparency and trust in the provisioning of such services. Our middleware facilitates the execution of applications using containers on Cloud and Grid based infrastructures, as well as on HPC clusters. Our developments are freely downloadable as open source components, and are already being integrated into many scientific applications. △ Less

Submitted 5 February, 2019; v1 submitted 6 November, 2017; originally announced November 2017.

Comments: 39 pages, 15 figures.Version accepted in Journal of Grid Computing

arXiv:1711.01758 [pdf, other]

doi 10.1016/j.cpc.2018.05.021

Enabling rootless Linux Containers in multi-user environments: the udocker tool

Authors: Jorge Gomes, Isabel Campos, Emanuele Bagnaschi, Mario David, Luis Alves, Joao Martins, Joao Pina, Alvaro Lopez-Garcia, Pablo Orviz

Abstract: Containers are increasingly used as means to distribute and run Linux services and applications. In this paper we describe the architectural design and implementation of udocker, a tool which enables the user to execute Linux containers in user mode. We also present a few practical applications, using a range of scientific codes characterized by different requirements: from single core execution t… ▽ More Containers are increasingly used as means to distribute and run Linux services and applications. In this paper we describe the architectural design and implementation of udocker, a tool which enables the user to execute Linux containers in user mode. We also present a few practical applications, using a range of scientific codes characterized by different requirements: from single core execution to MPI parallel execution and execution on GPGPUs. △ Less

Submitted 1 June, 2018; v1 submitted 6 November, 2017; originally announced November 2017.

Comments: 30 pages, 5 figures, accepted for publication in Computing Physics Communications

Journal ref: Computing Physics Communications Volume 232 p. 84-97 (2018)

arXiv:1709.08746 [pdf, other]

DIeSEL: DIstributed SElf-Localization of a network of underwater vehicles

Authors: Cláudia Soares, Pusheng Ji, João Gomes, António Pascoal

Abstract: How can teams of artificial agents localize and position themselves in GPS-denied environments? How can each agent determine its position from pairwise ranges, own velocity, and limited interaction with neighbors? This paper addresses this problem from an optimization point of view: we directly optimize the nonconvex maximum-likelihood estimator in the presence of range measurements contaminated w… ▽ More How can teams of artificial agents localize and position themselves in GPS-denied environments? How can each agent determine its position from pairwise ranges, own velocity, and limited interaction with neighbors? This paper addresses this problem from an optimization point of view: we directly optimize the nonconvex maximum-likelihood estimator in the presence of range measurements contaminated with Gaussian noise, and we obtain a provably convergent, accurate and distributed positioning algorithm that outperforms the extended Kalman filter, a standard centralized solution for this problem. △ Less

Submitted 25 September, 2017; originally announced September 2017.

Journal ref: IEEE OCEANS 2017, Anchorage, AK, 2017

arXiv:1709.02432 [pdf, other]

Deep Learning the Physics of Transport Phenomena

Authors: Amir Barati Farimani, Joseph Gomes, Vijay S. Pande

Abstract: We have developed a new data-driven paradigm for the rapid inference, modeling and simulation of the physics of transport phenomena by deep learning. Using conditional generative adversarial networks (cGAN), we train models for the direct generation of solutions to steady state heat conduction and incompressible fluid flow purely on observation without knowledge of the underlying governing equatio… ▽ More We have developed a new data-driven paradigm for the rapid inference, modeling and simulation of the physics of transport phenomena by deep learning. Using conditional generative adversarial networks (cGAN), we train models for the direct generation of solutions to steady state heat conduction and incompressible fluid flow purely on observation without knowledge of the underlying governing equations. Rather than using iterative numerical methods to approximate the solution of the constitutive equations, cGANs learn to directly generate the solutions to these phenomena, given arbitrary boundary conditions and domain, with high test accuracy (MAE$<$1\%) and state-of-the-art computational performance. The cGAN framework can be used to learn causal models directly from experimental observations where the underlying physical model is complex or unknown. △ Less

Submitted 7 September, 2017; originally announced September 2017.

arXiv:1708.06700 [pdf, other]

Joint User Selection and Energy Minimization for Ultra-Dense Multi-channel C-RAN with Incomplete CSI

Authors: Cunhua Pan, Huiling Zhu, Nathan J. Gomes, Jiangzhou Wang

Abstract: This paper provides a unified framework to deal with the challenges arising in dense cloud radio access networks (C-RAN), which include huge power consumption, limited fronthaul capacity, heavy computational complexity, unavailability of full channel state information (CSI), etc. Specifically, we aim to jointly optimize the remote radio head (RRH) selection, user equipment (UE)-RRH associations an… ▽ More This paper provides a unified framework to deal with the challenges arising in dense cloud radio access networks (C-RAN), which include huge power consumption, limited fronthaul capacity, heavy computational complexity, unavailability of full channel state information (CSI), etc. Specifically, we aim to jointly optimize the remote radio head (RRH) selection, user equipment (UE)-RRH associations and beam-vectors to minimize the total network power consumption (NPC) for dense multi-channel downlink C-RAN with incomplete CSI subject to per-RRH power constraints, each UE's total rate requirement, and fronthaul link capacity constraints. This optimization problem is NP-hard. In addition, due to the incomplete CSI, the exact expression of UEs' rate expression is intractable. We first conservatively replace UEs' rate expression with its lower-bound. Then, based on the successive convex approximation (SCA) technique and the relationship between the data rate and the mean square error (MSE), we propose a single-layer iterative algorithm to solve the NPC minimization problem with convergence guarantee. In each iteration of the algorithm, the Lagrange dual decomposition method is used to derive the structure of the optimal beam-vectors, which facilitates the parallel computations at the Baseband unit (BBU) pool. Furthermore, a bisection UE selection algorithm is proposed to guarantee the feasibility of the problem. Simulation results show the benefits of the proposed algorithms and the fact that a limited amount of CSI is sufficient to achieve performance close to that obtained when perfect CSI is possessed. △ Less

Submitted 17 May, 2017; originally announced August 2017.

Comments: This paper has been accepted by IEEE JSAC with special issue on Deployment Issues and Performance Challenges for 5G

arXiv:1706.01643 [pdf]

Retrosynthetic reaction prediction using neural sequence-to-sequence models

Authors: Bowen Liu, Bharath Ramsundar, Prasad Kawthekar, Jade Shi, Joseph Gomes, Quang Luu Nguyen, Stephen Ho, Jack Sloane, Paul Wender, Vijay Pande

Abstract: We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem. The end-to-end trained model has an encoder-decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation… ▽ More We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem. The end-to-end trained model has an encoder-decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation. The model is trained on 50,000 experimental reaction examples from the United States patent literature, which span 10 broad reaction types that are commonly used by medicinal chemists. We find that our model performs comparably with a rule-based expert system baseline model, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule-based expert system component. Our model provides an important first step towards solving the challenging problem of computational retrosynthetic analysis. △ Less

Submitted 6 June, 2017; originally announced June 2017.

arXiv:1703.10603 [pdf, other]

Atomic Convolutional Networks for Predicting Protein-Ligand Binding Affinity

Authors: Joseph Gomes, Bharath Ramsundar, Evan N. Feinberg, Vijay S. Pande

Abstract: Empirical scoring functions based on either molecular force fields or cheminformatics descriptors are widely used, in conjunction with molecular docking, during the early stages of drug discovery to predict potency and binding affinity of a drug-like molecule to a given target. These models require expert-level knowledge of physical chemistry and biology to be encoded as hand-tuned parameters or f… ▽ More Empirical scoring functions based on either molecular force fields or cheminformatics descriptors are widely used, in conjunction with molecular docking, during the early stages of drug discovery to predict potency and binding affinity of a drug-like molecule to a given target. These models require expert-level knowledge of physical chemistry and biology to be encoded as hand-tuned parameters or features rather than allowing the underlying model to select features in a data-driven procedure. Here, we develop a general 3-dimensional spatial convolution operation for learning atomic-level chemical interactions directly from atomic coordinates and demonstrate its application to structure-based bioactivity prediction. The atomic convolutional neural network is trained to predict the experimentally determined binding affinity of a protein-ligand complex by direct calculation of the energy associated with the complex, protein, and ligand given the crystal structure of the binding pose. Non-covalent interactions present in the complex that are absent in the protein-ligand sub-structures are identified and the model learns the interaction strength associated with these features. We test our model by predicting the binding free energy of a subset of protein-ligand complexes found in the PDBBind dataset and compare with state-of-the-art cheminformatics and machine learning-based approaches. We find that all methods achieve experimental accuracy and that atomic convolutional networks either outperform or perform competitively with the cheminformatics based methods. Unlike all previous protein-ligand prediction systems, atomic convolutional networks are end-to-end and fully-differentiable. They represent a new data-driven, physics-based deep learning model paradigm that offers a strong foundation for future improvements in structure-based bioactivity prediction. △ Less

Submitted 30 March, 2017; originally announced March 2017.

arXiv:1703.00564 [pdf, other]

MoleculeNet: A Benchmark for Molecular Machine Learning

Authors: Zhenqin Wu, Bharath Ramsundar, Evan N. Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S. Pappu, Karl Leswing, Vijay Pande

Abstract: Molecular machine learning has been maturing rapidly over the last few years. Improved methods and the presence of larger datasets have enabled machine learning algorithms to make increasingly accurate predictions about molecular properties. However, algorithmic progress has been limited due to the lack of a standard benchmark to compare the efficacy of proposed methods; most new algorithms are be… ▽ More Molecular machine learning has been maturing rapidly over the last few years. Improved methods and the presence of larger datasets have enabled machine learning algorithms to make increasingly accurate predictions about molecular properties. However, algorithmic progress has been limited due to the lack of a standard benchmark to compare the efficacy of proposed methods; most new algorithms are benchmarked on different datasets making it challenging to gauge the quality of proposed methods. This work introduces MoleculeNet, a large scale benchmark for molecular machine learning. MoleculeNet curates multiple public datasets, establishes metrics for evaluation, and offers high quality open-source implementations of multiple previously proposed molecular featurization and learning algorithms (released as part of the DeepChem open source library). MoleculeNet benchmarks demonstrate that learnable representations are powerful tools for molecular machine learning and broadly offer the best performance. However, this result comes with caveats. Learnable representations still struggle to deal with complex tasks under data scarcity and highly imbalanced classification. For quantum mechanical and biophysical datasets, the use of physics-aware featurizations can be more important than choice of particular learning algorithm. △ Less

Submitted 25 October, 2018; v1 submitted 1 March, 2017; originally announced March 2017.

arXiv:1702.03346 [pdf, other]

Joint Precoding and RRH selection for User-centric Green MIMO C-RAN

Authors: Cunhua Pan, Huiling Zhu, Nathan J. Gomes, Jiangzhou Wang

Abstract: This paper jointly optimizes the precoding matrices and the set of active remote radio heads (RRHs) to minimize the network power consumption (NPC) for a user-centric cloud radio access network (C-RAN), where both the RRHs and users have multiple antennas and each user is served by its nearby RRHs. Both users' rate requirements and per-RRH power constraints are considered. Due to these conflicting… ▽ More This paper jointly optimizes the precoding matrices and the set of active remote radio heads (RRHs) to minimize the network power consumption (NPC) for a user-centric cloud radio access network (C-RAN), where both the RRHs and users have multiple antennas and each user is served by its nearby RRHs. Both users' rate requirements and per-RRH power constraints are considered. Due to these conflicting constraints, this optimization problem may be infeasible. In this paper, we propose to solve this problem in two stages. In Stage I, a low-complexity user selection algorithm is proposed to find the largest subset of feasible users. In Stage II, a low-complexity algorithm is proposed to solve the optimization problem with the users selected from Stage I. Specifically, the re-weighted $l_1$-norm minimization method is used to transform the original problem with non-smooth objective function into a series of weighted power minimization (WPM) problems, each of which can be solved by the weighted minimum mean square error (WMMSE) method. The solution obtained by the WMMSE method is proved to satisfy the Karush-Kuhn-Tucker (KKT) conditions of the WPM problem. Moreover, a low-complexity algorithm based on Newton's method and the gradient descent method is developed to update the precoder matrices in each iteration of the WMMSE method. Simulation results demonstrate the rapid convergence of the proposed algorithms and the benefits of equipping multiple antennas at the user side. Moreover, the proposed algorithm is shown to achieve near-optimal performance in terms of NPC. △ Less

Submitted 10 February, 2017; originally announced February 2017.

Comments: This work has been accepted in IEEE TWC

arXiv:1701.08027 [pdf, other]

LocDyn: Robust Distributed Localization for Mobile Underwater Networks

Authors: Cláudia Soares, João Gomes, Beatriz Ferreira, João Paulo Costeira

Abstract: How to self-localize large teams of underwater nodes using only noisy range measurements? How to do it in a distributed way, and incorporating dynamics into the problem? How to reject outliers and produce trustworthy position estimates? The stringent acoustic communication channel and the accuracy needs of our geophysical survey application demand faster and more accurate localization methods. We… ▽ More How to self-localize large teams of underwater nodes using only noisy range measurements? How to do it in a distributed way, and incorporating dynamics into the problem? How to reject outliers and produce trustworthy position estimates? The stringent acoustic communication channel and the accuracy needs of our geophysical survey application demand faster and more accurate localization methods. We approach dynamic localization as a MAP estimation problem where the prior encodes dynamics, and we devise a convex relaxation method that takes advantage of previous estimates at each measurement acquisition step; The algorithm converges at an optimal rate for first order methods. LocDyn is distributed: there is no fusion center responsible for processing acquired data and the same simple computations are performed for each node. LocDyn is accurate: experiments attest to a smaller positioning error than a comparable Kalman filter. LocDyn is robust: it rejects outlier noise, while the comparing methods succumb in terms of positioning error. △ Less

Submitted 27 January, 2017; originally announced January 2017.

arXiv:1605.00930 [pdf, other]

Efficient Execution of Irregular Wavefront Propagation Pattern on Many Integrated Core Architecture

Authors: Jeremias Gomes, George Teodoro

Abstract: The efficient execution of image processing algorithms is an active area of Bioinformatics. In image processing, one of the classes of algorithms or computing pattern that works with irregular data structures is the Irregular Wavefront Propagation Pattern (IWPP). In this class, elements propagate information to neighbors in the form of wave propagation. This propagation results in irregular access… ▽ More The efficient execution of image processing algorithms is an active area of Bioinformatics. In image processing, one of the classes of algorithms or computing pattern that works with irregular data structures is the Irregular Wavefront Propagation Pattern (IWPP). In this class, elements propagate information to neighbors in the form of wave propagation. This propagation results in irregular access to data and expansions. Due to this irregularity, current implementations of this class of algorithms requires atomic operations, which is very costly and also restrains implementations with Single Instruction, Multiple Data (SIMD) instructions in Many Integrated Core (MIC) architectures, which are critical to attain high performance on this processor. The objective of this study is to redesign the Irregular Wavefront Propagation Pattern algorithm in order to enable the efficient execution on processors with Many Integrated Core architecture using SIMD instructions. In this work, using the Intel (R) Xeon Phi (TM) coprocessor, we have implemented a vector version of IWPP with up to 5.63x gains on non-vectored version, a parallel version using First In, First Out (FIFO) queue that attained speedup up to 55x as compared to the single core version on the coprocessor, a version using priority queue whose performance was 1.62x better than the fastest version of GPU based implementation available in the literature, and a cooperative version between heterogeneous processors that allow to process images bigger than the Intel (R) Xeon Phi (TM) memory and also provides a way to utilize all the available devices in the computation. △ Less

Submitted 3 May, 2016; originally announced May 2016.

Comments: in Portuguese

arXiv:1603.09536 [pdf, other]

INDIGO-Datacloud: foundations and architectural description of a Platform as a Service oriented to scientific computing

Authors: D. Salomoni, I. Campos, L. Gaido, G. Donvito, M. Antonacci, P. Fuhrman, J. Marco, A. Lopez-Garcia, P. Orviz, I. Blanquer, M. Caballer, G. Molto, M. Plociennik, M. Owsiak, M. Urbaniak, M. Hardt, A. Ceccanti, B. Wegh, J. Gomes, M. David, C. Aiftimiei, L. Dutka, B. Kryza, T. Szepieniec, S. Fiore , et al. (10 additional authors not shown)

Abstract: In this paper we describe the architecture of a Platform as a Service (PaaS) oriented to computing and data analysis. In order to clarify the choices we made, we explain the features using practical examples, applied to several known usage patterns in the area of HEP computing. The proposed architecture is devised to provide researchers with a unified view of distributed computing infrastructures,… ▽ More In this paper we describe the architecture of a Platform as a Service (PaaS) oriented to computing and data analysis. In order to clarify the choices we made, we explain the features using practical examples, applied to several known usage patterns in the area of HEP computing. The proposed architecture is devised to provide researchers with a unified view of distributed computing infrastructures, focusing in facilitating seamless access. In this respect the Platform is able to profit from the most recent developments for computing and processing large amounts of data, and to exploit current storage and preservation technologies, with the appropriate mechanisms to ensure security and privacy. △ Less

Submitted 22 April, 2016; v1 submitted 31 March, 2016; originally announced March 2016.

Comments: 31 pages, 12 Figures

arXiv:1511.03154 [pdf, other]

doi 10.1371/journal.pone.0151834

Evolution of Collective Behaviors for a Real Swarm of Aquatic Surface Robots

Authors: Miguel Duarte, Vasco Costa, Jorge Gomes, Tiago Rodrigues, Fernando Silva, Sancho Moura Oliveira, Anders Lyhne Christensen

Abstract: Swarm robotics is a promising approach for the coordination of large numbers of robots. While previous studies have shown that evolutionary robotics techniques can be applied to obtain robust and efficient self-organized behaviors for robot swarms, most studies have been conducted in simulation, and the few that have been conducted on real robots have been confined to laboratory environments. In t… ▽ More Swarm robotics is a promising approach for the coordination of large numbers of robots. While previous studies have shown that evolutionary robotics techniques can be applied to obtain robust and efficient self-organized behaviors for robot swarms, most studies have been conducted in simulation, and the few that have been conducted on real robots have been confined to laboratory environments. In this paper, we demonstrate for the first time a swarm robotics system with evolved control successfully operating in a real and uncontrolled environment. We evolve neural network-based controllers in simulation for canonical swarm robotics tasks, namely homing, dispersion, clustering, and monitoring. We then assess the performance of the controllers on a real swarm of up to ten aquatic surface robots. Our results show that the evolved controllers transfer successfully to real robots and achieve a performance similar to the performance obtained in simulation. We validate that the evolved controllers display key properties of swarm intelligence-based control, namely scalability, flexibility, and robustness on the real swarm. We conclude with a proof-of-concept experiment in which the swarm performs a complete environmental monitoring task by combining multiple evolved controllers. △ Less

Submitted 2 February, 2016; v1 submitted 10 November, 2015; originally announced November 2015.

Comments: 31 pages, 15 figures, journal

Journal ref: PLoS ONE 11(3), 2016, pp. e0151834

arXiv:1407.0577 [pdf, other]

doi 10.7551/978-0-262-32621-6-ch036

Systematic Derivation of Behaviour Characterisations in Evolutionary Robotics

Authors: Jorge Gomes, Pedro Mariano, Anders Lyhne Christensen

Abstract: Evolutionary techniques driven by behavioural diversity, such as novelty search, have shown significant potential in evolutionary robotics. These techniques rely on priorly specified behaviour characterisations to estimate the similarity between individuals. Characterisations are typically defined in an ad hoc manner based on the experimenter's intuition and knowledge about the task. Alternatively… ▽ More Evolutionary techniques driven by behavioural diversity, such as novelty search, have shown significant potential in evolutionary robotics. These techniques rely on priorly specified behaviour characterisations to estimate the similarity between individuals. Characterisations are typically defined in an ad hoc manner based on the experimenter's intuition and knowledge about the task. Alternatively, generic characterisations based on the sensor-effector values of the agents are used. In this paper, we propose a novel approach that allows for systematic derivation of behaviour characterisations for evolutionary robotics, based on a formal description of the agents and their environment. Systematically derived behaviour characterisations (SDBCs) go beyond generic characterisations in that they can contain task-specific features related to the internal state of the agents, environmental features, and relations between them. We evaluate SDBCs with novelty search in three simulated collective robotics tasks. Our results show that SDBCs yield a performance comparable to the task-specific characterisations, in terms of both solution quality and behaviour space exploration. △ Less

Submitted 2 July, 2014; originally announced July 2014.

Comments: To appear in 14th International Conference on the Synthesis and Simulation of Living Systems (ALife 14)

Journal ref: International Conference on the Synthesis and Simulation of Living Systems (ALife). pp. 212-219. MIT Press (2014)

arXiv:1407.0576 [pdf, other]

doi 10.1007/978-3-319-10762-2_23

Novelty Search in Competitive Coevolution

Authors: Jorge Gomes, Pedro Mariano, Anders Lyhne Christensen

Abstract: One of the main motivations for the use of competitive coevolution systems is their ability to capitalise on arms races between competing species to evolve increasingly sophisticated solutions. Such arms races can, however, be hard to sustain, and it has been shown that the competing species often converge prematurely to certain classes of behaviours. In this paper, we investigate if and how novel… ▽ More One of the main motivations for the use of competitive coevolution systems is their ability to capitalise on arms races between competing species to evolve increasingly sophisticated solutions. Such arms races can, however, be hard to sustain, and it has been shown that the competing species often converge prematurely to certain classes of behaviours. In this paper, we investigate if and how novelty search, an evolutionary technique driven by behavioural novelty, can overcome convergence in coevolution. We propose three methods for applying novelty search to coevolutionary systems with two species: (i) score both populations according to behavioural novelty; (ii) score one population according to novelty, and the other according to fitness; and (iii) score both populations with a combination of novelty and fitness. We evaluate the methods in a predator-prey pursuit task. Our results show that novelty-based approaches can evolve a significantly more diverse set of solutions, when compared to traditional fitness-based coevolution. △ Less

Submitted 2 July, 2014; originally announced July 2014.

Comments: To appear in 13th International Conference on Parallel Problem Solving from Nature (PPSN 2014)

Journal ref: Parallel Problem Solving from Nature (PPSN). vol. 8672 LNCS. pp. 233-242. Springer (2014)

arXiv:1304.3393 [pdf, other]

doi 10.1145/2463372.2463398

Generic Behaviour Similarity Measures for Evolutionary Swarm Robotics

Authors: Jorge Gomes, Anders Lyhne Christensen

Abstract: Novelty search has shown to be a promising approach for the evolution of controllers for swarm robotics. In existing studies, however, the experimenter had to craft a domain dependent behaviour similarity measure to use novelty search in swarm robotics applications. The reliance on hand-crafted similarity measures places an additional burden to the experimenter and introduces a bias in the evoluti… ▽ More Novelty search has shown to be a promising approach for the evolution of controllers for swarm robotics. In existing studies, however, the experimenter had to craft a domain dependent behaviour similarity measure to use novelty search in swarm robotics applications. The reliance on hand-crafted similarity measures places an additional burden to the experimenter and introduces a bias in the evolutionary process. In this paper, we propose and compare two task-independent, generic behaviour similarity measures: combined state count and sampled average state. The proposed measures use the values of sensors and effectors recorded for each individual robot of the swarm. The characterisation of the group-level behaviour is then obtained by combining the sensor-effector values from all the robots. We evaluate the proposed measures in an aggregation task and in a resource sharing task. We show that the generic measures match the performance of domain dependent measures in terms of solution quality. Our results indicate that the proposed generic measures operate as effective behaviour similarity measures, and that it is possible to leverage the benefits of novelty search without having to craft domain specific similarity measures. △ Less

Submitted 11 April, 2013; originally announced April 2013.

Comments: Initial submission. Final version to appear in GECCO 2013 and dl.acm.org

Journal ref: Genetic and Evolutionary Computation Conference (GECCO). pp. 199-206. ACM Press (2013)

Showing 1–50 of 55 results for author: Gomes, J