-
A Road Less Travelled and Beyond: Towards a Roadmap for Integrating Sustainability into Computing Education
Authors:
Ana Moreira,
Ola Leifler,
Stefanie Betz,
Ian Brooks,
Rafael Capilla,
Vlad Constantin Coroama,
Leticia Duboc,
Joao Paulo Fernandes,
Rogardt Heldal,
Patricia Lago,
Ngoc-Thanh Nguyen,
Shola Oyedeji,
Birgit Penzenstadler,
Anne Kathrin Peters,
Jari Porras,
Colin C. Venters
Abstract:
Education for sustainable development has evolved to include more constructive approaches and a better understanding of what is needed to align education with the cultural, societal, and pedagogical changes required to avoid the risks posed by an unsustainable society. This evolution aims to lead us toward viable, equitable, and sustainable futures. However, computing education, including software…
▽ More
Education for sustainable development has evolved to include more constructive approaches and a better understanding of what is needed to align education with the cultural, societal, and pedagogical changes required to avoid the risks posed by an unsustainable society. This evolution aims to lead us toward viable, equitable, and sustainable futures. However, computing education, including software engineering, is not fully aligned with the current understanding of what is needed for transformational learning in light of our current challenges. This is partly because computing is primarily seen as a technical field, focused on industry needs. Until recently, sustainability was not a high priority for most businesses, including the digital sector, nor was it a prominent focus for higher education institutions and society.
Given these challenges, we aim to propose a research roadmap to integrate sustainability principles and essential skills into the crowded computing curriculum, nurturing future software engineering professionals with a sustainability mindset. We conducted two extensive studies: a systematic review of academic literature on sustainability in computing education and a survey of industry professionals on their interest in sustainability and desired skills for graduates. Using insights from these studies, we identified key topics for teaching sustainability, including core sustainability principles, values and ethics, systems thinking, impact measurement, soft skills, business value, legal standards, and advocacy. Based on these findings, we will develop recommendations for future computing education programs that emphasise sustainability.
The paper is accepted at the 2030 Software Engineering workshop, which is co-located with the FSE'24 conference.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Vehicle-to-Vehicle Charging: Model, Complexity, and Heuristics
Authors:
Cláudio Gomes,
João Paulo Fernandes,
Gabriel Falcao,
Soummya Kar,
Sridhar Tayur
Abstract:
The rapid adoption of Electric Vehicles (EVs) poses challenges for electricity grids to accommodate or mitigate peak demand. Vehicle-to-Vehicle Charging (V2VC) has been recently adopted by popular EVs, posing new opportunities and challenges to the management and operation of EVs. We present a novel V2VC model that allows decision-makers to take V2VC into account when optimizing their EV operation…
▽ More
The rapid adoption of Electric Vehicles (EVs) poses challenges for electricity grids to accommodate or mitigate peak demand. Vehicle-to-Vehicle Charging (V2VC) has been recently adopted by popular EVs, posing new opportunities and challenges to the management and operation of EVs. We present a novel V2VC model that allows decision-makers to take V2VC into account when optimizing their EV operations. We show that optimizing V2VC is NP-Complete and find that even small problem instances are computationally challenging. We propose R-V2VC, a heuristic that takes advantage of the resulting totally unimodular constraint matrix to efficiently solve problems of realistic sizes. Our results demonstrate that R-V2VC presents a linear growth in the solution time as the problem size increases, while achieving solutions of optimal or near-optimal quality. R-V2VC can be used for real-world operations and to study what-if scenarios when evaluating the costs and benefits of V2VC.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Auto Tuning for OpenMP Dynamic Scheduling applied to FWI
Authors:
Felipe H. S. da Silva,
João B. Fernandes,
Idalmis M. Sardina,
Tiago Barros,
Samuel Xavier-de-Souza,
Italo A. S. Assis
Abstract:
Because Full Waveform Inversion (FWI) works with a massive amount of data, its execution requires much time and computational resources, being restricted to large-scale computer systems such as supercomputers. Techniques such as FWI adapt well to parallel computing and can be parallelized in shared memory systems using the application programming interface (API) OpenMP. The management of parallel…
▽ More
Because Full Waveform Inversion (FWI) works with a massive amount of data, its execution requires much time and computational resources, being restricted to large-scale computer systems such as supercomputers. Techniques such as FWI adapt well to parallel computing and can be parallelized in shared memory systems using the application programming interface (API) OpenMP. The management of parallel tasks can be performed through loop schedulers contained in OpenMP. The dynamic scheduler stands out for distributing predefined fixed-size chunk sizes to idle processing cores at runtime. It can better adapt to FWI, where data processing can be irregular. However, the relationship between the size of the chunk size and the runtime is unknown. Optimization techniques can employ meta-heuristics to explore the parameter search space, avoiding testing all possible solutions. Here, we propose a strategy to use the Parameter Auto Tuning for Shared Memory Algorithms (PATSMA), with Coupled Simulated Annealing (CSA) as its optimization method, to automatically adjust the chunk size for the dynamic scheduling of wave propagation, one of the most expensive steps in FWI. Since testing each candidate chunk size in the complete FWI is unpractical, our approach consists of running a PATSMA where the objective function is the runtime of the first time iteration of the first seismic shot of the first FWI iteration. The resulting chunk size is then employed in all wave propagations involved in an FWI. We conducted tests to measure the runtime of an FWI using the proposed autotuning, varying the problem size and running on different computational environments, such as supercomputers and cloud computing instances. The results show that applying the proposed autotuning in an FWI reduces its runtime by up to 70.46% compared to standard OpenMP schedulers.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Are Fact-Checking Tools Reliable? An Evaluation of Google Fact Check
Authors:
Qiangeng Yang,
Tess Christensen,
Shlok Gilda,
Juliana Fernandes,
Daniela Oliveira
Abstract:
Fact-checking is an effective approach to combat misinformation on social media, especially regarding significant social events such as the COVID-19 pandemic and the U.S. presidential elections. In this study, we evaluated the performance of Google Fact Check, a fact-checking search engine. By analyzing the search results regarding 1,000 COVID-19-related false claims, we found Google Fact Check no…
▽ More
Fact-checking is an effective approach to combat misinformation on social media, especially regarding significant social events such as the COVID-19 pandemic and the U.S. presidential elections. In this study, we evaluated the performance of Google Fact Check, a fact-checking search engine. By analyzing the search results regarding 1,000 COVID-19-related false claims, we found Google Fact Check not likely to provide sufficient fact-checking information for most false claims, even though the results obtained are generally reliable and helpful. We also found that the corresponding false claims of different fact-checking verdicts (i.e., "False", "Partly False", "True", and "Unratable") tend to reflect diverse emotional tones, and fact-checking sources are likely to check the claims in different lengths and using dictionary words to various extents. Claims addressing the same issue yet described differently are likely to obtain disparate fact-checking results. This research aims to shed light on the best practices for performing fact-checking searches for the general public.
△ Less
Submitted 22 April, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
PATSMA: Parameter Auto-tuning for Shared Memory Algorithms
Authors:
Joao B. Fernandes,
Felipe H. S. da Silva,
Samuel Xavier-de-Souza,
Italo A. S. Assis
Abstract:
Programs with high levels of complexity often face challenges in adjusting execution parameters, particularly when these parameters vary based on the execution context. These dynamic parameters significantly impact the program's performance, such as loop granularity, which can vary depending on factors like the execution environment, program input, or the choice of compiler. Given the expensive na…
▽ More
Programs with high levels of complexity often face challenges in adjusting execution parameters, particularly when these parameters vary based on the execution context. These dynamic parameters significantly impact the program's performance, such as loop granularity, which can vary depending on factors like the execution environment, program input, or the choice of compiler. Given the expensive nature of testing each case individually, one viable solution is to automate parameter adjustments using optimization methods. This article introduces PATSMA, a parameter auto-tuning tool that leverages Coupled Simulated Annealing (CSA) and Nelder-Mead (NM) optimization methods to fine-tune existing parameters in an application. We demonstrate how auto-tuning can contribute to the real-time optimization of parallel algorithms designed for shared memory systems. PATSMA is a C++ library readily available under the MIT license.
△ Less
Submitted 14 June, 2024; v1 submitted 15 January, 2024;
originally announced January 2024.
-
Adaptive Asynchronous Work-Stealing for distributed load-balancing in heterogeneous systems
Authors:
João B. Fernandes,
Ítalo A. S. de Assis,
Idalmis M. S. Martins,
Tiago Barros,
Samuel Xavier-de-Souza
Abstract:
Supercomputers have revolutionized how industries and scientific fields process large amounts of data. These machines group hundreds or thousands of computing nodes working together to execute time-consuming programs that require a large amount of computational resources. Over the years, supercomputers have expanded to include new and different technologies characterizing them as heterogeneous. Ho…
▽ More
Supercomputers have revolutionized how industries and scientific fields process large amounts of data. These machines group hundreds or thousands of computing nodes working together to execute time-consuming programs that require a large amount of computational resources. Over the years, supercomputers have expanded to include new and different technologies characterizing them as heterogeneous. However, executing a program in a heterogeneous environment requires attention to a specific aspect of performance degradation: load imbalance. In this research, we address the challenges associated with load imbalance when scheduling many homogeneous tasks in a heterogeneous environment. To address this issue, we introduce the concept of adaptive asynchronous work-stealing. This approach collects information about the nodes and utilizes it to improve work-stealing aspects, such as victim selection and task offloading. Additionally, the proposed approach eliminates the need for extra threads to communicate information, thereby reducing overhead when implementing a fully asynchronous approach. Our experimental results demonstrate a performance improvement of approximately 10.1\% compared to other conventional and state-of-the-art implementations.
△ Less
Submitted 23 January, 2024; v1 submitted 9 January, 2024;
originally announced January 2024.
-
ROBBIE: Robust Bias Evaluation of Large Generative Language Models
Authors:
David Esiobu,
Xiaoqing Tan,
Saghar Hosseini,
Megan Ung,
Yuchen Zhang,
Jude Fernandes,
Jane Dwivedi-Yu,
Eleonora Presani,
Adina Williams,
Eric Michael Smith
Abstract:
As generative large language models (LLMs) grow more performant and prevalent, we must develop comprehensive enough tools to measure and improve their fairness. Different prompt-based datasets can be used to measure social bias across multiple text domains and demographic axes, meaning that testing LLMs on more datasets can potentially help us characterize their biases more fully, and better ensur…
▽ More
As generative large language models (LLMs) grow more performant and prevalent, we must develop comprehensive enough tools to measure and improve their fairness. Different prompt-based datasets can be used to measure social bias across multiple text domains and demographic axes, meaning that testing LLMs on more datasets can potentially help us characterize their biases more fully, and better ensure equal and equitable treatment of marginalized demographic groups. In this work, our focus is two-fold:
(1) Benchmarking: a comparison of 6 different prompt-based bias and toxicity metrics across 12 demographic axes and 5 families of generative LLMs. Out of those 6 metrics, AdvPromptSet and HolisticBiasR are novel datasets proposed in the paper. The comparison of those benchmarks gives us insights about the bias and toxicity of the compared models. Therefore, we explore the frequency of demographic terms in common LLM pre-training corpora and how this may relate to model biases.
(2) Mitigation: we conduct a comprehensive study of how well 3 bias/toxicity mitigation techniques perform across our suite of measurements. ROBBIE aims to provide insights for practitioners while deploying a model, emphasizing the need to not only measure potential harms, but also understand how they arise by characterizing the data, mitigate harms once found, and balance any trade-offs. We open-source our analysis code in hopes of encouraging broader measurements of bias in future LLMs.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
2-Cats: 2D Copula Approximating Transforms
Authors:
Flavio Figueiredo,
José Geraldo Fernandes,
Jackson Silva,
Renato M. Assunção
Abstract:
Copulas are powerful statistical tools for capturing dependencies across data dimensions. Applying Copulas involves estimating independent marginals, a straightforward task, followed by the much more challenging task of determining a single copulating function, $C$, that links these marginals. For bivariate data, a copula takes the form of a two-increasing function…
▽ More
Copulas are powerful statistical tools for capturing dependencies across data dimensions. Applying Copulas involves estimating independent marginals, a straightforward task, followed by the much more challenging task of determining a single copulating function, $C$, that links these marginals. For bivariate data, a copula takes the form of a two-increasing function $C: (u,v)\in \mathbb{I}^2 \rightarrow \mathbb{I}$, where $\mathbb{I} = [0, 1]$. This paper proposes 2-Cats, a Neural Network (NN) model that learns two-dimensional Copulas without relying on specific Copula families (e.g., Archimedean). Furthermore, via both theoretical properties of the model and a Lagrangian training approach, we show that 2-Cats meets the desiderata of Copula properties. Moreover, inspired by the literature on Physics-Informed Neural Networks and Sobolev Training, we further extend our training strategy to learn not only the output of a Copula but also its derivatives. Our proposed method exhibits superior performance compared to the state-of-the-art across various datasets while respecting (provably for most and approximately for a single other) properties of C.
△ Less
Submitted 28 May, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
SusTrainable: Promoting Sustainability as a Fundamental Driver in Software Development Training and Education. 2nd Teacher Training, January 23-27, 2023, Pula, Croatia. Revised lecture notes
Authors:
Tihana Galinac Grbac,
Csaba Szabó,
João Paulo Fernandes
Abstract:
This volume exhibits the revised lecture notes of the 2nd teacher training organized as part of the project Promoting Sustainability as a Fundamental Driver in Software Development Training and Education, held at the Juraj Dobrila University of Pula, Croatia, in the week January 23-27, 2023. It is the Erasmus+ project No. 2020-1-PT01-KA203-078646 - Sustrainable. More details can be found at the pr…
▽ More
This volume exhibits the revised lecture notes of the 2nd teacher training organized as part of the project Promoting Sustainability as a Fundamental Driver in Software Development Training and Education, held at the Juraj Dobrila University of Pula, Croatia, in the week January 23-27, 2023. It is the Erasmus+ project No. 2020-1-PT01-KA203-078646 - Sustrainable. More details can be found at the project web site https://sustrainable.github.io/
One of the most important contributions of the project are two summer schools. The 2nd SusTrainable Summer School (SusTrainable - 23) will be organized at the University of Coimbra, Portugal, in the week July 10-14, 2023. The summer school will consist of lectures and practical work for master and PhD students in computing science and closely related fields. There will be contributions from Babeş-Bolyai University, Eötvös Loránd University, Juraj Dobrila University of Pula, Radboud University Nijmegen, Roskilde University, Technical University of Košice, University of Amsterdam, University of Coimbra, University of Minho, University of Plovdiv, University of Porto, University of Rijeka.
To prepare and streamline the summer school, the consortium organized a teacher training in Pula, Croatia. This was an event of five full days, organized by Tihana Galinac Grbac and Neven Grbac. The Juraj Dobrila University of Pula is very concerned with the sustainability issues. The education, research and management are conducted with sustainability goals in mind.
The contributions in the proceedings were reviewed and provide a good overview of the range of topics that will be covered at the summer school. The papers in the proceedings, as well as the very constructive and cooperative teacher training, guarantee the highest quality and beneficial summer school for all participants.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
Llama 2: Open Foundation and Fine-Tuned Chat Models
Authors:
Hugo Touvron,
Louis Martin,
Kevin Stone,
Peter Albert,
Amjad Almahairi,
Yasmine Babaei,
Nikolay Bashlykov,
Soumya Batra,
Prajjwal Bhargava,
Shruti Bhosale,
Dan Bikel,
Lukas Blecher,
Cristian Canton Ferrer,
Moya Chen,
Guillem Cucurull,
David Esiobu,
Jude Fernandes,
Jeremy Fu,
Wenyin Fu,
Brian Fuller,
Cynthia Gao,
Vedanuj Goswami,
Naman Goyal,
Anthony Hartshorn,
Saghar Hosseini
, et al. (43 additional authors not shown)
Abstract:
In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be…
▽ More
In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.
△ Less
Submitted 19 July, 2023; v1 submitted 18 July, 2023;
originally announced July 2023.
-
Sustainability in Computing Education: A Systematic Literature Review
Authors:
A. -K. Peters,
R. Capilla,
V. C. Coroamă,
R. Heldal,
P. Lago,
O. Leifler,
A. Moreira,
J. P. Fernandes,
B. Penzenstadler,
J. Porras,
C. C. Venters
Abstract:
Research shows that the global society as organized today, with our current technological and economic system, is impossible to sustain. We are living in the Anthropocene, an era in which human activities in highly industrialized countries are responsible for overshooting several planetary boundaries, with poorer communities contributing least to the problems but being impacted the most. At the sa…
▽ More
Research shows that the global society as organized today, with our current technological and economic system, is impossible to sustain. We are living in the Anthropocene, an era in which human activities in highly industrialized countries are responsible for overshooting several planetary boundaries, with poorer communities contributing least to the problems but being impacted the most. At the same time, technical and economic gains fail to provide society at large with equal opportunities and improved quality of life. This paper describes approaches taken in computing education to address the issue of sustainability. It presents results of a systematic review of literature on sustainability in computing education. From a set of 572 publications extracted from six large digital libraries plus snowballing, we distilled and analyzed the 90 relevant primary studies. Using an inductive and deductive thematic analysis, we study 1) conceptions of sustainability, computing, and education, 2) implementations of sustainability in computing education, and 3) research on sustainability in computing education. We present a framework capturing learning objectives and outcomes as well as pedagogical methods for sustainability in computing education. These results can be mapped to existing standards and curricula in future work. We find that only a few of the articles engage with the challenges as calling for drastic systemic change, along with radically new understandings of computing and education. We suggest that future research should connect to the substantial body of critical theory such as feminist theory of science and technology. Existing research on sustainability in computing education may be considered as rather immature as the majority of articles are experience reports with limited empirical research.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages
Authors:
Anusha Prakash,
Arun Kumar,
Ashish Seth,
Bhagyashree Mukherjee,
Ishika Gupta,
Jom Kuriakose,
Jordan Fernandes,
K V Vikram,
Mano Ranjith Kumar M,
Metilda Sagaya Mary,
Mohammad Wajahat,
Mohana N,
Mudit Batra,
Navina K,
Nihal John George,
Nithya Ravi,
Pruthwik Mishra,
Sudhanshu Srivastava,
Vasista Sai Lodagala,
Vandan Mujadia,
Kada Sai Venkata Vineeth,
Vrunda Sukhadia,
Dipti Sharma,
Hema Murthy,
Pushpak Bhattacharya
, et al. (2 additional authors not shown)
Abstract:
Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages…
▽ More
Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages belong to different language families, resulting in differences in generated audio duration. This is further compounded by the original speaker's rhythm, especially for extempore speech. This paper describes the challenges in regenerating English lecture videos in Indian languages semi-automatically. A prototype is developed for dubbing lectures into 9 Indian languages. A mean-opinion-score (MOS) is obtained for two languages, Hindi and Tamil, on two different courses. The output video is compared with the original video in terms of MOS (1-5) and lip synchronisation with scores of 4.09 and 3.74, respectively. The human effort also reduces by 75%.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
Identifying and Extracting Football Features from Real-World Media Sources using Only Synthetic Training Data
Authors:
Jose Cerqueira Fernandes,
Benjamin Kenwright
Abstract:
Real-world images used for training machine learning algorithms are often unstructured and inconsistent. The process of analysing and tagging these images can be costly and error prone (also availability, gaps and legal conundrums). However, as we demonstrate in this article, the potential to generate accurate graphical images that are indistinguishable from real-world sources has a multitude of b…
▽ More
Real-world images used for training machine learning algorithms are often unstructured and inconsistent. The process of analysing and tagging these images can be costly and error prone (also availability, gaps and legal conundrums). However, as we demonstrate in this article, the potential to generate accurate graphical images that are indistinguishable from real-world sources has a multitude of benefits in machine learning paradigms. One such example of this is football data from broadcast services (television and other streaming media sources). The football games are usually recorded from multiple sources (cameras and phones) and resolutions, not to mention, occlusion of visual details and other artefacts (like blurring, weathering and lighting conditions) which make it difficult to accurately identify features. We demonstrate an approach which is able to overcome these limitations using generated tagged and structured images. The generated images are able to simulate a variety views and conditions (including noise and blurring) which may only occur sporadically in real-world data and make it difficult for machine learning algorithm to 'cope' with these unforeseen problems in real-data. This approach enables us to rapidly train and prepare a robust solution that accurately extracts features (e.g., spacial locations, markers on the pitch, player positions, ball location and camera FOV) from real-world football match sources for analytical purposes.
△ Less
Submitted 27 September, 2022;
originally announced September 2022.
-
Automatically Assessing Students Performance with Smartphone Data
Authors:
J. Fernandes,
J. Sá Silva,
A. Rodrigues,
S. Sinche,
F. Boavida
Abstract:
As the number of smart devices that surround us increases, so do the opportunities to create smart socially-aware systems. In this context, mobile devices can be used to collect data about students and to better understand how their day-to-day routines can influence their academic performance. Moreover, the Covid-19 pandemic led to new challenges and difficulties, also for students, with considera…
▽ More
As the number of smart devices that surround us increases, so do the opportunities to create smart socially-aware systems. In this context, mobile devices can be used to collect data about students and to better understand how their day-to-day routines can influence their academic performance. Moreover, the Covid-19 pandemic led to new challenges and difficulties, also for students, with considerable impact on their lifestyle. In this paper we present a dataset collected using a smartphone application (ISABELA), which include passive data (e.g., activity and location) as well as self-reported data from questionnaires. We present several tests with different machine learning models, in order to classify students' performance. These tests were carried out using different time windows, showing that weekly time windows lead to better prediction and classification results than monthly time windows. Furthermore, it is shown that the created models can predict student performance even with data collected from different contexts, namely before and during the Covid-19 pandemic. SVMs, XGBoost and AdaBoost-SAMME with Random Forest were found to be the best algorithms, showing an accuracy greater than 78%. Additionally, we propose a pipeline that uses a decision level median voting algorithm to further improve the models' performance, by using historic data from the students to further improve the prediction. Using this pipeline, it is possible to further increase the performance of the models, with some of them obtaining an accuracy greater than 90%.
△ Less
Submitted 6 July, 2022;
originally announced September 2022.
-
Social Sensing and Human in the Loop Profiling during Pandemics: the Vitoria application
Authors:
J. Fernandes,
J. Sá Silva,
A. Rodrigues,
F. Boavida,
R. Gaspar,
C. Godinho,
R. Francisco
Abstract:
As the number of smart devices that surround us increases, so do the opportunities to leverage them to create socially- and context-aware systems. Smart devices can be used for better understanding human behaviour and its societal implications. As an example of a scenario in which the role of socially aware systems is crucial, consider the SARS-CoV-2 pandemic. In this paper we present an innovativ…
▽ More
As the number of smart devices that surround us increases, so do the opportunities to leverage them to create socially- and context-aware systems. Smart devices can be used for better understanding human behaviour and its societal implications. As an example of a scenario in which the role of socially aware systems is crucial, consider the SARS-CoV-2 pandemic. In this paper we present an innovative Humanin-The-Loop Cyber Physical system that can collect passive data from people, such as physical activity, sleep information, and discrete location, as well as collect self-reported data, and provide individualised user feedback. In this paper, we also present a three and a half months field trial implemented in Portugal. This trial was part of a larger scope project that was supported by the Portuguese National Health System, to evaluate the indicators and effects of the pandemic. Results concerning various applications usage statistics are presented, comparing the most used applications, their objective and their usage pattern in work/non-work periods. Additionally,the time-lagged cross correlation between some of the collected metrics, Covid events, and media news, are explored. This type of applications can be used not only in the context of Covid but also in future pandemics, to assist individuals in self-regulation of their contagion risk, based on personalized information, while also function as a means for raising self-awareness of risks related to psychological wellbeing.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
A neural network based controller for underwater robotic vehicles
Authors:
Josiane Maria Macedo Fernandes,
Marcelo Costa Tanaka,
Raimundo Carlos Silvério Freire Júnior,
Wallace Moreira Bessa
Abstract:
Due to the enormous technological improvements obtained in the last decades it is possible to use robotic vehicles for underwater exploration. This work describes the development of a dynamic positioning system for remotely operated underwater vehicles based. The adopted approach is developed using Lyapunov Stability Theory and enhanced by a neural network based algorithm for uncertainty and distu…
▽ More
Due to the enormous technological improvements obtained in the last decades it is possible to use robotic vehicles for underwater exploration. This work describes the development of a dynamic positioning system for remotely operated underwater vehicles based. The adopted approach is developed using Lyapunov Stability Theory and enhanced by a neural network based algorithm for uncertainty and disturbance compensation. The performance of the proposed control scheme is evaluated by means of numerical simulations.
△ Less
Submitted 17 June, 2022; v1 submitted 23 May, 2022;
originally announced May 2022.
-
Perturbation Augmentation for Fairer NLP
Authors:
Rebecca Qian,
Candace Ross,
Jude Fernandes,
Eric Smith,
Douwe Kiela,
Adina Williams
Abstract:
Unwanted and often harmful social biases are becoming ever more salient in NLP research, affecting both models and datasets. In this work, we ask whether training on demographically perturbed data leads to fairer language models. We collect a large dataset of human annotated text perturbations and train a neural perturbation model, which we show outperforms heuristic alternatives. We find that (i)…
▽ More
Unwanted and often harmful social biases are becoming ever more salient in NLP research, affecting both models and datasets. In this work, we ask whether training on demographically perturbed data leads to fairer language models. We collect a large dataset of human annotated text perturbations and train a neural perturbation model, which we show outperforms heuristic alternatives. We find that (i) language models (LMs) pre-trained on demographically perturbed corpora are typically more fair, and (ii) LMs finetuned on perturbed GLUE datasets exhibit less demographic bias on downstream tasks, and (iii) fairness improvements do not come at the expense of performance on downstream tasks. Lastly, we discuss outstanding questions about how best to evaluate the (un)fairness of large language models. We hope that this exploration of neural demographic perturbation will help drive more improvement towards fairer NLP.
△ Less
Submitted 12 October, 2022; v1 submitted 25 May, 2022;
originally announced May 2022.
-
SusTrainable: Promoting Sustainability as a Fundamental Driver in Software Development Training and Education. Teacher Training, November 1-5, Nijmegen, The Netherlands. Revised lecture notes
Authors:
Pieter Koopman,
Mart Lubbers,
João Paulo Fernandes
Abstract:
These are the proceedings of the first teacher training of the Erasmus+ project No. 2020-1-PT01-KA203-078646 -- Sustrainable. The full title of this project is Promoting Sustainability as a Fundamental Driver in Software Development Training and Education and the interested reader may know more about it at:
https://sustrainable.github.io
The flagship contribution is the organization of two sum…
▽ More
These are the proceedings of the first teacher training of the Erasmus+ project No. 2020-1-PT01-KA203-078646 -- Sustrainable. The full title of this project is Promoting Sustainability as a Fundamental Driver in Software Development Training and Education and the interested reader may know more about it at:
https://sustrainable.github.io
The flagship contribution is the organization of two summer schools on sustainable software production. The first summer school is moved due to the Corona pandemic. It is now scheduled for July 2022 in Rijeka, Croatia. This summer school will consist of lectures and practical work for master and PhD students in the area of computing science and closely related fields. There will be contributions from Plovdiv University, University of Košice, University of Coimbra, University of Minho, Eötvös Loránd University, University of Rijeka, Radboud University Nijmegen, University of Pula, University of Amsterdam and Babeş-Bolyai University.
To prepare and streamline this summer school, the consortium organized a teacher training in Nijmegen, the Netherlands. This was an event of five full days on November 1-5, 2022 organized by Ingrid Berenbroek, Mart Lubbers and Pieter Koopman on the campus of the Radboud University Nijmegen, The Netherlands. Sustainability is one of the themes within the strategy of the Radboud University called `A Significant Impact', see https://www.ru.nl/sustainability/organisation/radboud-sustainable/. Sustainability plays an important role in education, research, and management.
The contributions were reviewed and give a good overview of what can be expected from the summer school. Based on these papers and the very positive and constructive atmosphere, we expect a high quality and pleasant summer school. We are looking forward to contributing to a sustainable future.
△ Less
Submitted 4 April, 2022;
originally announced April 2022.
-
Accelerating GAN training using highly parallel hardware on public cloud
Authors:
Renato Cardoso,
Dejan Golubovic,
Ignacio Peluaga Lozada,
Ricardo Rocha,
João Fernandes,
Sofia Vallecorsa
Abstract:
With the increasing number of Machine and Deep Learning applications in High Energy Physics, easy access to dedicated infrastructure represents a requirement for fast and efficient R&D. This work explores different types of cloud services to train a Generative Adversarial Network (GAN) in a parallel environment, using Tensorflow data parallel strategy. More specifically, we parallelize the trainin…
▽ More
With the increasing number of Machine and Deep Learning applications in High Energy Physics, easy access to dedicated infrastructure represents a requirement for fast and efficient R&D. This work explores different types of cloud services to train a Generative Adversarial Network (GAN) in a parallel environment, using Tensorflow data parallel strategy. More specifically, we parallelize the training process on multiple GPUs and Google Tensor Processing Units (TPU) and we compare two algorithms: the TensorFlow built-in logic and a custom loop, optimised to have higher control of the elements assigned to each GPU worker or TPU core. The quality of the generated data is compared to Monte Carlo simulation. Linear speed-up of the training process is obtained, while retaining most of the performance in terms of physics results. Additionally, we benchmark the aforementioned approaches, at scale, over multiple GPU nodes, deploying the training process on different public cloud providers, seeking for overall efficiency and cost-effectiveness. The combination of data science, cloud deployment options and associated economics allows to burst out heterogeneously, exploring the full potential of cloud-based services.
△ Less
Submitted 8 November, 2021;
originally announced November 2021.
-
Automated Cardiac Resting Phase Detection Targeted on the Right Coronary Artery
Authors:
Seung Su Yoon,
Elisabeth Preuhs,
Michaela Schmidt,
Christoph Forman,
Teodora Chitiboi,
Puneet Sharma,
Juliano Lara Fernandes,
Christoph Tillmanns,
Jens Wetzl,
Andreas Maier
Abstract:
Static cardiac imaging such as late gadolinium enhancement, mapping, or 3-D coronary angiography require prior information, e.g., the phase during a cardiac cycle with least motion, called resting phase (RP). The purpose of this work is to propose a fully automated framework that allows the detection of the right coronary artery (RCA) RP within CINE series. The proposed prototype system consists o…
▽ More
Static cardiac imaging such as late gadolinium enhancement, mapping, or 3-D coronary angiography require prior information, e.g., the phase during a cardiac cycle with least motion, called resting phase (RP). The purpose of this work is to propose a fully automated framework that allows the detection of the right coronary artery (RCA) RP within CINE series. The proposed prototype system consists of three main steps. First, the localization of the regions of interest (ROI) is performed. Second, the cropped ROI series are taken for tracking motions over all time points. Third, the output motion values are used to classify RPs. In this work, we focused on the detection of the area with the outer edge of the cross-section of the RCA as our target. The proposed framework was evaluated on 102 clinically acquired dataset at 1.5T and 3T. The automatically classified RPs were compared with the reference RPs annotated manually by a expert for testing the robustness and feasibility of the framework. The predicted RCA RPs showed high agreement with the experts annotated RPs with 92.7% accuracy, 90.5% sensitivity and 95.0% specificity for the unseen study dataset. The mean absolute difference of the start and end RP was 13.6 $\pm$ 18.6 ms for the validation study dataset (n=102). In this work, automated RP detection has been introduced by the proposed framework and demonstrated feasibility, robustness, and applicability for static imaging acquisitions.
△ Less
Submitted 31 January, 2023; v1 submitted 6 September, 2021;
originally announced September 2021.
-
Green Software Lab: Towards an Engineering Discipline for Green Software
Authors:
Rui Abreu,
Marco Couto,
Luís Cruz,
Jácome Cunha,
João Paulo Fernandes,
Rui Pereira,
Alexandre Perez,
João Saraiva
Abstract:
This report describes the research goals and results of the Green Software Lab (GSL) research project. This was a project funded by Fundação para a Ciência e a Tecnologia (FCT) -- the Portuguese research foundation -- under reference POCI-01-0145-FEDER-016718, that ran from January 2016 till July 2020.
This report includes the complete document reporting the results achieved during the project e…
▽ More
This report describes the research goals and results of the Green Software Lab (GSL) research project. This was a project funded by Fundação para a Ciência e a Tecnologia (FCT) -- the Portuguese research foundation -- under reference POCI-01-0145-FEDER-016718, that ran from January 2016 till July 2020.
This report includes the complete document reporting the results achieved during the project execution, which was submitted to FCT for evaluation on July 2020. It describes the goals of the project, and the different research tasks presenting the deliverables of each of them. It also presents the management and result dissemination work performed during the project's execution. The document includes also a self assessment of the achieved results, and a complete list of scientific publications describing the contributions of the project. Finally, this document includes the FCT evaluation report.
△ Less
Submitted 6 August, 2021;
originally announced August 2021.
-
Lumen: A Machine Learning Framework to Expose Influence Cues in Text
Authors:
Hanyu Shi,
Mirela Silva,
Daniel Capecci,
Luiz Giovanini,
Lauren Czech,
Juliana Fernandes,
Daniela Oliveira
Abstract:
Phishing and disinformation are popular social engineering attacks with attackers invariably applying influence cues in texts to make them more appealing to users. We introduce Lumen, a learning-based framework that exposes influence cues in text: (i) persuasion, (ii) framing, (iii) emotion, (iv) objectivity/subjectivity, (v) guilt/blame, and (vi) use of emphasis. Lumen was trained with a newly de…
▽ More
Phishing and disinformation are popular social engineering attacks with attackers invariably applying influence cues in texts to make them more appealing to users. We introduce Lumen, a learning-based framework that exposes influence cues in text: (i) persuasion, (ii) framing, (iii) emotion, (iv) objectivity/subjectivity, (v) guilt/blame, and (vi) use of emphasis. Lumen was trained with a newly developed dataset of 3K texts comprised of disinformation, phishing, hyperpartisan news, and mainstream news. Evaluation of Lumen in comparison to other learning models showed that Lumen and LSTM presented the best F1-micro score, but Lumen yielded better interpretability. Our results highlight the promise of ML to expose influence cues in text, towards the goal of application in automatic labeling tools to improve the accuracy of human-based detection and reduce the likelihood of users falling for deceptive online content.
△ Less
Submitted 12 July, 2021;
originally announced July 2021.
-
Expanding Frontiers: Settling an Understanding of Systems-of-Information Systems
Authors:
Valdemar Vicente Graciano Neto,
Bruno Gabriel Araújo Lebtag,
Paulo Gabriel Teixeira,
Priscilla Batista,
Vinícius Carvalho Lopes,
Jamal El-Hachem,
Jérémy Buisson,
Flavio Oquendo,
Juliana Fernandes,
Francisco Ferreira,
Rodrigo Peireira dos Santos,
Davi Viana,
Everton Cavalcante,
Mohamad Kassab,
Ahmad Mohsin,
Roberto Oliveira,
Vânia Neves,
Maria Istela Cagnin,
Elisa Yumi Nakagawa
Abstract:
System-of-Systems (SoS) has consolidated itself as a special type of software-intensive systems. As such, subtypes of SoS have also emerged, such as Cyber-Physical SoS (CPSoS) that are formed essentially of cyber-physical constituent systems and Systems-of-Information Systems (SoIS) that contain information systems as their constituents. In contrast to CPSoS that have been investigated and covered…
▽ More
System-of-Systems (SoS) has consolidated itself as a special type of software-intensive systems. As such, subtypes of SoS have also emerged, such as Cyber-Physical SoS (CPSoS) that are formed essentially of cyber-physical constituent systems and Systems-of-Information Systems (SoIS) that contain information systems as their constituents. In contrast to CPSoS that have been investigated and covered in the specialized literature, SoIS still lack critical discussion about their fundamentals. The main contribution of this paper is to present those fundamentals to set an understanding of SoIS. By offering a discussion and examining literature cases, we draw an essential settlement on SoIS definition, basics, and practical implications. The discussion herein presented results from research conducted on SoIS over the past years in interinstitutional and multinational research collaborations. The knowledge gathered in this paper arises from several scientific discussion meetings among the authors. As a result, we aim to contribute to the state of the art of SoIS besides paving the research avenues for the forthcoming years.
△ Less
Submitted 25 March, 2021;
originally announced March 2021.
-
A Deep Learning Approach Based on Graphs to Detect Plantation Lines
Authors:
Diogo Nunes Gonçalves,
Mauro dos Santos de Arruda,
Hemerson Pistori,
Vanessa Jordão Marcato Fernandes,
Ana Paula Marques Ramos,
Danielle Elis Garcia Furuya,
Lucas Prado Osco,
Hongjie He,
Jonathan Li,
José Marcato Junior,
Wesley Nunes Gonçalves
Abstract:
Deep learning-based networks are among the most prominent methods to learn linear patterns and extract this type of information from diverse imagery conditions. Here, we propose a deep learning approach based on graphs to detect plantation lines in UAV-based RGB imagery presenting a challenging scenario containing spaced plants. The first module of our method extracts a feature map throughout the…
▽ More
Deep learning-based networks are among the most prominent methods to learn linear patterns and extract this type of information from diverse imagery conditions. Here, we propose a deep learning approach based on graphs to detect plantation lines in UAV-based RGB imagery presenting a challenging scenario containing spaced plants. The first module of our method extracts a feature map throughout the backbone, which consists of the initial layers of the VGG16. This feature map is used as an input to the Knowledge Estimation Module (KEM), organized in three concatenated branches for detecting 1) the plant positions, 2) the plantation lines, and 3) for the displacement vectors between the plants. A graph modeling is applied considering each plant position on the image as vertices, and edges are formed between two vertices (i.e. plants). Finally, the edge is classified as pertaining to a certain plantation line based on three probabilities (higher than 0.5): i) in visual features obtained from the backbone; ii) a chance that the edge pixels belong to a line, from the KEM step; and iii) an alignment of the displacement vectors with the edge, also from KEM. Experiments were conducted in corn plantations with different growth stages and patterns with aerial RGB imagery. A total of 564 patches with 256 x 256 pixels were used and randomly divided into training, validation, and testing sets in a proportion of 60\%, 20\%, and 20\%, respectively. The proposed method was compared against state-of-the-art deep learning methods, and achieved superior performance with a significant margin, returning precision, recall, and F1-score of 98.7\%, 91.9\%, and 95.1\%, respectively. This approach is useful in extracting lines with spaced plantation patterns and could be implemented in scenarios where plantation gaps occur, generating lines with few-to-none interruptions.
△ Less
Submitted 5 February, 2021;
originally announced February 2021.
-
Facebook Ad Engagement in the Russian Active Measures Campaign of 2016
Authors:
Mirela Silva,
Luiz Giovanini,
Juliana Fernandes,
Daniela Oliveira,
Catia S. Silva
Abstract:
This paper examines 3,517 Facebook ads created by Russia's Internet Research Agency (IRA) between June 2015 and August 2017 in its active measures disinformation campaign targeting the 2016 U.S. general election. We aimed to unearth the relationship between ad engagement (as measured by ad clicks) and 41 features related to ads' metadata, sociolinguistic structures, and sentiment. Our analysis was…
▽ More
This paper examines 3,517 Facebook ads created by Russia's Internet Research Agency (IRA) between June 2015 and August 2017 in its active measures disinformation campaign targeting the 2016 U.S. general election. We aimed to unearth the relationship between ad engagement (as measured by ad clicks) and 41 features related to ads' metadata, sociolinguistic structures, and sentiment. Our analysis was three-fold: (i) understand the relationship between engagement and features via correlation analysis; (ii) find the most relevant feature subsets to predict engagement via feature selection; and (iii) find the semantic topics that best characterize the dataset via topic modeling. We found that ad expenditure, text size, ad lifetime, and sentiment were the top features predicting users' engagement to the ads. Additionally, positive sentiment ads were more engaging than negative ads, and sociolinguistic features (e.g., use of religion-relevant words) were identified as highly important in the makeup of an engaging ad. Linear SVM and Logistic Regression classifiers achieved the highest mean F-scores (93.6% for both models), determining that the optimal feature subset contains 12 and 6 features, respectively. Finally, we corroborate the findings of related works that the IRA specifically targeted Americans on divisive ad topics (e.g., LGBT rights, African American reparations).
△ Less
Submitted 23 December, 2020; v1 submitted 21 December, 2020;
originally announced December 2020.
-
Small Changes, Big Impacts: Leveraging Diversity to Improve Energy Efficiency
Authors:
Wellington Oliveira,
Hugo Matalonga,
Gustavo Pinto,
Fernando Castor,
João Paulo Fernandes
Abstract:
In the last few years, a growing body of research has proposed methods, techniques, and tools to support developers in the construction of software that consumes less energy. These solutions leverage diverse approaches such as version history mining, analytical models, identifying energy-efficient color schemes, and optimizing the packaging of HTTP requests.
In this chapter, we present a complem…
▽ More
In the last few years, a growing body of research has proposed methods, techniques, and tools to support developers in the construction of software that consumes less energy. These solutions leverage diverse approaches such as version history mining, analytical models, identifying energy-efficient color schemes, and optimizing the packaging of HTTP requests.
In this chapter, we present a complementary approach. We advocate that developers should leverage software diversity to make software systems more energy-efficient. Our main insight is that non-specialists can build software that consumes less energy by alternating at development time between readily available, diversely-designed pieces of software implemented by third-parties. These pieces of software can vary in nature, granularity, and quality attributes. Examples include data structures and constructs for thread management and synchronization.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
People Still Care About Facts: Twitter Users Engage More with Factual Discourse than Misinformation--A Comparison Between COVID and General Narratives on Twitter
Authors:
Mirela Silva,
Fabrício Ceschin,
Prakash Shrestha,
Christopher Brant,
Shlok Gilda,
Juliana Fernandes,
Catia S. Silva,
André Grégio,
Daniela Oliveira,
Luiz Giovanini
Abstract:
Misinformation entails the dissemination of falsehoods that leads to the slow fracturing of society via decreased trust in democratic processes, institutions, and science. The public has grown aware of the role of social media as a superspreader of untrustworthy information, where even pandemics have not been immune. In this paper, we focus on COVID-19 misinformation and examine a subset of 2.1M t…
▽ More
Misinformation entails the dissemination of falsehoods that leads to the slow fracturing of society via decreased trust in democratic processes, institutions, and science. The public has grown aware of the role of social media as a superspreader of untrustworthy information, where even pandemics have not been immune. In this paper, we focus on COVID-19 misinformation and examine a subset of 2.1M tweets to understand misinformation as a function of engagement, tweet content (COVID-19- vs. non-COVID-19-related), and veracity (misleading or factual). Using correlation analysis, we show the most relevant feature subsets among over 126 features that most heavily correlate with misinformation or facts. We found that (i) factual tweets, regardless of whether COVID-related, were more engaging than misinformation tweets; and (ii) features that most heavily correlated with engagement varied depending on the veracity and content of the tweet.
△ Less
Submitted 9 September, 2021; v1 submitted 3 December, 2020;
originally announced December 2020.
-
Analysis of Social Robotic Navigation approaches: CNN Encoder and Incremental Learning as an alternative to Deep Reinforcement Learning
Authors:
Janderson Ferreira,
Agostinho A. F. Júnior,
Letícia Castro,
Yves M. Galvão,
Pablo Barros,
Bruno J. T. Fernandes
Abstract:
Dealing with social tasks in robotic scenarios is difficult, as having humans in the learning loop is incompatible with most of the state-of-the-art machine learning algorithms. This is the case when exploring Incremental learning models, in particular the ones involving reinforcement learning. In this work, we discuss this problem and possible solutions by analysing a previous study on adaptive c…
▽ More
Dealing with social tasks in robotic scenarios is difficult, as having humans in the learning loop is incompatible with most of the state-of-the-art machine learning algorithms. This is the case when exploring Incremental learning models, in particular the ones involving reinforcement learning. In this work, we discuss this problem and possible solutions by analysing a previous study on adaptive convolutional encoders for a social navigation task.
△ Less
Submitted 5 September, 2020; v1 submitted 18 August, 2020;
originally announced August 2020.
-
Performance Improvement of Path Planning algorithms with Deep Learning Encoder Model
Authors:
Janderson Ferreira,
Agostinho A. F. Júnior,
Yves M. Galvão,
Pablo Barros,
Sergio Murilo Maciel Fernandes,
Bruno J. T. Fernandes
Abstract:
Currently, path planning algorithms are used in many daily tasks. They are relevant to find the best route in traffic and make autonomous robots able to navigate. The use of path planning presents some issues in large and dynamic environments. Large environments make these algorithms spend much time finding the shortest path. On the other hand, dynamic environments request a new execution of the a…
▽ More
Currently, path planning algorithms are used in many daily tasks. They are relevant to find the best route in traffic and make autonomous robots able to navigate. The use of path planning presents some issues in large and dynamic environments. Large environments make these algorithms spend much time finding the shortest path. On the other hand, dynamic environments request a new execution of the algorithm each time a change occurs in the environment, and it increases the execution time. The dimensionality reduction appears as a solution to this problem, which in this context means removing useless paths present in those environments. Most of the algorithms that reduce dimensionality are limited to the linear correlation of the input data. Recently, a Convolutional Neural Network (CNN) Encoder was used to overcome this situation since it can use both linear and non-linear information to data reduction. This paper analyzes in-depth the performance to eliminate the useless paths using this CNN Encoder model. To measure the mentioned model efficiency, we combined it with different path planning algorithms. Next, the final algorithms (combined and not combined) are checked in a database that is composed of five scenarios. Each scenario contains fixed and dynamic obstacles. Their proposed model, the CNN Encoder, associated to other existent path planning algorithms in the literature, was able to obtain a time decrease to find the shortest path in comparison to all path planning algorithms analyzed. the average decreased time was 54.43 %.
△ Less
Submitted 5 August, 2020;
originally announced August 2020.
-
Bot Development for Social Engineering Attacks on Twitter
Authors:
Jefferson Viana Fonseca Abreu,
Jorge Henrique Cabral Fernandes,
João José Costa Gondim,
Célia Ghedini Ralha
Abstract:
A series of bots performing simulated social engineering attacks using phishing in the Twitter platform was developed to identify potentially unsafe user behavior. In this work different bot versions were developed to collect feedback data after stimuli directed to 1,287 twitter accounts for 38 consecutive days. The results were not conclusive about the existence of preceptors for unsafe behavior,…
▽ More
A series of bots performing simulated social engineering attacks using phishing in the Twitter platform was developed to identify potentially unsafe user behavior. In this work different bot versions were developed to collect feedback data after stimuli directed to 1,287 twitter accounts for 38 consecutive days. The results were not conclusive about the existence of preceptors for unsafe behavior, but we conclude that despite Twiter's security this kind of attack is still feasible.
△ Less
Submitted 23 July, 2020;
originally announced July 2020.
-
FCN+RL: A Fully Convolutional Network followed by Refinement Layers to Offline Handwritten Signature Segmentation
Authors:
Celso A. M. Lopes Junior,
Matheus Henrique M. da Silva,
Byron Leite Dantas Bezerra,
Bruno Jose Torres Fernandes,
Donato Impedovo
Abstract:
Although secular, handwritten signature is one of the most reliable biometric methods used by most countries. In the last ten years, the application of technology for verification of handwritten signatures has evolved strongly, including forensic aspects. Some factors, such as the complexity of the background and the small size of the region of interest - signature pixels - increase the difficulty…
▽ More
Although secular, handwritten signature is one of the most reliable biometric methods used by most countries. In the last ten years, the application of technology for verification of handwritten signatures has evolved strongly, including forensic aspects. Some factors, such as the complexity of the background and the small size of the region of interest - signature pixels - increase the difficulty of the targeting task. Other factors that make it challenging are the various variations present in handwritten signatures such as location, type of ink, color and type of pen, and the type of stroke. In this work, we propose an approach to locate and extract the pixels of handwritten signatures on identification documents, without any prior information on the location of the signatures. The technique used is based on a fully convolutional encoder-decoder network combined with a block of refinement layers for the alpha channel of the predicted image. The experimental results demonstrate that the technique outputs a clean signature with higher fidelity in the lines than the traditional approaches and preservation of the pertinent characteristics to the signer's spelling. To evaluate the quality of our proposal, we use the following image similarity metrics: SSIM, SIFT, and Dice Coefficient. The qualitative and quantitative results show a significant improvement in comparison with the baseline system.
△ Less
Submitted 28 May, 2020;
originally announced May 2020.
-
CNN Encoder to Reduce the Dimensionality of Data Image for Motion Planning
Authors:
Janderson Ferreira,
Agostinho A. F. Júnior,
Yves M. Galvão,
Bruno J. T. Fernandes,
Pablo Barros
Abstract:
Many real-world applications need path planning algorithms to solve tasks in different areas, such as social applications, autonomous cars, and tracking activities. And most importantly motion planning. Although the use of path planning is sufficient in most motion planning scenarios, they represent potential bottlenecks in large environments with dynamic changes. To tackle this problem, the numbe…
▽ More
Many real-world applications need path planning algorithms to solve tasks in different areas, such as social applications, autonomous cars, and tracking activities. And most importantly motion planning. Although the use of path planning is sufficient in most motion planning scenarios, they represent potential bottlenecks in large environments with dynamic changes. To tackle this problem, the number of possible routes could be reduced to make it easier for path planning algorithms to find the shortest path with less efforts. An traditional algorithm for path planning is the A*, it uses an heuristic to work faster than other solutions. In this work, we propose a CNN encoder capable of eliminating useless routes for motion planning problems, then we combine the proposed neural network output with A*. To measure the efficiency of our solution, we propose a database with different scenarios of motion planning problems. The evaluated metric is the number of the iterations to find the shortest path. The A* was compared with the CNN Encoder (proposal) with A*. In all evaluated scenarios, our solution reduced the number of iterations by more than 60\%.
△ Less
Submitted 10 April, 2020;
originally announced April 2020.
-
Auto-tuning of dynamic scheduling applied to 3D reverse time migration on multicore systems
Authors:
Ítalo A. S. Assis,
João B. Fernandes,
Tiago Barros,
Samuel Xavier-de-Souza
Abstract:
Reverse time migration (RTM) is an algorithm widely used in the oil and gas industry to process seismic data. It is a computationally intensive task that suits well in parallel computers. Methods such as RTM can be parallelized in shared memory systems through scheduling iterations of parallel loops to threads. However, several aspects, such as memory size and hierarchy, number of cores, and input…
▽ More
Reverse time migration (RTM) is an algorithm widely used in the oil and gas industry to process seismic data. It is a computationally intensive task that suits well in parallel computers. Methods such as RTM can be parallelized in shared memory systems through scheduling iterations of parallel loops to threads. However, several aspects, such as memory size and hierarchy, number of cores, and input size, make optimal scheduling very challenging. In this paper, we introduce a run-time strategy to automatically tune the dynamic scheduling of parallel loops iterations in iterative applications, such as the RTM, in multicore systems. The proposed method aims to reduce the execution time of such applications. To find the optimal granularity, we propose a coupled simulated annealing (CSA) based auto-tuning strategy that adjusts the chunk size of work that OpenMP parallel loops assign dynamically to worker threads during the initialization of a 3D RTM application. Experiments performed with different computational systems and input sizes show that the proposed method is consistently better than the default OpenMP schedulers, static, auto, and guided, causing the application to be up to 33% faster. We show that the possible reason for this performance is the reduction of cache misses, mainly level L3, and low overhead, inferior to 2%. Having shown to be robust and scalable for the 3D RTM, the proposed method could also improve the performance of similar wave-based algorithms, such as full-waveform inversion (FWI) and other iterative applications.
△ Less
Submitted 5 July, 2020; v1 submitted 16 May, 2019;
originally announced May 2019.
-
Typed Linear Algebra for Efficient Analytical Querying
Authors:
João M. Afonso,
Gabriel D. Fernandes,
João P. Fernandes,
Filipe Oliveira,
Bruno M. Ribeiro,
Rogério Pontes,
José N. Oliveira,
Alberto J. Proença
Abstract:
This paper uses typed linear algebra (LA) to represent data and perform analytical querying in a single, unified framework. The typed approach offers strong type checking (as in modern programming languages) and a diagrammatic way of expressing queries (paths in LA diagrams). A kernel of LA operators has been implemented so that paths extracted from LA diagrams can be executed. The approach is val…
▽ More
This paper uses typed linear algebra (LA) to represent data and perform analytical querying in a single, unified framework. The typed approach offers strong type checking (as in modern programming languages) and a diagrammatic way of expressing queries (paths in LA diagrams). A kernel of LA operators has been implemented so that paths extracted from LA diagrams can be executed. The approach is validated and evaluated taking TPC-H benchmark queries as reference. The performance of the LA-based approach is compared with popular database competitors (PostgreSQL and MySQL).
△ Less
Submitted 3 September, 2018;
originally announced September 2018.
-
Fusarium Damaged Kernels Detection Using Transfer Learning on Deep Neural Network Architecture
Authors:
Márcio Nicolau,
Márcia Barrocas Moreira Pimentel,
Casiane Salete Tibola,
José Mauricio Cunha Fernandes,
Willingthon Pavan
Abstract:
The present work shows the application of transfer learning for a pre-trained deep neural network (DNN), using a small image dataset ($\approx$ 12,000) on a single workstation with enabled NVIDIA GPU card that takes up to 1 hour to complete the training task and archive an overall average accuracy of $94.7\%$. The DNN presents a $20\%$ score of misclassification for an external test dataset. The a…
▽ More
The present work shows the application of transfer learning for a pre-trained deep neural network (DNN), using a small image dataset ($\approx$ 12,000) on a single workstation with enabled NVIDIA GPU card that takes up to 1 hour to complete the training task and archive an overall average accuracy of $94.7\%$. The DNN presents a $20\%$ score of misclassification for an external test dataset. The accuracy of the proposed methodology is equivalent to ones using HSI methodology $(81\%-91\%)$ used for the same task, but with the advantage of being independent on special equipment to classify wheat kernel for FHB symptoms.
△ Less
Submitted 31 January, 2018;
originally announced February 2018.
-
The Influence of the Java Collection Framework on Overall Energy Consumption
Authors:
Rui Pereira,
Marco Couto,
Jácome Cunha,
João Paulo Fernandes,
João Saraiva
Abstract:
This paper presents a detailed study of the energy consumption of the different Java Collection Framework (JFC) implementations. For each method of an implementation in this framework, we present its energy consumption when handling different amounts of data. Knowing the greenest methods for each implementation, we present an energy optimization approach for Java programs: based on calls to JFC me…
▽ More
This paper presents a detailed study of the energy consumption of the different Java Collection Framework (JFC) implementations. For each method of an implementation in this framework, we present its energy consumption when handling different amounts of data. Knowing the greenest methods for each implementation, we present an energy optimization approach for Java programs: based on calls to JFC methods in the source code of a program, we select the greenest implementation. Finally, we present preliminary results of optimizing a set of Java programs where we obtained 6.2% energy savings.
△ Less
Submitted 2 February, 2016;
originally announced February 2016.
-
Querying Spreadsheets: An Empirical Study
Authors:
Jácome Cunha,
João Paulo Fernandes,
Rui Pereira,
João Saraiva
Abstract:
One of the most important assets of any company is being able to easily access information on itself and on its business. In this line, it has been observed that this important information is often stored in one of the millions of spreadsheets created every year, due to simplicity in using and manipulating such an artifact. Unfortunately, in many cases it is quite difficult to retrieve the intende…
▽ More
One of the most important assets of any company is being able to easily access information on itself and on its business. In this line, it has been observed that this important information is often stored in one of the millions of spreadsheets created every year, due to simplicity in using and manipulating such an artifact. Unfortunately, in many cases it is quite difficult to retrieve the intended information from a spreadsheet: information is often stored in a huge unstructured matrix, with no care for readability or comprehensiveness. In an attempt to aid users in the task of extracting information from a spreadsheet, researchers have been working on models, languages and tools to query. In this paper we present an empirical study evaluating such proposals assessing their usage to query spreadsheets. We investigate the use of the Google Query Function, textual model-driven querying, and visual model-driven querying. To compare these different querying approaches we present an empirical study whose results show that the end-users' productivity increases when using model-driven queries, specially using its visual representation.
△ Less
Submitted 27 February, 2015;
originally announced February 2015.
-
An Empirical Study on End-users Productivity Using Model-based Spreadsheets
Authors:
Laura Beckwith,
Jácome Cunha,
João Paulo Fernandes,
João Saraiva
Abstract:
Spreadsheets are widely used, and studies have shown that most end-user spreadsheets contain nontrivial errors. To improve end-users productivity, recent research proposes the use of a model-driven engineering approach to spreadsheets. In this paper we conduct the first systematic empirical study to assess the effectiveness and efficiency of this approach. A set of spreadsheet end users worked wit…
▽ More
Spreadsheets are widely used, and studies have shown that most end-user spreadsheets contain nontrivial errors. To improve end-users productivity, recent research proposes the use of a model-driven engineering approach to spreadsheets. In this paper we conduct the first systematic empirical study to assess the effectiveness and efficiency of this approach. A set of spreadsheet end users worked with two different model-based spreadsheets, and we present and analyze here the results achieved.
△ Less
Submitted 18 December, 2011;
originally announced December 2011.