-
Naming the Pain in Machine Learning-Enabled Systems Engineering
Authors:
Marcos Kalinowski,
Daniel Mendez,
Görkem Giray,
Antonio Pedro Santos Alves,
Kelly Azevedo,
Tatiana Escovedo,
Hugo Villamizar,
Helio Lopes,
Teresa Baldassarre,
Stefan Wagner,
Stefan Biffl,
Jürgen Musil,
Michael Felderer,
Niklas Lavesson,
Tony Gorschek
Abstract:
Context: Machine learning (ML)-enabled systems are being increasingly adopted by companies aiming to enhance their products and operational processes. Objective: This paper aims to deliver a comprehensive overview of the current status quo of engineering ML-enabled systems and lay the foundation to steer practically relevant and problem-driven academic research. Method: We conducted an internation…
▽ More
Context: Machine learning (ML)-enabled systems are being increasingly adopted by companies aiming to enhance their products and operational processes. Objective: This paper aims to deliver a comprehensive overview of the current status quo of engineering ML-enabled systems and lay the foundation to steer practically relevant and problem-driven academic research. Method: We conducted an international survey to collect insights from practitioners on the current practices and problems in engineering ML-enabled systems. We received 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative analyses on the reported problems using open and axial coding procedures. Results: Our survey results reinforce and extend existing empirical evidence on engineering ML-enabled systems, providing additional insights into typical ML-enabled systems project contexts, the perceived relevance and complexity of ML life cycle phases, and current practices related to problem understanding, model deployment, and model monitoring. Furthermore, the qualitative analysis provides a detailed map of the problems practitioners face within each ML life cycle phase and the problems causing overall project failure. Conclusions: The results contribute to a better understanding of the status quo and problems in practical environments. We advocate for the further adaptation and dissemination of software engineering practices to enhance the engineering of ML-enabled systems.
△ Less
Submitted 20 May, 2024;
originally announced June 2024.
-
Quantum Software Ecosystem Design
Authors:
Achim Basermann,
Michael Epping,
Benedikt Fauseweh,
Michael Felderer,
Elisabeth Lobe,
Melven Röhrig-Zöllner,
Gary Schmiedinghoff,
Peter K. Schuhmacher,
Yoshinta Setyawati,
Alexander Weinert
Abstract:
The rapid advancements in quantum computing necessitate a scientific and rigorous approach to the construction of a corresponding software ecosystem, a topic underexplored and primed for systematic investigation. This chapter takes an important step in this direction: It presents scientific considerations essential for building a quantum software ecosystem that makes quantum computing available fo…
▽ More
The rapid advancements in quantum computing necessitate a scientific and rigorous approach to the construction of a corresponding software ecosystem, a topic underexplored and primed for systematic investigation. This chapter takes an important step in this direction: It presents scientific considerations essential for building a quantum software ecosystem that makes quantum computing available for scientific and industrial problem solving. Central to this discourse is the concept of hardware-software co-design, which fosters a bidirectional feedback loop from the application layer at the top of the software stack down to the hardware. This approach begins with compilers and low-level software that are specifically designed to align with the unique specifications and constraints of the quantum processor, proceeds with algorithms developed with a clear understanding of underlying hardware and computational model features, and extends to applications that effectively leverage the capabilities to achieve a quantum advantage. We analyze the ecosystem from two critical perspectives: the conceptual view, focusing on theoretical foundations, and the technical infrastructure, addressing practical implementations around real quantum devices necessary for a functional ecosystem. This approach ensures that the focus is towards promising applications with optimized algorithm-circuit synergy, while ensuring a user-friendly design, an effective data management and an overall orchestration. Our chapter thus offers a guide to the essential concepts and practical strategies necessary for developing a scientifically grounded quantum software ecosystem.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Toward Research Software Categories
Authors:
Wilhelm Hasselbring,
Stephan Druskat,
Jan Bernoth,
Philine Betker,
Michael Felderer,
Stephan Ferenz,
Anna-Lena Lamprecht,
Jan Linxweiler,
Bernhard Rumpe
Abstract:
Research software has been categorized in different contexts to serve different goals. We start with a look at what research software is, before we discuss the purpose of research software categories. We propose a multi-dimensional categorization of research software. We present a template for characterizing such categories. As selected dimensions, we present our proposed role-based, developer-bas…
▽ More
Research software has been categorized in different contexts to serve different goals. We start with a look at what research software is, before we discuss the purpose of research software categories. We propose a multi-dimensional categorization of research software. We present a template for characterizing such categories. As selected dimensions, we present our proposed role-based, developer-based, and maturity-based categories. Since our work has been inspired by various previous efforts to categorize research software, we discuss them as related works. We characterize all these categories via the previously introduced template, to enable a systematic comparison.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
ML-Enabled Systems Model Deployment and Monitoring: Status Quo and Problems
Authors:
Eduardo Zimelewicz,
Marcos Kalinowski,
Daniel Mendez,
Görkem Giray,
Antonio Pedro Santos Alves,
Niklas Lavesson,
Kelly Azevedo,
Hugo Villamizar,
Tatiana Escovedo,
Helio Lopes,
Stefan Biffl,
Juergen Musil,
Michael Felderer,
Stefan Wagner,
Teresa Baldassarre,
Tony Gorschek
Abstract:
[Context] Systems incorporating Machine Learning (ML) models, often called ML-enabled systems, have become commonplace. However, empirical evidence on how ML-enabled systems are engineered in practice is still limited, especially for activities surrounding ML model dissemination. [Goal] We investigate contemporary industrial practices and problems related to ML model dissemination, focusing on the…
▽ More
[Context] Systems incorporating Machine Learning (ML) models, often called ML-enabled systems, have become commonplace. However, empirical evidence on how ML-enabled systems are engineered in practice is still limited, especially for activities surrounding ML model dissemination. [Goal] We investigate contemporary industrial practices and problems related to ML model dissemination, focusing on the model deployment and the monitoring of ML life cycle phases. [Method] We conducted an international survey to gather practitioner insights on how ML-enabled systems are engineered. We gathered a total of 188 complete responses from 25 countries. We analyze the status quo and problems reported for the model deployment and monitoring phases. We analyzed contemporary practices using bootstrapping with confidence intervals and conducted qualitative analyses on the reported problems applying open and axial coding procedures. [Results] Practitioners perceive the model deployment and monitoring phases as relevant and difficult. With respect to model deployment, models are typically deployed as separate services, with limited adoption of MLOps principles. Reported problems include difficulties in designing the architecture of the infrastructure for production deployment and legacy application integration. Concerning model monitoring, many models in production are not monitored. The main monitored aspects are inputs, outputs, and decisions. Reported problems involve the absence of monitoring practices, the need to create custom monitoring tools, and the selection of suitable metrics. [Conclusion] Our results help provide a better understanding of the adopted practices and problems in practice and support guiding ML deployment and monitoring research in a problem-driven manner.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
On the Role of Font Formats in Building Efficient Web Applications
Authors:
Benedikt Dornauer,
Wolfgang Vigl,
Michael Felderer
Abstract:
The success of a web application is closely linked to its performance, which positively impacts user satisfaction and contributes to energy-saving efforts. Among the various optimization techniques, one specific subject focuses on improving the utilization of web fonts. This study investigates the impact of different font formats on client-side resource consumption, such as CPU, memory, load time,…
▽ More
The success of a web application is closely linked to its performance, which positively impacts user satisfaction and contributes to energy-saving efforts. Among the various optimization techniques, one specific subject focuses on improving the utilization of web fonts. This study investigates the impact of different font formats on client-side resource consumption, such as CPU, memory, load time, and energy. In a controlled experiment, we evaluate performance metrics using the four font formats: OTF, TTF, WOFF, and WOFF2. The results of the study show that there are significant differences between all pair-wise format comparisons regarding all performance metrics. Overall, WOFF2 performs best, except in terms of memory allocation. Through the study and examination of literature, this research contributes (1) an overview of methodologies to enhance web performance through font utilization, (2) a specific exploration of the four prevalent font formats in an experimental setup, and (3) practical recommendations for scientific professionals and practitioners.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Status Quo and Problems of Requirements Engineering for Machine Learning: Results from an International Survey
Authors:
Antonio Pedro Santos Alves,
Marcos Kalinowski,
Görkem Giray,
Daniel Mendez,
Niklas Lavesson,
Kelly Azevedo,
Hugo Villamizar,
Tatiana Escovedo,
Helio Lopes,
Stefan Biffl,
Jürgen Musil,
Michael Felderer,
Stefan Wagner,
Teresa Baldassarre,
Tony Gorschek
Abstract:
Systems that use Machine Learning (ML) have become commonplace for companies that want to improve their products and processes. Literature suggests that Requirements Engineering (RE) can help address many problems when engineering ML-enabled systems. However, the state of empirical evidence on how RE is applied in practice in the context of ML-enabled systems is mainly dominated by isolated case s…
▽ More
Systems that use Machine Learning (ML) have become commonplace for companies that want to improve their products and processes. Literature suggests that Requirements Engineering (RE) can help address many problems when engineering ML-enabled systems. However, the state of empirical evidence on how RE is applied in practice in the context of ML-enabled systems is mainly dominated by isolated case studies with limited generalizability. We conducted an international survey to gather practitioner insights into the status quo and problems of RE in ML-enabled systems. We gathered 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative analyses on the reported problems involving open and axial coding procedures. We found significant differences in RE practices within ML projects. For instance, (i) RE-related activities are mostly conducted by project leaders and data scientists, (ii) the prevalent requirements documentation format concerns interactive Notebooks, (iii) the main focus of non-functional requirements includes data quality, model reliability, and model explainability, and (iv) main challenges include managing customer expectations and aligning requirements with data. The qualitative analyses revealed that practitioners face problems related to lack of business domain understanding, unclear goals and requirements, low customer engagement, and communication issues. These results help to provide a better understanding of the adopted practices and of which problems exist in practical environments. We put forward the need to adapt further and disseminate RE-related practices for engineering ML-enabled systems.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Web Image Formats: Assessment of Their Real-World-Usage and Performance across Popular Web Browsers
Authors:
Benedikt Dornauer,
Michael Felderer
Abstract:
In 2023, images on the web make up 41% of transmitted data, significantly impacting the performance of web apps. Fortunately, image formats like WEBP and AVIF could offer advanced compression and faster page loading, but may face performance disparities across browsers. Therefore, we conducted performance evaluations on five major browsers - Chrome, Edge, Safari, Opera, and Firefox - while compari…
▽ More
In 2023, images on the web make up 41% of transmitted data, significantly impacting the performance of web apps. Fortunately, image formats like WEBP and AVIF could offer advanced compression and faster page loading, but may face performance disparities across browsers. Therefore, we conducted performance evaluations on five major browsers - Chrome, Edge, Safari, Opera, and Firefox - while comparing four image formats. The results indicate that the newer formats exhibited notable performance enhancements across all browsers, leading to shorter loading times. Compared to the compressed JPEG format, WEBP and AVIF improved the Page Load Time by 21% and 15%, respectively. However, web scraping revealed that JPEG and PNG still dominate web image choices, with WEBP at 4% as the most used new format. Through the web scraping and web performance evaluation, this research serves to (1) explore image format preferences in web applications and analyze distribution and characteristics across frequently-visited sites in 2023 and (2) assess the performance impact of distinct web image formats on application load times across popular web browsers.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
Streamlining Attack Tree Generation: A Fragment-Based Approach
Authors:
Irdin Pekaric,
Markus Frick,
Jubril Gbolahan Adigun,
Raffaela Groner,
Thomas Witte,
Alexander Raschke,
Michael Felderer,
Matthias Tichy
Abstract:
Attack graphs are a tool for analyzing security vulnerabilities that capture different and prospective attacks on a system. As a threat modeling tool, it shows possible paths that an attacker can exploit to achieve a particular goal. However, due to the large number of vulnerabilities that are published on a daily basis, they have the potential to rapidly expand in size. Consequently, this necessi…
▽ More
Attack graphs are a tool for analyzing security vulnerabilities that capture different and prospective attacks on a system. As a threat modeling tool, it shows possible paths that an attacker can exploit to achieve a particular goal. However, due to the large number of vulnerabilities that are published on a daily basis, they have the potential to rapidly expand in size. Consequently, this necessitates a significant amount of resources to generate attack graphs. In addition, generating composited attack models for complex systems such as self-adaptive or AI is very difficult due to their nature to continuously change. In this paper, we present a novel fragment-based attack graph generation approach that utilizes information from publicly available information security databases. Furthermore, we also propose a domain-specific language for attack modeling, which we employ in the proposed attack graph generation approach. Finally, we present a demonstrator example showcasing the attack generator's capability to replicate a verified attack chain, as previously confirmed by security experts.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
Model-Based Generation of Attack-Fault Trees
Authors:
Raffaela Groner,
Thomas Witte,
Alexander Raschke,
Sophie Hirn,
Irdin Pekaric,
Markus Frick,
Matthias Tichy,
Michael Felderer
Abstract:
Joint safety and security analysis of cyber-physical systems is a necessary step to correctly capture inter-dependencies between these properties. Attack-Fault Trees represent a combination of dynamic Fault Trees and Attack Trees and can be used to model and model-check a holistic view on both safety and security. Manually creating a complete AFT for the whole system is, however, a daunting task.…
▽ More
Joint safety and security analysis of cyber-physical systems is a necessary step to correctly capture inter-dependencies between these properties. Attack-Fault Trees represent a combination of dynamic Fault Trees and Attack Trees and can be used to model and model-check a holistic view on both safety and security. Manually creating a complete AFT for the whole system is, however, a daunting task. It needs to span multiple abstraction layers, e.g., abstract application architecture and data flow as well as system and library dependencies that are affected by various vulnerabilities. We present an AFT generation tool-chain that facilitates this task using partial Fault and Attack Trees that are either manually created or mined from vulnerability databases. We semi-automatically create two system models that provide the necessary information to automatically combine these partial Fault and Attack Trees into complete AFTs using graph transformation rules.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Towards Model Co-evolution Across Self-Adaptation Steps for Combined Safety and Security Analysis
Authors:
Thomas Witte,
Raffaela Groner,
Alexander Raschke,
Matthias Tichy,
Irdin Pekaric,
Michael Felderer
Abstract:
Self-adaptive systems offer several attack surfaces due to the communication via different channels and the different sensors required to observe the environment. Often, attacks cause safety to be compromised as well, making it necessary to consider these two aspects together. Furthermore, the approaches currently used for safety and security analysis do not sufficiently take into account the inte…
▽ More
Self-adaptive systems offer several attack surfaces due to the communication via different channels and the different sensors required to observe the environment. Often, attacks cause safety to be compromised as well, making it necessary to consider these two aspects together. Furthermore, the approaches currently used for safety and security analysis do not sufficiently take into account the intermediate steps of an adaptation. Current work in this area ignores the fact that a self-adaptive system also reveals possible vulnerabilities (even if only temporarily) during the adaptation. To address this issue, we propose a modeling approach that takes into account the different relevant aspects of a system, its adaptation process, as well as safety hazards and security attacks. We present several models that describe different aspects of a self-adaptive system and we outline our idea of how these models can then be combined into an Attack-Fault Tree. This allows modeling aspects of the system on different levels of abstraction and co-evolve the models using transformations according to the adaptation of the system. Finally, analyses can then be performed as usual on the resulting Attack-Fault Tree.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
VULNERLIZER: Cross-analysis Between Vulnerabilities and Software Libraries
Authors:
Irdin Pekaric,
Michael Felderer,
Philipp Steinmüller
Abstract:
The identification of vulnerabilities is a continuous challenge in software projects. This is due to the evolution of methods that attackers employ as well as the constant updates to the software, which reveal additional issues. As a result, new and innovative approaches for the identification of vulnerable software are needed. In this paper, we present VULNERLIZER, which is a novel framework for…
▽ More
The identification of vulnerabilities is a continuous challenge in software projects. This is due to the evolution of methods that attackers employ as well as the constant updates to the software, which reveal additional issues. As a result, new and innovative approaches for the identification of vulnerable software are needed. In this paper, we present VULNERLIZER, which is a novel framework for cross-analysis between vulnerabilities and software libraries. It uses CVE and software library data together with clustering algorithms to generate links between vulnerabilities and libraries. In addition, the training of the model is conducted in order to reevaluate the generated associations. This is achieved by updating the assigned weights. Finally, the approach is then evaluated by making the predictions using the CVE data from the test set. The results show that the VULNERLIZER has a great potential in being able to predict future vulnerable libraries based on an initial input CVE entry or a software library. The trained model reaches a prediction accuracy of 75% or higher.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Simulation of Sensor Spoofing Attacks on Unmanned Aerial Vehicles Using the Gazebo Simulator
Authors:
Irdin Pekaric,
David Arnold,
Michael Felderer
Abstract:
Conducting safety simulations in various simulators, such as the Gazebo simulator, became a very popular means of testing vehicles against potential safety risks (i.e. crashes). However, this was not the case with security testing. Performing security testing in a simulator is very difficult because security attacks are performed on a different abstraction level. In addition, the attacks themselve…
▽ More
Conducting safety simulations in various simulators, such as the Gazebo simulator, became a very popular means of testing vehicles against potential safety risks (i.e. crashes). However, this was not the case with security testing. Performing security testing in a simulator is very difficult because security attacks are performed on a different abstraction level. In addition, the attacks themselves are becoming more sophisticated, which directly contributes to the difficulty of executing them in a simulator. In this paper, we attempt to tackle the aforementioned gap by investigating possible attacks that can be simulated, and then performing their simulations. The presented approach shows that attacks targeting the LiDAR and GPS components of unmanned aerial vehicles can be simulated. This is achieved by exploiting vulnerabilities of the ROS and MAVLink protocol and injecting malicious processes into an application. As a result, messages with arbitrary values can be spoofed to the corresponding topics, which allows attackers to update relevant parameters and cause a potential crash of a vehicle. This was tested in multiple scenarios, thereby proving that it is indeed possible to simulate certain attack types, such as spoofing and jamming.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Applying Security Testing Techniques to Automotive Engineering
Authors:
Irdin Pekaric,
Clemens Sauerwein,
Michael Felderer
Abstract:
The openness of modern IT systems and their permanent change make it challenging to keep these systems secure. A combination of regression and security testing called security regression testing, which ensures that changes made to a system do not harm its security, are therefore of high significance and the interest in such approaches has steadily increased. In this article we present a systematic…
▽ More
The openness of modern IT systems and their permanent change make it challenging to keep these systems secure. A combination of regression and security testing called security regression testing, which ensures that changes made to a system do not harm its security, are therefore of high significance and the interest in such approaches has steadily increased. In this article we present a systematic classification of available security regression testing approaches based on a solid study of background and related work to sketch which parts of the research area seem to be well understood and evaluated, and which ones require further research. For this purpose we extract approaches relevant to security regression testing from computer science digital libraries based on a rigorous search and selection strategy. Then, we provide a classification of these according to security regression approach criteria: abstraction level, security issue, regression testing techniques, and tool support, as well as evaluation criteria, for instance evaluated system, maturity of the system, and evaluation measures. From the resulting classification we derive observations with regard to the abstraction level, regression testing techniques, tool support as well as evaluation, and finally identify several potential directions of future research.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Data Pipeline Quality: Influencing Factors, Root Causes of Data-related Issues, and Processing Problem Areas for Developers
Authors:
Harald Foidl,
Valentina Golendukhina,
Rudolf Ramler,
Michael Felderer
Abstract:
Data pipelines are an integral part of various modern data-driven systems. However, despite their importance, they are often unreliable and deliver poor-quality data. A critical step toward improving this situation is a solid understanding of the aspects contributing to the quality of data pipelines. Therefore, this article first introduces a taxonomy of 41 factors that influence the ability of da…
▽ More
Data pipelines are an integral part of various modern data-driven systems. However, despite their importance, they are often unreliable and deliver poor-quality data. A critical step toward improving this situation is a solid understanding of the aspects contributing to the quality of data pipelines. Therefore, this article first introduces a taxonomy of 41 factors that influence the ability of data pipelines to provide quality data. The taxonomy is based on a multivocal literature review and validated by eight interviews with experts from the data engineering domain. Data, infrastructure, life cycle management, development & deployment, and processing were found to be the main influencing themes. Second, we investigate the root causes of data-related issues, their location in data pipelines, and the main topics of data pipeline processing issues for developers by mining GitHub projects and Stack Overflow posts. We found data-related issues to be primarily caused by incorrect data types (33%), mainly occurring in the data cleaning stage of pipelines (35%). Data integration and ingestion tasks were found to be the most asked topics of developers, accounting for nearly half (47%) of all questions. Compatibility issues were found to be a separate problem area in addition to issues corresponding to the usual data pipeline processing areas (i.e., data loading, ingestion, integration, cleaning, and transformation). These findings suggest that future research efforts should focus on analyzing compatibility and data type issues in more depth and assisting developers in data integration and ingestion tasks. The proposed taxonomy is valuable to practitioners in the context of quality assurance activities and fosters future research into data pipeline quality.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
Summary of the 4th International Workshop on Requirements Engineering and Testing (RET 2017)
Authors:
Markus Borg,
Elizabeth Bjarnason,
Michael Unterkalmsteiner,
Tingting Yu,
Gregory Gay,
Michael Felderer
Abstract:
The RET (Requirements Engineering and Testing) workshop series provides a meeting point for researchers and practitioners from the two separate fields of Requirements Engineering (RE) and Testing. The long term aim is to build a community and a body of knowledge within the intersection of RE and Testing, i.e., RET. The 4th workshop was co-located with the 25th International Requirements Engineerin…
▽ More
The RET (Requirements Engineering and Testing) workshop series provides a meeting point for researchers and practitioners from the two separate fields of Requirements Engineering (RE) and Testing. The long term aim is to build a community and a body of knowledge within the intersection of RE and Testing, i.e., RET. The 4th workshop was co-located with the 25th International Requirements Engineering Conference (RE'17) in Lisbon, Portugal and attracted about 20 participants. In line with the previous workshop instances, RET 2017 o ered an interactive setting with a keynote, an invited talk, paper presentations, and a concluding hands-on exercise.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
Summary of the 3rd International Workshop on Requirements Engineering and Testing
Authors:
Michael Unterkalmsteiner,
Gregory Gay,
Michael Felderer,
Elizabeth Bjarnason,
Markus Borg,
Mirko Morandini
Abstract:
The RET (Requirements Engineering and Testing) workshop series provides a meeting point for researchers and practitioners from the two separate fields of Requirements Engineering (RE) and Testing. The goal is to improve the connection and alignment of these two areas through an exchange of ideas, challenges, practices, experiences and results. The long term aim is to build a community and a body o…
▽ More
The RET (Requirements Engineering and Testing) workshop series provides a meeting point for researchers and practitioners from the two separate fields of Requirements Engineering (RE) and Testing. The goal is to improve the connection and alignment of these two areas through an exchange of ideas, challenges, practices, experiences and results. The long term aim is to build a community and a body of knowledge within the intersection of RE and Testing, i.e. RET. The 3rd workshop was held in co-location with REFSQ 2016 in Gothenburg, Sweden. The workshop continued in the same interactive vein as the predecessors and included a keynote, paper presentations with ample time for discussions, and panels. In order to create an RET knowledge base, this crosscutting area elicits contributions from both RE and Testing, and from both researchers and practitioners. A range of papers were presented from short positions papers to full research papers that cover connections between the two fields.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Techniques for Improving the Energy Efficiency of Mobile Apps: A Taxonomy and Systematic Literature Review
Authors:
Stefan Huber,
Tobias Lorey,
Michael Felderer
Abstract:
Building energy efficient software is an increasingly important task for mobile developers. However, a cumulative body of knowledge of techniques that support this goal does not exist. We conduct a systematic literature review to gather information on existing techniques that allow developers to increase energy efficiency in mobile apps. Based on a synthesis of the 91 included primary studies, we…
▽ More
Building energy efficient software is an increasingly important task for mobile developers. However, a cumulative body of knowledge of techniques that support this goal does not exist. We conduct a systematic literature review to gather information on existing techniques that allow developers to increase energy efficiency in mobile apps. Based on a synthesis of the 91 included primary studies, we propose a taxonomy of techniques for improving the energy efficiency in mobile apps. The taxonomy includes seven main categories of techniques and serves as a collection of available methods for developers and as a reference guide for software testers when performing energy efficiency testing by the means of benchmark tests.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
Summary of 2nd International Workshop on Requirements Engineering and Testing (RET)
Authors:
Elizabeth Bjarnason,
Mirko Morandini,
Markus Borg,
Michael Unterkalmsteiner,
Michael Felderer,
Matthew Staats
Abstract:
The RET (Requirements Engineering and Testing) workshop series provides a meeting point for researchers and practitioners from the two separate fields of Requirements Engineering (RE) and Testing. The goal is to improve the connection and alignment of these two areas through an exchange of ideas, challenges, practices, experiences and results. The long term aim is to build a community and a body o…
▽ More
The RET (Requirements Engineering and Testing) workshop series provides a meeting point for researchers and practitioners from the two separate fields of Requirements Engineering (RE) and Testing. The goal is to improve the connection and alignment of these two areas through an exchange of ideas, challenges, practices, experiences and results. The long term aim is to build a community and a body of knowledge within the intersection of RE and Testing, i.e. RET. The 2nd workshop was held in co-location with ICSE 2015 in Florence, Italy. The workshop continued in the same interactive vein as the 1st one and included a keynote, paper presentations with ample time for discussions, and a group exercise. For true impact and relevance this cross-cutting area requires contribution from both RE and Testing, and from both researchers and practitioners. A range of papers were presented from short experience papers to full research papers that cover connections between the two fields. One of the main outputs of the 2nd workshop was a categorization of the presented workshop papers according to an initial definition of the area of RET which identifies the aspects RE, Testing and coordination effect.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
SoHist: A Tool for Managing Technical Debt through Retro Perspective Code Analysis
Authors:
Benedikt Dornauer,
Michael Felderer,
Johannes Weinzerl,
Mircea-Cristian Racasan,
Martin Hess
Abstract:
Technical debt is often the result of Short Run decisions made during code development, which can lead to long-term maintenance costs and risks. Hence, evaluating the progression of a project and understanding related code quality aspects is essential.
Fortunately, the prioritization process for addressing technical debt can be expedited with code analysis tools like the established SonarQube. U…
▽ More
Technical debt is often the result of Short Run decisions made during code development, which can lead to long-term maintenance costs and risks. Hence, evaluating the progression of a project and understanding related code quality aspects is essential.
Fortunately, the prioritization process for addressing technical debt can be expedited with code analysis tools like the established SonarQube. Unfortunately, we experienced some limitations with this tool and have had some requirements from the industry that were not yet addressed.
Through this experience report and the analysis of scientific papers, this work contributes: (1) a reassessment of technical debt within the industry, (2) considers the benefits of employing SonarQube as well as its limitations when evaluating and prioritizing technical debt, (3) introduces a novel tool named SoHist which addresses these limitations and offers additional features for the assessment and prioritization of technical debt, and (4) exemplifies the usage of this tool in two industrial settings in the ITEA3 SmartDelta project.
△ Less
Submitted 27 April, 2023;
originally announced April 2023.
-
Energy-Saving Strategies for Mobile Web Apps and their Measurement: Results from a Decade of Research
Authors:
Benedikt Dornauer,
Michael Felderer
Abstract:
In 2022, over half of the web traffic was accessed through mobile devices. By reducing the energy consumption of mobile web apps, we can not only extend the battery life of our devices, but also make a significant contribution to energy conservation efforts. For example, if we could save only 5% of the energy used by web apps, we estimate that it would be enough to shut down one of the nuclear rea…
▽ More
In 2022, over half of the web traffic was accessed through mobile devices. By reducing the energy consumption of mobile web apps, we can not only extend the battery life of our devices, but also make a significant contribution to energy conservation efforts. For example, if we could save only 5% of the energy used by web apps, we estimate that it would be enough to shut down one of the nuclear reactors in Fukushima. This paper presents a comprehensive overview of energy-saving experiments and related approaches for mobile web apps, relevant for researchers and practitioners. To achieve this objective, we conducted a systematic literature review and identified 44 primary studies for inclusion. Through the mapping and analysis of scientific papers, this work contributes: (1) an overview of the energy-draining aspects of mobile web apps, (2) a comprehensive description of the methodology used for the energy-saving experiments, and (3) a categorization and synthesis of various energy-saving approaches.
△ Less
Submitted 27 April, 2023; v1 submitted 4 April, 2023;
originally announced April 2023.
-
The Pipeline for the Continuous Development of Artificial Intelligence Models -- Current State of Research and Practice
Authors:
Monika Steidl,
Michael Felderer,
Rudolf Ramler
Abstract:
Companies struggle to continuously develop and deploy AI models to complex production systems due to AI characteristics while assuring quality. To ease the development process, continuous pipelines for AI have become an active research area where consolidated and in-depth analysis regarding the terminology, triggers, tasks, and challenges is required. This paper includes a Multivocal Literature Re…
▽ More
Companies struggle to continuously develop and deploy AI models to complex production systems due to AI characteristics while assuring quality. To ease the development process, continuous pipelines for AI have become an active research area where consolidated and in-depth analysis regarding the terminology, triggers, tasks, and challenges is required. This paper includes a Multivocal Literature Review where we consolidated 151 relevant formal and informal sources. In addition, nine-semi structured interviews with participants from academia and industry verified and extended the obtained information. Based on these sources, this paper provides and compares terminologies for DevOps and CI/CD for AI, MLOps, (end-to-end) lifecycle management, and CD4ML. Furthermore, the paper provides an aggregated list of potential triggers for reiterating the pipeline, such as alert systems or schedules. In addition, this work uses a taxonomy creation strategy to present a consolidated pipeline comprising tasks regarding the continuous development of AI. This pipeline consists of four stages: Data Handling, Model Learning, Software Development and System Operations. Moreover, we map challenges regarding pipeline implementation, adaption, and usage for the continuous development of AI to these four stages.
△ Less
Submitted 21 January, 2023;
originally announced January 2023.
-
Metamorphic Testing in Autonomous System Simulations
Authors:
Jubril Gbolahan Adigun,
Linus Eisele,
Michael Felderer
Abstract:
Metamorphic testing has proven to be effective for test case generation and fault detection in many domains. It is a software testing strategy that uses certain relations between input-output pairs of a program, referred to as metamorphic relations. This approach is relevant in the autonomous systems domain since it helps in cases where the outcome of a given test input may be difficult to determi…
▽ More
Metamorphic testing has proven to be effective for test case generation and fault detection in many domains. It is a software testing strategy that uses certain relations between input-output pairs of a program, referred to as metamorphic relations. This approach is relevant in the autonomous systems domain since it helps in cases where the outcome of a given test input may be difficult to determine. In this paper therefore, we provide an overview of metamorphic testing as well as an implementation in the autonomous systems domain. We implement an obstacle detection and avoidance task in autonomous drones utilising the GNC API alongside a simulation in Gazebo. Particularly, we describe properties and best practices that are crucial for the development of effective metamorphic relations. We also demonstrate two metamorphic relations for metamorphic testing of single and more than one drones, respectively. Our relations reveal several properties and some weak spots of both the implementation and the avoidance algorithm in the light of metamorphic testing. The results indicate that metamorphic testing has great potential in the autonomous systems domain and should be considered for quality assurance in this field.
△ Less
Submitted 22 September, 2022;
originally announced September 2022.
-
Guiding the retraining of convolutional neural networks against adversarial inputs
Authors:
Francisco Durán López,
Silverio Martínez-Fernández,
Michael Felderer,
Xavier Franch
Abstract:
Background: When using deep learning models, there are many possible vulnerabilities and some of the most worrying are the adversarial inputs, which can cause wrong decisions with minor perturbations. Therefore, it becomes necessary to retrain these models against adversarial inputs, as part of the software testing process addressing the vulnerability to these inputs. Furthermore, for an energy ef…
▽ More
Background: When using deep learning models, there are many possible vulnerabilities and some of the most worrying are the adversarial inputs, which can cause wrong decisions with minor perturbations. Therefore, it becomes necessary to retrain these models against adversarial inputs, as part of the software testing process addressing the vulnerability to these inputs. Furthermore, for an energy efficient testing and retraining, data scientists need support on which are the best guidance metrics and optimal dataset configurations.
Aims: We examined four guidance metrics for retraining convolutional neural networks and three retraining configurations. Our goal is to improve the models against adversarial inputs regarding accuracy, resource utilization and time from the point of view of a data scientist in the context of image classification.
Method: We conducted an empirical study in two datasets for image classification. We explore: (a) the accuracy, resource utilization and time of retraining convolutional neural networks by ordering new training set by four different guidance metrics (neuron coverage, likelihood-based surprise adequacy, distance-based surprise adequacy and random), (b) the accuracy and resource utilization of retraining convolutional neural networks with three different configurations (from scratch and augmented dataset, using weights and augmented dataset, and using weights and only adversarial inputs).
Results: We reveal that retraining with adversarial inputs from original weights and by ordering with surprise adequacy metrics gives the best model w.r.t. the used metrics.
Conclusions: Although more studies are necessary, we recommend data scientists to use the above configuration and metrics to deal with the vulnerability to adversarial inputs of deep learning models, as they can improve their models against adversarial inputs without using many inputs.
△ Less
Submitted 12 July, 2022; v1 submitted 8 July, 2022;
originally announced July 2022.
-
STORM: A Model for Sustainably Onboarding Software Testers
Authors:
Tobias Lorey,
Stefan Mohacsi,
Armin Beer,
Michael Felderer
Abstract:
Recruiting and onboarding software testing professionals are complex and cost intensive activities. Whether onboarding is successful and sustainable depends on both the employee as well as the organization and is influenced by a number of often highly individual factors. Therefore, we propose the Software Testing Onboarding Model (STORM) for sustainably onboarding software testing professionals ba…
▽ More
Recruiting and onboarding software testing professionals are complex and cost intensive activities. Whether onboarding is successful and sustainable depends on both the employee as well as the organization and is influenced by a number of often highly individual factors. Therefore, we propose the Software Testing Onboarding Model (STORM) for sustainably onboarding software testing professionals based on existing frameworks and models taking into account onboarding processes, sustainability, and test processes. We provide detailed instructions on how to use the model and apply it to real-world onboarding processes in two industrial case studies.
△ Less
Submitted 2 June, 2022;
originally announced June 2022.
-
Automatic Error Classification and Root Cause Determination while Replaying Recorded Workload Data at SAP HANA
Authors:
Neetha Jambigi,
Thomas Bach,
Felix Schabernack,
Michael Felderer
Abstract:
Capturing customer workloads of database systems to replay these workloads during internal testing can be beneficial for software quality assurance. However, we experienced that such replays can produce a large amount of false positive alerts that make the results unreliable or time consuming to analyze. Therefore, we design a machine learning based approach that attributes root causes to the aler…
▽ More
Capturing customer workloads of database systems to replay these workloads during internal testing can be beneficial for software quality assurance. However, we experienced that such replays can produce a large amount of false positive alerts that make the results unreliable or time consuming to analyze. Therefore, we design a machine learning based approach that attributes root causes to the alerts. This provides several benefits for quality assurance and allows for example to classify whether an alert is true positive or false positive. Our approach considerably reduces manual effort and improves the overall quality assurance for the database system SAP HANA. We discuss the problem, the design and result of our approach, and we present practical limitations that may require further research.
△ Less
Submitted 16 May, 2022;
originally announced May 2022.
-
Towards Understanding the Skill Gap in Cybersecurity
Authors:
Francois Goupil,
Pavel Laskov,
Irdin Pekaric,
Michael Felderer,
Alexander Dürr,
Frederic Thiesse
Abstract:
Given the ongoing "arms race" in cybersecurity, the shortage of skilled professionals in this field is one of the strongest in computer science. The currently unmet staffing demand in cybersecurity is estimated at over 3 million jobs worldwide. Furthermore, the qualifications of the existing workforce are largely believed to be insufficient. We attempt to gain deeper insights into the nature of th…
▽ More
Given the ongoing "arms race" in cybersecurity, the shortage of skilled professionals in this field is one of the strongest in computer science. The currently unmet staffing demand in cybersecurity is estimated at over 3 million jobs worldwide. Furthermore, the qualifications of the existing workforce are largely believed to be insufficient. We attempt to gain deeper insights into the nature of the current skill gap in cybersecurity. To this end, we correlate data from job ads and academic curricula using two kinds of skill characterizations: manual definitions from established skill frameworks as well as "skill topics" automatically derived by text mining tools. Our analysis shows a strong agreement between these two analysis techniques and reveals a substantial undersupply in several crucial skill categories, e.g., software and application security, security management, requirements engineering, compliance, and certification. Based on the results of our analysis, we provide recommendations for future curricula development in cybersecurity so as to decrease the identified skill gaps.
△ Less
Submitted 28 April, 2022;
originally announced April 2022.
-
Software Testing, AI and Robotics (STAIR) Learning Lab
Authors:
Simon Haller-Seeber,
Thomas Gatterer,
Patrick Hofmann,
Christopher Kelter,
Thomas Auer,
Michael Felderer
Abstract:
In this paper we presented the Software Testing, AI and Robotics (STAIR) Learning Lab. STAIR is an initiative started at the University of Innsbruck to bring robotics, Artificial Intelligence (AI) and software testing into schools. In the lab physical and virtual learning units are developed in parallel and in sync with each other. Its core learning approach is based the develop of both a physical…
▽ More
In this paper we presented the Software Testing, AI and Robotics (STAIR) Learning Lab. STAIR is an initiative started at the University of Innsbruck to bring robotics, Artificial Intelligence (AI) and software testing into schools. In the lab physical and virtual learning units are developed in parallel and in sync with each other. Its core learning approach is based the develop of both a physical and simulated robotics environment. In both environments AI scenarios (like traffic sign recognition) are deployed and tested. We present and focus on our newly designed MiniBot that are both built on hardware which was designed for educational and research purposes as well as the simulation environment. Additionally, we describe first learning design concepts and a showcase scenario (i.e., AI-based traffic sign recognition) with different exercises which can easily be extended.
△ Less
Submitted 6 April, 2022;
originally announced April 2022.
-
What is Software Quality for AI Engineers? Towards a Thinning of the Fog
Authors:
Valentina Golendukhina,
Valentina Lenarduzzi,
Michael Felderer
Abstract:
It is often overseen that AI-enabled systems are also software systems and therefore rely on software quality assurance (SQA). Thus, the goal of this study is to investigate the software quality assurance strategies adopted during the development, integration, and maintenance of AI/ML components and code. We conducted semi-structured interviews with representatives of ten Austrian SMEs that develo…
▽ More
It is often overseen that AI-enabled systems are also software systems and therefore rely on software quality assurance (SQA). Thus, the goal of this study is to investigate the software quality assurance strategies adopted during the development, integration, and maintenance of AI/ML components and code. We conducted semi-structured interviews with representatives of ten Austrian SMEs that develop AI-enabled systems. A qualitative analysis of the interview data identified 12 issues in the development of AI/ML components. Furthermore, we identified when quality issues arise in AI/ML components and how they are detected. The results of this study should guide future work on software quality assurance processes and techniques for AI/ML components.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Data Smells: Categories, Causes and Consequences, and Detection of Suspicious Data in AI-based Systems
Authors:
Harald Foidl,
Michael Felderer,
Rudolf Ramler
Abstract:
High data quality is fundamental for today's AI-based systems. However, although data quality has been an object of research for decades, there is a clear lack of research on potential data quality issues (e.g., ambiguous, extraneous values). These kinds of issues are latent in nature and thus often not obvious. Nevertheless, they can be associated with an increased risk of future problems in AI-b…
▽ More
High data quality is fundamental for today's AI-based systems. However, although data quality has been an object of research for decades, there is a clear lack of research on potential data quality issues (e.g., ambiguous, extraneous values). These kinds of issues are latent in nature and thus often not obvious. Nevertheless, they can be associated with an increased risk of future problems in AI-based systems (e.g., technical debt, data-induced faults). As a counterpart to code smells in software engineering, we refer to such issues as Data Smells. This article conceptualizes data smells and elaborates on their causes, consequences, detection, and use in the context of AI-based systems. In addition, a catalogue of 36 data smells divided into three categories (i.e., Believability Smells, Understandability Smells, Consistency Smells) is presented. Moreover, the article outlines tool support for detecting data smells and presents the result of an initial smell detection on more than 240 real-world datasets.
△ Less
Submitted 16 June, 2022; v1 submitted 19 March, 2022;
originally announced March 2022.
-
Social Science Theories in Software Engineering Research
Authors:
Tobias Lorey,
Paul Ralph,
Michael Felderer
Abstract:
As software engineering research becomes more concerned with the psychological, sociological and managerial aspects of software development, relevant theories from reference disciplines are increasingly important for understanding the field's core phenomena of interest. However, the degree to which software engineering research draws on relevant social sciences remains unclear. This study therefor…
▽ More
As software engineering research becomes more concerned with the psychological, sociological and managerial aspects of software development, relevant theories from reference disciplines are increasingly important for understanding the field's core phenomena of interest. However, the degree to which software engineering research draws on relevant social sciences remains unclear. This study therefore investigates the use of social science theories in five influential software engineering journals over 13 years. It analyzes not only the extent of theory use but also what, how and where these theories are used. While 87 different theories are used, less than two percent of papers use a social science theory, most theories are used in only one paper, most social sciences are ignored, and the theories are rarely tested for applicability to software engineering contexts. Ignoring relevant social science theories may (1) undermine the community's ability to generate, elaborate and maintain a cumulative body of knowledge; and (2) lead to oversimplified models of software engineering phenomena. More attention to theory is needed for software engineering to mature as a scientific discipline.
△ Less
Submitted 15 February, 2022;
originally announced February 2022.
-
Cognition in Software Engineering: A Taxonomy and Survey of a Half-Century of Research
Authors:
Fabian Fagerholm,
Michael Felderer,
Davide Fucci,
Michael Unterkalmsteiner,
Bogdan Marculescu,
Markus Martini,
Lars Göran Wallgren Tengberg,
Robert Feldt,
Bettina Lehtelä,
Balázs Nagyváradi,
Jehan Khattak
Abstract:
Cognition plays a fundamental role in most software engineering activities. This article provides a taxonomy of cognitive concepts and a survey of the literature since the beginning of the Software Engineering discipline. The taxonomy comprises the top-level concepts of perception, attention, memory, cognitive load, reasoning, cognitive biases, knowledge, social cognition, cognitive control, and e…
▽ More
Cognition plays a fundamental role in most software engineering activities. This article provides a taxonomy of cognitive concepts and a survey of the literature since the beginning of the Software Engineering discipline. The taxonomy comprises the top-level concepts of perception, attention, memory, cognitive load, reasoning, cognitive biases, knowledge, social cognition, cognitive control, and errors, and procedures to assess them both qualitatively and quantitatively. The taxonomy provides a useful tool to filter existing studies, classify new studies, and support researchers in getting familiar with a (sub) area. In the literature survey, we systematically collected and analysed 311 scientific papers spanning five decades and classified them using the cognitive concepts from the taxonomy. Our analysis shows that the most developed areas of research correspond to the four life-cycle stages, software requirements, design, construction, and maintenance. Most research is quantitative and focuses on knowledge, cognitive load, memory, and reasoning. Overall, the state of the art appears fragmented when viewed from the perspective of cognition. There is a lack of use of cognitive concepts that would represent a coherent picture of the cognitive processes active in specific tasks. Accordingly, we discuss the research gap in each cognitive concept and provide recommendations for future research.
△ Less
Submitted 14 January, 2022;
originally announced January 2022.
-
Collaborative Artificial Intelligence Needs Stronger Assurances Driven by Risks
Authors:
Jubril Gbolahan Adigun,
Matteo Camilli,
Michael Felderer,
Andrea Giusti,
Dominik T Matt,
Anna Perini,
Barbara Russo,
Angelo Susi
Abstract:
Collaborative AI systems (CAISs) aim at working together with humans in a shared space to achieve a common goal. This critical setting yields hazardous circumstances that could harm human beings. Thus, building such systems with strong assurances of compliance with requirements, domain-specific standards and regulations is of greatest importance. Only few scale impact has been reported so far for…
▽ More
Collaborative AI systems (CAISs) aim at working together with humans in a shared space to achieve a common goal. This critical setting yields hazardous circumstances that could harm human beings. Thus, building such systems with strong assurances of compliance with requirements, domain-specific standards and regulations is of greatest importance. Only few scale impact has been reported so far for such systems since much work remains to manage possible risks. We identify emerging problems in this context and then we report our vision, as well as the progress of our multidisciplinary research team composed of software/systems, and mechatronics engineers to develop a risk-driven assurance process for CAISs.
△ Less
Submitted 22 September, 2022; v1 submitted 1 December, 2021;
originally announced December 2021.
-
What Makes Agile Software Development Agile?
Authors:
Marco Kuhrmann,
Paolo Tell,
Regina Hebig,
Jil Klünder,
Jürgen Münch,
Oliver Linssen,
Dietmar Pfahl,
Michael Felderer,
Christian R. Prause,
Stephen G. MacDonell,
Joyce Nakatumba-Nabende,
David Raffo,
Sarah Beecham,
Eray Tüzün,
Gustavo López,
Nicolas Paez,
Diego Fontdevila,
Sherlock A. Licorish,
Steffen Küpper,
Günther Ruhe,
Eric Knauss,
Özden Özcan-Top,
Paul Clarke,
Fergal McCaffery,
Marcela Genero
, et al. (22 additional authors not shown)
Abstract:
Together with many success stories, promises such as the increase in production speed and the improvement in stakeholders' collaboration have contributed to making agile a transformation in the software industry in which many companies want to take part. However, driven either by a natural and expected evolution or by contextual factors that challenge the adoption of agile methods as prescribed by…
▽ More
Together with many success stories, promises such as the increase in production speed and the improvement in stakeholders' collaboration have contributed to making agile a transformation in the software industry in which many companies want to take part. However, driven either by a natural and expected evolution or by contextual factors that challenge the adoption of agile methods as prescribed by their creator(s), software processes in practice mutate into hybrids over time. Are these still agile? In this article, we investigate the question: what makes a software development method agile? We present an empirical study grounded in a large-scale international survey that aims to identify software development methods and practices that improve or tame agility. Based on 556 data points, we analyze the perceived degree of agility in the implementation of standard project disciplines and its relation to used development methods and practices. Our findings suggest that only a small number of participants operate their projects in a purely traditional or agile manner (under 15%). That said, most project disciplines and most practices show a clear trend towards increasing degrees of agility. Compared to the methods used to develop software, the selection of practices has a stronger effect on the degree of agility of a given discipline. Finally, there are no methods or practices that explicitly guarantee or prevent agility. We conclude that agility cannot be defined solely at the process level. Additional factors need to be taken into account when trying to implement or improve agility in a software company. Finally, we discuss the field of software process-related research in the light of our findings and present a roadmap for future research.
△ Less
Submitted 23 September, 2021;
originally announced September 2021.
-
Aspects of Sustainable Test Processes
Authors:
Armin Beer,
Michael Felderer,
Tobias Lorey,
Stefan Mohacsi
Abstract:
Testing is a core software development activity that has huge potential to make software development more sustainable. In this paper, we discuss how environmental, social, economic, and technical sustainability map onto the activities of test planning, design, execution, and evaluation.
Testing is a core software development activity that has huge potential to make software development more sustainable. In this paper, we discuss how environmental, social, economic, and technical sustainability map onto the activities of test planning, design, execution, and evaluation.
△ Less
Submitted 3 April, 2021;
originally announced April 2021.
-
Towards Risk Modeling for Collaborative AI
Authors:
Matteo Camilli,
Michael Felderer,
Andrea Giusti,
Dominik T. Matt,
Anna Perini,
Barbara Russo,
Angelo Susi
Abstract:
Collaborative AI systems aim at working together with humans in a shared space to achieve a common goal. This setting imposes potentially hazardous circumstances due to contacts that could harm human beings. Thus, building such systems with strong assurances of compliance with requirements domain specific standards and regulations is of greatest importance. Challenges associated with the achieveme…
▽ More
Collaborative AI systems aim at working together with humans in a shared space to achieve a common goal. This setting imposes potentially hazardous circumstances due to contacts that could harm human beings. Thus, building such systems with strong assurances of compliance with requirements domain specific standards and regulations is of greatest importance. Challenges associated with the achievement of this goal become even more severe when such systems rely on machine learning components rather than such as top-down rule-based AI. In this paper, we introduce a risk modeling approach tailored to Collaborative AI systems. The risk model includes goals, risk events and domain specific indicators that potentially expose humans to hazards. The risk model is then leveraged to drive assurance methods that feed in turn the risk model through insights extracted from run-time evidence. Our envisioned approach is described by means of a running example in the domain of Industry 4.0, where a robotic arm endowed with a visual perception component, implemented with machine learning, collaborates with a human operator for a production-relevant task.
△ Less
Submitted 12 March, 2021;
originally announced March 2021.
-
Compliance Requirements in Large-Scale Software Development: An Industrial Case Study
Authors:
Muhammad Usman,
Michael Felderer,
Michael Unterkalmsteiner,
Eriks Klotins,
Daniel Mendez,
Emil Alegroth
Abstract:
Regulatory compliance is a well-studied area, including research on how to model, check, analyse, enact, and verify compliance of software. However, while the theoretical body of knowledge is vast, empirical evidence on challenges with regulatory compliance, as faced by industrial practitioners particularly in the Software Engineering domain, is still lacking. In this paper, we report on an indust…
▽ More
Regulatory compliance is a well-studied area, including research on how to model, check, analyse, enact, and verify compliance of software. However, while the theoretical body of knowledge is vast, empirical evidence on challenges with regulatory compliance, as faced by industrial practitioners particularly in the Software Engineering domain, is still lacking. In this paper, we report on an industrial case study which aims at providing insights into common practices and challenges with checking and analysing regulatory compliance, and we discuss our insights in direct relation to the state of reported evidence. Our study is performed at Ericsson AB, a large telecommunications company, which must comply to both locally and internationally governing regulatory entities and standards such as GDPR. The main contributions of this work are empirical evidence on challenges experienced by Ericsson that complement the existing body of knowledge on regulatory compliance.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
Quality Assurance for AI-based Systems: Overview and Challenges
Authors:
Michael Felderer,
Rudolf Ramler
Abstract:
The number and importance of AI-based systems in all domains is growing. With the pervasive use and the dependence on AI-based systems, the quality of these systems becomes essential for their practical usage. However, quality assurance for AI-based systems is an emerging area that has not been well explored and requires collaboration between the SE and AI research communities. This paper discusse…
▽ More
The number and importance of AI-based systems in all domains is growing. With the pervasive use and the dependence on AI-based systems, the quality of these systems becomes essential for their practical usage. However, quality assurance for AI-based systems is an emerging area that has not been well explored and requires collaboration between the SE and AI research communities. This paper discusses terminology and challenges on quality assurance for AI-based systems to set a baseline for that purpose. Therefore, we define basic concepts and characterize AI-based systems along the three dimensions of artifact type, process, and quality characteristics. Furthermore, we elaborate on the key challenges of (1) understandability and interpretability of AI models, (2) lack of specifications and defined requirements, (3) need for validation data and test input generation, (4) defining expected outcomes as test oracles, (5) accuracy and correctness measures, (6) non-functional properties of AI-based systems, (7) self-adaptive and self-learning characteristics, and (8) dynamic and frequently changing environments.
△ Less
Submitted 10 February, 2021;
originally announced February 2021.
-
Controlled Experimentation in Continuous Experimentation: Knowledge and Challenges
Authors:
Florian Auer,
Rasmus Ros,
Lukas Kaltenbrunner,
Per Runeson,
Michael Felderer
Abstract:
Context: Continuous experimentation and A/B testing is an established industry practice that has been researched for more than 10 years. Our aim is to synthesize the conducted research.
Objective: We wanted to find the core constituents of a framework for continuous experimentation and the solutions that are applied within the field. Finally, we were interested in the challenges and benefits rep…
▽ More
Context: Continuous experimentation and A/B testing is an established industry practice that has been researched for more than 10 years. Our aim is to synthesize the conducted research.
Objective: We wanted to find the core constituents of a framework for continuous experimentation and the solutions that are applied within the field. Finally, we were interested in the challenges and benefits reported of continuous experimentation.
Method: We applied forward snowballing on a known set of papers and identified a total of 128 relevant papers. Based on this set of papers we performed two qualitative narrative syntheses and a thematic synthesis to answer the research questions.
Results: The framework constituents for continuous experimentation include experimentation processes as well as supportive technical and organizational infrastructure. The solutions found in the literature were synthesized to nine themes, e.g. experiment design, automated experiments, or metric specification. Concerning the challenges of continuous experimentation, the analysis identified cultural, organizational, business, technical, statistical, ethical, and domain-specific challenges. Further, the study concludes that the benefits of experimentation are mostly implicit in the studies.
Conclusions: The research on continuous experimentation has yielded a large body of knowledge on experimentation. The synthesis of published research presented within include recommended infrastructure and experimentation process models, guidelines to mitigate the identified challenges, and what problems the various published solutions solve.
△ Less
Submitted 17 February, 2021; v1 submitted 10 February, 2021;
originally announced February 2021.
-
Catching up with Method and Process Practice: An Industry-Informed Baseline for Researchers
Authors:
Jil Klünder,
Regina Hebig,
Paolo Tell,
Marco Kuhrmann,
Joyce Nakatumba-Nabende,
Rogardt Heldal,
Stephan Krusche,
Masud Fazal-Baqaie,
Michael Felderer,
Marcela Fabiana Genero Bocco,
Steffen Küpper,
Sherlock A. Licorish,
Gustavo Lòpez,
Fergal McCaffery,
Özden Özcan Top,
Christian R. Prause,
Rafael Prikladnicki,
Eray Tüzün,
Dietmar Pfahl,
Kurt Schneider,
Stephen G. MacDonell
Abstract:
Software development methods are usually not applied by the book. Companies are under pressure to continuously deploy software products that meet market needs and stakeholders' requests. To implement efficient and effective development processes, companies utilize multiple frameworks, methods and practices, and combine these into hybrid methods. A common combination contains a rich management fram…
▽ More
Software development methods are usually not applied by the book. Companies are under pressure to continuously deploy software products that meet market needs and stakeholders' requests. To implement efficient and effective development processes, companies utilize multiple frameworks, methods and practices, and combine these into hybrid methods. A common combination contains a rich management framework to organize and steer projects complemented with a number of smaller practices providing the development teams with tools to complete their tasks. In this paper, based on 732 data points collected through an international survey, we study the software development process use in practice. Our results show that 76.8% of the companies implement hybrid methods. Company size as well as the strategy in devising and evolving hybrid methods affect the suitability of the chosen process to reach company or project goals. Our findings show that companies that combine planned improvement programs with process evolution can increase their process' suitability by up to 5%.
△ Less
Submitted 28 January, 2021;
originally announced January 2021.
-
Mining user reviews of COVID contact-tracing apps: An exploratory analysis of nine European apps
Authors:
Vahid Garousi,
David Cutting,
Michael Felderer
Abstract:
Context: More than 50 countries have developed COVID contact-tracing apps to limit the spread of coronavirus. However, many experts and scientists cast doubt on the effectiveness of those apps. For each app, a large number of reviews have been entered by end-users in app stores. Objective: Our goal is to gain insights into the user reviews of those apps, and to find out the main problems that user…
▽ More
Context: More than 50 countries have developed COVID contact-tracing apps to limit the spread of coronavirus. However, many experts and scientists cast doubt on the effectiveness of those apps. For each app, a large number of reviews have been entered by end-users in app stores. Objective: Our goal is to gain insights into the user reviews of those apps, and to find out the main problems that users have reported. Our focus is to assess the "software in society" aspects of the apps, based on user reviews. Method: We selected nine European national apps for our analysis and used a commercial app-review analytics tool to extract and mine the user reviews. For all the apps combined, our dataset includes 39,425 user reviews. Results: Results show that users are generally dissatisfied with the nine apps under study, except the Scottish ("Protect Scotland") app. Some of the major issues that users have complained about are high battery drainage and doubts on whether apps are really working. Conclusion: Our results show that more work is needed by the stakeholders behind the apps (e.g., app developers, decision-makers, public health experts) to improve the public adoption, software quality and public perception of these apps.
△ Less
Submitted 25 December, 2020;
originally announced December 2020.
-
Empirical Standards for Software Engineering Research
Authors:
Paul Ralph,
Nauman bin Ali,
Sebastian Baltes,
Domenico Bianculli,
Jessica Diaz,
Yvonne Dittrich,
Neil Ernst,
Michael Felderer,
Robert Feldt,
Antonio Filieri,
Breno Bernard Nicolau de França,
Carlo Alberto Furia,
Greg Gay,
Nicolas Gold,
Daniel Graziotin,
Pinjia He,
Rashina Hoda,
Natalia Juristo,
Barbara Kitchenham,
Valentina Lenarduzzi,
Jorge Martínez,
Jorge Melegati,
Daniel Mendez,
Tim Menzies,
Jefferson Molleri
, et al. (18 additional authors not shown)
Abstract:
Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research methods commonly used in software engineering. These living documents, which should be continuously revised to reflect evolving consensus around resear…
▽ More
Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research methods commonly used in software engineering. These living documents, which should be continuously revised to reflect evolving consensus around research best practices, will improve research quality and make peer review more effective, reliable, transparent and fair.
△ Less
Submitted 4 March, 2021; v1 submitted 7 October, 2020;
originally announced October 2020.
-
Retrieving and mining professional experience of software practice from grey literature: an exploratory review
Authors:
Austen Rainer,
Ashley Williams,
Vahid Garousi,
Michael Felderer
Abstract:
Background: Retrieving and mining practitioners' self--reports of their professional experience of software practice could provide valuable evidence for research. We are, however, unaware of any existing reviews of research conducted in this area. Objective: To review and classify previous research, and to identify insights into the challenges research confronts when retrieving and mining practiti…
▽ More
Background: Retrieving and mining practitioners' self--reports of their professional experience of software practice could provide valuable evidence for research. We are, however, unaware of any existing reviews of research conducted in this area. Objective: To review and classify previous research, and to identify insights into the challenges research confronts when retrieving and mining practitioners' self-reports of their experience of software practice. Method: We conduct an exploratory review to identify and classify 42 articles. We analyse a selection of those articles for insights on challenges to mining professional experience. Results: We identify only one directly relevant article. Even then this article concerns the software professional's emotional experiences rather than the professional's reporting of behaviour and events occurring during software practice. We discuss challenges concerning: the prevalence of professional experience; definitions, models and theories; the sparseness of data; units of discourse analysis; annotator agreement; evaluation of the performance of algorithms; and the lack of replications. Conclusion: No directly relevant prior research appears to have been conducted in this area. We discuss the value of reporting negative results in secondary studies. There are a range of research opportunities but also considerable challenges. We formulate a set of guiding questions for further research in this area.
△ Less
Submitted 30 September, 2020;
originally announced September 2020.
-
Why Research on Test-Driven Development is Inconclusive?
Authors:
Mohammad Ghafari,
Timm Gross,
Davide Fucci,
Michael Felderer
Abstract:
[Background] Recent investigations into the effects of Test-Driven Development (TDD) have been contradictory and inconclusive. This hinders development teams to use research results as the basis for deciding whether and how to apply TDD. [Aim] To support researchers when designing a new study and to increase the applicability of TDD research in the decision-making process in the industrial context…
▽ More
[Background] Recent investigations into the effects of Test-Driven Development (TDD) have been contradictory and inconclusive. This hinders development teams to use research results as the basis for deciding whether and how to apply TDD. [Aim] To support researchers when designing a new study and to increase the applicability of TDD research in the decision-making process in the industrial context, we aim at identifying the reasons behind the inconclusive research results in TDD. [Method] We studied the state of the art in TDD research published in top venues in the past decade, and analyzed the way these studies were set up. [Results] We identified five categories of factors that directly impact the outcome of studies on TDD. [Conclusions] This work can help researchers to conduct more reliable studies, and inform practitioners of risks they need to consider when consulting research on TDD.
△ Less
Submitted 19 July, 2020;
originally announced July 2020.
-
Assessing the maturity of software testing services using CMMI-SVC: An industrial case study
Authors:
Vahid Garousi,
Seyfettin Arkan,
Gökhan Urul,
Çağrı Murat Karapıçak,
Michael Felderer
Abstract:
Context: While many companies conduct their software testing activities in-house, many other companies outsource their software testing needs to other firms who act as software testing service providers. As a result, Testing as a Service (TaaS) has emerged as a strong service industry in the last several decades. In the context of software testing services, there could be various challenges (e.g.,…
▽ More
Context: While many companies conduct their software testing activities in-house, many other companies outsource their software testing needs to other firms who act as software testing service providers. As a result, Testing as a Service (TaaS) has emerged as a strong service industry in the last several decades. In the context of software testing services, there could be various challenges (e.g., during the planning and service delivery phases) and, as a result, the quality of testing services is not always as expected. Objective: It is important, for both providers and also customers of testing services, to assess the quality and maturity of test services and subsequently improve them. Method: Motivated by a real industrial need in the context of several testing service providers, to assess the maturity of their software testing services, we chose the existing CMMI for Services maturity model (CMMI-SVC), and conducted a case study using it in the context of two Turkish testing service providers. Results: The case-study results show that maturity appraisal of testing services using CMMI-SVC was helpful for both companies and their test management teams by enabling them objectively assess the maturity of their testing services and also by pinpointing potential improvement areas. Conclusion: We empirically observed that, after some minor customization, CMMI-SVC is indeed a suitable model for maturity appraisal of testing services.
△ Less
Submitted 30 May, 2020; v1 submitted 26 May, 2020;
originally announced May 2020.
-
Risk Management Practices in Information Security: Exploring the Status Quo in the DACH Region
Authors:
Michael Brunner,
Clemens Sauerwein,
Michael Felderer,
Ruth Breu
Abstract:
Information security management aims at ensuring proper protection of information values and information processing systems (i.e. assets). Information security risk management techniques are incorporated to deal with threats and vulnerabilities that impose risks to information security properties of these assets. This paper investigates the current state of risk management practices being used in…
▽ More
Information security management aims at ensuring proper protection of information values and information processing systems (i.e. assets). Information security risk management techniques are incorporated to deal with threats and vulnerabilities that impose risks to information security properties of these assets. This paper investigates the current state of risk management practices being used in information security management in the DACH region (Germany, Austria, Switzerland). We used an anonymous online survey targeting strategic and operative information security and risk managers and collected data from 26 organizations. We analyzed general practices, documentation artifacts, patterns of stakeholder collaboration as well as tool types and data sources used by enterprises to conduct information security management activities. Our findings show that the state of practice of information security risk management is in need of improvement. Current industrial practice heavily relies on manual data collection and complex potentially subjective decision processes with multiple stakeholders involved. Dedicated risk management tools and methods are used selectively and neglected in favor of general-purpose documentation tools and direct communication between stakeholders. In light of our results we propose guidelines for the development of risk management practices that are better aligned with the current operational situation in information security management.
△ Less
Submitted 4 March, 2020;
originally announced March 2020.
-
Software Engineering und Software Engineering Forschung im Zeitalter der Digitalisierung
Authors:
Michael Felderer,
Ralf Reussner,
Bernhard Rumpe
Abstract:
Digitization not only affects society, it also requires a redefinition of the location of computer science and computer scientists, as the science journalist Yogeshwar suggests. Since all official aspects of digitalization are based on software, this article is intended to attempt to redefine the role of software engineering and its research. Software-based products, systems or services are influe…
▽ More
Digitization not only affects society, it also requires a redefinition of the location of computer science and computer scientists, as the science journalist Yogeshwar suggests. Since all official aspects of digitalization are based on software, this article is intended to attempt to redefine the role of software engineering and its research. Software-based products, systems or services are influencing all areas of life and are a critical component and central innovation driver of digitization in all areas of life. Scientifically, there are new opportunities and challenges for software engineering as a driving discipline in the development of any technical innovation. However, the chances must not be sacrificed to the competition for bibliometric numbers as an end in themselves.
△ Less
Submitted 25 February, 2020;
originally announced February 2020.
-
A taxonomy of risk-based testing
Authors:
Michael Felderer,
Ina Schieferdecker
Abstract:
Software testing has often to be done under severe pressure due to limited resources and a challenging time schedule facing the demand to assure the fulfillment of the software requirements. In addition, testing should unveil those software defects that harm the mission-critical functions of the software. Risk-based testing uses risk (re-)assessments to steer all phases of the test process in orde…
▽ More
Software testing has often to be done under severe pressure due to limited resources and a challenging time schedule facing the demand to assure the fulfillment of the software requirements. In addition, testing should unveil those software defects that harm the mission-critical functions of the software. Risk-based testing uses risk (re-)assessments to steer all phases of the test process in order to optimize testing efforts and limit risks of the software-based system. Due to its importance and high practical relevance several risk-based testing approaches were proposed in academia and industry. This paper presents a taxonomy of risk-based testing providing a framework to understand, categorize, assess, and compare risk-based testing approaches to support their selection and tailoring for specific purposes. The taxonomy is aligned with the consideration of risks in all phases of the test process and consists of the top-level classes risk drivers, risk assessment, and risk-based test process. The taxonomy of risk-based testing has been developed by analyzing the work presented in available publications on risk-based testing. Afterwards, it has been applied to the work on risk-based testing presented in this special section of the International Journal on Software Tools for Technology Transfer.
△ Less
Submitted 24 December, 2019;
originally announced December 2019.
-
The Evolution of Empirical Methods in Software Engineering
Authors:
Michael Felderer,
Guilherme Horta Travassos
Abstract:
Empirical methods like experimentation have become a powerful means to drive the field of software engineering by creating scientific evidence on software development, operation, and maintenance, but also by supporting practitioners in their decision making and learning. Today empirical methods are fully applied in software engineering. However, they have developed in several iterations since the…
▽ More
Empirical methods like experimentation have become a powerful means to drive the field of software engineering by creating scientific evidence on software development, operation, and maintenance, but also by supporting practitioners in their decision making and learning. Today empirical methods are fully applied in software engineering. However, they have developed in several iterations since the 1960s. In this chapter we tell the history of empirical software engineering and present the evolution of empirical methods in software engineering in five iterations, i.e., (1) mid-1960s to mid-1970s, (2) mid-1970s to mid-1980s, (3) mid-1980s to end of the 1990s, (4) the 2000s, and (5) the 2010s. We present the five iterations of the development of empirical software engineering mainly from a methodological perspective and additionally take key papers, venues, and books, which are covered in chronological order in a separate section on recommended further readings, into account. We complement our presentation of the evolution of empirical software engineering by presenting the current situation and an outlook in Sect. 4 and the available books on empirical software engineering. Furthermore, based on the chapters covered in this book we discuss trends on contemporary empirical methods in software engineering related to the plurality of research methods, human factors, data collection and processing, aggregation and synthesis of evidence, and impact of software engineering research.
△ Less
Submitted 8 May, 2020; v1 submitted 24 December, 2019;
originally announced December 2019.
-
Characteristics of an Online Controlled Experiment: Preliminary Results of a Literature Review
Authors:
Florian Auer,
Michael Felderer
Abstract:
In this paper the preliminary results of a literature review on characteristics used to define continuous experiments are presented. In total 14 papers were selected. The results were synthesized into a model that gives an overview of all characteristics.
In this paper the preliminary results of a literature review on characteristics used to define continuous experiments are presented. In total 14 papers were selected. The results were synthesized into a model that gives an overview of all characteristics.
△ Less
Submitted 10 December, 2019; v1 submitted 2 December, 2019;
originally announced December 2019.
-
Benefitting from the Grey Literature in Software Engineering Research
Authors:
Vahid Garousi,
Michael Felderer,
Mika V. Mäntylä,
Austen Rainer
Abstract:
Researchers generally place the most trust in peer-reviewed, published information, such as journals and conference papers. By contrast, software engineering (SE) practitioners typically do not have the time, access or expertise to review and benefit from such publications. As a result, practitioners are more likely to turn to other sources of information that they trust, e.g., trade magazines, on…
▽ More
Researchers generally place the most trust in peer-reviewed, published information, such as journals and conference papers. By contrast, software engineering (SE) practitioners typically do not have the time, access or expertise to review and benefit from such publications. As a result, practitioners are more likely to turn to other sources of information that they trust, e.g., trade magazines, online blog-posts, survey results or technical reports, collectively referred to as Grey Literature (GL). Furthermore, practitioners also share their ideas and experiences as GL, which can serve as a valuable data source for research. While GL itself is not a new topic in SE, using, benefitting and synthesizing knowledge from the GL in SE is a contemporary topic in empirical SE research and we are seeing that researchers are increasingly benefitting from the knowledge available within GL. The goal of this chapter is to provide an overview to GL in SE, together with insights on how SE researchers can effectively use and benefit from the knowledge and evidence available in the vast amount of GL.
△ Less
Submitted 27 November, 2019;
originally announced November 2019.