subscribe to arXiv mailings

SZZ in the time of Pull Requests

Authors: Fernando Petrulio, David Ackermann, Enrico Fregnan, Gül Calikli, Marco Castelluccio, Sylvestre Ledru, Calixte Denizet, Emma Humphries, Alberto Bacchelli

Abstract: In the multi-commit development model, programmers complete tasks (e.g., implementing a feature) by organizing their work in several commits and packaging them into a commit-set. Analyzing data from developers using this model can be useful to tackle challenging developers' needs, such as knowing which features introduce a bug as well as assessing the risk of integrating certain features in a rele… ▽ More In the multi-commit development model, programmers complete tasks (e.g., implementing a feature) by organizing their work in several commits and packaging them into a commit-set. Analyzing data from developers using this model can be useful to tackle challenging developers' needs, such as knowing which features introduce a bug as well as assessing the risk of integrating certain features in a release. However, to do so one first needs to identify fix-inducing commit-sets. For such an identification, the SZZ algorithm is the most natural candidate, but its performance has not been evaluated in the multi-commit context yet. In this study, we conduct an in-depth investigation on the reliability and performance of SZZ in the multi-commit model. To obtain a reliable ground truth, we consider an already existing SZZ dataset and adapt it to the multi-commit context. Moreover, we devise a second dataset that is more extensive and directly created by developers as well as Quality Assurance (QA) engineers of Mozilla. Based on these datasets, we (1) test the performance of B-SZZ and its non-language-specific SZZ variations in the context of the multi-commit model, (2) investigate the reasons behind their specific behavior, and (3) analyze the impact of non-relevant commits in a commit-set and automatically detect them before using SZZ. △ Less

Submitted 7 September, 2022; originally announced September 2022.

arXiv:2208.04259 [pdf, other]

doi 10.1145/3540250.3549177

First Come First Served: The Impact of File Position on Code Review

Authors: Enrico Fregnan, Larissa Braz, Marco D'Ambros, Gül Çalikli, Alberto Bacchelli

Abstract: The most popular code review tools (e.g., Gerrit and GitHub) present the files to review sorted in alphabetical order. Could this choice or, more generally, the relative position in which a file is presented bias the outcome of code reviews? We investigate this hypothesis by triangulating complementary evidence in a two-step study. First, we observe developers' code review activity. We analyze t… ▽ More The most popular code review tools (e.g., Gerrit and GitHub) present the files to review sorted in alphabetical order. Could this choice or, more generally, the relative position in which a file is presented bias the outcome of code reviews? We investigate this hypothesis by triangulating complementary evidence in a two-step study. First, we observe developers' code review activity. We analyze the review comments pertaining to 219,476 Pull Requests (PRs) from 138 popular Java projects on GitHub. We found files shown earlier in a PR to receive more comments than files shown later, also when controlling for possible confounding factors: e.g., the presence of discussion threads or the lines added in a file. Second, we measure the impact of file position on defect finding in code review. Recruiting 106 participants, we conduct an online controlled experiment in which we measure participants' performance in detecting two unrelated defects seeded into two different files. Participants are assigned to one of two treatments in which the position of the defective files is switched. For one type of defect, participants are not affected by its file's position; for the other, they have 64% lower odds to identify it when its file is last as opposed to first. Overall, our findings provide evidence that the relative position in which files are presented has an impact on code reviews' outcome; we discuss these results and implications for tool design and code review. Data and materials: https://doi.org/10.5281/zenodo.6901285 △ Less

Submitted 8 August, 2022; originally announced August 2022.

Comments: This paper has been accepted at ESEC/FSE '22 (30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering)

arXiv:2202.10985 [pdf, other]

A Laboratory Experiment on Using Different Financial-Incentivization Schemes in Software-Engineering Experimentation

Authors: Dmitri Bershadskyy, Jacob Krüger, Gül Çalıklı, Siegmar Otto, Sarah Zabel, Jannik Greif, Robert Heyer

Abstract: In software-engineering research, many empirical studies are conducted with open-source or industry developers. However, in contrast to other research communities like economics or psychology, only few experiments use financial incentives (i.e., paying money) as a strategy to motivate participants' behavior and reward their performance. The most recent version of the SIGSOFT Empirical Standards me… ▽ More In software-engineering research, many empirical studies are conducted with open-source or industry developers. However, in contrast to other research communities like economics or psychology, only few experiments use financial incentives (i.e., paying money) as a strategy to motivate participants' behavior and reward their performance. The most recent version of the SIGSOFT Empirical Standards mentions payouts only for increasing participation in surveys, but not for mimicking real-world motivations and behavior in experiments. Within this article, we report a controlled experiment in which we tackled this gap by studying how different financial incentivization schemes impact developers. For this purpose, we first conducted a survey on financial incentives used in the real-world, based on which we designed three incentivization schemes: (1) a performance-dependent scheme that employees prefer, (2) a scheme that is performance-independent, and (3) a scheme that mimics open-source development. Then, using a between-subject experimental design, we explored how these three schemes impact participants' performance. Our findings indicate that the different schemes can impact participants' performance in software-engineering experiments. Due to the small sample sizes, our results are not statistically significant, but we can still observe clear tendencies. Our contributions help understand the impact of financial incentives on participants in experiments as well as real-world scenarios, guiding researchers in designing experiments and organizations in compensating developers. △ Less

Submitted 19 March, 2024; v1 submitted 22 February, 2022; originally announced February 2022.

Comments: Laboratory experiment for our registered report (previous preprints) with tracked changes, submitted for peer review

arXiv:2202.04586 [pdf, other]

doi 10.1145/3510003.3511560

Less is More: Supporting Developers in Vulnerability Detection during Code Review

Authors: Larissa Braz, Christian Aeberhard, Gül Çalikli, Alberto Bacchelli

Abstract: Reviewing source code from a security perspective has proven to be a difficult task. Indeed, previous research has shown that developers often miss even popular and easy-to-detect vulnerabilities during code review. Initial evidence suggests that a significant cause may lie in the reviewers' mental attitude and common practices. In this study, we investigate whether and how explicitly asking devel… ▽ More Reviewing source code from a security perspective has proven to be a difficult task. Indeed, previous research has shown that developers often miss even popular and easy-to-detect vulnerabilities during code review. Initial evidence suggests that a significant cause may lie in the reviewers' mental attitude and common practices. In this study, we investigate whether and how explicitly asking developers to focus on security during a code review affects the detection of vulnerabilities. Furthermore, we evaluate the effect of providing a security checklist to guide the security review. To this aim, we conduct an online experiment with 150 participants, of which 71% report to have three or more years of professional development experience. Our results show that simply asking reviewers to focus on security during the code review increases eight times the probability of vulnerability detection. The presence of a security checklist does not significantly improve the outcome further, even when the checklist is tailored to the change under review and the existing vulnerabilities in the change. These results provide evidence supporting the mental attitude hypothesis and call for further work on security checklists' effectiveness and design. Data and materials: https://doi.org/10.5281/zenodo.6026291 △ Less

Submitted 12 February, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

Comments: This paper has been accepted at ICSE 2022 (44th ACM/IEEE International Conference on Software Engineering). This version of the paper uses a different title to match the one used for ICSE 2022

arXiv:2201.05425 [pdf, other]

Interpersonal Conflicts During Code Review

Authors: Pavlína Wurzel Gonçalves, Gül Çalıklı, Alberto Bacchelli

Abstract: Code review consists of manual inspection, discussion, and judgment of source code by developers other than the code's author. Due to discussions around competing ideas and group decision-making processes, interpersonal conflicts during code reviews are expected. This study systematically investigates how developers perceive code review conflicts and addresses interpersonal conflicts during code r… ▽ More Code review consists of manual inspection, discussion, and judgment of source code by developers other than the code's author. Due to discussions around competing ideas and group decision-making processes, interpersonal conflicts during code reviews are expected. This study systematically investigates how developers perceive code review conflicts and addresses interpersonal conflicts during code reviews as a theoretical construct. Through the thematic analysis of interviews conducted with 22 developers, we confirm that conflicts during code reviews are commonplace, anticipated and seen as normal by developers. Even though conflicts do happen and carry a negative impact for the review, conflicts-if resolved constructively-can also create value and bring improvement. Moreover, the analysis provided insights on how strongly conflicts during code review and its context (i.e., code, developer, team, organization) are intertwined. Finally, there are aspects specific to code review conflicts that call for the research and application of customized conflict resolution and management techniques, some of which are discussed in this paper. Data and material: https://doi.org/10.5281/zenodo.5848794 △ Less

Submitted 14 January, 2022; originally announced January 2022.

Comments: Paper also published in the Proceedings of the ACM on Human Computer Interaction (HCI) for the 25th ACM Conference On Computer-Supported Cooperative Work And Social Computing (CSCW 2022)

arXiv:2102.06251 [pdf, other]

Why Don't Developers Detect Improper Input Validation?'; DROP TABLE Papers; --

Authors: Larissa Braz, Enrico Fregnan, Gül Çalikli, Alberto Bacchelli

Abstract: Improper Input Validation (IIV) is a software vulnerability that occurs when a system does not safely handle input data. Even though IIV is easy to detect and fix, it still commonly happens in practice. In this paper, we study to what extent developers can detect IIV and investigate underlying reasons. This knowledge is essential to better understand how to support developers in creating secure so… ▽ More Improper Input Validation (IIV) is a software vulnerability that occurs when a system does not safely handle input data. Even though IIV is easy to detect and fix, it still commonly happens in practice. In this paper, we study to what extent developers can detect IIV and investigate underlying reasons. This knowledge is essential to better understand how to support developers in creating secure software systems. We conduct an online experiment with 146 participants, of which 105 report at least three years of professional software development experience. Our results show that the existence of a visible attack scenario facilitates the detection of IIV vulnerabilities and that a significant portion of developers who did not find the vulnerability initially could identify it when warned about its existence. Yet, a total of 60 participants could not detect the vulnerability even after the warning. Other factors, such as the frequency with which the participants perform code reviews, influence the detection of IIV. Data and materials: https://doi.org/10.5281/zenodo.3996696 △ Less

Submitted 11 February, 2021; originally announced February 2021.

arXiv:1807.07800 [pdf, other]

Safety-Critical Systems and Agile Development: A Mapping Study

Authors: Rashidah Kasauli, Eric Knauss, Benjamin Kanagwa, Agneta Nilsson, Gul Calikli

Abstract: In the last decades, agile methods had a huge impact on how software is developed. In many cases, this has led to significant benefits, such as quality and speed of software deliveries to customers. However, safety-critical systems have widely been dismissed from benefiting from agile methods. Products that include safety critical aspects are therefore faced with a situation in which the developme… ▽ More In the last decades, agile methods had a huge impact on how software is developed. In many cases, this has led to significant benefits, such as quality and speed of software deliveries to customers. However, safety-critical systems have widely been dismissed from benefiting from agile methods. Products that include safety critical aspects are therefore faced with a situation in which the development of safety-critical parts can significantly limit the potential speed-up through agile methods, for the full product, but also in the non-safety critical parts. For such products, the ability to develop safety-critical software in an agile way will generate a competitive advantage. In order to enable future research in this important area, we present in this paper a mapping of the current state of practice based on {a mixed method approach}. Starting from a workshop with experts from six large Swedish product development companies we develop a lens for our analysis. We then present a systematic mapping study on safety-critical systems and agile development through this lens in order to map potential benefits, challenges, and solution candidates for guiding future research. △ Less

Submitted 3 August, 2018; v1 submitted 20 July, 2018; originally announced July 2018.

Comments: Accepted at Euromicro Conf. on Software Engineering and Advanced Applications 2018, Prague, Czech Republic

arXiv:1805.01151 [pdf, other]

Involving External Stakeholders in Project Courses

Authors: Jan-Philipp Steghöfer, Håkan Burden, Regina Hebig, Gul Calikli, Robert Feldt, Imed Hammouda, Jennifer Horkoff, Eric Knauss, Grischa Liebel

Abstract: Problem: The involvement of external stakeholders in capstone projects and project courses is desirable due to its potential positive effects on the students. Capstone projects particularly profit from the inclusion of an industrial partner to make the project relevant and help students acquire professional skills. In addition, an increasing push towards education that is aligned with industry and… ▽ More Problem: The involvement of external stakeholders in capstone projects and project courses is desirable due to its potential positive effects on the students. Capstone projects particularly profit from the inclusion of an industrial partner to make the project relevant and help students acquire professional skills. In addition, an increasing push towards education that is aligned with industry and incorporates industrial partners can be observed. However, the involvement of external stakeholders in teaching moments can create friction and could, in the worst case, lead to frustration of all involved parties. Contribution: We developed a model that allows analysing the involvement of external stakeholders in university courses both in a retrospective fashion, to gain insights from past course instances, and in a constructive fashion, to plan the involvement of external stakeholders. Key Concepts: The conceptual model and the accompanying guideline guide the teachers in their analysis of stakeholder involvement. The model is comprised of several activities (define, execute, and evaluate the collaboration). The guideline provides questions that the teachers should answer for each of these activities. In the constructive use, the model allows teachers to define an action plan based on an analysis of potential stakeholders and the pedagogical objectives. In the retrospective use, the model allows teachers to identify issues that appeared during the project and their underlying causes. Drawing from ideas of the reflective practitioner, the model contains an emphasis on reflection and interpretation of the observations made by the teacher and other groups involved in the courses. Key Lessons: Applying the model retrospectively to a total of eight courses shows that it is possible to reveal hitherto implicit risks and assumptions and to gain a better insight into the interaction... △ Less

Submitted 4 May, 2018; v1 submitted 3 May, 2018; originally announced May 2018.

Comments: Abstract shortened since arxiv.org limits length of abstracts. See paper/pdf for full abstract. Paper is forthcoming, accepted August 2017. Arxiv version 2 corrects misspelled author name

Journal ref: ACM Transactions on Computing Education (TOCE), acc. August 2017

Showing 1–8 of 8 results for author: Calikli, G