Skip to main content

Showing 1–7 of 7 results for author: Majdinasab, V

  1. arXiv:2407.08890  [pdf, other

    cs.SE cs.AI cs.LG

    DeepCodeProbe: Towards Understanding What Models Trained on Code Learn

    Authors: Vahid Majdinasab, Amin Nikanjam, Foutse Khomh

    Abstract: Machine learning models trained on code and related artifacts offer valuable support for software maintenance but suffer from interpretability issues due to their complex internal variables. These concerns are particularly significant in safety-critical applications where the models' decision-making processes must be reliable. The specific features and representations learned by these models remai… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    ACM Class: I.2.5; D.2.3

  2. arXiv:2402.09299  [pdf, other

    cs.SE cs.LG

    Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code

    Authors: Vahid Majdinasab, Amin Nikanjam, Foutse Khomh

    Abstract: Code auditing ensures that the developed code adheres to standards, regulations, and copyright protection by verifying that it does not contain code from protected sources. The recent advent of Large Language Models (LLMs) as coding assistants in the software development process poses new challenges for code auditing. The dataset for training these models is mainly collected from publicly availabl… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: Submitted to TOSEM (ACM Transactions on Software Engineering and Methodology)

  3. arXiv:2311.11177  [pdf, other

    cs.SE

    Assessing the Security of GitHub Copilot Generated Code -- A Targeted Replication Study

    Authors: Vahid Majdinasab, Michael Joshua Bishop, Shawn Rasheed, Arghavan Moradidakhel, Amjed Tahir, Foutse Khomh

    Abstract: AI-powered code generation models have been developing rapidly, allowing developers to expedite code generation and thus improve their productivity. These models are trained on large corpora of code (primarily sourced from public repositories), which may contain bugs and vulnerabilities. Several concerns have been raised about the security of the code generated by these models. Recent studies have… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  4. arXiv:2308.16557  [pdf, other

    cs.SE

    Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing

    Authors: Arghavan Moradi Dakhel, Amin Nikanjam, Vahid Majdinasab, Foutse Khomh, Michel C. Desmarais

    Abstract: One of the critical phases in software development is software testing. Testing helps with identifying potential bugs and reducing maintenance costs. The goal of automated test generation tools is to ease the development of tests by suggesting efficient bug-revealing tests. Recently, researchers have leveraged Large Language Models (LLMs) of code to generate unit tests. While the code coverage of… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: 16 pages, 3 figures

  5. arXiv:2307.13777  [pdf, other

    cs.SE cs.AI

    An Empirical Study on Bugs Inside PyTorch: A Replication Study

    Authors: Sharon Chee Yin Ho, Vahid Majdinasab, Mohayeminul Islam, Diego Elias Costa, Emad Shihab, Foutse Khomh, Sarah Nadi, Muhammad Raza

    Abstract: Software systems are increasingly relying on deep learning components, due to their remarkable capability of identifying complex data patterns and powering intelligent behaviour. A core enabler of this change in software development is the availability of easy-to-use deep learning libraries. Libraries like PyTorch and TensorFlow empower a large variety of intelligent systems, offering a multitude… ▽ More

    Submitted 1 August, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

  6. arXiv:2301.05651  [pdf, other

    cs.LG cs.SE

    Mutation Testing of Deep Reinforcement Learning Based on Real Faults

    Authors: Florian Tambon, Vahid Majdinasab, Amin Nikanjam, Foutse Khomh, Giuliano Antonio

    Abstract: Testing Deep Learning (DL) systems is a complex task as they do not behave like traditional systems would, notably because of their stochastic nature. Nonetheless, being able to adapt existing testing techniques such as Mutation Testing (MT) to DL settings would greatly improve their potential verifiability. While some efforts have been made to extend MT to the Supervised Learning paradigm, little… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

    Comments: Accepted to the International Conference of Software Testing (ICST2023)

  7. arXiv:2206.15331  [pdf, other

    cs.SE cs.LG

    GitHub Copilot AI pair programmer: Asset or Liability?

    Authors: Arghavan Moradi Dakhel, Vahid Majdinasab, Amin Nikanjam, Foutse Khomh, Michel C. Desmarais, Zhen Ming, Jiang

    Abstract: Automatic program synthesis is a long-lasting dream in software engineering. Recently, a promising Deep Learning (DL) based solution, called Copilot, has been proposed by OpenAI and Microsoft as an industrial product. Although some studies evaluate the correctness of Copilot solutions and report its issues, more empirical evaluations are necessary to understand how developers can benefit from it e… ▽ More

    Submitted 14 April, 2023; v1 submitted 30 June, 2022; originally announced June 2022.

    Comments: 27 pages, 8 figures