Skip to main content

Showing 1–35 of 35 results for author: Devanbu, P

  1. arXiv:2405.02828  [pdf, other

    cs.SE cs.LG

    Trojans in Large Language Models of Code: A Critical Review through a Trigger-Based Taxonomy

    Authors: Aftab Hussain, Md Rafiqul Islam Rabin, Toufique Ahmed, Bowen Xu, Premkumar Devanbu, Mohammad Amin Alipour

    Abstract: Large language models (LLMs) have provided a lot of exciting new capabilities in software development. However, the opaque nature of these models makes them difficult to reason about and inspect. Their opacity gives rise to potential security risks, as adversaries can train and deploy compromised models to disrupt the software development process in the victims' organization. This work presents… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2305.03803

  2. arXiv:2404.19318  [pdf, other

    cs.SE cs.CL

    Enhancing Trust in LLM-Generated Code Summaries with Calibrated Confidence Scores

    Authors: Yuvraj Virk, Premkumar Devanbu, Toufique Ahmed

    Abstract: A good summary can often be very useful during program comprehension. While a brief, fluent, and relevant summary can be helpful, it does require significant human effort to produce. Often, good summaries are unavailable in software projects, thus making maintenance more difficult. There has been a considerable body of research into automated AI-based methods, using Large Language models (LLMs), t… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  3. arXiv:2403.17134  [pdf, other

    cs.SE cs.AI

    RepairAgent: An Autonomous, LLM-Based Agent for Program Repair

    Authors: Islem Bouzenia, Premkumar Devanbu, Michael Pradel

    Abstract: Automated program repair has emerged as a powerful technique to mitigate the impact of software bugs on system reliability and user experience. This paper introduces RepairAgent, the first work to address the program repair challenge through an autonomous agent based on a large language model (LLM). Unlike existing deep learning-based approaches, which prompt a model with a fixed prompt or in a fi… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  4. arXiv:2403.07506  [pdf, other

    cs.SE

    Robustness, Security, Privacy, Explainability, Efficiency, and Usability of Large Language Models for Code

    Authors: Zhou Yang, Zhensu Sun, Terry Zhuo Yue, Premkumar Devanbu, David Lo

    Abstract: Large language models for code (LLM4Code), which demonstrate strong performance (e.g., high accuracy) in processing source code, have significantly transformed software engineering. Many studies separately investigate the non-functional properties of LM4Code, but there is no systematic review of how these properties are evaluated and enhanced. This paper fills this gap by thoroughly examining 146… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  5. arXiv:2402.15100  [pdf, other

    cs.SE cs.LG

    Studying LLM Performance on Closed- and Open-source Data

    Authors: Toufique Ahmed, Christian Bird, Premkumar Devanbu, Saikat Chakraborty

    Abstract: Large Language models (LLMs) are finding wide use in software engineering practice. These models are extremely data-hungry, and are largely trained on open-source (OSS) code distributed with permissive licenses. In terms of actual use however, a great deal of software development still occurs in the for-profit/proprietary sphere, where the code under development is not, and never has been, in the… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  6. arXiv:2402.02047  [pdf, other

    cs.SE cs.LG

    Calibration and Correctness of Language Models for Code

    Authors: Claudio Spiess, David Gros, Kunal Suresh Pai, Michael Pradel, Md Rafiqul Islam Rabin, Amin Alipour, Susmit Jha, Prem Devanbu, Toufique Ahmed

    Abstract: Machine learning models are widely used but can also often be wrong. Users would benefit from a reliable indication of whether a given output from a given model should be trusted, so a rational decision can be made whether to use the output or not. For example, outputs can be associated with a confidence measure; if this confidence measure is strongly associated with likelihood of correctness, the… ▽ More

    Submitted 16 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  7. arXiv:2306.11943  [pdf, other

    cs.SE cs.CL cs.LG

    Towards Understanding What Code Language Models Learned

    Authors: Toufique Ahmed, Dian Yu, Chengxuan Huang, Cathy Wang, Prem Devanbu, Kenji Sagae

    Abstract: Pre-trained language models are effective in a variety of natural language tasks, but it has been argued their capabilities fall short of fully learning meaning or understanding language. To understand the extent to which language models can learn some form of meaning, we investigate their ability to capture semantics of code beyond superficial frequency and co-occurrence. In contrast to previous… ▽ More

    Submitted 27 February, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

  8. arXiv:2306.00108  [pdf, other

    cs.SE cs.LG

    Better patching using LLM prompting, via Self-Consistency

    Authors: Toufique Ahmed, Premkumar Devanbu

    Abstract: Large Language models (LLMs) can be induced to solve non-trivial problems with "few-shot" prompts including illustrative problem-solution examples. Now if the few-shots also include "chain of thought" (CoT) explanations, which are of the form problem-explanation-solution, LLMs will generate a "explained" solution, and perform even better. Recently an exciting, substantially better technique, self-… ▽ More

    Submitted 16 August, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

    Comments: Accepted at ASE-NIER (2023) track

  9. arXiv:2305.03803  [pdf, other

    cs.SE

    A Survey of Trojans in Neural Models of Source Code: Taxonomy and Techniques

    Authors: Aftab Hussain, Md Rafiqul Islam Rabin, Toufique Ahmed, Navid Ayoobi, Bowen Xu, Prem Devanbu, Mohammad Amin Alipour

    Abstract: In this work, we study literature in Explainable AI and Safe AI to understand poisoning of neural models of code. In order to do so, we first establish a novel taxonomy for Trojan AI for code, and present a new aspect-based classification of triggers in neural models of code. Next, we highlight recent works that help us deepen our conception of how these models understand software code. Then we pi… ▽ More

    Submitted 18 April, 2024; v1 submitted 5 May, 2023; originally announced May 2023.

  10. arXiv:2304.14597  [pdf, other

    cs.SE

    AI Safety Subproblems for Software Engineering Researchers

    Authors: David Gros, Prem Devanbu, Zhou Yu

    Abstract: In this 4-page manuscript we discuss the problem of long-term AI Safety from a Software Engineering (SE) research viewpoint. We briefly summarize long-term AI Safety, and the challenge of avoiding harms from AI as systems meet or exceed human capabilities, including software engineering capabilities (and approach AGI / "HLMI"). We perform a quantified literature review suggesting that AI Safety di… ▽ More

    Submitted 31 August, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: Arxived Apr 2023. Update June 2023 to correct some typos and small text changes. Update Sept 2023, small typos/adjustment, adjust intro to clarify citation analysis focus on HLMI / advanced AI, rerun scripts and tweak handling of unknown venues, add TOSEM, de-anon github and acknowledgements

  11. arXiv:2304.06815  [pdf, other

    cs.SE cs.LG

    Automatic Semantic Augmentation of Language Model Prompts (for Code Summarization)

    Authors: Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr

    Abstract: Large Language Models (LLM) are a new class of computation engines, "programmed" via prompt engineering. We are still learning how to best "program" these LLMs to help developers. We start with the intuition that developers tend to consciously and unconsciously have a collection of semantics facts in mind when working on coding tasks. Mostly these are shallow, simple facts arising from a quick rea… ▽ More

    Submitted 11 January, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: Accepted at International Conference on Software Engineering (ICSE-2024)

  12. arXiv:2303.11455  [pdf, other

    cs.SE cs.CL cs.LG

    Large Language Models and Simple, Stupid Bugs

    Authors: Kevin Jesse, Toufique Ahmed, Premkumar T. Devanbu, Emily Morgan

    Abstract: With the advent of powerful neural language models, AI-based systems to assist developers in coding tasks are becoming widely available; Copilot is one such system. Copilot uses Codex, a large language model (LLM), to complete code conditioned on a preceding "prompt". Codex, however, is trained on public GitHub repositories, viz., on code that may include bugs and vulnerabilities. Previous studies… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: Accepted at International Conference on Mining Software Repositories (MSR-2023)

  13. arXiv:2301.01701  [pdf, other

    cs.CR cs.AI cs.LG cs.SE

    Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries

    Authors: Ali Al-Kaswan, Toufique Ahmed, Maliheh Izadi, Anand Ashok Sawant, Premkumar Devanbu, Arie van Deursen

    Abstract: Reverse engineering binaries is required to understand and analyse programs for which the source code is unavailable. Decompilers can transform the largely unreadable binaries into a more readable source code-like representation. However, reverse engineering is time-consuming, much of which is taken up by labelling the functions with semantic information. While the automated summarisation of dec… ▽ More

    Submitted 13 January, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

    Comments: SANER 2023 Technical Track Camera Ready

  14. arXiv:2207.04237  [pdf, other

    cs.SE cs.LG

    Few-shot training LLMs for project-specific code-summarization

    Authors: Toufique Ahmed, Premkumar Devanbu

    Abstract: Very large language models (LLMs), such as GPT-3 and Codex have achieved state-of-the-art performance on several natural-language tasks, and show great promise also for code. A particularly exciting aspect of LLMs is their knack for few-shot and zero-shot learning: they can learn to perform a task with very few examples. Few-shotting has particular synergies in software engineering, where there ar… ▽ More

    Submitted 8 September, 2022; v1 submitted 9 July, 2022; originally announced July 2022.

    Comments: Accepted at ASE-NIER (2022) track

  15. arXiv:2206.07585  [pdf, other

    cs.PL cs.AI cs.LG cs.SE

    NatGen: Generative pre-training by "Naturalizing" source code

    Authors: Saikat Chakraborty, Toufique Ahmed, Yangruibo Ding, Premkumar Devanbu, Baishakhi Ray

    Abstract: Pre-trained Generative Language models (e.g. PLBART, CodeT5, SPT-Code) for source code yielded strong results on several tasks in the past few years, including code generation and translation. These models have adopted varying pre-training objectives to learn statistics of code construction from very large-scale corpora in a self-supervised fashion; the success of pre-trained models largely hinges… ▽ More

    Submitted 5 July, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: Accepted to be published in ESEC/FSE 2022

  16. arXiv:2206.00804  [pdf, other

    cs.SE cs.LG

    Learning code summarization from a small and local dataset

    Authors: Toufique Ahmed, Premkumar Devanbu

    Abstract: Foundation models (e.g., CodeBERT, GraphCodeBERT, CodeT5) work well for many software engineering tasks. These models are pre-trained (using self-supervision) with billions of code tokens, and then fine-tuned with hundreds of thousands of labeled examples, typically drawn from many projects. However, software phenomena can be very project-specific. Vocabulary, and other phenomena vary substantiall… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  17. Multilingual training for Software Engineering

    Authors: Toufique Ahmed, Premkumar Devanbu

    Abstract: Well-trained machine-learning models, which leverage large amounts of open-source software data, have now become an interesting approach to automating many software engineering tasks. Several SE tasks have all been subject to this approach, with performance gradually improving over the past several years with better models and training methods. More, and more diverse, clean, labeled data is better… ▽ More

    Submitted 2 February, 2022; v1 submitted 3 December, 2021; originally announced December 2021.

    Comments: Accepted at International Conference on Software Engineering (ICSE-2022)

  18. arXiv:2104.14671  [pdf, other

    cs.SE cs.LG

    SYNFIX: Automatically Fixing Syntax Errors using Compiler Diagnostics

    Authors: Toufique Ahmed, Noah Rose Ledesma, Premkumar Devanbu

    Abstract: Beginning programmers struggle with the complex grammar of modern programming languages like Java, and make lot of syntax errors. The diagnostic syntax error messages from compilers and IDEs are sometimes useful, but often the messages are cryptic and puzzling. Students could be helped, and instructors' time saved, by automated repair suggestions when dealing with syntax errors. Large samples of s… ▽ More

    Submitted 11 October, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

  19. Learning to Find Usages of Library Functions in Optimized Binaries

    Authors: Toufique Ahmed, Premkumar Devanbu, Anand Ashok Sawant

    Abstract: Much software, whether beneficent or malevolent, is distributed only as binaries, sans source code. Absent source code, understanding binaries' behavior can be quite challenging, especially when compiled under higher levels of compiler optimization. These optimizations can transform comprehensible, "natural" source constructions into something entirely unrecognizable. Reverse engineering binaries,… ▽ More

    Submitted 16 September, 2021; v1 submitted 8 March, 2021; originally announced March 2021.

    Journal ref: Transactions on Software Engineering (2021)

  20. arXiv:2010.01410  [pdf, other

    cs.SE cs.CL

    Code to Comment "Translation": Data, Metrics, Baselining & Evaluation

    Authors: David Gros, Hariharan Sezhiyan, Prem Devanbu, Zhou Yu

    Abstract: The relationship of comments to code, and in particular, the task of generating useful comments given the code, has long been of interest. The earliest approaches have been based on strong syntactic theories of comment-structures, and relied on textual templates. More recently, researchers have applied deep learning methods to this task, and specifically, trainable generative translation models wh… ▽ More

    Submitted 3 October, 2020; originally announced October 2020.

  21. arXiv:2009.08525  [pdf, other

    cs.SE cs.AI cs.LG

    Deep Learning & Software Engineering: State of Research and Future Directions

    Authors: Prem Devanbu, Matthew Dwyer, Sebastian Elbaum, Michael Lowry, Kevin Moran, Denys Poshyvanyk, Baishakhi Ray, Rishabh Singh, Xiangyu Zhang

    Abstract: Given the current transformative potential of research that sits at the intersection of Deep Learning (DL) and Software Engineering (SE), an NSF-sponsored community workshop was conducted in co-location with the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE'19) in San Diego, California. The goal of this workshop was to outline high priority areas for cross-cutting r… ▽ More

    Submitted 17 September, 2020; originally announced September 2020.

    Comments: Community Report from the 2019 NSF Workshop on Deep Learning & Software Engineering, 37 pages

  22. arXiv:2008.10707  [pdf, other

    cs.SE cs.LG cs.PL

    Patching as Translation: the Data and the Metaphor

    Authors: Yangruibo Ding, Baishakhi Ray, Premkumar Devanbu, Vincent J. Hellendoorn

    Abstract: Machine Learning models from other fields, like Computational Linguistics, have been transplanted to Software Engineering tasks, often quite successfully. Yet a transplanted model's initial success at a given task does not necessarily mean it is well-suited for the task. In this work, we examine a common example of this phenomenon: the conceit that "software patching is like language translation".… ▽ More

    Submitted 31 August, 2020; v1 submitted 24 August, 2020; originally announced August 2020.

  23. arXiv:1911.07393  [pdf

    cs.SE cs.PL

    Rebuttal to Berger et al., TOPLAS 2019

    Authors: Baishakhi Ray, Prem Devanbu, Vladimir Filkov

    Abstract: Berger et al., published in TOPLAS 2019, is a critique of our 2014 FSE conference abstract and its archival version, the 2017 CACM paper: A Large-Scale Study of Programming Languages and Code Quality in Github. In their paper Berger et al. make academic claims about the veracity of our work. Here, we respond to their technical and scientific critiques aimed at our work, attempting to stick with sc… ▽ More

    Submitted 17 November, 2019; originally announced November 2019.

    Comments: 12 pages

  24. Learning Lenient Parsing & Typing via Indirect Supervision

    Authors: Toufique Ahmed, Premkumar Devanbu, Vincent Hellendoorn

    Abstract: Both professional coders and teachers frequently deal with imperfect (fragmentary, incomplete, ill-formed) code. Such fragments are common in STACKOVERFLOW; students also frequently produce ill-formed code, for which instructors, TAs (or students themselves) must find repairs. In either case, the developer experience could be greatly improved if such code could somehow be parsed & typed; this make… ▽ More

    Submitted 9 February, 2021; v1 submitted 13 October, 2019; originally announced October 2019.

    Comments: Accepted at EMSE (Empirical Software Engineering Journal)

    Report number: 29

    Journal ref: Empirical Software Engineering volume 26 (2021)

  25. arXiv:1910.03704  [pdf, other

    cs.CL cs.IT cs.PL

    Do People Prefer "Natural" code?

    Authors: Casey Casalnuovo, Kevin Lee, Hulin Wang, Prem Devanbu, Emily Morgan

    Abstract: Natural code is known to be very repetitive (much more so than natural language corpora); furthermore, this repetitiveness persists, even after accounting for the simpler syntax of code. However, programming languages are very expressive, allowing a great many different ways (all clear and unambiguous) to express even very simple computations. So why is natural code repetitive? We hypothesize that… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

  26. arXiv:1903.06725  [pdf, other

    cs.SE

    BugSwarm: Mining and Continuously Growing a Dataset of Reproducible Failures and Fixes

    Authors: David A. Tomassi, Naji Dmeiri, Yichen Wang, Antara Bhowmick, Yen-Chuan Liu, Premkumar Devanbu, Bogdan Vasilescu, Cindy Rubio-González

    Abstract: Fault-detection, localization, and repair methods are vital to software quality; but it is difficult to evaluate their generality, applicability, and current effectiveness. Large, diverse, realistic datasets of durably-reproducible faults and fixes are vital to good experimental evaluation of approaches to software quality, but they are difficult and expensive to assemble and keep current. Modern… ▽ More

    Submitted 22 July, 2019; v1 submitted 15 March, 2019; originally announced March 2019.

    Comments: In Proceedings of the 41st ACM/IEEE International Conference on Software Engineering (ICSE'19)

  27. arXiv:1903.06089  [pdf, ps, other

    cs.SE

    Are My Invariants Valid? A Learning Approach

    Authors: Vincent J. Hellendoorn, Premkumar T. Devanbu, Oleksandr Polozov, Mark Marron

    Abstract: Ensuring that a program operates correctly is a difficult task in large, complex systems. Enshrining invariants -- desired properties of correct execution -- in code or comments can support maintainability and help sustain correctness. Tools that can automatically infer and recommend invariants can thus be very beneficial. However, current invariant-suggesting tools, such as Daikon, suffer from hi… ▽ More

    Submitted 15 March, 2019; v1 submitted 14 March, 2019; originally announced March 2019.

    Comments: 10 pages

  28. arXiv:1806.08457  [pdf, other

    cs.SE

    Whom Are You Going to Call?: Determinants of @-Mentions in GitHub Discussions

    Authors: David Kavaler, Premkumar Devanbu, Vladimir Filkov

    Abstract: Open Source Software (OSS) project success relies on crowd contributions. When an issue arises in pull-request based systems, @-mentions are used to call on people to task; previous studies have shown that @-mentions in discussions are associated with faster issue resolution. In most projects there may be many developers who could technically handle a variety of tasks. But OSS supports dynamic tea… ▽ More

    Submitted 21 June, 2018; originally announced June 2018.

    Comments: 12 pages, 5 figures, 2 tables

    ACM Class: D.2.2

  29. arXiv:1806.02437  [pdf, other

    cs.CL

    Studying the Difference Between Natural and Programming Language Corpora

    Authors: Casey Casalnuovo, Kenji Sagae, Prem Devanbu

    Abstract: Code corpora, as observed in large software systems, are now known to be far more repetitive and predictable than natural language corpora. But why? Does the difference simply arise from the syntactic limitations of programming languages? Or does it arise from the differences in authoring decisions made by the writers of these natural and programming language texts? We conjecture that the differen… ▽ More

    Submitted 6 June, 2018; originally announced June 2018.

    Comments: Preprint

    MSC Class: 68N15; 68T50

  30. arXiv:1709.06182  [pdf, ps, other

    cs.SE cs.LG cs.PL

    A Survey of Machine Learning for Big Code and Naturalness

    Authors: Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu, Charles Sutton

    Abstract: Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit code's abundance of patterns. In this article, we survey this work. We contrast programming languages against natural languages and discuss how these similarities and differences drive the design… ▽ More

    Submitted 4 May, 2018; v1 submitted 18 September, 2017; originally announced September 2017.

    Comments: Website accompanying this survey paper can be found at https://ml4code.github.io

  31. arXiv:1607.07602  [pdf

    cs.SE cs.AI cs.CL

    OntoCat: Automatically categorizing knowledge in API Documentation

    Authors: Niraj Kumar, Premkumar Devanbu

    Abstract: Most application development happens in the context of complex APIs; reference documentation for APIs has grown tremendously in variety, complexity, and volume, and can be difficult to navigate. There is a growing need to develop well-organized ways to access the knowledge latent in the documentation; several research efforts deal with the organization (ontology) of API-related knowledge. Extensiv… ▽ More

    Submitted 26 July, 2016; originally announced July 2016.

    Comments: To be submitted for journal publication

  32. arXiv:1606.00521  [pdf, other

    cs.SE

    Initial and Eventual Software Quality Relating to Continuous Integration in GitHub

    Authors: Yue Yu, Bogdan Vasilescu, Huaimin Wang, Vladimir Filkov, Premkumar Devanbu

    Abstract: The constant demand for new features and bug fixes are forcing software projects to shorten cycles and deliver updates ever faster, while sustaining software quality. The availability of inexpensive, virtualized, cloud-computing has helped shorten schedules, by enabling continuous integration (CI) on demand. Platforms like GitHub support CI in-the-cloud. In projects using CI, a user submitting a p… ▽ More

    Submitted 1 June, 2016; originally announced June 2016.

  33. arXiv:1506.01159  [pdf, other

    cs.SE

    On the "Naturalness" of Buggy Code

    Authors: Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, Premkumar Devanbu

    Abstract: Real software, the kind working programmers produce by the kLOC to solve real-world problems, tends to be "natural", like speech or natural language; it tends to be highly repetitive and predictable. Researchers have captured this naturalness of software through statistical models and used them to good effect in suggestion engines, porting tools, coding standards checkers, and idiom miners. This s… ▽ More

    Submitted 10 September, 2015; v1 submitted 3 June, 2015; originally announced June 2015.

    Comments: 12 pages

    MSC Class: 68N30

  34. arXiv:1404.5708  [pdf, other

    cs.SE cs.HC cs.SI physics.data-an

    Converging Work-Talk Patterns in Online Task-Oriented Communities

    Authors: Qi Xuan, Premkumar T Devanbu, Vladimir Filkov

    Abstract: Much of what we do is accomplished by working collaboratively with others, and a large portion of our lives are spent working and talking; the patterns embodied in the alternation of working and talking can provide much useful insight into task-oriented social behaviors. The available electronic traces of the different kinds of human activities in online communities are an empirical goldmine that… ▽ More

    Submitted 23 April, 2014; originally announced April 2014.

    ACM Class: H.2.8; D.2.8; D.2.9

  35. arXiv:0805.1489  [pdf, other

    cond-mat.stat-mech cs.SE q-bio.QM stat.AP

    Modeling and verifying a broad array of network properties

    Authors: Vladimir Filkov, Zachary M. Saul, Soumen Roy, Raissa M. D'Souza, Premkumar T. Devanbu

    Abstract: Motivated by widely observed examples in nature, society and software, where groups of already related nodes arrive together and attach to an existing network, we consider network growth via sequential attachment of linked node groups, or graphlets. We analyze the simplest case, attachment of the three node V-graphlet, where, with probability alpha, we attach a peripheral node of the graphlet, a… ▽ More

    Submitted 26 March, 2009; v1 submitted 12 May, 2008; originally announced May 2008.

    Comments: To appear in Europhysics Letters

    Journal ref: Europhysics Letters, 86 (2009) 28003