subscribe to arXiv mailings

Class-Level Code Generation from Natural Language Using Iterative, Tool-Enhanced Reasoning over Repository

Authors: Ajinkya Deshpande, Anmol Agarwal, Shashank Shet, Arun Iyer, Aditya Kanade, Ramakrishna Bairi, Suresh Parthasarathy

Abstract: LLMs have demonstrated significant potential in code generation tasks, achieving promising results at the function or statement level across various benchmarks. However, the complexities associated with creating code artifacts like classes, particularly within the context of real-world software repositories, remain underexplored. Prior research treats class-level generation as an isolated task, ne… ▽ More LLMs have demonstrated significant potential in code generation tasks, achieving promising results at the function or statement level across various benchmarks. However, the complexities associated with creating code artifacts like classes, particularly within the context of real-world software repositories, remain underexplored. Prior research treats class-level generation as an isolated task, neglecting the intricate dependencies & interactions that characterize real-world software environments. To address this gap, we introduce RepoClassBench, a comprehensive benchmark designed to rigorously evaluate LLMs in generating complex, class-level code within real-world repositories. RepoClassBench includes "Natural Language to Class generation" tasks across Java, Python & C# from a selection of repositories. We ensure that each class in our dataset not only has cross-file dependencies within the repository but also includes corresponding test cases to verify its functionality. We find that current models struggle with the realistic challenges posed by our benchmark, primarily due to their limited exposure to relevant repository contexts. To address this shortcoming, we introduce Retrieve-Repotools-Reflect (RRR), a novel approach that equips LLMs with static analysis tools to iteratively navigate & reason about repository-level context in an agent-based framework. Our experiments demonstrate that RRR significantly outperforms existing baselines on RepoClassBench, showcasing its effectiveness across programming languages & under various settings. Our findings emphasize the critical need for code-generation benchmarks to incorporate repo-level dependencies to more accurately reflect the complexities of software development. Our work shows the benefits of leveraging specialized tools to enhance LLMs' understanding of repository context. We plan to make our dataset & evaluation harness public. △ Less

Submitted 5 June, 2024; v1 submitted 21 April, 2024; originally announced May 2024.

Comments: Preprint with additional experiments

arXiv:2309.12499 [pdf, other]

CodePlan: Repository-level Coding using LLMs and Planning

Authors: Ramakrishna Bairi, Atharv Sonwane, Aditya Kanade, Vageesh D C, Arun Iyer, Suresh Parthasarathy, Sriram Rajamani, B. Ashok, Shashank Shet

Abstract: Software engineering activities such as package migration, fixing errors reports from static analysis or testing, and adding type annotations or other specifications to a codebase, involve pervasively editing the entire repository of code. We formulate these activities as repository-level coding tasks. Recent tools like GitHub Copilot, which are powered by Large Language Models (LLMs), have succ… ▽ More Software engineering activities such as package migration, fixing errors reports from static analysis or testing, and adding type annotations or other specifications to a codebase, involve pervasively editing the entire repository of code. We formulate these activities as repository-level coding tasks. Recent tools like GitHub Copilot, which are powered by Large Language Models (LLMs), have succeeded in offering high-quality solutions to localized coding problems. Repository-level coding tasks are more involved and cannot be solved directly using LLMs, since code within a repository is inter-dependent and the entire repository may be too large to fit into the prompt. We frame repository-level coding as a planning problem and present a task-agnostic framework, called CodePlan to solve it. CodePlan synthesizes a multi-step chain of edits (plan), where each step results in a call to an LLM on a code location with context derived from the entire repository, previous code changes and task-specific instructions. CodePlan is based on a novel combination of an incremental dependency analysis, a change may-impact analysis and an adaptive planning algorithm. We evaluate the effectiveness of CodePlan on two repository-level tasks: package migration (C#) and temporal code edits (Python). Each task is evaluated on multiple code repositories, each of which requires inter-dependent changes to many files (between 2-97 files). Coding tasks of this level of complexity have not been automated using LLMs before. Our results show that CodePlan has better match with the ground truth compared to baselines. CodePlan is able to get 5/6 repositories to pass the validity checks (e.g., to build without errors and make correct code edits) whereas the baselines (without planning but with the same type of contextual information as CodePlan) cannot get any of the repositories to pass them. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:1006.0860 [pdf]

Implementation of Handoff through wireless access point Techniques

Authors: N. S. V. Shet, K. Chandrasekaran, K. C. Shet

Abstract: Handoff has become an inevitable part of wireless cellular communication, Soon users will carry small portable handheld devices which will incorporate the computer, phone, camera, GPS, personal control module etc. This paper proposes a new scheme to deal with seam less roaming and reduce failed handoffs. The simulation is done using software called Qualnet meant for wireless communication. The res… ▽ More Handoff has become an inevitable part of wireless cellular communication, Soon users will carry small portable handheld devices which will incorporate the computer, phone, camera, GPS, personal control module etc. This paper proposes a new scheme to deal with seam less roaming and reduce failed handoffs. The simulation is done using software called Qualnet meant for wireless communication. The results clearly indicate the advantages of this new scheme. △ Less

Submitted 4 June, 2010; originally announced June 2010.

Comments: Submitted to Journal of Telecommunications, see http://sites.google.com/site/journaloftelecommunications/volume-2-issue-2-may-2010

Journal ref: Journal of Telecommunications,Volume 2, Issue 2, p143-146, May 2010

arXiv:1001.5339 [pdf, other]

Implementation of Connectivity and Handover through Wireless Sensor Node based Techniques

Authors: N. S. V. Shet, K. Chandrasekaran, K. C. Shet

Abstract: In this paper a scheme for handoff and connectivity, based on wireless sensor nodetechniques is proposed. Scenes are created in Qualnet and simulated for a simple case. Results are discussed. In this paper a scheme for handoff and connectivity, based on wireless sensor nodetechniques is proposed. Scenes are created in Qualnet and simulated for a simple case. Results are discussed. △ Less

Submitted 29 January, 2010; originally announced January 2010.

Comments: 5 pages, 8 figures

Journal ref: InterJRI Computer Science and Networking, Volume 1, pp 13-17, 2009

Showing 1–4 of 4 results for author: Shet, S