subscribe to arXiv mailings

ESBMC v7.6: Enhanced Model Checking of C++ Programs with Clang AST

Authors: Xianzhiyu Li, Kunjian Song, Mikhail R. Gadelha, Franz Brauße, Rafael S. Menezes, Konstantin Korovin, Lucas C. Cordeiro

Abstract: This paper presents Efficient SMT-Based Context-Bounded Model Checker (ESBMC) v7.6, an extended version based on previous work on ESBMC v7.3 by K. Song et al. The v7.3 introduced a new Clang-based C++ front-end to address the challenges posed by modern C++ programs. Although the new front-end has demonstrated significant potential in previous studies, it remains in the developmental stage and lack… ▽ More This paper presents Efficient SMT-Based Context-Bounded Model Checker (ESBMC) v7.6, an extended version based on previous work on ESBMC v7.3 by K. Song et al. The v7.3 introduced a new Clang-based C++ front-end to address the challenges posed by modern C++ programs. Although the new front-end has demonstrated significant potential in previous studies, it remains in the developmental stage and lacks several essential features. ESBMC v7.6 further enhanced this foundation by adding and extending features based on the Clang AST, such as 1) exception handling, 2) extended memory management and memory safety verification, including dangling pointers, duplicate deallocation, memory leaks and rvalue references and 3) new operational models for STL updating the outdated C++ operational models. Our extensive experiments demonstrate that ESBMC v7.6 can handle a significantly broader range of C++ features introduced in recent versions of the C++ standard. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 27 pages, 2 figures. arXiv admin note: substantial text overlap with arXiv:2308.05649

arXiv:2406.15281 [pdf, ps, other]

Interval Analysis in Industrial-Scale BMC Software Verifiers: A Case Study

Authors: Rafael Sá Menezes, Edoardo Manino, Fedor Shmarov, Mohannad Aldughaim, Rosiane de Freitas, Lucas C. Cordeiro

Abstract: Bounded Model Checking (BMC) is a widely used software verification technique. Despite its successes, the technique has several limiting factors, from state-space explosion to lack of completeness. Over the years, interval analysis has repeatedly been proposed as a partial solution to these limitations. In this work, we evaluate whether the computational cost of interval analysis yields significan… ▽ More Bounded Model Checking (BMC) is a widely used software verification technique. Despite its successes, the technique has several limiting factors, from state-space explosion to lack of completeness. Over the years, interval analysis has repeatedly been proposed as a partial solution to these limitations. In this work, we evaluate whether the computational cost of interval analysis yields significant enough improvements in BMC's performance to justify its use. In more detail, we quantify the benefits of interval analysis on two benchmarks: the Intel Core Power Management firmware and 9537 programs in the ReachSafety category of the International Competition on Software Verification. Our results show that interval analysis is essential in solving 203 unique benchmarks. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: Submitted to IFM

arXiv:2311.09180 [pdf, other]

PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers

Authors: Sheshera Mysore, Zhuoran Lu, Mengting Wan, Longqi Yang, Steve Menezes, Tina Baghaee, Emmanuel Barajas Gonzalez, Jennifer Neville, Tara Safavi

Abstract: Powerful large language models have facilitated the development of writing assistants that promise to significantly improve the quality and efficiency of composition and communication. However, a barrier to effective assistance is the lack of personalization in LLM outputs to the author's communication style and specialized knowledge. In this paper, we address this challenge by proposing PEARL, a… ▽ More Powerful large language models have facilitated the development of writing assistants that promise to significantly improve the quality and efficiency of composition and communication. However, a barrier to effective assistance is the lack of personalization in LLM outputs to the author's communication style and specialized knowledge. In this paper, we address this challenge by proposing PEARL, a retrieval-augmented LLM writing assistant personalized with a generation-calibrated retriever. Our retriever is trained to select historic user-authored documents for prompt augmentation, such that they are likely to best personalize LLM generations for a user request. We propose two key novelties for training our retriever: 1) A training data selection method that identifies user requests likely to benefit from personalization and documents that provide that benefit; and 2) A scale-calibrating KL-divergence objective that ensures that our retriever closely tracks the benefit of a document for personalized generation. We demonstrate the effectiveness of PEARL in generating personalized workplace social media posts and Reddit comments. Finally, we showcase the potential of a generation-calibrated retriever to double as a performance predictor and further improve low-quality generations via LLM chaining. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: Pre-print, work in progress

arXiv:2309.03617 [pdf, other]

NeuroCodeBench: a plain C neural network benchmark for software verification

Authors: Edoardo Manino, Rafael Sá Menezes, Fedor Shmarov, Lucas C. Cordeiro

Abstract: Safety-critical systems with neural network components require strong guarantees. While existing neural network verification techniques have shown great progress towards this goal, they cannot prove the absence of software faults in the network implementation. This paper presents NeuroCodeBench - a verification benchmark for neural network code written in plain C. It contains 32 neural networks wi… ▽ More Safety-critical systems with neural network components require strong guarantees. While existing neural network verification techniques have shown great progress towards this goal, they cannot prove the absence of software faults in the network implementation. This paper presents NeuroCodeBench - a verification benchmark for neural network code written in plain C. It contains 32 neural networks with 607 safety properties divided into 6 categories: maths library, activation functions, error-correcting networks, transfer function approximation, probability density estimation and reinforcement learning. Our preliminary evaluation shows that state-of-the-art software verifiers struggle to provide correct verdicts, due to their incomplete support of the standard C mathematical library and the complexity of larger neural networks. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: Submitted to the 2023 AFRiTS workshop

arXiv:2308.05649 [pdf, other]

ESBMC v7.3: Model Checking C++ Programs using Clang AST

Authors: Kunjian Song, Mikhail R. Gadelha, Franz Brauße, Rafael S. Menezes, Lucas C. Cordeiro

Abstract: This paper introduces ESBMC v7.3, the latest Efficient SMT-Based Context-Bounded Model Checker version, which now incorporates a new clang-based C++ front-end. While the previous CPROVER-based front-end served well for handling C++03 programs, it encountered challenges keeping up with the evolving C++ language. As new language and library features were added in each C++ version, the limitations of… ▽ More This paper introduces ESBMC v7.3, the latest Efficient SMT-Based Context-Bounded Model Checker version, which now incorporates a new clang-based C++ front-end. While the previous CPROVER-based front-end served well for handling C++03 programs, it encountered challenges keeping up with the evolving C++ language. As new language and library features were added in each C++ version, the limitations of the old front-end became apparent, leading to difficult-to-maintain code. Consequently, modern C++ programs were challenging to verify. To overcome this obstacle, we redeveloped the front-end, opting for a more robust approach using clang. The new front-end efficiently traverses the Abstract Syntax Tree (AST) in-memory using clang APIs and transforms each AST node into ESBMC's Intermediate Representation. Through extensive experimentation, our results demonstrate that ESBMC v7.3 with the new front-end significantly reduces parse and conversion errors, enabling successful verification of a wide range of C++ programs, thereby outperforming previous ESBMC versions. △ Less

Submitted 10 August, 2023; originally announced August 2023.

arXiv:2012.11223 [pdf, other]

FuSeBMC: A White-Box Fuzzer for Finding Security Vulnerabilities in C Programs

Authors: Kaled M. Alshmrany, Rafael S. Menezes, Mikhail R. Gadelha, Lucas C. Cordeiro

Abstract: We describe and evaluate a novel white-box fuzzer for C programs named FuSeBMC, which combines fuzzing and symbolic execution, and applies Bounded Model Checking (BMC) to find security vulnerabilities in C programs. FuSeBMC explores and analyzes C programs (1) to find execution paths that lead to property violations and (2) to incrementally inject labels to guide the fuzzer and the BMC engine to p… ▽ More We describe and evaluate a novel white-box fuzzer for C programs named FuSeBMC, which combines fuzzing and symbolic execution, and applies Bounded Model Checking (BMC) to find security vulnerabilities in C programs. FuSeBMC explores and analyzes C programs (1) to find execution paths that lead to property violations and (2) to incrementally inject labels to guide the fuzzer and the BMC engine to produce test-cases for code coverage. FuSeBMC successfully participates in Test-Comp'21 and achieves first place in the Cover-Error category and second place in the Overall category. △ Less

Submitted 21 December, 2020; originally announced December 2020.

Comments: 4 pages

arXiv:2005.06989 [pdf, other]

doi 10.1088/1748-0221/16/05/T05006

A continuous integration and web framework in support of the ATLAS Publication Process

Authors: Juan Pedro Araque Espinosa, Gabriel Baldi Levcovitz, Riccardo-Maria Bianchi, Ian Brock, Tancredi Carli, Nuno Filipe Castro, Alessandra Ciocio, Maurizio Colautti, Ana Carolina Da Silva Menezes, Gabriel De Oliveira da Fonseca, Leandro Domingues Macedo Alves, Andreas Hoecker, Bruno Lange Ramos, Gabriela Lemos Lúcidi Pinhão, Carmen Maidantchik, Fairouz Malek, Robert McPherson, Gianluca Picco, Marcelo Teixeira Dos Santos

Abstract: The ATLAS collaboration defines methods, establishes procedures, and organises advisory groups to manage the publication processes of scientific papers, conference papers, and public notes. All stages are managed through web systems, computing programs, and tools that are designed and developed by the collaboration. A framework called FENCE is integrated into the CERN GitLab software repository, t… ▽ More The ATLAS collaboration defines methods, establishes procedures, and organises advisory groups to manage the publication processes of scientific papers, conference papers, and public notes. All stages are managed through web systems, computing programs, and tools that are designed and developed by the collaboration. A framework called FENCE is integrated into the CERN GitLab software repository, to automatically configure workspaces where each analysis can be documented by the analysis team and managed by the relevant coordinators. Continuous integration is used to guide the writers in applying consistent and correct formatting when preparing papers to be submitted to scientific journals. Additional software assures the correctness of other aspects of each paper, such as the lists of collaboration authors, funding agencies, and foundations. The framework and the workflow therein provide automatic and easy support to the researchers and facilitates each phase of the publication process, allowing authors to focus on the article contents. The framework and its integration with the most up to date and efficient tools has consequently provided a more professional and efficient automatized work environment to the whole collaboration. △ Less

Submitted 28 January, 2021; v1 submitted 14 May, 2020; originally announced May 2020.

Comments: 22 pages in total,11 figures, submitted to JINST. All figures including auxiliary figures are available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PAPERS/GENR-2018-01/

Report number: CERN-OPEN-2020-007

arXiv:1908.05758 [pdf, other]

Building a Massive Corpus for Named Entity Recognition using Free Open Data Sources

Authors: Daniel Specht Menezes, Pedro Savarese, Ruy Luiz Milidiú

Abstract: With the recent progress in machine learning, boosted by techniques such as deep learning, many tasks can be successfully solved once a large enough dataset is available for training. Nonetheless, human-annotated datasets are often expensive to produce, especially when labels are fine-grained, as is the case of Named Entity Recognition (NER), a task that operates with labels on a word-level. In… ▽ More With the recent progress in machine learning, boosted by techniques such as deep learning, many tasks can be successfully solved once a large enough dataset is available for training. Nonetheless, human-annotated datasets are often expensive to produce, especially when labels are fine-grained, as is the case of Named Entity Recognition (NER), a task that operates with labels on a word-level. In this paper, we propose a method to automatically generate labeled datasets for NER from public data sources by exploiting links and structured data from DBpedia and Wikipedia. Due to the massive size of these data sources, the resulting dataset -- SESAME Available at https://sesame-pt.github.io -- is composed of millions of labeled sentences. We detail the method to generate the dataset, report relevant statistics, and design a baseline using a neural network, showing that our dataset helps building better NER predictors. △ Less

Submitted 12 August, 2019; originally announced August 2019.

Showing 1–8 of 8 results for author: Menezes, S