Automesc: Automatic framework for mining and classifying ethereum smart contract vulnerabilities and their fixes

M Soud, I Qasse, G Liebel…�- 2023 49th Euromicro�…, 2023 - ieeexplore.ieee.org
2023 49th Euromicro Conference on Software Engineering and�…, 2023ieeexplore.ieee.org
Due to the risks associated with vulnerabilities in smart contracts, their security has gained
significant attention in recent years. However, there is a lack of open datasets on smart
contract vulnerabilities and their fixes that allows for data-driven research. Towards this end,
we propose an automated framework for mining and classifying Ethereum's smart contract
vulnerabilities and their corresponding fixes from GitHub and from the Common
Vulnerabilities and Exposures (CVE) records in the National Vulnerability Database. We�…
Due to the risks associated with vulnerabilities in smart contracts, their security has gained significant attention in recent years. However, there is a lack of open datasets on smart contract vulnerabilities and their fixes that allows for data-driven research. Towards this end, we propose an automated framework for mining and classifying Ethereum’s smart contract vulnerabilities and their corresponding fixes from GitHub and from the Common Vulnerabilities and Exposures (CVE) records in the National Vulnerability Database. We implemented the proposed method in a fully automated framework, which we call AutoMESC. AutoMESC uses seven of the most well-known smart contract security tools to classify and label the collected vulnerabilities based on vulnerability types. Furthermore, it collects metadata that can be used in data-intensive smart contract security research (e.g., vulnerability detection, vulnerability classification, severity prediction, and automated repair). We used AutoMESC to construct a sample dataset and made it publicly available. Currently, the dataset contains 6.7K smart contract vulnerability-fix pairs written in Solidity. We assess the quality of the constructed dataset in terms of accuracy, provenance, and relevance, and compare it with existing datasets. AutoMESC is designed to collect data continuously and keep the corresponding dataset up-to-date with newly discovered smart contract vulnerabilities and their fixes from GitHub and CVE records.
ieeexplore.ieee.org