subscribe to arXiv mailings

On the Abuse and Detection of Polyglot Files

Authors: Luke Koch, Sean Oesch, Amul Chaulagain, Jared Dixon, Matthew Dixon, Mike Huettal, Amir Sadovnik, Cory Watson, Brian Weber, Jacob Hartman, Richard Patulski

Abstract: A polyglot is a file that is valid in two or more formats. Polyglot files pose a problem for malware detection systems that route files to format-specific detectors/signatures, as well as file upload and sanitization tools. In this work we found that existing file-format and embedded-file detection tools, even those developed specifically for polyglot files, fail to reliably detect polyglot files… ▽ More A polyglot is a file that is valid in two or more formats. Polyglot files pose a problem for malware detection systems that route files to format-specific detectors/signatures, as well as file upload and sanitization tools. In this work we found that existing file-format and embedded-file detection tools, even those developed specifically for polyglot files, fail to reliably detect polyglot files used in the wild, leaving organizations vulnerable to attack. To address this issue, we studied the use of polyglot files by malicious actors in the wild, finding $30$ polyglot samples and $15$ attack chains that leveraged polyglot files. In this report, we highlight two well-known APTs whose cyber attack chains relied on polyglot files to bypass detection mechanisms. Using knowledge from our survey of polyglot usage in the wild -- the first of its kind -- we created a novel data set based on adversary techniques. We then trained a machine learning detection solution, PolyConv, using this data set. PolyConv achieves a precision-recall area-under-curve score of $0.999$ with an F1 score of $99.20$% for polyglot detection and $99.47$% for file-format identification, significantly outperforming all other tools tested. We developed a content disarmament and reconstruction tool, ImSan, that successfully sanitized $100$% of the tested image-based polyglots, which were the most common type found via the survey. Our work provides concrete tools and suggestions to enable defenders to better defend themselves against polyglot files, as well as directions for future work to create more robust file specifications and methods of disarmament. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 18 pages, 11 figures

arXiv:2404.10788 [pdf, other]

The Path To Autonomous Cyber Defense

Authors: Sean Oesch, Phillipe Austria, Amul Chaulagain, Brian Weber, Cory Watson, Matthew Dixson, Amir Sadovnik

Abstract: Defenders are overwhelmed by the number and scale of attacks against their networks.This problem will only be exacerbated as attackers leverage artificial intelligence to automate their workflows. We propose a path to autonomous cyber agents able to augment defenders by automating critical steps in the cyber defense life cycle. Defenders are overwhelmed by the number and scale of attacks against their networks.This problem will only be exacerbated as attackers leverage artificial intelligence to automate their workflows. We propose a path to autonomous cyber agents able to augment defenders by automating critical steps in the cyber defense life cycle. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 9 pages, 3 figures

arXiv:2308.14835 [pdf, other]

AI ATAC 1: An Evaluation of Prominent Commercial Malware Detectors

Authors: Robert A. Bridges, Brian Weber, Justin M. Beaver, Jared M. Smith, Miki E. Verma, Savannah Norem, Kevin Spakes, Cory Watson, Jeff A. Nichols, Brian Jewell, Michael. D. Iannacone, Chelsey Dunivan Stahl, Kelly M. T. Huffer, T. Sean Oesch

Abstract: This work presents an evaluation of six prominent commercial endpoint malware detectors, a network malware detector, and a file-conviction algorithm from a cyber technology vendor. The evaluation was administered as the first of the Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC) prize challenges, funded by / completed in service of the US Navy. The experiment employed 1… ▽ More This work presents an evaluation of six prominent commercial endpoint malware detectors, a network malware detector, and a file-conviction algorithm from a cyber technology vendor. The evaluation was administered as the first of the Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC) prize challenges, funded by / completed in service of the US Navy. The experiment employed 100K files (50/50% benign/malicious) with a stratified distribution of file types, including ~1K zero-day program executables (increasing experiment size two orders of magnitude over previous work). We present an evaluation process of delivering a file to a fresh virtual machine donning the detection technology, waiting 90s to allow static detection, then executing the file and waiting another period for dynamic detection; this allows greater fidelity in the observational data than previous experiments, in particular, resource and time-to-detection statistics. To execute all 800K trials (100K files $\times$ 8 tools), a software framework is designed to choreographed the experiment into a completely automated, time-synced, and reproducible workflow with substantial parallelization. A cost-benefit model was configured to integrate the tools' recall, precision, time to detection, and resource requirements into a single comparable quantity by simulating costs of use. This provides a ranking methodology for cyber competitions and a lens through which to reason about the varied statistical viewpoints of the results. These statistical and cost-model results provide insights on state of commercial malware detection. △ Less

Submitted 28 August, 2023; originally announced August 2023.

arXiv:2208.06075 [pdf, other]

doi 10.1016/j.cose.2023.103201

Testing SOAR Tools in Use

Authors: Robert A. Bridges, Ashley E. Rice, Sean Oesch, Jeff A. Nichols, Cory Watson, Kevin Spakes, Savannah Norem, Mike Huettel, Brian Jewell, Brian Weber, Connor Gannon, Olivia Bizovi, Samuel C Hollifield, Samantha Erwin

Abstract: Modern security operation centers (SOCs) rely on operators and a tapestry of logging and alerting tools with large scale collection and query abilities. SOC investigations are tedious as they rely on manual efforts to query diverse data sources, overlay related logs, and correlate the data into information and then document results in a ticketing system. Security orchestration, automation, and res… ▽ More Modern security operation centers (SOCs) rely on operators and a tapestry of logging and alerting tools with large scale collection and query abilities. SOC investigations are tedious as they rely on manual efforts to query diverse data sources, overlay related logs, and correlate the data into information and then document results in a ticketing system. Security orchestration, automation, and response (SOAR) tools are a new technology that promise to collect, filter, and display needed data; automate common tasks that require SOC analysts' time; facilitate SOC collaboration; and, improve both efficiency and consistency of SOCs. SOAR tools have never been tested in practice to evaluate their effect and understand them in use. In this paper, we design and administer the first hands-on user study of SOAR tools, involving 24 participants and 6 commercial SOAR tools. Our contributions include the experimental design, itemizing six characteristics of SOAR tools and a methodology for testing them. We describe configuration of the test environment in a cyber range, including network, user, and threat emulation; a full SOC tool suite; and creation of artifacts allowing multiple representative investigation scenarios to permit testing. We present the first research results on SOAR tools. We found that SOAR configuration is critical, as it involves creative design for data display and automation. We found that SOAR tools increased efficiency and reduced context switching during investigations, although ticket accuracy and completeness (indicating investigation quality) decreased with SOAR use. Our findings indicated that user preferences are slightly negatively correlated with their performance with the tool; overautomation was a concern of senior analysts, and SOAR tools that balanced automation with assisting a user to make decisions were preferred. △ Less

Submitted 14 February, 2023; v1 submitted 11 August, 2022; originally announced August 2022.

Journal ref: Computers & Security 2023

arXiv:2203.07561 [pdf, other]

Toward the Detection of Polyglot Files

Authors: Luke Koch, Sean Oesch, Mary Adkisson, Sam Erwin, Brian Weber, Amul Chaulagain

Abstract: Standardized file formats play a key role in the development and use of computer software. However, it is possible to abuse standardized file formats by creating a file that is valid in multiple file formats. The resulting polyglot (many languages) file can confound file format identification, allowing elements of the file to evade analysis.This is especially problematic for malware detection syst… ▽ More Standardized file formats play a key role in the development and use of computer software. However, it is possible to abuse standardized file formats by creating a file that is valid in multiple file formats. The resulting polyglot (many languages) file can confound file format identification, allowing elements of the file to evade analysis.This is especially problematic for malware detection systems that rely on file format identification for feature extraction. File format identification processes that depend on file signatures can be easily evaded thanks to flexibility in the format specifications of certain file formats. Although work has been done to identify file formats using more comprehensive methods than file signatures, accurate identification of polyglot files remains an open problem. Since malware detection systems routinely perform file format-specific feature extraction, polyglot files need to be filtered out prior to ingestion by these systems. Otherwise, malicious content could pass through undetected. To address the problem of polyglot detection we assembled a data set using the mitra tool. We then evaluated the performance of the most commonly used file identification tool, file. Finally, we demonstrated the accuracy, precision, recall and F1 score of a range of machine and deep learning models. Malconv2 and Catboost demonstrated the highest recall on our data set with 95.16% and 95.45%, respectively. These models can be incorporated into a malware detector's file processing pipeline to filter out potentially malicious polyglots before file format-dependent feature extraction takes place. △ Less

Submitted 12 April, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

arXiv:2112.00100 [pdf, other]

A Mathematical Framework for Evaluation of SOAR Tools with Limited Survey Data

Authors: Savannah Norem, Ashley E Rice, Samantha Erwin, Robert A Bridges, Sean Oesch, Brian Weber

Abstract: Security operation centers (SOCs) all over the world are tasked with reacting to cybersecurity alerts ranging in severity. Security Orchestration, Automation, and Response (SOAR) tools streamline cybersecurity alert responses by SOC operators. SOAR tool adoption is expensive both in effort and finances. Hence, it is crucial to limit adoption to those most worthwhile; yet no research evaluating or… ▽ More Security operation centers (SOCs) all over the world are tasked with reacting to cybersecurity alerts ranging in severity. Security Orchestration, Automation, and Response (SOAR) tools streamline cybersecurity alert responses by SOC operators. SOAR tool adoption is expensive both in effort and finances. Hence, it is crucial to limit adoption to those most worthwhile; yet no research evaluating or comparing SOAR tools exists. The goal of this work is to evaluate several SOAR tools using specific criteria pertaining to their usability. SOC operators were asked to first complete a survey about what SOAR tool aspects are most important. Operators were then assigned a set of SOAR tools for which they viewed demonstration and overview videos, and then operators completed a second survey wherein they were tasked with evaluating each of the tools on the aspects from the first survey. In addition, operators provided an overall rating to each of their assigned tools, and provided a ranking of their tools in order of preference. Due to time constraints on SOC operators for thorough testing, we provide a systematic method of downselecting a large pool of SOAR tools to a select few that merit next-step hands-on evaluation by SOC operators. Furthermore, the analyses conducted in this survey help to inform future development of SOAR tools to ensure that the appropriate functions are available for use in a SOC. △ Less

Submitted 30 November, 2021; originally announced December 2021.

arXiv:2105.06545 [pdf]

What Clinical Trials Can Teach Us about the Development of More Resilient AI for Cybersecurity

Authors: Edmon Begoli, Robert A. Bridges, Sean Oesch, Kathryn E. Knight

Abstract: Policy-mandated, rigorously administered scientific testing is needed to provide transparency into the efficacy of artificial intelligence-based (AI-based) cyber defense tools for consumers and to prioritize future research and development. In this article, we propose a model that is informed by our experience, urged forward by massive scale cyberattacks, and inspired by parallel developments in t… ▽ More Policy-mandated, rigorously administered scientific testing is needed to provide transparency into the efficacy of artificial intelligence-based (AI-based) cyber defense tools for consumers and to prioritize future research and development. In this article, we propose a model that is informed by our experience, urged forward by massive scale cyberattacks, and inspired by parallel developments in the biomedical field and the unprecedentedly fast development of new vaccines to combat global pathogens. △ Less

Submitted 13 May, 2021; originally announced May 2021.

arXiv:2104.10017 [pdf, other]

doi 10.1145/3485832.3485884

The Emperor's New Autofill Framework: A Security Analysis of Autofill on iOS and Android

Authors: Sean Oesch, Anuj Gautam, Scott Ruoti

Abstract: Password managers help users more effectively manage their passwords, encouraging them to adopt stronger passwords across their many accounts. In contrast to desktop systems where password managers receive no system-level support, mobile operating systems provide autofill frameworks designed to integrate with password managers to provide secure and usable autofill for browsers and other apps insta… ▽ More Password managers help users more effectively manage their passwords, encouraging them to adopt stronger passwords across their many accounts. In contrast to desktop systems where password managers receive no system-level support, mobile operating systems provide autofill frameworks designed to integrate with password managers to provide secure and usable autofill for browsers and other apps installed on mobile devices. In this paper, we evaluate mobile autofill frameworks on iOS and Android, examining whether they achieve substantive benefits over the ad-hoc desktop environment or become a problematic single point of failure. Our results find that while the frameworks address several common issues, they also enforce insecure behavior and fail to provide password managers sufficient information to override the frameworks' insecure behavior, resulting in mobile managers being less secure than their desktop counterparts overall. We also demonstrate how these frameworks act as a confused deputy in manager-assisted credential phishing attacks. Our results demonstrate the need for significant improvements to mobile autofill frameworks. We conclude the paper with recommendations for the design and implementation of secure autofill frameworks. △ Less

Submitted 28 September, 2021; v1 submitted 20 April, 2021; originally announced April 2021.

Comments: 12 pages, 3 pages appendix, published at ACSAC 2021

arXiv:2012.09244 [pdf, other]

An Integrated Platform for Collaborative Data Analytics

Authors: Sean Oesch, Rob Gillen, Tom Karnowski

Abstract: While collaboration among data scientists is a key to organizational productivity, data analysts face significant barriers to achieving this end, including data sharing, accessing and configuring the required computational environment, and a unified method of sharing knowledge. Each of these barriers to collaboration is related to the fundamental question of knowledge management "how can organizat… ▽ More While collaboration among data scientists is a key to organizational productivity, data analysts face significant barriers to achieving this end, including data sharing, accessing and configuring the required computational environment, and a unified method of sharing knowledge. Each of these barriers to collaboration is related to the fundamental question of knowledge management "how can organizations use knowledge more effectively?". In this paper, we consider the problem of knowledge management in collaborative data analytics and present ShareAL, an integrated knowledge management platform, as a solution to that problem. The ShareAL platform consists of three core components: a full stack web application, a dashboard for analyzing streaming data and a High Performance Computing (HPC) cluster for performing real time analysis. Prior research has not applied knowledge management to collaborative analytics or developed a platform with the same capabilities as ShareAL. ShareAL overcomes the barriers data scientists face to collaboration by providing intuitive sharing of data and analytics via the web application, a shared computing environment via the HPC cluster and knowledge sharing and collaboration via a real time messaging application. △ Less

Submitted 16 December, 2020; originally announced December 2020.

arXiv:2012.09214 [pdf, other]

doi 10.1145/3567432

Beyond the Hype: A Real-World Evaluation of the Impact and Cost of Machine Learning-Based Malware Detection

Authors: Robert A. Bridges, Sean Oesch, Miki E. Verma, Michael D. Iannacone, Kelly M. T. Huffer, Brian Jewell, Jeff A. Nichols, Brian Weber, Justin M. Beaver, Jared M. Smith, Daniel Scofield, Craig Miles, Thomas Plummer, Mark Daniell, Anne M. Tall

Abstract: In this paper, we present a scientific evaluation of four prominent malware detection tools to assist an organization with two primary questions: To what extent do ML-based tools accurately classify previously- and never-before-seen files? Is it worth purchasing a network-level malware detector? To identify weaknesses, we tested each tool against 3,536 total files (2,554 or 72\% malicious, 982 or… ▽ More In this paper, we present a scientific evaluation of four prominent malware detection tools to assist an organization with two primary questions: To what extent do ML-based tools accurately classify previously- and never-before-seen files? Is it worth purchasing a network-level malware detector? To identify weaknesses, we tested each tool against 3,536 total files (2,554 or 72\% malicious, 982 or 28\% benign) of a variety of file types, including hundreds of malicious zero-days, polyglots, and APT-style files, delivered on multiple protocols. We present statistical results on detection time and accuracy, consider complementary analysis (using multiple tools together), and provide two novel applications of the recent cost-benefit evaluation procedure of Iannacone \& Bridges. While the ML-based tools are more effective at detecting zero-day files and executables, the signature-based tool may still be an overall better option. Both network-based tools provide substantial (simulated) savings when paired with either host tool, yet both show poor detection rates on protocols other than HTTP or SMTP. Our results show that all four tools have near-perfect precision but alarmingly low recall, especially on file types other than executables and office files -- 37% of malware tested, including all polyglot files, were undetected. Priorities for researchers and takeaways for end users are given. △ Less

Submitted 17 August, 2022; v1 submitted 16 December, 2020; originally announced December 2020.

Comments: Includes Actionable Takeaways for SOCs

Journal ref: Digital Threats: Research and Practice 2023

arXiv:2012.09013 [pdf, other]

An Assessment of the Usability of Machine Learning Based Tools for the Security Operations Center

Authors: Sean Oesch, Robert Bridges, Jared Smith, Justin Beaver, John Goodall, Kelly Huffer, Craig Miles, Dan Scofield

Abstract: Gartner, a large research and advisory company, anticipates that by 2024 80% of security operation centers (SOCs) will use machine learning (ML) based solutions to enhance their operations. In light of such widespread adoption, it is vital for the research community to identify and address usability concerns. This work presents the results of the first in situ usability assessment of ML-based tool… ▽ More Gartner, a large research and advisory company, anticipates that by 2024 80% of security operation centers (SOCs) will use machine learning (ML) based solutions to enhance their operations. In light of such widespread adoption, it is vital for the research community to identify and address usability concerns. This work presents the results of the first in situ usability assessment of ML-based tools. With the support of the US Navy, we leveraged the national cyber range, a large, air-gapped cyber testbed equipped with state-of-the-art network and user emulation capabilities, to study six US Naval SOC analysts' usage of two tools. Our analysis identified several serious usability issues, including multiple violations of established usability heuristics form user interface design. We also discovered that analysts lacked a clear mental model of how these tools generate scores, resulting in mistrust and/or misuse of the tools themselves. Surprisingly, we found no correlation between analysts' level of education or years of experience and their performance with either tool, suggesting that other factors such as prior background knowledge or personality play a significant role in ML-based tool usage. Our findings demonstrate that ML-based security tool vendors must put a renewed focus on working with analysts, both experienced and inexperienced, to ensure that their systems are usable and useful in real-world security operations settings. △ Less

Submitted 16 December, 2020; originally announced December 2020.

arXiv:1908.03296 [pdf, other]

That Was Then, This Is Now: A Security Evaluation of Password Generation, Storage, and Autofill in Thirteen Password Managers

Authors: Sean Oesch, Scott Ruoti

Abstract: Password managers have the potential to help users more effectively manage their passwords and address many of the concerns surrounding password-based authentication, however prior research has identified significant vulnerabilities in existing password managers. Since that time, five years has passed, leaving it unclear whether password managers remain vulnerable or whether they are now ready for… ▽ More Password managers have the potential to help users more effectively manage their passwords and address many of the concerns surrounding password-based authentication, however prior research has identified significant vulnerabilities in existing password managers. Since that time, five years has passed, leaving it unclear whether password managers remain vulnerable or whether they are now ready for broad adoption. To answer this question, we evaluate thirteen popular password managers and consider all three stages of the password manager lifecycle--password generation, storage, and autofill. Our evaluation is the first analysis of password generation in password managers, finding several non-random character distributions and identifying instances where generated passwords were vulnerable to online and offline guessing attacks. For password storage and autofill, we replicate past evaluations, demonstrating that while password managers have improved in the half-decade since those prior evaluations, there are still significant issues, particularly with browser-based password managers; these problems include unencrypted metadata, unsafe defaults, and vulnerabilities to clickjacking attacks. Based on our results, we identify password managers to avoid, provide recommendations on how to improve existing password managers, and identify areas of future research. △ Less

Submitted 10 December, 2019; v1 submitted 8 August, 2019; originally announced August 2019.

Comments: Appearing at USENIX Security 2020

Showing 1–12 of 12 results for author: Oesch, S