subscribe to arXiv mailings

ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

Authors: Fengqing Jiang, Zhangchen Xu, Luyao Niu, Bill Yuchen Lin, Radha Poovendran

Abstract: Large language models (LLMs) are expected to follow instructions from users and engage in conversations. Techniques to enhance LLMs' instruction-following capabilities typically fine-tune them using data structured according to a predefined chat template. Although chat templates are shown to be effective in optimizing LLM performance, their impact on safety alignment of LLMs has been less understo… ▽ More Large language models (LLMs) are expected to follow instructions from users and engage in conversations. Techniques to enhance LLMs' instruction-following capabilities typically fine-tune them using data structured according to a predefined chat template. Although chat templates are shown to be effective in optimizing LLM performance, their impact on safety alignment of LLMs has been less understood, which is crucial for deploying LLMs safely at scale. In this paper, we investigate how chat templates affect safety alignment of LLMs. We identify a common vulnerability, named ChatBug, that is introduced by chat templates. Our key insight to identify ChatBug is that the chat templates provide a rigid format that need to be followed by LLMs, but not by users. Hence, a malicious user may not necessarily follow the chat template when prompting LLMs. Instead, malicious users could leverage their knowledge of the chat template and accordingly craft their prompts to bypass safety alignments of LLMs. We develop two attacks to exploit the ChatBug vulnerability. We demonstrate that a malicious user can exploit the ChatBug vulnerability of eight state-of-the-art (SOTA) LLMs and effectively elicit unintended responses from these models. Moreover, we show that ChatBug can be exploited by existing jailbreak attacks to enhance their attack success rates. We investigate potential countermeasures to ChatBug. Our results show that while adversarial training effectively mitigates the ChatBug vulnerability, the victim model incurs significant performance degradation. These results highlight the trade-off between safety alignment and helpfulness. Developing new methods for instruction tuning to balance this trade-off is an open and critical direction for future research △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2406.12257 [pdf, other]

CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models

Authors: Yuetai Li, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Dinuka Sahabandu, Bhaskar Ramasubramanian, Radha Poovendran

Abstract: The remarkable performance of large language models (LLMs) in generation tasks has enabled practitioners to leverage publicly available models to power custom applications, such as chatbots and virtual assistants. However, the data used to train or fine-tune these LLMs is often undisclosed, allowing an attacker to compromise the data and inject backdoors into the models. In this paper, we develop… ▽ More The remarkable performance of large language models (LLMs) in generation tasks has enabled practitioners to leverage publicly available models to power custom applications, such as chatbots and virtual assistants. However, the data used to train or fine-tune these LLMs is often undisclosed, allowing an attacker to compromise the data and inject backdoors into the models. In this paper, we develop a novel inference time defense, named CleanGen, to mitigate backdoor attacks for generation tasks in LLMs. CleanGenis a lightweight and effective decoding strategy that is compatible with the state-of-the-art (SOTA) LLMs. Our insight behind CleanGen is that compared to other LLMs, backdoored LLMs assign significantly higher probabilities to tokens representing the attacker-desired contents. These discrepancies in token probabilities enable CleanGen to identify suspicious tokens favored by the attacker and replace them with tokens generated by another LLM that is not compromised by the same attacker, thereby avoiding generation of attacker-desired content. We evaluate CleanGen against five SOTA backdoor attacks. Our results show that CleanGen achieves lower attack success rates (ASR) compared to five SOTA baseline defenses for all five backdoor attacks. Moreover, LLMs deploying CleanGen maintain helpfulness in their responses when serving benign user queries with minimal added computational overhead. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.08464 [pdf, other]

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Authors: Zhangchen Xu, Fengqing Jiang, Luyao Niu, Yuntian Deng, Radha Poovendran, Yejin Choi, Bill Yuchen Lin

Abstract: High-quality instruction data is critical for aligning large language models (LLMs). Although some models, such as Llama-3-Instruct, have open weights, their alignment data remain private, which hinders the democratization of AI. High human labor costs and a limited, predefined scope for prompting prevent existing open-source data creation methods from scaling effectively, potentially limiting the… ▽ More High-quality instruction data is critical for aligning large language models (LLMs). Although some models, such as Llama-3-Instruct, have open weights, their alignment data remain private, which hinders the democratization of AI. High human labor costs and a limited, predefined scope for prompting prevent existing open-source data creation methods from scaling effectively, potentially limiting the diversity and quality of public alignment datasets. Is it possible to synthesize high-quality instruction data at scale by extracting it directly from an aligned LLM? We present a self-synthesis method for generating large-scale alignment data named Magpie. Our key observation is that aligned LLMs like Llama-3-Instruct can generate a user query when we input only the left-side templates up to the position reserved for user messages, thanks to their auto-regressive nature. We use this method to prompt Llama-3-Instruct and generate 4 million instructions along with their corresponding responses. We perform a comprehensive analysis of the extracted data and select 300K high-quality instances. To compare Magpie data with other public instruction datasets, we fine-tune Llama-3-8B-Base with each dataset and evaluate the performance of the fine-tuned models. Our results indicate that in some tasks, models fine-tuned with Magpie perform comparably to the official Llama-3-8B-Instruct, despite the latter being enhanced with 10 million data points through supervised fine-tuning (SFT) and subsequent feedback learning. We also show that using Magpie solely for SFT can surpass the performance of previous public datasets utilized for both SFT and preference optimization, such as direct preference optimization with UltraFeedback. This advantage is evident on alignment benchmarks such as AlpacaEval, ArenaHard, and WildBench. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: Link: https://magpie-align.github.io/

arXiv:2405.20975 [pdf, other]

ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning

Authors: Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bo Li, Radha Poovendran

Abstract: In Federated Learning (FL), a set of clients collaboratively train a machine learning model (called global model) without sharing their local training data. The local training data of clients is typically non-i.i.d. and heterogeneous, resulting in varying contributions from individual clients to the final performance of the global model. In response, many contribution evaluation methods were propo… ▽ More In Federated Learning (FL), a set of clients collaboratively train a machine learning model (called global model) without sharing their local training data. The local training data of clients is typically non-i.i.d. and heterogeneous, resulting in varying contributions from individual clients to the final performance of the global model. In response, many contribution evaluation methods were proposed, where the server could evaluate the contribution made by each client and incentivize the high-contributing clients to sustain their long-term participation in FL. Existing studies mainly focus on developing new metrics or algorithms to better measure the contribution of each client. However, the security of contribution evaluation methods of FL operating in adversarial environments is largely unexplored. In this paper, we propose the first model poisoning attack on contribution evaluation methods in FL, termed ACE. Specifically, we show that any malicious client utilizing ACE could manipulate the parameters of its local model such that it is evaluated to have a high contribution by the server, even when its local training data is indeed of low quality. We perform both theoretical analysis and empirical evaluations of ACE. Theoretically, we show our design of ACE can effectively boost the malicious client's perceived contribution when the server employs the widely-used cosine distance metric to measure contribution. Empirically, our results show ACE effectively and efficiently deceive five state-of-the-art contribution evaluation methods. In addition, ACE preserves the accuracy of the final global models on testing inputs. We also explore six countermeasures to defend ACE. Our results show they are inadequate to thwart ACE, highlighting the urgent need for new defenses to safeguard the contribution evaluation methods in FL. △ Less

Submitted 5 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

Comments: To appear in the 33rd USENIX Security Symposium, 2024

arXiv:2402.18677 [pdf, other]

Fault Tolerant Neural Control Barrier Functions for Robotic Systems under Sensor Faults and Attacks

Authors: Hongchao Zhang, Luyao Niu, Andrew Clark, Radha Poovendran

Abstract: Safety is a fundamental requirement of many robotic systems. Control barrier function (CBF)-based approaches have been proposed to guarantee the safety of robotic systems. However, the effectiveness of these approaches highly relies on the choice of CBFs. Inspired by the universal approximation power of neural networks, there is a growing trend toward representing CBFs using neural networks, leadi… ▽ More Safety is a fundamental requirement of many robotic systems. Control barrier function (CBF)-based approaches have been proposed to guarantee the safety of robotic systems. However, the effectiveness of these approaches highly relies on the choice of CBFs. Inspired by the universal approximation power of neural networks, there is a growing trend toward representing CBFs using neural networks, leading to the notion of neural CBFs (NCBFs). Current NCBFs, however, are trained and deployed in benign environments, making them ineffective for scenarios where robotic systems experience sensor faults and attacks. In this paper, we study safety-critical control synthesis for robotic systems under sensor faults and attacks. Our main contribution is the development and synthesis of a new class of CBFs that we term fault tolerant neural control barrier function (FT-NCBF). We derive the necessary and sufficient conditions for FT-NCBFs to guarantee safety, and develop a data-driven method to learn FT-NCBFs by minimizing a loss function constructed using the derived conditions. Using the learned FT-NCBF, we synthesize a control input and formally prove the safety guarantee provided by our approach. We demonstrate our proposed approach using two case studies: obstacle avoidance problem for an autonomous mobile robot and spacecraft rendezvous problem, with code available via https://github.com/HongchaoZhang-HZ/FTNCBF. △ Less

Submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.11753 [pdf, other]

ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

Authors: Fengqing Jiang, Zhangchen Xu, Luyao Niu, Zhen Xiang, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran

Abstract: Safety is critical to the usage of large language models (LLMs). Multiple techniques such as data filtering and supervised fine-tuning have been developed to strengthen LLM safety. However, currently known techniques presume that corpora used for safety alignment of LLMs are solely interpreted by semantics. This assumption, however, does not hold in real-world applications, which leads to severe v… ▽ More Safety is critical to the usage of large language models (LLMs). Multiple techniques such as data filtering and supervised fine-tuning have been developed to strengthen LLM safety. However, currently known techniques presume that corpora used for safety alignment of LLMs are solely interpreted by semantics. This assumption, however, does not hold in real-world applications, which leads to severe vulnerabilities in LLMs. For example, users of forums often use ASCII art, a form of text-based art, to convey image information. In this paper, we propose a novel ASCII art-based jailbreak attack and introduce a comprehensive benchmark Vision-in-Text Challenge (ViTC) to evaluate the capabilities of LLMs in recognizing prompts that cannot be solely interpreted by semantics. We show that five SOTA LLMs (GPT-3.5, GPT-4, Gemini, Claude, and Llama2) struggle to recognize prompts provided in the form of ASCII art. Based on this observation, we develop the jailbreak attack ArtPrompt, which leverages the poor performance of LLMs in recognizing ASCII art to bypass safety measures and elicit undesired behaviors from LLMs. ArtPrompt only requires black-box access to the victim LLMs, making it a practical attack. We evaluate ArtPrompt on five SOTA LLMs, and show that ArtPrompt can effectively and efficiently induce undesired behaviors from all five LLMs. Our code is available at https://github.com/uw-nsl/ArtPrompt. △ Less

Submitted 7 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

Comments: To appear in ACL 2024

arXiv:2402.08983 [pdf, other]

SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding

Authors: Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran

Abstract: As large language models (LLMs) become increasingly integrated into real-world applications such as code generation and chatbot assistance, extensive efforts have been made to align LLM behavior with human values, including safety. Jailbreak attacks, aiming to provoke unintended and unsafe behaviors from LLMs, remain a significant/leading LLM safety threat. In this paper, we aim to defend LLMs aga… ▽ More As large language models (LLMs) become increasingly integrated into real-world applications such as code generation and chatbot assistance, extensive efforts have been made to align LLM behavior with human values, including safety. Jailbreak attacks, aiming to provoke unintended and unsafe behaviors from LLMs, remain a significant/leading LLM safety threat. In this paper, we aim to defend LLMs against jailbreak attacks by introducing SafeDecoding, a safety-aware decoding strategy for LLMs to generate helpful and harmless responses to user queries. Our insight in developing SafeDecoding is based on the observation that, even though probabilities of tokens representing harmful contents outweigh those representing harmless responses, safety disclaimers still appear among the top tokens after sorting tokens by probability in descending order. This allows us to mitigate jailbreak attacks by identifying safety disclaimers and amplifying their token probabilities, while simultaneously attenuating the probabilities of token sequences that are aligned with the objectives of jailbreak attacks. We perform extensive experiments on five LLMs using six state-of-the-art jailbreak attacks and four benchmark datasets. Our results show that SafeDecoding significantly reduces the attack success rate and harmfulness of jailbreak attacks without compromising the helpfulness of responses to benign user queries. SafeDecoding outperforms six defense methods. △ Less

Submitted 7 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

Comments: To appear in ACL 2024

arXiv:2402.08695 [pdf, other]

Game of Trojans: Adaptive Adversaries Against Output-based Trojaned-Model Detectors

Authors: Dinuka Sahabandu, Xiaojun Xu, Arezoo Rajabi, Luyao Niu, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran

Abstract: We propose and analyze an adaptive adversary that can retrain a Trojaned DNN and is also aware of SOTA output-based Trojaned model detectors. We show that such an adversary can ensure (1) high accuracy on both trigger-embedded and clean samples and (2) bypass detection. Our approach is based on an observation that the high dimensionality of the DNN parameters provides sufficient degrees of freedom… ▽ More We propose and analyze an adaptive adversary that can retrain a Trojaned DNN and is also aware of SOTA output-based Trojaned model detectors. We show that such an adversary can ensure (1) high accuracy on both trigger-embedded and clean samples and (2) bypass detection. Our approach is based on an observation that the high dimensionality of the DNN parameters provides sufficient degrees of freedom to simultaneously achieve these objectives. We also enable SOTA detectors to be adaptive by allowing retraining to recalibrate their parameters, thus modeling a co-evolution of parameters of a Trojaned model and detectors. We then show that this co-evolution can be modeled as an iterative game, and prove that the resulting (optimal) solution of this interactive game leads to the adversary successfully achieving the above objectives. In addition, we provide a greedy algorithm for the adversary to select a minimum number of input samples for embedding triggers. We show that for cross-entropy or log-likelihood loss functions used by the DNNs, the greedy algorithm provides provable guarantees on the needed number of trigger-embedded input samples. Extensive experiments on four diverse datasets -- MNIST, CIFAR-10, CIFAR-100, and SpeechCommand -- reveal that the adversary effectively evades four SOTA output-based Trojaned model detectors: MNTD, NeuralCleanse, STRIP, and TABOR. △ Less

Submitted 12 February, 2024; originally announced February 2024.

arXiv:2402.01114 [pdf, other]

Double-Dip: Thwarting Label-Only Membership Inference Attacks with Transfer Learning and Randomization

Authors: Arezoo Rajabi, Reeya Pimple, Aiswarya Janardhanan, Surudhi Asokraj, Bhaskar Ramasubramanian, Radha Poovendran

Abstract: Transfer learning (TL) has been demonstrated to improve DNN model performance when faced with a scarcity of training samples. However, the suitability of TL as a solution to reduce vulnerability of overfitted DNNs to privacy attacks is unexplored. A class of privacy attacks called membership inference attacks (MIAs) aim to determine whether a given sample belongs to the training dataset (member) o… ▽ More Transfer learning (TL) has been demonstrated to improve DNN model performance when faced with a scarcity of training samples. However, the suitability of TL as a solution to reduce vulnerability of overfitted DNNs to privacy attacks is unexplored. A class of privacy attacks called membership inference attacks (MIAs) aim to determine whether a given sample belongs to the training dataset (member) or not (nonmember). We introduce Double-Dip, a systematic empirical study investigating the use of TL (Stage-1) combined with randomization (Stage-2) to thwart MIAs on overfitted DNNs without degrading classification accuracy. Our study examines the roles of shared feature space and parameter values between source and target models, number of frozen layers, and complexity of pretrained models. We evaluate Double-Dip on three (Target, Source) dataset paris: (i) (CIFAR-10, ImageNet), (ii) (GTSRB, ImageNet), (iii) (CelebA, VGGFace2). We consider four publicly available pretrained DNNs: (a) VGG-19, (b) ResNet-18, (c) Swin-T, and (d) FaceNet. Our experiments demonstrate that Stage-1 reduces adversary success while also significantly increasing classification accuracy of nonmembers against an adversary with either white-box or black-box DNN model access, attempting to carry out SOTA label-only MIAs. After Stage-2, success of an adversary carrying out a label-only MIA is further reduced to near 50%, bringing it closer to a random guess and showing the effectiveness of Double-Dip. Stage-2 of Double-Dip also achieves lower ASR and higher classification accuracy than regularization and differential privacy-based methods. △ Less

Submitted 1 February, 2024; originally announced February 2024.

arXiv:2401.12242 [pdf, other]

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models

Authors: Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li

Abstract: Large language models (LLMs) are shown to benefit from chain-of-thought (COT) prompting, particularly when tackling tasks that require systematic reasoning processes. On the other hand, COT prompting also poses new vulnerabilities in the form of backdoor attacks, wherein the model will output unintended malicious content under specific backdoor-triggered conditions during inference. Traditional me… ▽ More Large language models (LLMs) are shown to benefit from chain-of-thought (COT) prompting, particularly when tackling tasks that require systematic reasoning processes. On the other hand, COT prompting also poses new vulnerabilities in the form of backdoor attacks, wherein the model will output unintended malicious content under specific backdoor-triggered conditions during inference. Traditional methods for launching backdoor attacks involve either contaminating the training dataset with backdoored instances or directly manipulating the model parameters during deployment. However, these approaches are not practical for commercial LLMs that typically operate via API access. In this paper, we propose BadChain, the first backdoor attack against LLMs employing COT prompting, which does not require access to the training dataset or model parameters and imposes low computational overhead. BadChain leverages the inherent reasoning capabilities of LLMs by inserting a backdoor reasoning step into the sequence of reasoning steps of the model output, thereby altering the final response when a backdoor trigger exists in the query prompt. Empirically, we show the effectiveness of BadChain for two COT strategies across four LLMs (Llama2, GPT-3.5, PaLM2, and GPT-4) and six complex benchmark tasks encompassing arithmetic, commonsense, and symbolic reasoning. Moreover, we show that LLMs endowed with stronger reasoning capabilities exhibit higher susceptibility to BadChain, exemplified by a high average attack success rate of 97.0% across the six benchmark tasks on GPT-4. Finally, we propose two defenses based on shuffling and demonstrate their overall ineffectiveness against BadChain. Therefore, BadChain remains a severe threat to LLMs, underscoring the urgency for the development of robust and effective future defenses. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: Accepted to ICLR2024

arXiv:2401.05562 [pdf, ps, other]

Brave: Byzantine-Resilient and Privacy-Preserving Peer-to-Peer Federated Learning

Authors: Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Radha Poovendran

Abstract: Federated learning (FL) enables multiple participants to train a global machine learning model without sharing their private training data. Peer-to-peer (P2P) FL advances existing centralized FL paradigms by eliminating the server that aggregates local models from participants and then updates the global model. However, P2P FL is vulnerable to (i) honest-but-curious participants whose objective is… ▽ More Federated learning (FL) enables multiple participants to train a global machine learning model without sharing their private training data. Peer-to-peer (P2P) FL advances existing centralized FL paradigms by eliminating the server that aggregates local models from participants and then updates the global model. However, P2P FL is vulnerable to (i) honest-but-curious participants whose objective is to infer private training data of other participants, and (ii) Byzantine participants who can transmit arbitrarily manipulated local models to corrupt the learning process. P2P FL schemes that simultaneously guarantee Byzantine resilience and preserve privacy have been less studied. In this paper, we develop Brave, a protocol that ensures Byzantine Resilience And privacy-preserving property for P2P FL in the presence of both types of adversaries. We show that Brave preserves privacy by establishing that any honest-but-curious adversary cannot infer other participants' private data by observing their models. We further prove that Brave is Byzantine-resilient, which guarantees that all benign participants converge to an identical model that deviates from a global model trained without Byzantine adversaries by a bounded distance. We evaluate Brave against three state-of-the-art adversaries on a P2P FL for image classification tasks on benchmark datasets CIFAR10 and MNIST. Our results show that the global model learned with Brave in the presence of adversaries achieves comparable classification accuracy to a global model trained in the absence of any adversary. △ Less

Submitted 10 January, 2024; originally announced January 2024.

arXiv:2311.16153 [pdf, other]

Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications

Authors: Fengqing Jiang, Zhangchen Xu, Luyao Niu, Boxin Wang, Jinyuan Jia, Bo Li, Radha Poovendran

Abstract: Large language models (LLMs) are increasingly deployed as the service backend for LLM-integrated applications such as code completion and AI-powered search. LLM-integrated applications serve as middleware to refine users' queries with domain-specific knowledge to better inform LLMs and enhance the responses. Despite numerous opportunities and benefits, LLM-integrated applications also introduce ne… ▽ More Large language models (LLMs) are increasingly deployed as the service backend for LLM-integrated applications such as code completion and AI-powered search. LLM-integrated applications serve as middleware to refine users' queries with domain-specific knowledge to better inform LLMs and enhance the responses. Despite numerous opportunities and benefits, LLM-integrated applications also introduce new attack surfaces. Understanding, minimizing, and eliminating these emerging attack surfaces is a new area of research. In this work, we consider a setup where the user and LLM interact via an LLM-integrated application in the middle. We focus on the communication rounds that begin with user's queries and end with LLM-integrated application returning responses to the queries, powered by LLMs at the service backend. For this query-response protocol, we identify potential vulnerabilities that can originate from the malicious application developer or from an outsider threat initiator that is able to control the database access, manipulate and poison data that are high-risk for the user. Successful exploits of the identified vulnerabilities result in the users receiving responses tailored to the intent of a threat initiator. We assess such threats against LLM-integrated applications empowered by OpenAI GPT-3.5 and GPT-4. Our empirical results show that the threats can effectively bypass the restrictions and moderation policies of OpenAI, resulting in users receiving responses that contain bias, toxic content, privacy risk, and disinformation. To mitigate those threats, we identify and define four key properties, namely integrity, source identification, attack detectability, and utility preservation, that need to be satisfied by a safe LLM-integrated application. Based on these properties, we develop a lightweight, threat-agnostic defense that mitigates both insider and outsider threats. △ Less

Submitted 28 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

arXiv:2308.15673 [pdf, other]

MDTD: A Multi Domain Trojan Detector for Deep Neural Networks

Authors: Arezoo Rajabi, Surudhi Asokraj, Fengqing Jiang, Luyao Niu, Bhaskar Ramasubramanian, Jim Ritcey, Radha Poovendran

Abstract: Machine learning models that use deep neural networks (DNNs) are vulnerable to backdoor attacks. An adversary carrying out a backdoor attack embeds a predefined perturbation called a trigger into a small subset of input samples and trains the DNN such that the presence of the trigger in the input results in an adversary-desired output class. Such adversarial retraining however needs to ensure that… ▽ More Machine learning models that use deep neural networks (DNNs) are vulnerable to backdoor attacks. An adversary carrying out a backdoor attack embeds a predefined perturbation called a trigger into a small subset of input samples and trains the DNN such that the presence of the trigger in the input results in an adversary-desired output class. Such adversarial retraining however needs to ensure that outputs for inputs without the trigger remain unaffected and provide high classification accuracy on clean samples. In this paper, we propose MDTD, a Multi-Domain Trojan Detector for DNNs, which detects inputs containing a Trojan trigger at testing time. MDTD does not require knowledge of trigger-embedding strategy of the attacker and can be applied to a pre-trained DNN model with image, audio, or graph-based inputs. MDTD leverages an insight that input samples containing a Trojan trigger are located relatively farther away from a decision boundary than clean samples. MDTD estimates the distance to a decision boundary using adversarial learning methods and uses this distance to infer whether a test-time input sample is Trojaned or not. We evaluate MDTD against state-of-the-art Trojan detection methods across five widely used image-based datasets: CIFAR100, CIFAR10, GTSRB, SVHN, and Flowers102; four graph-based datasets: AIDS, WinMal, Toxicant, and COLLAB; and the SpeechCommand audio dataset. MDTD effectively identifies samples that contain different types of Trojan triggers. We evaluate MDTD against adaptive attacks where an adversary trains a robust DNN to increase (decrease) distance of benign (Trojan) inputs from a decision boundary. △ Less

Submitted 2 September, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

Comments: Accepted to ACM Conference on Computer and Communications Security (ACM CCS) 2023

arXiv:2304.02058 [pdf, ps, other]

A Compositional Resilience Index for Computationally Efficient Safety Analysis of Interconnected Systems

Authors: Luyao Niu, Abdullah Al Maruf, Andrew Clark, J. Sukarno Mertoguno, Radha Poovendran

Abstract: Interconnected systems such as power systems and chemical processes are often required to satisfy safety properties in the presence of faults and attacks. Verifying safety of these systems, however, is computationally challenging due to nonlinear dynamics, high dimensionality, and combinatorial number of possible faults and attacks that can be incurred by the subsystems interconnected within the n… ▽ More Interconnected systems such as power systems and chemical processes are often required to satisfy safety properties in the presence of faults and attacks. Verifying safety of these systems, however, is computationally challenging due to nonlinear dynamics, high dimensionality, and combinatorial number of possible faults and attacks that can be incurred by the subsystems interconnected within the network. In this paper, we develop a compositional resilience index to verify safety properties of interconnected systems under faults and attacks. The resilience index is a tuple serving the following two purposes. First, it quantifies how a safety property is impacted when a subsystem is compromised by faults and attacks. Second, the resilience index characterizes the needed behavior of a subsystem during normal operations to ensure safety violations will not occur when future adverse events occur. We develop a set of sufficient conditions on the dynamics of each subsystem to satisfy its safety constraint, and leverage these conditions to formulate an optimization program to compute the resilience index. When multiple subsystems are interconnected and their resilience indices are given, we show that the safety constraints of the interconnected system can be efficiently verified by solving a system of linear inequalities. We demonstrate our developed resilience index using a numerical case study on chemical reactors connected in series. △ Less

Submitted 4 April, 2023; originally announced April 2023.

arXiv:2304.02005 [pdf, other]

Risk-Aware Distributed Multi-Agent Reinforcement Learning

Authors: Abdullah Al Maruf, Luyao Niu, Bhaskar Ramasubramanian, Andrew Clark, Radha Poovendran

Abstract: Autonomous cyber and cyber-physical systems need to perform decision-making, learning, and control in unknown environments. Such decision-making can be sensitive to multiple factors, including modeling errors, changes in costs, and impacts of events in the tails of probability distributions. Although multi-agent reinforcement learning (MARL) provides a framework for learning behaviors through repe… ▽ More Autonomous cyber and cyber-physical systems need to perform decision-making, learning, and control in unknown environments. Such decision-making can be sensitive to multiple factors, including modeling errors, changes in costs, and impacts of events in the tails of probability distributions. Although multi-agent reinforcement learning (MARL) provides a framework for learning behaviors through repeated interactions with the environment by minimizing an average cost, it will not be adequate to overcome the above challenges. In this paper, we develop a distributed MARL approach to solve decision-making problems in unknown environments by learning risk-aware actions. We use the conditional value-at-risk (CVaR) to characterize the cost function that is being minimized, and define a Bellman operator to characterize the value function associated to a given state-action pair. We prove that this operator satisfies a contraction property, and that it converges to the optimal value function. We then propose a distributed MARL algorithm called the CVaR QD-Learning algorithm, and establish that value functions of individual agents reaches consensus. We identify several challenges that arise in the implementation of the CVaR QD-Learning algorithm, and present solutions to overcome these. We evaluate the CVaR QD-Learning algorithm through simulations, and demonstrate the effect of a risk parameter on value functions at consensus. △ Less

Submitted 4 April, 2023; originally announced April 2023.

arXiv:2302.10308 [pdf, other]

doi 10.1109/TPWRS.2023.3299280

A Hybrid Submodular Optimization Approach to Controlled Islanding with Post-Disturbance Stability Guarantees

Authors: Luyao Niu, Dinuka Sahanbandu, Andrew Clark, Radha Poovendran

Abstract: Disturbances may create cascading failures in power systems and lead to widespread blackouts. Controlled islanding is an effective approach to mitigate cascading failures by partitioning the power system into a set of disjoint islands. To retain the stability of the power system following disturbances, the islanding strategy should not only be minimally disruptive, but also guarantee post-disturba… ▽ More Disturbances may create cascading failures in power systems and lead to widespread blackouts. Controlled islanding is an effective approach to mitigate cascading failures by partitioning the power system into a set of disjoint islands. To retain the stability of the power system following disturbances, the islanding strategy should not only be minimally disruptive, but also guarantee post-disturbance stability. In this paper, we study the problem of synthesizing post-disturbance stability-aware controlled islanding strategies. To ensure post-disturbance stability, our computation of islanding strategies takes load-generation balance and transmission line capacity constraints into consideration, leading to a hybrid optimization problem with both discrete and continuous variables. To mitigate the computational challenge incurred when solving the hybrid optimization program, we propose the concepts of hybrid submodularity and hybrid matroid. We show that the islanding problem is equivalent to a hybrid matroid optimization program, whose objective function is hybrid supermodular. Leveraging the supermodularity property, we develop an efficient local search algorithm and show that the proposed algorithm achieves 1/2-optimality guarantee. We compare our approach with a baseline using mixed-integer linear program on IEEE 118-bus, IEEE 300-bus, ActivSg 500-bus, and Polish 2383-bus systems. Our results show that our approach outperforms the baseline in terms of the total cost incurred during islanding across all test cases. Furthermore, our proposed approach can find an islanding strategy for large-scale test cases such as Polish 2383-bus system, whereas the baseline approach becomes intractable. △ Less

Submitted 9 August, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

arXiv:2301.04587 [pdf, other]

Electric Vehicles Security and Privacy: Challenges, Solutions, and Future Needs

Authors: Alessandro Brighente, Mauro Conti, Denis Donadel, Radha Poovendran, Federico Turrin, Jianying Zhou

Abstract: Electric Vehicles (EVs) share common technologies with classical fossil-fueled cars, but they also employ novel technologies and components (e.g., Charging System and Battery Management System) that create an unexplored attack surface for malicious users. Although multiple contributions in the literature explored cybersecurity aspects of particular components of the EV ecosystem (e.g., charging in… ▽ More Electric Vehicles (EVs) share common technologies with classical fossil-fueled cars, but they also employ novel technologies and components (e.g., Charging System and Battery Management System) that create an unexplored attack surface for malicious users. Although multiple contributions in the literature explored cybersecurity aspects of particular components of the EV ecosystem (e.g., charging infrastructure), there is still no contribution to the holistic cybersecurity of EVs and their related technologies from a cyber-physical system perspective. In this paper, we provide the first in-depth study of the security and privacy threats associated with the EVs ecosystem. We analyze the threats associated with both the EV and the different charging solutions. Focusing on the Cyber-Physical Systems (CPS) paradigm, we provide a detailed analysis of all the processes that an attacker might exploit to affect the security and privacy of both drivers and the infrastructure. To address the highlighted threats, we present possible solutions that might be implemented. We also provide an overview of possible future directions to guarantee the security and privacy of the EVs ecosystem. Based on our analysis, we stress the need for EV-specific cybersecurity solutions. △ Less

Submitted 11 January, 2023; originally announced January 2023.

arXiv:2212.01688 [pdf, other]

LDL: A Defense for Label-Based Membership Inference Attacks

Authors: Arezoo Rajabi, Dinuka Sahabandu, Luyao Niu, Bhaskar Ramasubramanian, Radha Poovendran

Abstract: The data used to train deep neural network (DNN) models in applications such as healthcare and finance typically contain sensitive information. A DNN model may suffer from overfitting. Overfitted models have been shown to be susceptible to query-based attacks such as membership inference attacks (MIAs). MIAs aim to determine whether a sample belongs to the dataset used to train a classifier (membe… ▽ More The data used to train deep neural network (DNN) models in applications such as healthcare and finance typically contain sensitive information. A DNN model may suffer from overfitting. Overfitted models have been shown to be susceptible to query-based attacks such as membership inference attacks (MIAs). MIAs aim to determine whether a sample belongs to the dataset used to train a classifier (members) or not (nonmembers). Recently, a new class of label based MIAs (LAB MIAs) was proposed, where an adversary was only required to have knowledge of predicted labels of samples. Developing a defense against an adversary carrying out a LAB MIA on DNN models that cannot be retrained remains an open problem. We present LDL, a light weight defense against LAB MIAs. LDL works by constructing a high-dimensional sphere around queried samples such that the model decision is unchanged for (noisy) variants of the sample within the sphere. This sphere of label-invariance creates ambiguity and prevents a querying adversary from correctly determining whether a sample is a member or a nonmember. We analytically characterize the success rate of an adversary carrying out a LAB MIA when LDL is deployed, and show that the formulation is consistent with experimental observations. We evaluate LDL on seven datasets -- CIFAR-10, CIFAR-100, GTSRB, Face, Purchase, Location, and Texas -- with varying sizes of training data. All of these datasets have been used by SOTA LAB MIAs. Our experiments demonstrate that LDL reduces the success rate of an adversary carrying out a LAB MIA in each case. We empirically compare LDL with defenses against LAB MIAs that require retraining of DNN models, and show that LDL performs favorably despite not needing to retrain the DNNs. △ Less

Submitted 16 December, 2022; v1 submitted 3 December, 2022; originally announced December 2022.

Comments: to appear in ACM ASIA Conference on Computer and Communications Security (ACM ASIACCS 2023)

arXiv:2208.14282 [pdf, ps, other]

A Timing-Based Framework for Designing Resilient Cyber-Physical Systems under Safety Constraint

Authors: Abdullah Al Maruf, Luyao Niu, Andrew Clark, J. Sukarno Mertoguno, Radha Poovendran

Abstract: Cyber-physical systems (CPS) are required to satisfy safety constraints in various application domains such as robotics, industrial manufacturing systems, and power systems. Faults and cyber attacks have been shown to cause safety violations, which can damage the system and endanger human lives. Resilient architectures have been proposed to ensure safety of CPS under such faults and attacks via me… ▽ More Cyber-physical systems (CPS) are required to satisfy safety constraints in various application domains such as robotics, industrial manufacturing systems, and power systems. Faults and cyber attacks have been shown to cause safety violations, which can damage the system and endanger human lives. Resilient architectures have been proposed to ensure safety of CPS under such faults and attacks via methodologies including redundancy and restarting from safe operating conditions. The existing resilient architectures for CPS utilize different mechanisms to guarantee safety, and currently there is no approach to compare them. Moreover, the analysis and design undertaken for CPS employing one architecture is not readily extendable to another. In this paper, we propose a timing-based framework for CPS employing various resilient architectures and develop a common methodology for safety analysis and computation of control policies and design parameters. Using the insight that the cyber subsystem operates in one out of a finite number of statuses, we first develop a hybrid system model that captures CPS adopting any of these architectures. Based on the hybrid system, we formulate the problem of joint computation of control policies and associated timing parameters for CPS to satisfy a given safety constraint and derive sufficient conditions for the solution. Utilizing the derived conditions, we provide an algorithm to compute control policies and timing parameters relevant to the employed architecture. We also note that our solution can be applied to a wide class of CPS with polynomial dynamics and also allows incorporation of new architectures. We verify our proposed framework by performing a case study on adaptive cruise control of vehicles. △ Less

Submitted 1 September, 2022; v1 submitted 30 August, 2022; originally announced August 2022.

arXiv:2208.02857 [pdf, other]

Identity-Based Authentication for On-Demand Charging of Electric Vehicles

Authors: Surudhi Asokraj, Tommaso Bianchi, Alessandro Brighente, Mauro Conti, Radha Poovendran

Abstract: Dynamic wireless power transfer provides means for charging Electric Vehicles (EVs) while driving, avoiding stopping for charging and hence fostering their widespread adoption. Researchers devoted much effort over the last decade to provide a reliable infrastructure for potential users to improve comfort and time management. Due to the severe security and performance system requirements, the diffe… ▽ More Dynamic wireless power transfer provides means for charging Electric Vehicles (EVs) while driving, avoiding stopping for charging and hence fostering their widespread adoption. Researchers devoted much effort over the last decade to provide a reliable infrastructure for potential users to improve comfort and time management. Due to the severe security and performance system requirements, the different scheme proposed in last years lack of a unified protocol involving the modern architecture model with merged authentication and billing processes. Furthermore, they require the continuous interaction of the trusted entity during the process, increasing the delay for the communication and reducing security due to the large number of message exchanges. In this paper, we propose a secure, computationally lightweight, unified protocol for fast authentication and billing that provides on-demand dynamic charging to comprehensively deal with all the computational and security constraints. The protocol employs an ID-based public encryption scheme to manage mutual authentication and pseudonyms to preserve the user's identity across multiple charging processes. Compared to state-of-the-art authentication protocols, our proposal overcomes the problem of overwhelming interactions and provides public scheme security against the use of simple operations in wide open communications without impacting on performance. △ Less

Submitted 4 August, 2022; originally announced August 2022.

Comments: 13 pages

arXiv:2207.05937 [pdf, other]

Game of Trojans: A Submodular Byzantine Approach

Authors: Dinuka Sahabandu, Arezoo Rajabi, Luyao Niu, Bo Li, Bhaskar Ramasubramanian, Radha Poovendran

Abstract: Machine learning models in the wild have been shown to be vulnerable to Trojan attacks during training. Although many detection mechanisms have been proposed, strong adaptive attackers have been shown to be effective against them. In this paper, we aim to answer the questions considering an intelligent and adaptive adversary: (i) What is the minimal amount of instances required to be Trojaned by a… ▽ More Machine learning models in the wild have been shown to be vulnerable to Trojan attacks during training. Although many detection mechanisms have been proposed, strong adaptive attackers have been shown to be effective against them. In this paper, we aim to answer the questions considering an intelligent and adaptive adversary: (i) What is the minimal amount of instances required to be Trojaned by a strong attacker? and (ii) Is it possible for such an attacker to bypass strong detection mechanisms? We provide an analytical characterization of adversarial capability and strategic interactions between the adversary and detection mechanism that take place in such models. We characterize adversary capability in terms of the fraction of the input dataset that can be embedded with a Trojan trigger. We show that the loss function has a submodular structure, which leads to the design of computationally efficient algorithms to determine this fraction with provable bounds on optimality. We propose a Submodular Trojan algorithm to determine the minimal fraction of samples to inject a Trojan trigger. To evade detection of the Trojaned model, we model strategic interactions between the adversary and Trojan detection mechanism as a two-player game. We show that the adversary wins the game with probability one, thus bypassing detection. We establish this by proving that output probability distributions of a Trojan model and a clean model are identical when following the Min-Max (MM) Trojan algorithm. We perform extensive evaluations of our algorithms on MNIST, CIFAR-10, and EuroSAT datasets. The results show that (i) with Submodular Trojan algorithm, the adversary needs to embed a Trojan trigger into a very small fraction of samples to achieve high accuracy on both Trojan and clean samples, and (ii) the MM Trojan algorithm yields a trained Trojan model that evades detection with probability 1. △ Less

Submitted 12 July, 2022; originally announced July 2022.

Comments: Submitted to GameSec 2022

arXiv:2205.10292 [pdf, ps, other]

doi 10.1109/VTC2023-Spring57618.2023.10199651

QEVSEC: Quick Electric Vehicle SEcure Charging via Dynamic Wireless Power Transfer

Authors: Tommaso Bianchi, Surudhi Asokraj, Alessandro Brighente, Mauro Conti, Radha Poovendran

Abstract: Dynamic Wireless Power Transfer (DWPT) can be used for on-demand recharging of Electric Vehicles (EV) while driving. However, DWPT raises numerous security and privacy concerns. Recently, researchers demonstrated that DWPT systems are vulnerable to adversarial attacks. In an EV charging scenario, an attacker can prevent the authorized customer from charging, obtain a free charge by billing a victi… ▽ More Dynamic Wireless Power Transfer (DWPT) can be used for on-demand recharging of Electric Vehicles (EV) while driving. However, DWPT raises numerous security and privacy concerns. Recently, researchers demonstrated that DWPT systems are vulnerable to adversarial attacks. In an EV charging scenario, an attacker can prevent the authorized customer from charging, obtain a free charge by billing a victim user and track a target vehicle. State-of-the-art authentication schemes relying on centralized solutions are either vulnerable to various attacks or have high computational complexity, making them unsuitable for a dynamic scenario. In this paper, we propose Quick Electric Vehicle SEcure Charging (QEVSEC), a novel, secure, and efficient authentication protocol for the dynamic charging of EVs. Our idea for QEVSEC originates from multiple vulnerabilities we found in the state-of-the-art protocol that allows tracking of user activity and is susceptible to replay attacks. Based on these observations, the proposed protocol solves these issues and achieves lower computational complexity by using only primitive cryptographic operations in a very short message exchange. QEVSEC provides scalability and a reduced cost in each iteration, thus lowering the impact on the power needed from the grid. △ Less

Submitted 28 August, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

Comments: 6 pages, conference

Journal ref: 2023 IEEE 97th Vehicular Technology Conference (VTC2023-Spring), Florence, Italy, 2023, pp. 1-6

arXiv:2204.06624 [pdf, other]

A Natural Language Processing Approach for Instruction Set Architecture Identification

Authors: Dinuka Sahabandu, Sukarno Mertoguno, Radha Poovendran

Abstract: Binary analysis of software is a critical step in cyber forensics applications such as program vulnerability assessment and malware detection. This involves interpreting instructions executed by software and often necessitates converting the software's binary file data to assembly language. The conversion process requires information about the binary file's target instruction set architecture (ISA… ▽ More Binary analysis of software is a critical step in cyber forensics applications such as program vulnerability assessment and malware detection. This involves interpreting instructions executed by software and often necessitates converting the software's binary file data to assembly language. The conversion process requires information about the binary file's target instruction set architecture (ISA). However, ISA information might not be included in binary files due to compilation errors, partial downloads, or adversarial corruption of file metadata. Machine learning (ML) is a promising methodology that can be used to identify the target ISA using binary data in the object code section of binary files. In this paper we propose a binary code feature extraction model to improve the accuracy and scalability of ML-based ISA identification methods. Our feature extraction model can be used in the absence of domain knowledge about the ISAs. Specifically, we adapt models from natural language processing (NLP) to i) identify successive byte patterns commonly observed in binary codes, ii) estimate the significance of each byte pattern to a binary file, and iii) estimate the relevance of each byte pattern in distinguishing between ISAs. We introduce character-level features of encoded binaries to identify fine-grained bit patterns inherent to each ISA. We use a dataset with binaries from 12 different ISAs to evaluate our approach. Empirical evaluations show that using our byte-level features in ML-based ISA identification results in an 8% higher accuracy than the state-of-the-art features based on byte-histograms and byte pattern signatures. We observe that character-level features allow reducing the size of the feature set by up to 16x while maintaining accuracy above 97%. △ Less

Submitted 13 April, 2022; originally announced April 2022.

Comments: 13 pages, 9 figures, submitted to IEEE TIFS

arXiv:2204.00514 [pdf, other]

An Analytical Framework for Control Synthesis of Cyber-Physical Systems with Safety Guarantee

Authors: Luyao Niu, Abdullah Al Maruf, Andrew Clark, J. Sukarno Mertoguno, Radha Poovendran

Abstract: Cyber-physical systems (CPS) are required to operate safely under fault and malicious attacks. The simplex architecture and the recently proposed cyber resilient architectures, e.g., Byzantine fault tolerant++ (BFT++), provide safety for CPS under faults and malicious cyber attacks, respectively. However, these existing architectures make use of different timing parameters and implementations to p… ▽ More Cyber-physical systems (CPS) are required to operate safely under fault and malicious attacks. The simplex architecture and the recently proposed cyber resilient architectures, e.g., Byzantine fault tolerant++ (BFT++), provide safety for CPS under faults and malicious cyber attacks, respectively. However, these existing architectures make use of different timing parameters and implementations to provide safety, and are seemingly unrelated. In this paper, we propose an analytical framework to represent the simplex, BFT++ and other practical cyber resilient architectures (CRAs). We construct a hybrid system that models CPS adopting any of these architectures. We derive sufficient conditions via our proposed framework under which a control policy is guaranteed to be safe. We present an algorithm to synthesize the control policy. We validate the proposed framework using a case study on lateral control of a Boeing 747, and demonstrate that our proposed approach ensures safety of the system. △ Less

Submitted 1 April, 2022; originally announced April 2022.

arXiv:2204.00512 [pdf, ps, other]

A Compositional Approach to Safety-Critical Resilient Control for Systems with Coupled Dynamics

Authors: Abdullah Al Maruf, Luyao Niu, Andrew Clark, J. Sukarno Mertoguno, Radha Poovendran

Abstract: Complex, interconnected Cyber-physical Systems (CPS) are increasingly common in applications including smart grids and transportation. Ensuring safety of interconnected systems whose dynamics are coupled is challenging because the effects of faults and attacks in one sub-system can propagate to other sub-systems and lead to safety violations. In this paper, we study the problem of safety-critical… ▽ More Complex, interconnected Cyber-physical Systems (CPS) are increasingly common in applications including smart grids and transportation. Ensuring safety of interconnected systems whose dynamics are coupled is challenging because the effects of faults and attacks in one sub-system can propagate to other sub-systems and lead to safety violations. In this paper, we study the problem of safety-critical control for CPS with coupled dynamics when some sub-systems are subject to failure or attack. We first propose resilient-safety indices (RSIs) for the faulty or compromised sub-systems that bound the worst-case impacts of faulty or compromised sub-systems on a set of specified safety constraints. By incorporating the RSIs, we provide a sufficient condition for the synthesis of control policies in each failure- and attack- free sub-systems. The synthesized control policies compensate for the impacts of the faulty or compromised sub-systems to guarantee safety. We formulate sum-of-square optimization programs to compute the RSIs and the safety-ensuring control policies. We present a case study that applies our proposed approach on the temperature regulation of three coupled rooms. The case study demonstrates that control policies obtained using our algorithm guarantee system's safety constraints. △ Less

Submitted 1 April, 2022; originally announced April 2022.

arXiv:2203.15506 [pdf, other]

Trojan Horse Training for Breaking Defenses against Backdoor Attacks in Deep Learning

Authors: Arezoo Rajabi, Bhaskar Ramasubramanian, Radha Poovendran

Abstract: Machine learning (ML) models that use deep neural networks are vulnerable to backdoor attacks. Such attacks involve the insertion of a (hidden) trigger by an adversary. As a consequence, any input that contains the trigger will cause the neural network to misclassify the input to a (single) target class, while classifying other inputs without a trigger correctly. ML models that contain a backdoor… ▽ More Machine learning (ML) models that use deep neural networks are vulnerable to backdoor attacks. Such attacks involve the insertion of a (hidden) trigger by an adversary. As a consequence, any input that contains the trigger will cause the neural network to misclassify the input to a (single) target class, while classifying other inputs without a trigger correctly. ML models that contain a backdoor are called Trojan models. Backdoors can have severe consequences in safety-critical cyber and cyber physical systems when only the outputs of the model are available. Defense mechanisms have been developed and illustrated to be able to distinguish between outputs from a Trojan model and a non-Trojan model in the case of a single-target backdoor attack with accuracy > 96 percent. Understanding the limitations of a defense mechanism requires the construction of examples where the mechanism fails. Current single-target backdoor attacks require one trigger per target class. We introduce a new, more general attack that will enable a single trigger to result in misclassification to more than one target class. Such a misclassification will depend on the true (actual) class that the input belongs to. We term this category of attacks multi-target backdoor attacks. We demonstrate that a Trojan model with either a single-target or multi-target trigger can be trained so that the accuracy of a defense mechanism that seeks to distinguish between outputs coming from a Trojan and a non-Trojan model will be reduced. Our approach uses the non-Trojan model as a teacher for the Trojan model and solves a min-max optimization problem between the Trojan model and defense mechanism. Empirical evaluations demonstrate that our training procedure reduces the accuracy of a state-of-the-art defense mechanism from >96 to 0 percent. △ Less

Submitted 24 March, 2022; originally announced March 2022.

Comments: Submitted to conference

arXiv:2203.10165 [pdf, other]

Privacy-Preserving Reinforcement Learning Beyond Expectation

Authors: Arezoo Rajabi, Bhaskar Ramasubramanian, Abdullah Al Maruf, Radha Poovendran

Abstract: Cyber and cyber-physical systems equipped with machine learning algorithms such as autonomous cars share environments with humans. In such a setting, it is important to align system (or agent) behaviors with the preferences of one or more human users. We consider the case when an agent has to learn behaviors in an unknown environment. Our goal is to capture two defining characteristics of humans:… ▽ More Cyber and cyber-physical systems equipped with machine learning algorithms such as autonomous cars share environments with humans. In such a setting, it is important to align system (or agent) behaviors with the preferences of one or more human users. We consider the case when an agent has to learn behaviors in an unknown environment. Our goal is to capture two defining characteristics of humans: i) a tendency to assess and quantify risk, and ii) a desire to keep decision making hidden from external parties. We incorporate cumulative prospect theory (CPT) into the objective of a reinforcement learning (RL) problem for the former. For the latter, we use differential privacy. We design an algorithm to enable an RL agent to learn policies to maximize a CPT-based objective in a privacy-preserving manner and establish guarantees on the privacy of value functions learned by the algorithm when rewards are sufficiently close. This is accomplished through adding a calibrated noise using a Gaussian process mechanism at each step. Through empirical evaluations, we highlight a privacy-utility tradeoff and demonstrate that the RL agent is able to learn behaviors that are aligned with that of a human user in the same environment in a privacy-preserving manner △ Less

Submitted 18 March, 2022; originally announced March 2022.

Comments: Submitted to conference. arXiv admin note: text overlap with arXiv:2104.00540

arXiv:2203.05266 [pdf, other]

EVExchange: A Relay Attack on Electric Vehicle Charging System

Authors: Mauro Conti, Denis Donadel, Radha Poovendran, Federico Turrin

Abstract: To support the increasing spread of Electric Vehicles (EVs), Charging Stations (CSs) are being installed worldwide. The new generation of CSs employs the Vehicle-To-Grid (V2G) paradigm by implementing novel standards such as the ISO 15118. This standard enables high-level communication between the vehicle and the charging column, helps manage the charge smartly, and simplifies the payment phase. T… ▽ More To support the increasing spread of Electric Vehicles (EVs), Charging Stations (CSs) are being installed worldwide. The new generation of CSs employs the Vehicle-To-Grid (V2G) paradigm by implementing novel standards such as the ISO 15118. This standard enables high-level communication between the vehicle and the charging column, helps manage the charge smartly, and simplifies the payment phase. This novel charging paradigm, which connects the Smart Grid to external networks (e.g., EVs and CSs), has not been thoroughly examined yet. Therefore, it may lead to dangerous vulnerability surfaces and new research challenges. In this paper, we present EVExchange, the first attack to steal energy during a charging session in a V2G communication: i.e., charging the attacker's car while letting the victim pay for it. Furthermore, if reverse charging flow is enabled, the attacker can even sell the energy available on the victim's car! Thus, getting the economic profit of this selling, and leaving the victim with a completely discharged battery. We developed a virtual and a physical testbed in which we validate the attack and prove its effectiveness in stealing the energy. To prevent the attack, we propose a lightweight modification of the ISO 15118 protocol to include a distance bounding algorithm. Finally, we validated the countermeasure on our testbeds. Our results show that the proposed countermeasure can identify all the relay attack attempts while being transparent to the user. △ Less

Submitted 7 July, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

Comments: 20 pages, 6 figures

arXiv:2202.09489 [pdf, other]

Shaping Advice in Deep Reinforcement Learning

Authors: Baicen Xiao, Bhaskar Ramasubramanian, Radha Poovendran

Abstract: Reinforcement learning involves agents interacting with an environment to complete tasks. When rewards provided by the environment are sparse, agents may not receive immediate feedback on the quality of actions that they take, thereby affecting learning of policies. In this paper, we propose to methods to augment the reward signal from the environment with an additional reward termed shaping advic… ▽ More Reinforcement learning involves agents interacting with an environment to complete tasks. When rewards provided by the environment are sparse, agents may not receive immediate feedback on the quality of actions that they take, thereby affecting learning of policies. In this paper, we propose to methods to augment the reward signal from the environment with an additional reward termed shaping advice in both single and multi-agent reinforcement learning. The shaping advice is specified as a difference of potential functions at consecutive time-steps. Each potential function is a function of observations and actions of the agents. The use of potential functions is underpinned by an insight that the total potential when starting from any state and returning to the same state is always equal to zero. We show through theoretical analyses and experimental validation that the shaping advice does not distract agents from completing tasks specified by the environment reward. Theoretically, we prove that the convergence of policy gradients and value functions when using shaping advice implies the convergence of these quantities in the absence of shaping advice. We design two algorithms- Shaping Advice in Single-agent reinforcement learning (SAS) and Shaping Advice in Multi-agent reinforcement learning (SAM). Shaping advice in SAS and SAM needs to be specified only once at the start of training, and can easily be provided by non-experts. Experimentally, we evaluate SAS and SAM on two tasks in single-agent environments and three tasks in multi-agent environments that have sparse rewards. We observe that using shaping advice results in agents learning policies to complete tasks faster, and obtain higher rewards than algorithms that do not use shaping advice. △ Less

Submitted 18 February, 2022; originally announced February 2022.

Comments: Submitted to journal special issue. arXiv admin note: substantial text overlap with arXiv:2103.15941, arXiv:1907.08823

arXiv:2201.04612 [pdf, other]

Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement Learning

Authors: Baicen Xiao, Bhaskar Ramasubramanian, Radha Poovendran

Abstract: This paper considers multi-agent reinforcement learning (MARL) tasks where agents receive a shared global reward at the end of an episode. The delayed nature of this reward affects the ability of the agents to assess the quality of their actions at intermediate time-steps. This paper focuses on developing methods to learn a temporal redistribution of the episodic reward to obtain a dense reward si… ▽ More This paper considers multi-agent reinforcement learning (MARL) tasks where agents receive a shared global reward at the end of an episode. The delayed nature of this reward affects the ability of the agents to assess the quality of their actions at intermediate time-steps. This paper focuses on developing methods to learn a temporal redistribution of the episodic reward to obtain a dense reward signal. Solving such MARL problems requires addressing two challenges: identifying (1) relative importance of states along the length of an episode (along time), and (2) relative importance of individual agents' states at any single time-step (among agents). In this paper, we introduce Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement Learning (AREL) to address these two challenges. AREL uses attention mechanisms to characterize the influence of actions on state transitions along trajectories (temporal attention), and how each agent is affected by other agents at each time-step (agent attention). The redistributed rewards predicted by AREL are dense, and can be integrated with any given MARL algorithm. We evaluate AREL on challenging tasks from the Particle World environment and the StarCraft Multi-Agent Challenge. AREL results in higher rewards in Particle World, and improved win rates in StarCraft compared to three state-of-the-art reward redistribution methods. Our code is available at https://github.com/baicenxiao/AREL. △ Less

Submitted 12 January, 2022; originally announced January 2022.

Comments: Extended version of paper accepted for Oral Presentation at the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2022

arXiv:2108.01628 [pdf, other]

A Game-Theoretic Framework for Controlled Islanding in the Presence of Adversaries

Authors: Luyao Niu, Dinuka Sahabandu, Andrew Clark, Radha Poovendran

Abstract: Controlled islanding effectively mitigates cascading failures by partitioning the power system into a set of disjoint islands. In this paper, we study the controlled islanding problem of a power system under disturbances introduced by a malicious adversary. We formulate the interaction between the grid operator and adversary using a game-theoretic framework. The grid operator first computes a cont… ▽ More Controlled islanding effectively mitigates cascading failures by partitioning the power system into a set of disjoint islands. In this paper, we study the controlled islanding problem of a power system under disturbances introduced by a malicious adversary. We formulate the interaction between the grid operator and adversary using a game-theoretic framework. The grid operator first computes a controlled islanding strategy, along with the power generation for the post-islanding system to guarantee stability. The adversary observes the strategies of the grid operator. The adversary then identifies critical substations of the power system to compromise and trips the transmission lines that are connected with compromised substations. For our game formulation, we propose a double oracle algorithm based approach that solves the best response for each player. We show that the best responses for the grid operator and adversary can be formulated as mixed integer linear programs. In addition, the best response of the adversary is equivalent to a submodular maximization problem under a cardinality constraint, which can be approximated up to a $(1-\frac{1}{e})$ optimality bound in polynomial time. We compare the proposed approach with a baseline where the grid operator computes an islanding strategy by minimizing the power flow disruption without considering the possible response from the adversary. We evaluate both approaches using IEEE 9-bus, 14-bus, 30-bus, 39-bus, 57-bus, and 118-bus power system case study data. Our proposed approach achieves better performance than the baseline in about $44\%$ of test cases, and on average it incurs about 12.27 MW less power flow disruption. △ Less

Submitted 27 September, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

arXiv:2104.00540 [pdf, other]

Reinforcement Learning Beyond Expectation

Authors: Bhaskar Ramasubramanian, Luyao Niu, Andrew Clark, Radha Poovendran

Abstract: The inputs and preferences of human users are important considerations in situations where these users interact with autonomous cyber or cyber-physical systems. In these scenarios, one is often interested in aligning behaviors of the system with the preferences of one or more human users. Cumulative prospect theory (CPT) is a paradigm that has been empirically shown to model a tendency of humans t… ▽ More The inputs and preferences of human users are important considerations in situations where these users interact with autonomous cyber or cyber-physical systems. In these scenarios, one is often interested in aligning behaviors of the system with the preferences of one or more human users. Cumulative prospect theory (CPT) is a paradigm that has been empirically shown to model a tendency of humans to view gains and losses differently. In this paper, we consider a setting where an autonomous agent has to learn behaviors in an unknown environment. In traditional reinforcement learning, these behaviors are learned through repeated interactions with the environment by optimizing an expected utility. In order to endow the agent with the ability to closely mimic the behavior of human users, we optimize a CPT-based cost. We introduce the notion of the CPT-value of an action taken in a state, and establish the convergence of an iterative dynamic programming-based approach to estimate this quantity. We develop two algorithms to enable agents to learn policies to optimize the CPT-vale, and evaluate these algorithms in environments where a target state has to be reached while avoiding obstacles. We demonstrate that behaviors of the agent learned using these algorithms are better aligned with that of a human user who might be placed in the same environment, and is significantly improved over a baseline that optimizes an expected utility. △ Less

Submitted 29 March, 2021; originally announced April 2021.

arXiv:2103.15941 [pdf, other]

Shaping Advice in Deep Multi-Agent Reinforcement Learning

Authors: Baicen Xiao, Bhaskar Ramasubramanian, Radha Poovendran

Abstract: Multi-agent reinforcement learning involves multiple agents interacting with each other and a shared environment to complete tasks. When rewards provided by the environment are sparse, agents may not receive immediate feedback on the quality of actions that they take, thereby affecting learning of policies. In this paper, we propose a method called Shaping Advice in deep Multi-agent reinforcement… ▽ More Multi-agent reinforcement learning involves multiple agents interacting with each other and a shared environment to complete tasks. When rewards provided by the environment are sparse, agents may not receive immediate feedback on the quality of actions that they take, thereby affecting learning of policies. In this paper, we propose a method called Shaping Advice in deep Multi-agent reinforcement learning (SAM) to augment the reward signal from the environment with an additional reward termed shaping advice. The shaping advice is given by a difference of potential functions at consecutive time-steps. Each potential function is a function of observations and actions of the agents. The shaping advice needs to be specified only once at the start of training, and can be easily provided by non-experts. We show through theoretical analyses and experimental validation that shaping advice provided by SAM does not distract agents from completing tasks specified by the environment reward. Theoretically, we prove that convergence of policy gradients and value functions when using SAM implies convergence of these quantities in the absence of SAM. Experimentally, we evaluate SAM on three tasks in the multi-agent Particle World environment that have sparse rewards. We observe that using SAM results in agents learning policies to complete tasks faster, and obtain higher rewards than: i) using sparse rewards alone; ii) a state-of-the-art reward redistribution method. △ Less

Submitted 29 March, 2021; originally announced March 2021.

arXiv:2103.15894 [pdf, ps, other]

Scalable Planning in Multi-Agent MDPs

Authors: Dinuka Sahabandu, Luyao Niu, Andrew Clark, Radha Poovendran

Abstract: Multi-agent Markov Decision Processes (MMDPs) arise in a variety of applications including target tracking, control of multi-robot swarms, and multiplayer games. A key challenge in MMDPs occurs when the state and action spaces grow exponentially in the number of agents, making computation of an optimal policy computationally intractable for medium- to large-scale problems. One property that has be… ▽ More Multi-agent Markov Decision Processes (MMDPs) arise in a variety of applications including target tracking, control of multi-robot swarms, and multiplayer games. A key challenge in MMDPs occurs when the state and action spaces grow exponentially in the number of agents, making computation of an optimal policy computationally intractable for medium- to large-scale problems. One property that has been exploited to mitigate this complexity is transition independence, in which each agent's transition probabilities are independent of the states and actions of other agents. Transition independence enables factorization of the MMDP and computation of local agent policies but does not hold for arbitrary MMDPs. In this paper, we propose an approximate transition dependence property, called $δ$-transition dependence and develop a metric for quantifying how far an MMDP deviates from transition independence. Our definition of $δ$-transition dependence recovers transition independence as a special case when $δ$ is zero. We develop a polynomial time algorithm in the number of agents that achieves a provable bound on the global optimum when the reward functions are monotone increasing and submodular in the agent actions. We evaluate our approach on two case studies, namely, multi-robot control and multi-agent patrolling example. △ Less

Submitted 29 March, 2021; originally announced March 2021.

arXiv:2101.03253 [pdf, other]

Adaptive Learning in Two-Player Stackelberg Games with Application to Network Security

Authors: Guosong Yang, Radha Poovendran, João P. Hespanha

Abstract: We study a two-player Stackelberg game with incomplete information such that the follower's strategy belongs to a known family of parameterized functions with an unknown parameter vector. We design an adaptive learning approach to simultaneously estimate the unknown parameter and minimize the leader's cost, based on adaptive control techniques and hysteresis switching. Our approach guarantees that… ▽ More We study a two-player Stackelberg game with incomplete information such that the follower's strategy belongs to a known family of parameterized functions with an unknown parameter vector. We design an adaptive learning approach to simultaneously estimate the unknown parameter and minimize the leader's cost, based on adaptive control techniques and hysteresis switching. Our approach guarantees that the leader's cost predicted using the parameter estimate becomes indistinguishable from its actual cost in finite time, up to a preselected, arbitrarily small error threshold. Also, the first-order necessary condition for optimality holds asymptotically for the predicted cost. Additionally, if a persistent excitation condition holds, then the parameter estimation error becomes bounded by a preselected, arbitrarily small threshold in finite time as well. For the case where there is a mismatch between the follower's strategy and the parameterized function that is known to the leader, our approach is able to guarantee the same convergence results for error thresholds larger than the size of the mismatch. The algorithms and the convergence results are illustrated via a simulation example in the domain of network security. △ Less

Submitted 8 January, 2021; originally announced January 2021.

MSC Class: 91A26; 91A65 (Primary) 37N40; 93C40; 65L20 (Secondary)

arXiv:2009.09511 [pdf, other]

Safety-Critical Online Control with Adversarial Disturbances

Authors: Bhaskar Ramasubramanian, Baicen Xiao, Linda Bushnell, Radha Poovendran

Abstract: This paper studies the control of safety-critical dynamical systems in the presence of adversarial disturbances. We seek to synthesize state-feedback controllers to minimize a cost incurred due to the disturbance, while respecting a safety constraint. The safety constraint is given by a bound on an H-inf norm, while the cost is specified as an upper bound on the H-2 norm of the system. We consider… ▽ More This paper studies the control of safety-critical dynamical systems in the presence of adversarial disturbances. We seek to synthesize state-feedback controllers to minimize a cost incurred due to the disturbance, while respecting a safety constraint. The safety constraint is given by a bound on an H-inf norm, while the cost is specified as an upper bound on the H-2 norm of the system. We consider an online setting where costs at each time are revealed only after the controller at that time is chosen. We propose an iterative approach to the synthesis of the controller by solving a modified discrete-time Riccati equation. Solutions of this equation enforce the safety constraint. We compare the cost of this controller with that of the optimal controller when one has complete knowledge of disturbances and costs in hindsight. We show that the regret function, which is defined as the difference between these costs, varies logarithmically with the time horizon. We validate our approach on a process control setup that is subject to two kinds of adversarial attacks. △ Less

Submitted 20 September, 2020; originally announced September 2020.

Comments: Paper accepted to the Conference on Decision and Control (CDC), 2020

arXiv:2007.13272 [pdf, other]

Privacy-Preserving Resilience of Cyber-Physical Systems to Adversaries

Authors: Bhaskar Ramasubramanian, Luyao Niu, Andrew Clark, Linda Bushnell, Radha Poovendran

Abstract: A cyber-physical system (CPS) is expected to be resilient to more than one type of adversary. In this paper, we consider a CPS that has to satisfy a linear temporal logic (LTL) objective in the presence of two kinds of adversaries. The first adversary has the ability to tamper with inputs to the CPS to influence satisfaction of the LTL objective. The interaction of the CPS with this adversary is m… ▽ More A cyber-physical system (CPS) is expected to be resilient to more than one type of adversary. In this paper, we consider a CPS that has to satisfy a linear temporal logic (LTL) objective in the presence of two kinds of adversaries. The first adversary has the ability to tamper with inputs to the CPS to influence satisfaction of the LTL objective. The interaction of the CPS with this adversary is modeled as a stochastic game. We synthesize a controller for the CPS to maximize the probability of satisfying the LTL objective under any policy of this adversary. The second adversary is an eavesdropper who can observe labeled trajectories of the CPS generated from the previous step. It could then use this information to launch other kinds of attacks. A labeled trajectory is a sequence of labels, where a label is associated to a state and is linked to the satisfaction of the LTL objective at that state. We use differential privacy to quantify the indistinguishability between states that are related to each other when the eavesdropper sees a labeled trajectory. Two trajectories of equal length will be differentially private if they are differentially private at each state along the respective trajectories. We use a skewed Kantorovich metric to compute distances between probability distributions over states resulting from actions chosen according to policies from related states in order to quantify differential privacy. Moreover, we do this in a manner that does not affect the satisfaction probability of the LTL objective. We validate our approach on a simulation of a UAV that has to satisfy an LTL objective in an adversarial environment. △ Less

Submitted 26 July, 2020; originally announced July 2020.

Comments: Accepted to the IEEE Conference on Decision and Control (CDC), 2020

arXiv:2007.12501 [pdf, other]

Secure Control in Partially Observable Environments to Satisfy LTL Specifications

Authors: Bhaskar Ramasubramanian, Luyao Niu, Andrew Clark, Linda Bushnell, Radha Poovendran

Abstract: This paper studies the synthesis of control policies for an agent that has to satisfy a temporal logic specification in a partially observable environment, in the presence of an adversary. The interaction of the agent (defender) with the adversary is modeled as a partially observable stochastic game. The goal is to generate a defender policy to maximize satisfaction of a given temporal logic speci… ▽ More This paper studies the synthesis of control policies for an agent that has to satisfy a temporal logic specification in a partially observable environment, in the presence of an adversary. The interaction of the agent (defender) with the adversary is modeled as a partially observable stochastic game. The goal is to generate a defender policy to maximize satisfaction of a given temporal logic specification under any adversary policy. The search for policies is limited to the space of finite state controllers, which leads to a tractable approach to determine policies. We relate the satisfaction of the specification to reaching (a subset of) recurrent states of a Markov chain. We present an algorithm to determine a set of defender and adversary finite state controllers of fixed sizes that will satisfy the temporal logic specification, and prove that it is sound. We then propose a value-iteration algorithm to maximize the probability of satisfying the temporal logic specification under finite state controllers of fixed sizes. Lastly, we extend this setting to the scenario where the size of the finite state controller of the defender can be increased to improve the satisfaction probability. We illustrate our approach with an example. △ Less

Submitted 4 November, 2020; v1 submitted 22 July, 2020; originally announced July 2020.

Comments: Provisionally accepted to the IEEE Transactions on Automatic Control. arXiv admin note: text overlap with arXiv:1903.06873

arXiv:2007.12327 [pdf, other]

Stochastic Dynamic Information Flow Tracking Game using Supervised Learning for Detecting Advanced Persistent Threats

Authors: Shana Moothedath, Dinuka Sahabandu, Joey Allen, Linda Bushnell, Wenke Lee, Radha Poovendran

Abstract: Advanced persistent threats (APTs) are organized prolonged cyberattacks by sophisticated attackers. Although APT activities are stealthy, they interact with the system components and these interactions lead to information flows. Dynamic Information Flow Tracking (DIFT) has been proposed as one of the effective ways to detect APTs using the information flows. However, wide range security analysis u… ▽ More Advanced persistent threats (APTs) are organized prolonged cyberattacks by sophisticated attackers. Although APT activities are stealthy, they interact with the system components and these interactions lead to information flows. Dynamic Information Flow Tracking (DIFT) has been proposed as one of the effective ways to detect APTs using the information flows. However, wide range security analysis using DIFT results in a significant increase in performance overhead and high rates of false-positives and false-negatives generated by DIFT. In this paper, we model the strategic interaction between APT and DIFT as a non-cooperative stochastic game. The game unfolds on a state space constructed from an information flow graph (IFG) that is extracted from the system log. The objective of the APT in the game is to choose transitions in the IFG to find an optimal path in the IFG from an entry point of the attack to an attack target. On the other hand, the objective of DIFT is to dynamically select nodes in the IFG to perform security analysis for detecting APT. Our game model has imperfect information as the players do not have information about the actions of the opponent. We consider two scenarios of the game (i) when the false-positive and false-negative rates are known to both players and (ii) when the false-positive and false-negative rates are unknown to both players. Case (i) translates to a game model with complete information and we propose a value iteration-based algorithm and prove the convergence. Case (ii) translates to a game with unknown transition probabilities. In this case, we propose Hierarchical Supervised Learning (HSL) algorithm that integrates a neural network, to predict the value vector of the game, with a policy iteration algorithm to compute an approximate equilibrium. We implemented our algorithms on real attack datasets and validated the performance of our approach. △ Less

Submitted 25 June, 2021; v1 submitted 23 July, 2020; originally announced July 2020.

arXiv:2007.00076 [pdf, other]

A Reinforcement Learning Approach for Dynamic Information Flow Tracking Games for Detecting Advanced Persistent Threats

Authors: Dinuka Sahabandu, Shana Moothedath, Joey Allen, Linda Bushnell, Wenke Lee, Radha Poovendran

Abstract: Advanced Persistent Threats (APTs) are stealthy attacks that threaten the security and privacy of sensitive information. Interactions of APTs with victim system introduce information flows that are recorded in the system logs. Dynamic Information Flow Tracking (DIFT) is a promising detection mechanism for detecting APTs. DIFT taints information flows originating at system entities that are suscept… ▽ More Advanced Persistent Threats (APTs) are stealthy attacks that threaten the security and privacy of sensitive information. Interactions of APTs with victim system introduce information flows that are recorded in the system logs. Dynamic Information Flow Tracking (DIFT) is a promising detection mechanism for detecting APTs. DIFT taints information flows originating at system entities that are susceptible to an attack, tracks the propagation of the tainted flows, and authenticates the tainted flows at certain system components according to a pre-defined security policy. Deployment of DIFT to defend against APTs in cyber systems is limited by the heavy resource and performance overhead associated with DIFT. In this paper, we propose a resource-efficient model for DIFT by incorporating the security costs, false-positives, and false-negatives associated with DIFT. Specifically, we develop a game-theoretic framework and provide an analytical model of DIFT that enables the study of trade-off between resource efficiency and the effectiveness of detection. Our game model is a nonzero-sum, infinite-horizon, average reward stochastic game. Our model incorporates the information asymmetry between players that arises from DIFT's inability to distinguish malicious flows from benign flows and APT's inability to know the locations where DIFT performs a security analysis. Additionally, the game has incomplete information as the transition probabilities (false-positive and false-negative rates) are unknown. We propose a multiple-time scale stochastic approximation algorithm to learn an equilibrium solution of the game. We prove that our algorithm converges to an average reward Nash equilibrium. We evaluate our proposed model and algorithm on a real-world ransomware dataset and validate the effectiveness of the proposed approach. △ Less

Submitted 28 June, 2021; v1 submitted 30 June, 2020; originally announced July 2020.

Comments: 15

arXiv:2006.12327 [pdf, other]

Dynamic Information Flow Tracking for Detection of Advanced Persistent Threats: A Stochastic Game Approach

Authors: Shana Moothedath, Dinuka Sahabandu, Joey Allen, Andrew Clark, Linda Bushnell, Wenke Lee, Radha Poovendran

Abstract: Advanced Persistent Threats (APTs) are stealthy customized attacks by intelligent adversaries. This paper deals with the detection of APTs that infiltrate cyber systems and compromise specifically targeted data and/or infrastructures. Dynamic information flow tracking is an information trace-based detection mechanism against APTs that taints suspicious information flows in the system and generates… ▽ More Advanced Persistent Threats (APTs) are stealthy customized attacks by intelligent adversaries. This paper deals with the detection of APTs that infiltrate cyber systems and compromise specifically targeted data and/or infrastructures. Dynamic information flow tracking is an information trace-based detection mechanism against APTs that taints suspicious information flows in the system and generates security analysis for unauthorized use of tainted data. In this paper, we develop an analytical model for resource-efficient detection of APTs using an information flow tracking game. The game is a nonzero-sum, turn-based, stochastic game with asymmetric information as the defender cannot distinguish whether an incoming flow is malicious or benign and hence has only partial state observation. We analyze equilibrium of the game and prove that a Nash equilibrium is given by a solution to the minimum capacity cut set problem on a flow-network derived from the system, where the edge capacities are obtained from the cost of performing security analysis. Finally, we implement our algorithm on the real-world dataset for a data exfiltration attack augmented with false-negative and false-positive rates and compute an optimal defender strategy. △ Less

Submitted 25 June, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

arXiv:2003.12733 [pdf, other]

Submodular Input Selection for Synchronization in Kuramoto Networks

Authors: Dinuka Sahabandu, Andrew Clark, Linda Bushnell, Radha Poovendran

Abstract: Synchronization is an essential property of engineered and natural networked dynamical systems. The Kuramoto model of nonlinear synchronization has been widely studied in applications including entrainment of clock cells in brain networks and power system stability. Synchronization of Kuramoto networks has been found to be challenging in the presence of signed couplings between oscillators and whe… ▽ More Synchronization is an essential property of engineered and natural networked dynamical systems. The Kuramoto model of nonlinear synchronization has been widely studied in applications including entrainment of clock cells in brain networks and power system stability. Synchronization of Kuramoto networks has been found to be challenging in the presence of signed couplings between oscillators and when the network includes oscillators with heterogeneous natural frequencies. In this paper, we study the problem of minimum-set control input selection for synchronizing signed Kuramoto networks. We first derive sufficient conditions for synchronization in homogeneous as well as heterogeneous Kuramoto networks using a passivity-based framework. We then develop a submodular algorithm for selecting a minimum set of control inputs for a given Kuramoto network. We evaluate our approach through a numerical study on multiple classes of graphs, including undirected, directed, and cycle graphs. △ Less

Submitted 31 March, 2020; v1 submitted 28 March, 2020; originally announced March 2020.

Comments: 9 pages, 3 figures

arXiv:2001.09978 [pdf, ps, other]

doi 10.1109/ICCPS48487.2020.00023

Control Synthesis for Cyber-Physical Systems to Satisfy Metric Interval Temporal Logic Objectives under Timing and Actuator Attacks

Authors: Luyao Niu, Bhaskar Ramasubramanian, Andrew Clark, Linda Bushnell, Radha Poovendran

Abstract: This paper studies the synthesis of controllers for cyber-physical systems (CPSs) that are required to carry out complex tasks that are time-sensitive, in the presence of an adversary. The task is specified as a formula in metric interval temporal logic (MITL). The adversary is assumed to have the ability to tamper with the control input to the CPS and also manipulate timing information perceived… ▽ More This paper studies the synthesis of controllers for cyber-physical systems (CPSs) that are required to carry out complex tasks that are time-sensitive, in the presence of an adversary. The task is specified as a formula in metric interval temporal logic (MITL). The adversary is assumed to have the ability to tamper with the control input to the CPS and also manipulate timing information perceived by the CPS. In order to model the interaction between the CPS and the adversary, and also the effect of these two classes of attacks, we define an entity called a durational stochastic game (DSG). DSGs probabilistically capture transitions between states in the environment, and also the time taken for these transitions. With the policy of the defender represented as a finite state controller (FSC), we present a value-iteration based algorithm that computes an FSC that maximizes the probability of satisfying the MITL specification under the two classes of attacks. A numerical case-study on a signalized traffic network is presented to illustrate our results. △ Less

Submitted 27 January, 2020; originally announced January 2020.

Journal ref: 2020 ACM/IEEE 11th International Conference on Cyber-Physical Systems (ICCPS), Sydney, Australia, 2020, pp. 162-173

arXiv:2001.06781 [pdf, other]

FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using Human Feedback

Authors: Baicen Xiao, Qifan Lu, Bhaskar Ramasubramanian, Andrew Clark, Linda Bushnell, Radha Poovendran

Abstract: Reinforcement learning has been successful in training autonomous agents to accomplish goals in complex environments. Although this has been adapted to multiple settings, including robotics and computer games, human players often find it easier to obtain higher rewards in some environments than reinforcement learning algorithms. This is especially true of high-dimensional state spaces where the re… ▽ More Reinforcement learning has been successful in training autonomous agents to accomplish goals in complex environments. Although this has been adapted to multiple settings, including robotics and computer games, human players often find it easier to obtain higher rewards in some environments than reinforcement learning algorithms. This is especially true of high-dimensional state spaces where the reward obtained by the agent is sparse or extremely delayed. In this paper, we seek to effectively integrate feedback signals supplied by a human operator with deep reinforcement learning algorithms in high-dimensional state spaces. We call this FRESH (Feedback-based REward SHaping). During training, a human operator is presented with trajectories from a replay buffer and then provides feedback on states and actions in the trajectory. In order to generalize feedback signals provided by the human operator to previously unseen states and actions at test-time, we use a feedback neural network. We use an ensemble of neural networks with a shared network architecture to represent model uncertainty and the confidence of the neural network in its output. The output of the feedback neural network is converted to a shaping reward that is augmented to the reward provided by the environment. We evaluate our approach on the Bowling and Skiing Atari games in the arcade learning environment. Although human experts have been able to achieve high scores in these environments, state-of-the-art deep learning algorithms perform poorly. We observe that FRESH is able to achieve much higher scores than state-of-the-art deep learning algorithms in both environments. FRESH also achieves a 21.4% higher score than a human expert in Bowling and does as well as a human expert in Skiing. △ Less

Submitted 19 January, 2020; originally announced January 2020.

Comments: Accepted as Full Paper to International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) 2020

arXiv:1912.04735 [pdf, other]

Covert Channel-Based Transmitter Authentication in Controller Area Networks

Authors: Xuhang Ying, Giuseppe Bernieri, Mauro Conti, Linda Bushnell, Radha Poovendran

Abstract: In recent years, the security of automotive Cyber-Physical Systems (CPSs) is facing urgent threats due to the widespread use of legacy in-vehicle communication systems. As a representative legacy bus system, the Controller Area Network (CAN) hosts Electronic Control Units (ECUs) that are crucial vehicle functioning. In this scenario, malicious actors can exploit CAN vulnerabilities, such as the la… ▽ More In recent years, the security of automotive Cyber-Physical Systems (CPSs) is facing urgent threats due to the widespread use of legacy in-vehicle communication systems. As a representative legacy bus system, the Controller Area Network (CAN) hosts Electronic Control Units (ECUs) that are crucial vehicle functioning. In this scenario, malicious actors can exploit CAN vulnerabilities, such as the lack of built-in authentication and encryption schemes, to launch CAN bus attacks with life-threatening consequences (e.g., disabling brakes). In this paper, we present TACAN (Transmitter Authentication in CAN), which provides secure authentication of ECUs on the legacy CAN bus by exploiting the covert channels, without introducing CAN protocol modifications or traffic overheads. TACAN turns upside-down the originally malicious concept of covert channels and exploits it to build an effective defensive technique that facilitates transmitter authentication via a centralized, trusted Monitor Node. TACAN consists of three different covert channels for ECU authentication: 1) the Inter-Arrival Time (IAT)-based; 2) the Least Significant Bit (LSB)-based; and 3) a hybrid covert channel, exploiting the combination of the first two. In order to validate TACAN, we implement the covert channels on the University of Washington (UW) EcoCAR (Chevrolet Camaro 2016) testbed. We further evaluate the bit error, throughput, and detection performance of TACAN through extensive experiments using the EcoCAR testbed and a publicly available dataset collected from Toyota Camry 2010. We demonstrate the feasibility of TACAN and the effectiveness of detecting CAN bus attacks, highlighting no traffic overheads and attesting the regular functionality of ECUs. △ Less

Submitted 7 December, 2019; originally announced December 2019.

Comments: Submitted to TDSC (Transactions on Dependable and Secure Computing). arXiv admin note: text overlap with arXiv:1903.05231

arXiv:1910.12282 [pdf, other]

Linear Temporal Logic Satisfaction in Adversarial Environments using Secure Control Barrier Certificates

Authors: Bhaskar Ramasubramanian, Luyao Niu, Andrew Clark, Linda Bushnell, Radha Poovendran

Abstract: This paper studies the satisfaction of a class of temporal properties for cyber-physical systems (CPSs) over a finite-time horizon in the presence of an adversary, in an environment described by discrete-time dynamics. The temporal logic specification is given in safe-LTL_F, a fragment of linear temporal logic over traces of finite length. The interaction of the CPS with the adversary is modeled a… ▽ More This paper studies the satisfaction of a class of temporal properties for cyber-physical systems (CPSs) over a finite-time horizon in the presence of an adversary, in an environment described by discrete-time dynamics. The temporal logic specification is given in safe-LTL_F, a fragment of linear temporal logic over traces of finite length. The interaction of the CPS with the adversary is modeled as a two-player zero-sum discrete-time dynamic stochastic game with the CPS as defender. We formulate a dynamic programming based approach to determine a stationary defender policy that maximized the probability of satisfaction of a safe-LTL_F formula over a finite time-horizon under any stationary adversary policy. We introduce secure control barrier certificates (S-CBCs), a generalization of barrier certificates and control barrier certificates that accounts for the presence of an adversary, and use S-CBCs to provide a lower bound on the above satisfaction probability. When the dynamics of the evolution of the system state has a specific underlying structure, we present a way to determine an S-CBC as a polynomial in the state variables using sum-of-squares optimization. An illustrative example demonstrates our approach. △ Less

Submitted 27 October, 2019; originally announced October 2019.

Comments: Proc. of GameSec2019 (to appear). This version corrects a typo in the simulation, and clarifies some ambiguous material from the Proceedings version

arXiv:1907.12138 [pdf, other]

Are Odds Really Odd? Bypassing Statistical Detection of Adversarial Examples

Authors: Hossein Hosseini, Sreeram Kannan, Radha Poovendran

Abstract: Deep learning classifiers are known to be vulnerable to adversarial examples. A recent paper presented at ICML 2019 proposed a statistical test detection method based on the observation that logits of noisy adversarial examples are biased toward the true class. The method is evaluated on CIFAR-10 dataset and is shown to achieve 99% true positive rate (TPR) at only 1% false positive rate (FPR). In… ▽ More Deep learning classifiers are known to be vulnerable to adversarial examples. A recent paper presented at ICML 2019 proposed a statistical test detection method based on the observation that logits of noisy adversarial examples are biased toward the true class. The method is evaluated on CIFAR-10 dataset and is shown to achieve 99% true positive rate (TPR) at only 1% false positive rate (FPR). In this paper, we first develop a classifier-based adaptation of the statistical test method and show that it improves the detection performance. We then propose Logit Mimicry Attack method to generate adversarial examples such that their logits mimic those of benign images. We show that our attack bypasses both statistical test and classifier-based methods, reducing their TPR to less than 2:2% and 1:6%, respectively, even at 5% FPR. We finally show that a classifier-based detector that is trained with logits of mimicry adversarial examples can be evaded by an adaptive attacker that specifically targets the detector. Furthermore, even a detector that is iteratively trained to defend against adaptive attacker cannot be made robust, indicating that statistics of logits cannot be used to detect adversarial examples. △ Less

Submitted 28 July, 2019; originally announced July 2019.

arXiv:1907.10783 [pdf, other]

Mitigating Vulnerabilities of Voltage-based Intrusion Detection Systems in Controller Area Networks

Authors: Sang Uk Sagong, Radha Poovendran, Linda Bushnell

Abstract: Data for controlling a vehicle is exchanged among Electronic Control Units (ECUs) via in-vehicle network protocols such as the Controller Area Network (CAN) protocol. Since these protocols are designed for an isolated network, the protocols do not encrypt data nor authenticate messages. Intrusion Detection Systems (IDSs) are developed to secure the CAN protocol by detecting abnormal deviations in… ▽ More Data for controlling a vehicle is exchanged among Electronic Control Units (ECUs) via in-vehicle network protocols such as the Controller Area Network (CAN) protocol. Since these protocols are designed for an isolated network, the protocols do not encrypt data nor authenticate messages. Intrusion Detection Systems (IDSs) are developed to secure the CAN protocol by detecting abnormal deviations in physical properties. For instance, a voltage-based IDS (VIDS) exploits voltage characteristics of each ECU to detect an intrusion. An ECU with VIDS must be connected to the CAN bus using extra wires to measure voltages of the CAN bus lines. These extra wires, however, may introduce new attack surfaces to the CAN bus if the ECU with VIDS is compromised. We investigate new vulnerabilities of VIDS and demonstrate that an adversary may damage an ECU with VIDS, block message transmission, and force an ECU to retransmit messages. In order to defend the CAN bus against these attacks, we propose two hardware-based Intrusion Response Systems (IRSs) that disconnect the compromised ECU from the CAN bus once these attacks are detected. We develop four voltage-based attacks by exploiting vulnerabilities of VIDS and evaluate the effectiveness of the proposed IRSs using a CAN bus testbed. △ Less

Submitted 24 July, 2019; originally announced July 2019.

Comments: 11 pages, 24 figures

MSC Class: 68W01

arXiv:1907.08823 [pdf, other]

Potential-Based Advice for Stochastic Policy Learning

Authors: Baicen Xiao, Bhaskar Ramasubramanian, Andrew Clark, Hannaneh Hajishirzi, Linda Bushnell, Radha Poovendran

Abstract: This paper augments the reward received by a reinforcement learning agent with potential functions in order to help the agent learn (possibly stochastic) optimal policies. We show that a potential-based reward shaping scheme is able to preserve optimality of stochastic policies, and demonstrate that the ability of an agent to learn an optimal policy is not affected when this scheme is augmented to… ▽ More This paper augments the reward received by a reinforcement learning agent with potential functions in order to help the agent learn (possibly stochastic) optimal policies. We show that a potential-based reward shaping scheme is able to preserve optimality of stochastic policies, and demonstrate that the ability of an agent to learn an optimal policy is not affected when this scheme is augmented to soft Q-learning. We propose a method to impart potential based advice schemes to policy gradient algorithms. An algorithm that considers an advantage actor-critic architecture augmented with this scheme is proposed, and we give guarantees on its convergence. Finally, we evaluate our approach on a puddle-jump grid world with indistinguishable states, and the continuous state and action mountain car environment from classical control. Our results indicate that these schemes allow the agent to learn a stochastic optimal policy faster and obtain a higher average reward. △ Less

Submitted 20 July, 2019; originally announced July 2019.

Comments: Accepted to the IEEE Conference on Decision and Control, 2019

arXiv:1905.00180 [pdf, other]

Dropping Pixels for Adversarial Robustness

Authors: Hossein Hosseini, Sreeram Kannan, Radha Poovendran

Abstract: Deep neural networks are vulnerable against adversarial examples. In this paper, we propose to train and test the networks with randomly subsampled images with high drop rates. We show that this approach significantly improves robustness against adversarial examples in all cases of bounded L0, L2 and L_inf perturbations, while reducing the standard accuracy by a small value. We argue that subsampl… ▽ More Deep neural networks are vulnerable against adversarial examples. In this paper, we propose to train and test the networks with randomly subsampled images with high drop rates. We show that this approach significantly improves robustness against adversarial examples in all cases of bounded L0, L2 and L_inf perturbations, while reducing the standard accuracy by a small value. We argue that subsampling pixels can be thought to provide a set of robust features for the input image and, thus, improves robustness without performing adversarial training. △ Less

Submitted 1 May, 2019; originally announced May 2019.

Showing 1–50 of 86 results for author: Poovendran, R