Skip to main content

Showing 1–43 of 43 results for author: Fredrikson, M

  1. arXiv:2406.04755  [pdf, other

    cs.CR cs.AI cs.HC cs.LG

    Sales Whisperer: A Human-Inconspicuous Attack on LLM Brand Recommendations

    Authors: Weiran Lin, Anna Gerchanovsky, Omer Akgul, Lujo Bauer, Matt Fredrikson, Zifan Wang

    Abstract: Large language model (LLM) users might rely on others (e.g., prompting services), to write prompts. However, the risks of trusting prompts written by others remain unstudied. In this paper, we assess the risk of using such prompts on brand recommendation tasks when shopping. First, we found that paraphrasing prompts can result in LLMs mentioning given brands with drastically different probabilitie… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  2. arXiv:2406.04313  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.CY

    Improving Alignment and Robustness with Circuit Breakers

    Authors: Andy Zou, Long Phan, Justin Wang, Derek Duenas, Maxwell Lin, Maksym Andriushchenko, Rowan Wang, Zico Kolter, Matt Fredrikson, Dan Hendrycks

    Abstract: AI systems can take harmful actions and are highly vulnerable to adversarial attacks. We present an approach, inspired by recent advances in representation engineering, that interrupts the models as they respond with harmful outputs with "circuit breakers." Existing techniques aimed at improving alignment, such as refusal training, are often bypassed. Techniques such as adversarial training try to… ▽ More

    Submitted 12 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Code and models are available at https://github.com/GraySwanAI/circuit-breakers

  3. arXiv:2406.00586  [pdf, other

    cs.CR cs.AI

    VeriSplit: Secure and Practical Offloading of Machine Learning Inferences across IoT Devices

    Authors: Han Zhang, Zifan Wang, Mihir Dhamankar, Matt Fredrikson, Yuvraj Agarwal

    Abstract: Many Internet-of-Things (IoT) devices rely on cloud computation resources to perform machine learning inferences. This is expensive and may raise privacy concerns for users. Consumers of these devices often have hardware such as gaming consoles and PCs with graphics accelerators that are capable of performing these computations, which may be left idle for significant periods of time. While this pr… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  4. arXiv:2405.09113  [pdf, ps, other

    cs.LG

    Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization

    Authors: Kai Hu, Weichen Yu, Tianjun Yao, Xiang Li, Wenhe Liu, Lijun Yu, Yining Li, Kai Chen, Zhiqiang Shen, Matt Fredrikson

    Abstract: Recent research indicates that large language models (LLMs) are susceptible to jailbreaking attacks that can generate harmful content. This paper introduces a novel token-level attack method, Adaptive Dense-to-Sparse Constrained Optimization (ADC), which effectively jailbreaks several open-source LLMs. Our approach relaxes the discrete jailbreak optimization into a continuous optimization and prog… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  5. arXiv:2311.13445  [pdf, other

    cs.LG cs.CR

    Transfer Attacks and Defenses for Large Language Models on Coding Tasks

    Authors: Chi Zhang, Zifan Wang, Ravi Mangal, Matt Fredrikson, Limin Jia, Corina Pasareanu

    Abstract: Modern large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities for coding tasks including writing and reasoning about code. They improve upon previous neural network models of code, such as code2seq or seq2seq, that already demonstrated competitive results when performing tasks such as code summarization and identifying code vulnerabilities. However, these previous… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  6. arXiv:2310.09361  [pdf, other

    cs.LG

    Is Certifying $\ell_p$ Robustness Still Worthwhile?

    Authors: Ravi Mangal, Klas Leino, Zifan Wang, Kai Hu, Weicheng Yu, Corina Pasareanu, Anupam Datta, Matt Fredrikson

    Abstract: Over the years, researchers have developed myriad attacks that exploit the ubiquity of adversarial examples, as well as defenses that aim to guard against the security vulnerabilities posed by such attacks. Of particular interest to this paper are defenses that provide provable guarantees against the class of $\ell_p$-bounded attacks. Certified defenses have made significant progress, taking robus… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  7. arXiv:2310.02513  [pdf, other

    cs.LG

    A Recipe for Improved Certifiable Robustness

    Authors: Kai Hu, Klas Leino, Zifan Wang, Matt Fredrikson

    Abstract: Recent studies have highlighted the potential of Lipschitz-based methods for training certifiably robust neural networks against adversarial attacks. A key challenge, supported both theoretically and empirically, is that robustness demands greater network capacity and more data than standard training. However, effectively adding capacity under stringent Lipschitz constraints has proven more diffic… ▽ More

    Submitted 22 June, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

  8. arXiv:2310.01405  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.CY

    Representation Engineering: A Top-Down Approach to AI Transparency

    Authors: Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks

    Abstract: In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience. RepE places population-level representations, rather than neurons or circuits, at the center of analysis, equipping us with novel methods for monitoring and manipulating high-level cognitive p… ▽ More

    Submitted 10 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: Code is available at https://github.com/andyzoujm/representation-engineering

  9. arXiv:2307.15043  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Universal and Transferable Adversarial Attacks on Aligned Language Models

    Authors: Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, J. Zico Kolter, Matt Fredrikson

    Abstract: Because "out-of-the-box" large language models are capable of generating a great deal of objectionable content, recent work has focused on aligning these models in an attempt to prevent undesirable generation. While there has been some success at circumventing these measures -- so-called "jailbreaks" against LLMs -- these attacks have required significant human ingenuity and are brittle in practic… ▽ More

    Submitted 20 December, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

    Comments: Website: http://llm-attacks.org/

  10. arXiv:2301.12549  [pdf, other

    cs.LG cs.CV

    Unlocking Deterministic Robustness Certification on ImageNet

    Authors: Kai Hu, Andy Zou, Zifan Wang, Klas Leino, Matt Fredrikson

    Abstract: Despite the promise of Lipschitz-based methods for provably-robust deep learning with deterministic guarantees, current state-of-the-art results are limited to feed-forward Convolutional Networks (ConvNets) on low-dimensional data, such as CIFAR-10. This paper investigates strategies for expanding certifiably robust training to larger, deeper models. A key challenge in certifying deep networks is… ▽ More

    Submitted 29 October, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

  11. arXiv:2301.11435  [pdf, other

    cs.LG cs.SC

    Learning Modulo Theories

    Authors: Matt Fredrikson, Kaiji Lu, Saranya Vijayakumar, Somesh Jha, Vijay Ganesh, Zifan Wang

    Abstract: Recent techniques that integrate \emph{solver layers} into Deep Neural Networks (DNNs) have shown promise in bridging a long-standing gap between inductive learning and symbolic reasoning techniques. In this paper we present a set of techniques for integrating \emph{Satisfiability Modulo Theories} (SMT) solvers into the forward and backward passes of a deep network layer, called SMTLayer. Using th… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

  12. arXiv:2209.03620  [pdf, other

    cs.LG cs.CR cs.CY

    Black-Box Audits for Group Distribution Shifts

    Authors: Marc Juarez, Samuel Yeom, Matt Fredrikson

    Abstract: When a model informs decisions about people, distribution shifts can create undue disparities. However, it is hard for external entities to check for distribution shift, as the model and its training set are often proprietary. In this paper, we introduce and study a black-box auditing method to detect cases of distribution shift that lead to a performance disparity of the model across demographic… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  13. arXiv:2206.00278  [pdf, other

    cs.LG

    On the Perils of Cascading Robust Classifiers

    Authors: Ravi Mangal, Zifan Wang, Chi Zhang, Klas Leino, Corina Pasareanu, Matt Fredrikson

    Abstract: Ensembling certifiably robust neural networks is a promising approach for improving the \emph{certified robust accuracy} of neural models. Black-box ensembles that assume only query-access to the constituent models (and their robustness certifiers) during prediction are particularly attractive due to their modular structure. Cascading ensembles are a popular instance of black-box ensembles that ap… ▽ More

    Submitted 19 October, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

  14. arXiv:2205.11850  [pdf, other

    cs.LG cs.AI

    Faithful Explanations for Deep Graph Models

    Authors: Zifan Wang, Yuhang Yao, Chaoran Zhang, Han Zhang, Youjie Kang, Carlee Joe-Wong, Matt Fredrikson, Anupam Datta

    Abstract: This paper studies faithful explanations for Graph Neural Networks (GNNs). First, we provide a new and general method for formally characterizing the faithfulness of explanations for GNNs. It applies to existing explanation methods, including feature attributions and subgraph explanations. Second, our analytical and empirical results demonstrate that feature attribution methods cannot capture the… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

  15. Enhancing the Insertion of NOP Instructions to Obfuscate Malware via Deep Reinforcement Learning

    Authors: Daniel Gibert, Matt Fredrikson, Carles Mateu, Jordi Planes, Quan Le

    Abstract: Current state-of-the-art research for tackling the problem of malware detection and classification is centered on the design, implementation and deployment of systems powered by machine learning because of its ability to generalize to never-before-seen malware families and polymorphic mutations. However, it has been shown that machine learning models, in particular deep neural networks, lack robus… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

    Report number: 0167-4048

    Journal ref: Journal Computers & Security, Volume 113, 2022, 102543

  16. arXiv:2111.08230  [pdf, other

    cs.LG

    Selective Ensembles for Consistent Predictions

    Authors: Emily Black, Klas Leino, Matt Fredrikson

    Abstract: Recent work has shown that models trained to the same objective, and which achieve similar measures of accuracy on consistent test data, may nonetheless behave very differently on individual predictions. This inconsistency is undesirable in high-stakes contexts, such as medical diagnosis and finance. We show that this inconsistent behavior extends beyond predictions to feature attributions, which… ▽ More

    Submitted 16 November, 2021; originally announced November 2021.

    Comments: Preprint

  17. arXiv:2110.03109  [pdf, other

    cs.LG

    Consistent Counterfactuals for Deep Models

    Authors: Emily Black, Zifan Wang, Matt Fredrikson, Anupam Datta

    Abstract: Counterfactual examples are one of the most commonly-cited methods for explaining the predictions of machine learning models in key areas such as finance and medical diagnosis. Counterfactuals are often discussed under the assumption that the model on which they will be used is static, but in deployment models may be periodically retrained or fine-tuned. This paper studies the consistency of model… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

  18. arXiv:2107.11445  [pdf, other

    cs.LG cs.NE

    Self-Correcting Neural Networks For Safe Classification

    Authors: Klas Leino, Aymeric Fromherz, Ravi Mangal, Matt Fredrikson, Bryan Parno, Corina Păsăreanu

    Abstract: Classifiers learnt from data are increasingly being used as components in systems where safety is a critical concern. In this work, we present a formal notion of safety for classifiers via constraints called safe-ordering constraints. These constraints relate requirements on the order of the classes output by a classifier to conditions on its input, and are expressive enough to encode various inte… ▽ More

    Submitted 9 June, 2022; v1 submitted 23 July, 2021; originally announced July 2021.

  19. arXiv:2107.10171  [pdf, other

    cs.LG cs.CY

    Leave-one-out Unfairness

    Authors: Emily Black, Matt Fredrikson

    Abstract: We introduce leave-one-out unfairness, which characterizes how likely a model's prediction for an individual will change due to the inclusion or removal of a single other person in the model's training data. Leave-one-out unfairness appeals to the idea that fair decisions are not arbitrary: they should not be based on the chance event of any one person's inclusion in the training data. Leave-one-o… ▽ More

    Submitted 21 July, 2021; originally announced July 2021.

    Comments: FAccT '21

    ACM Class: I.2.0; K.4.0

    Journal ref: FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency 2021, Pages 285-295

  20. arXiv:2106.06624  [pdf, other

    cs.LG

    Relaxing Local Robustness

    Authors: Klas Leino, Matt Fredrikson

    Abstract: Certifiable local robustness, which rigorously precludes small-norm adversarial examples, has received significant attention as a means of addressing security concerns in deep learning. However, for some classification problems, local robustness is not a natural objective, even in the presence of adversaries; for example, if an image contains two classes of subjects, the correct label for the imag… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

  21. arXiv:2104.12032  [pdf

    cs.CR cs.HC

    The Design of the User Interfaces for Privacy Enhancements for Android

    Authors: Jason I. Hong, Yuvraj Agarwal, Matt Fredrikson, Mike Czapik, Shawn Hanna, Swarup Sahoo, Judy Chun, Won-Woo Chung, Aniruddh Iyer, Ally Liu, Shen Lu, Rituparna Roychoudhury, Qian Wang, Shan Wang, Siqi Wang, Vida Zhang, Jessica Zhao, Yuan Jiang, Haojian Jin, Sam Kim, Evelyn Kuo, Tianshi Li, Jinping Liu, Yile Liu, Robert Zhang

    Abstract: We present the design and design rationale for the user interfaces for Privacy Enhancements for Android (PE for Android). These UIs are built around two core ideas, namely that developers should explicitly declare the purpose of why sensitive data is being used, and these permission-purpose pairs should be split by first party and third party uses. We also present a taxonomy of purposes and ways o… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

    Comments: 58 pages, 21 figures, 3 tables

  22. arXiv:2103.11257  [pdf, other

    cs.LG cs.CV

    Robust Models Are More Interpretable Because Attributions Look Normal

    Authors: Zifan Wang, Matt Fredrikson, Anupam Datta

    Abstract: Recent work has found that adversarially-robust deep networks used for image classification are more interpretable: their feature attributions tend to be sharper, and are more concentrated on the objects associated with the image's ground-truth class. We show that smooth decision boundaries play an important role in this enhanced interpretability, as the model's input gradients around data points… ▽ More

    Submitted 5 October, 2021; v1 submitted 20 March, 2021; originally announced March 2021.

  23. arXiv:2102.08452  [pdf, other

    cs.LG cs.CR stat.ML

    Globally-Robust Neural Networks

    Authors: Klas Leino, Zifan Wang, Matt Fredrikson

    Abstract: The threat of adversarial examples has motivated work on training certifiably robust neural networks to facilitate efficient verification of local robustness at inference time. We formalize a notion of global robustness, which captures the operational properties of on-line local robustness certification while yielding a natural learning objective for robust training. We show that widely-used archi… ▽ More

    Submitted 11 June, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: Appearing in ICML 2021

  24. arXiv:2006.06643  [pdf, other

    cs.LG stat.ML

    Smoothed Geometry for Robust Attribution

    Authors: Zifan Wang, Haofan Wang, Shakul Ramkumar, Matt Fredrikson, Piotr Mardziel, Anupam Datta

    Abstract: Feature attributions are a popular tool for explaining the behavior of Deep Neural Networks (DNNs), but have recently been shown to be vulnerable to attacks that produce divergent explanations for nearby inputs. This lack of robustness is especially problematic in high-stakes applications where adversarially-manipulated explanations could impair safety and trustworthiness. Building on a geometric… ▽ More

    Submitted 22 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

  25. arXiv:2002.07985  [pdf, other

    cs.AI

    Interpreting Interpretations: Organizing Attribution Methods by Criteria

    Authors: Zifan Wang, Piotr Mardziel, Anupam Datta, Matt Fredrikson

    Abstract: Motivated by distinct, though related, criteria, a growing number of attribution methods have been developed tointerprete deep learning. While each relies on the interpretability of the concept of "importance" and our ability to visualize patterns, explanations produced by the methods often differ. As a result, input attribution for vision models fail to provide any level of human understanding of… ▽ More

    Submitted 4 April, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

  26. arXiv:2002.07738  [pdf, other

    cs.LG stat.ML

    Individual Fairness Revisited: Transferring Techniques from Adversarial Robustness

    Authors: Samuel Yeom, Matt Fredrikson

    Abstract: We turn the definition of individual fairness on its head---rather than ascertaining the fairness of a model given a predetermined metric, we find a metric for a given model that satisfies individual fairness. This can facilitate the discussion on the fairness of a model, addressing the issue that it may be difficult to specify a priori a suitable metric. Our contributions are twofold: First, we i… ▽ More

    Submitted 13 October, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: Published at IJCAI 2020 (at https://www.ijcai.org/Proceedings/2020/61 ); the conference version has a minor error in the proof of Theorem 3, which is fixed here

  27. arXiv:2002.04742  [pdf, other

    cs.LG stat.ML

    Fast Geometric Projections for Local Robustness Certification

    Authors: Aymeric Fromherz, Klas Leino, Matt Fredrikson, Bryan Parno, Corina Păsăreanu

    Abstract: Local robustness ensures that a model classifies all inputs within an $\ell_2$-ball consistently, which precludes various forms of adversarial inputs. In this paper, we present a fast procedure for checking local robustness in feed-forward neural networks with piecewise-linear activation functions. Such networks partition the input space into a set of convex polyhedral regions in which the network… ▽ More

    Submitted 18 February, 2021; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: Appearing in ICLR 2021

  28. arXiv:1906.11813  [pdf, ps, other

    cs.LG stat.ML

    Learning Fair Representations for Kernel Models

    Authors: Zilong Tan, Samuel Yeom, Matt Fredrikson, Ameet Talwalkar

    Abstract: Fair representations are a powerful tool for establishing criteria like statistical parity, proxy non-discrimination, and equality of opportunity in learned models. Existing techniques for learning these representations are typically model-agnostic, as they preprocess the original data such that the output satisfies some fairness criterion, and can be used with arbitrary learning methods. In contr… ▽ More

    Submitted 20 January, 2020; v1 submitted 27 June, 2019; originally announced June 2019.

    Comments: The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)

  29. arXiv:1906.11798  [pdf, other

    cs.LG cs.CR stat.ML

    Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference

    Authors: Klas Leino, Matt Fredrikson

    Abstract: Membership inference (MI) attacks exploit the fact that machine learning algorithms sometimes leak information about their training data through the learned model. In this work, we study membership inference in the white-box setting in order to exploit the internals of a model, which have not been effectively utilized by previous work. Leveraging new insights about how overfitting occurs in deep n… ▽ More

    Submitted 24 June, 2020; v1 submitted 27 June, 2019; originally announced June 2019.

    Comments: appearing in USENIX 2020

  30. FlipTest: Fairness Testing via Optimal Transport

    Authors: Emily Black, Samuel Yeom, Matt Fredrikson

    Abstract: We present FlipTest, a black-box technique for uncovering discrimination in classifiers. FlipTest is motivated by the intuitive question: had an individual been of a different protected status, would the model have treated them differently? Rather than relying on causal information to answer this question, FlipTest leverages optimal transport to match individuals in different protected groups, cre… ▽ More

    Submitted 6 December, 2019; v1 submitted 21 June, 2019; originally announced June 2019.

    Comments: Accepted to ACM FAT* 2020; The first two authors contributed equally

  31. arXiv:1812.08999  [pdf, other

    cs.LG stat.ML

    Feature-Wise Bias Amplification

    Authors: Klas Leino, Emily Black, Matt Fredrikson, Shayak Sen, Anupam Datta

    Abstract: We study the phenomenon of bias amplification in classifiers, wherein a machine learning model learns to predict classes with a greater disparity than the underlying ground truth. We demonstrate that bias amplification can arise via an inductive bias in gradient descent methods that results in the overestimation of the importance of moderately-predictive "weak" features if insufficient training da… ▽ More

    Submitted 21 October, 2019; v1 submitted 21 December, 2018; originally announced December 2018.

    Comments: Published in ICLR 2019

  32. Contextual and Granular Policy Enforcement in Database-backed Applications

    Authors: Abhishek Bichhawat, Matt Fredrikson, Jean Yang, Akash Trehan

    Abstract: Database-backed applications rely on inlined policy checks to process users' private and confidential data in a policy-compliant manner as traditional database access control mechanisms cannot enforce complex policies. However, application bugs due to missed checks are common in such applications, which result in data breaches. While separating policy from code is a natural solution, many data pro… ▽ More

    Submitted 13 March, 2020; v1 submitted 20 November, 2018; originally announced November 2018.

  33. arXiv:1810.07155  [pdf, other

    cs.LG math.OC stat.ML

    Hunting for Discriminatory Proxies in Linear Regression Models

    Authors: Samuel Yeom, Anupam Datta, Matt Fredrikson

    Abstract: A machine learning model may exhibit discrimination when used to make decisions involving people. One potential cause for such outcomes is that the model uses a statistical proxy for a protected demographic attribute. In this paper we formulate a definition of proxy use for the setting of linear regression and present algorithms for detecting proxies. Our definition follows recent work on proxies… ▽ More

    Submitted 27 November, 2018; v1 submitted 16 October, 2018; originally announced October 2018.

  34. arXiv:1803.10815  [pdf, other

    cs.LG stat.ML

    Supervising Feature Influence

    Authors: Shayak Sen, Piotr Mardziel, Anupam Datta, Matthew Fredrikson

    Abstract: Causal influence measures for machine learnt classifiers shed light on the reasons behind classification, and aid in identifying influential input features and revealing their biases. However, such analyses involve evaluating the classifier using datapoints that may be atypical of its training distribution. Standard methods for training classifiers that minimize empirical risk do not constrain the… ▽ More

    Submitted 7 April, 2018; v1 submitted 28 March, 2018; originally announced March 2018.

  35. arXiv:1802.03788  [pdf, other

    cs.LG cs.AI stat.ML

    Influence-Directed Explanations for Deep Convolutional Networks

    Authors: Klas Leino, Shayak Sen, Anupam Datta, Matt Fredrikson, Linyi Li

    Abstract: We study the problem of explaining a rich class of behavioral properties of deep neural networks. Distinctively, our influence-directed explanations approach this problem by peering inside the network to identify neurons with high influence on a quantity and distribution of interest, using an axiomatically-justified influence measure, and then providing an interpretation for the concepts these neu… ▽ More

    Submitted 13 November, 2018; v1 submitted 11 February, 2018; originally announced February 2018.

    Comments: To appear in International Test Conference 2018

  36. arXiv:1801.01896  [pdf, ps, other

    cs.PL

    Verifying and Synthesizing Constant-Resource Implementations with Types

    Authors: Van Chan Ngo, Mario Dehesa-Azuara, Matthew Fredrikson, Jan Hoffmann

    Abstract: We propose a novel type system for verifying that programs correctly implement constant-resource behavior. Our type system extends recent work on automatic amortized resource analysis (AARA), a set of techniques that automatically derive provable upper bounds on the resource consumption of programs. We devise new techniques that build on the potential method to achieve compositionality, precision,… ▽ More

    Submitted 5 January, 2018; originally announced January 2018.

    Comments: 30, IEEE S&P 2017

  37. arXiv:1709.09586   

    cs.AI cs.CR

    Case Study: Explaining Diabetic Retinopathy Detection Deep CNNs via Integrated Gradients

    Authors: Linyi Li, Matt Fredrikson, Shayak Sen, Anupam Datta

    Abstract: In this report, we applied integrated gradients to explaining a neural network for diabetic retinopathy detection. The integrated gradient is an attribution method which measures the contributions of input to the quantity of interest. We explored some new ways for applying this method such as explaining intermediate layers, filtering out unimportant units by their attribution value and generating… ▽ More

    Submitted 18 October, 2017; v1 submitted 27 September, 2017; originally announced September 2017.

    Comments: This report has been withdrawn as it needs co-authors' permission and further verification of conclusions

  38. arXiv:1709.01604  [pdf, other

    cs.CR cs.LG stat.ML

    Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting

    Authors: Samuel Yeom, Irene Giacomelli, Matt Fredrikson, Somesh Jha

    Abstract: Machine learning algorithms, when applied to sensitive data, pose a distinct threat to privacy. A growing body of prior work demonstrates that models produced by these algorithms may leak specific private information in the training data to an attacker, either through the models' structure or their observable behavior. However, the underlying cause of this privacy risk is not well understood beyon… ▽ More

    Submitted 4 May, 2018; v1 submitted 5 September, 2017; originally announced September 2017.

  39. arXiv:1708.06384  [pdf, other

    cs.CR

    PrivacyProxy: Leveraging Crowdsourcing and In Situ Traffic Analysis to Detect and Mitigate Information Leakage

    Authors: Gaurav Srivastava, Kunal Bhuwalka, Swarup Kumar Sahoo, Saksham Chitkara, Kevin Ku, Matt Fredrikson, Jason Hong, Yuvraj Agarwal

    Abstract: Many smartphone apps transmit personally identifiable information (PII), often without the users knowledge. To address this issue, we present PrivacyProxy, a system that monitors outbound network traffic and generates app-specific signatures to represent sensitive data being shared. PrivacyProxy uses a crowd-based approach to detect likely PII in an adaptive and scalable manner by anonymously comb… ▽ More

    Submitted 26 October, 2018; v1 submitted 21 August, 2017; originally announced August 2017.

  40. arXiv:1707.08120  [pdf, other

    cs.CY cs.LG

    Proxy Non-Discrimination in Data-Driven Systems

    Authors: Anupam Datta, Matt Fredrikson, Gihyuk Ko, Piotr Mardziel, Shayak Sen

    Abstract: Machine learnt systems inherit biases against protected classes, historically disparaged groups, from training data. Usually, these biases are not explicit, they rely on subtle correlations discovered by training algorithms, and are therefore difficult to detect. We formalize proxy discrimination in data-driven systems, a class of properties indicative of bias, as the presence of protected class c… ▽ More

    Submitted 25 July, 2017; originally announced July 2017.

    Comments: arXiv admin note: substantial text overlap with arXiv:1705.07807

  41. arXiv:1705.07807  [pdf, other

    cs.CR cs.LG

    Use Privacy in Data-Driven Systems: Theory and Experiments with Machine Learnt Programs

    Authors: Anupam Datta, Matthew Fredrikson, Gihyuk Ko, Piotr Mardziel, Shayak Sen

    Abstract: This paper presents an approach to formalizing and enforcing a class of use privacy properties in data-driven systems. In contrast to prior work, we focus on use restrictions on proxies (i.e. strong predictors) of protected information types. Our definition relates proxy use to intermediate computations that occur in a program, and identify two essential properties that characterize this behavior:… ▽ More

    Submitted 7 September, 2017; v1 submitted 22 May, 2017; originally announced May 2017.

    Comments: extended CCS 2017 camera-ready: several new discussions, and complexity results added to appendix

  42. arXiv:1512.06388  [pdf, other

    cs.CR cs.DB cs.LG

    Revisiting Differentially Private Regression: Lessons From Learning Theory and their Consequences

    Authors: Xi Wu, Matthew Fredrikson, Wentao Wu, Somesh Jha, Jeffrey F. Naughton

    Abstract: Private regression has received attention from both database and security communities. Recent work by Fredrikson et al. (USENIX Security 2014) analyzed the functional mechanism (Zhang et al. VLDB 2012) for training linear regression models over medical data. Unfortunately, they found that model accuracy is already unacceptable with differential privacy when $\varepsilon = 5$. We address this issue… ▽ More

    Submitted 20 December, 2015; originally announced December 2015.

  43. arXiv:1511.07528  [pdf, other

    cs.CR cs.LG cs.NE stat.ML

    The Limitations of Deep Learning in Adversarial Settings

    Authors: Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, Ananthram Swami

    Abstract: Deep learning takes advantage of large datasets and computationally efficient training algorithms to outperform other approaches at various machine learning tasks. However, imperfections in the training phase of deep neural networks make them vulnerable to adversarial samples: inputs crafted by adversaries with the intent of causing deep neural networks to misclassify. In this work, we formalize t… ▽ More

    Submitted 23 November, 2015; originally announced November 2015.

    Comments: Accepted to the 1st IEEE European Symposium on Security & Privacy, IEEE 2016. Saarbrucken, Germany