Skip to main content

Showing 1–21 of 21 results for author: Suri, A

  1. arXiv:2406.11544  [pdf, other

    cs.LG cs.AI cs.CR

    Do Parameters Reveal More than Loss for Membership Inference?

    Authors: Anshuman Suri, Xiao Zhang, David Evans

    Abstract: Membership inference attacks aim to infer whether an individual record was used to train a model, serving as a key tool for disclosure auditing. While such evaluations are useful to demonstrate risk, they are computationally expensive and often make strong assumptions about potential adversaries' access to models and training environments, and thus do not provide very tight bounds on leakage from… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted at High-dimensional Learning Dynamics (HiLD) Workshop, ICML 2024

  2. arXiv:2405.05944  [pdf, other

    eess.IV cs.CV

    MRISegmentator-Abdomen: A Fully Automated Multi-Organ and Structure Segmentation Tool for T1-weighted Abdominal MRI

    Authors: Yan Zhuang, Tejas Sudharshan Mathai, Pritam Mukherjee, Brandon Khoury, Boah Kim, Benjamin Hou, Nusrat Rabbee, Abhinav Suri, Ronald M. Summers

    Abstract: Background: Segmentation of organs and structures in abdominal MRI is useful for many clinical applications, such as disease diagnosis and radiotherapy. Current approaches have focused on delineating a limited set of abdominal structures (13 types). To date, there is no publicly available abdominal MRI dataset with voxel-level annotations of multiple organs and structures. Consequently, a segmenta… ▽ More

    Submitted 24 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: We made the segmentation model publicly available

  3. arXiv:2402.07841  [pdf, other

    cs.CL

    Do Membership Inference Attacks Work on Large Language Models?

    Authors: Michael Duan, Anshuman Suri, Niloofar Mireshghallah, Sewon Min, Weijia Shi, Luke Zettlemoyer, Yulia Tsvetkov, Yejin Choi, David Evans, Hannaneh Hajishirzi

    Abstract: Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data. Despite extensive research on traditional machine learning models, there has been limited work studying MIA on the pre-training data of large language models (LLMs). We perform a large-scale evaluation of MIAs over a suite of language models (LMs) trained on the Pile… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  4. arXiv:2310.18362  [pdf, ps, other

    cs.CL cs.CR cs.LG

    SoK: Memorization in General-Purpose Large Language Models

    Authors: Valentin Hartmann, Anshuman Suri, Vincent Bindschaedler, David Evans, Shruti Tople, Robert West

    Abstract: Large Language Models (LLMs) are advancing at a remarkable pace, with myriad applications under development. Unlike most earlier machine learning models, they are no longer built for one specific application but are designed to excel in a wide range of tasks. A major part of this success is due to their huge training datasets and the unprecedented number of model parameters, which allow them to me… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  5. arXiv:2310.17534  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    SoK: Pitfalls in Evaluating Black-Box Attacks

    Authors: Fnu Suya, Anshuman Suri, Tingwei Zhang, Jingtao Hong, Yuan Tian, David Evans

    Abstract: Numerous works study black-box attacks on image classifiers. However, these works make different assumptions on the adversary's knowledge and current literature lacks a cohesive organization centered around the threat model. To systematize knowledge in this area, we propose a taxonomy over the threat space spanning the axes of feedback granularity, the access of interactive queries, and the qualit… ▽ More

    Submitted 14 February, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: Accepted at SaTML 2024

  6. arXiv:2303.11643  [pdf, other

    cs.LG cs.AI cs.CR

    Manipulating Transfer Learning for Property Inference

    Authors: Yulong Tian, Fnu Suya, Anshuman Suri, Fengyuan Xu, David Evans

    Abstract: Transfer learning is a popular method for tuning pretrained (upstream) models for different downstream tasks using limited data and computational resources. We study how an adversary with control over an upstream model used in transfer learning can conduct property inference attacks on a victim's tuned downstream model. For example, to infer the presence of images of a specific individual in the d… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: Accepted to CVPR 2023

  7. arXiv:2212.10986  [pdf, other

    cs.LG cs.CR cs.GT

    SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning

    Authors: Ahmed Salem, Giovanni Cherubin, David Evans, Boris Köpf, Andrew Paverd, Anshuman Suri, Shruti Tople, Santiago Zanella-Béguelin

    Abstract: Deploying machine learning models in production may allow adversaries to infer sensitive information about training data. There is a vast literature analyzing different types of inference risks, ranging from membership inference to reconstruction attacks. Inspired by the success of games (i.e., probabilistic experiments) to study security properties in cryptography, some authors describe privacy i… ▽ More

    Submitted 20 April, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: 20 pages, to appear in 2023 IEEE Symposium on Security and Privacy

  8. arXiv:2212.07591  [pdf, other

    cs.LG cs.AI cs.CR

    Dissecting Distribution Inference

    Authors: Anshuman Suri, Yifu Lu, Yanjin Chen, David Evans

    Abstract: A distribution inference attack aims to infer statistical properties of data used to train machine learning models. These attacks are sometimes surprisingly potent, but the factors that impact distribution inference risk are not well understood and demonstrated attacks often rely on strong and unrealistic assumptions such as full knowledge of training environments even in supposedly black-box thre… ▽ More

    Submitted 5 April, 2024; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: Accepted at SaTML 2023 (updated Yifu's email address)

  9. arXiv:2211.16200  [pdf, other

    cs.CV cs.AI

    From Forks to Forceps: A New Framework for Instance Segmentation of Surgical Instruments

    Authors: Britty Baby, Daksh Thapar, Mustafa Chasmai, Tamajit Banerjee, Kunal Dargan, Ashish Suri, Subhashis Banerjee, Chetan Arora

    Abstract: Minimally invasive surgeries and related applications demand surgical tool classification and segmentation at the instance level. Surgical tools are similar in appearance and are long, thin, and handled at an angle. The fine-tuning of state-of-the-art (SOTA) instance segmentation models trained on natural images for instrument segmentation has difficulty discriminating instrument classes. Our rese… ▽ More

    Submitted 11 March, 2023; v1 submitted 26 November, 2022; originally announced November 2022.

    Comments: WACV 2023

  10. arXiv:2206.03317  [pdf, other

    cs.LG cs.AI cs.CR

    Subject Membership Inference Attacks in Federated Learning

    Authors: Anshuman Suri, Pallika Kanani, Virendra J. Marathe, Daniel W. Peterson

    Abstract: Privacy attacks on Machine Learning (ML) models often focus on inferring the existence of particular data points in the training data. However, what the adversary really wants to know is if a particular individual's (subject's) data was included during training. In such scenarios, the adversary is more likely to have access to the distribution of a particular subject than actual records. Furthermo… ▽ More

    Submitted 2 June, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

  11. arXiv:2109.06024  [pdf, other

    cs.LG cs.AI cs.CR

    Formalizing and Estimating Distribution Inference Risks

    Authors: Anshuman Suri, David Evans

    Abstract: Distribution inference, sometimes called property inference, infers statistical properties about a training set from access to a model trained on that data. Distribution inference attacks can pose serious risks when models are trained on private data, but are difficult to distinguish from the intrinsic purpose of statistical machine learning -- namely, to produce models that capture statistical pr… ▽ More

    Submitted 5 July, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: Update: Accepted at PETS 2022

  12. arXiv:2106.03699  [pdf, other

    cs.LG cs.AI cs.CR

    Formalizing Distribution Inference Risks

    Authors: Anshuman Suri, David Evans

    Abstract: Property inference attacks reveal statistical properties about a training set but are difficult to distinguish from the primary purposes of statistical machine learning, which is to produce models that capture statistical properties about a distribution. Motivated by Yeom et al.'s membership inference framework, we propose a formal and generic definition of property inference attacks. The proposed… ▽ More

    Submitted 24 September, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: ICML 2021 Workshop on Theory and Practice of Differential Privacy. Longer version of work available at arXiv:2109.06024 Update: Labelling error for Census[Race], where graphs were mirror-images because of 1-ratio being used instead of the ratio. Comparison with SOTA also updated; conclusions remain unchanged

  13. Neuro-Endo-Trainer-Online Assessment System (NET-OAS) for Neuro-Endoscopic Skills Training

    Authors: Vinkle Srivastav, Britty Baby, Ramandeep Singh, Prem Kalra, Ashish Suri

    Abstract: Neuro-endoscopy is a challenging minimally invasive neurosurgery that requires surgical skills to be acquired using training methods different from the existing apprenticeship model. There are various training systems developed for imparting fundamental technical skills in laparoscopy where as limited systems for neuro-endoscopy. Neuro-Endo-Trainer was a box-trainer developed for endo-nasal transs… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

    Comments: Published at Federated Conference on Computer Science and Information Systems - FedCSIS 2017

    Journal ref: IEEE (2017)

  14. arXiv:2006.16469  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Model-Targeted Poisoning Attacks with Provable Convergence

    Authors: Fnu Suya, Saeed Mahloujifar, Anshuman Suri, David Evans, Yuan Tian

    Abstract: In a poisoning attack, an adversary with control over a small fraction of the training data attempts to select that data in a way that induces a corrupted model that misbehaves in favor of the adversary. We consider poisoning attacks against convex machine learning models and propose an efficient poisoning attack designed to induce a specified model. Unlike previous model-targeted poisoning attack… ▽ More

    Submitted 21 April, 2021; v1 submitted 29 June, 2020; originally announced June 2020.

    Comments: 32 pages, code available at: https://github.com/suyeecav/model-targeted-poisoning

  15. arXiv:2005.11151  [pdf

    cs.HC cs.AI eess.SP

    Attention Patterns Detection using Brain Computer Interfaces

    Authors: Felix G. Hamza-Lup, Adytia Suri, Ionut E. Iacob, Ioana R. Goldbach, Lateef Rasheed, Paul N. Borza

    Abstract: The human brain provides a range of functions such as expressing emotions, controlling the rate of breathing, etc., and its study has attracted the interest of scientists for many years. As machine learning models become more sophisticated, and bio-metric data becomes more readily available through new non-invasive technologies, it becomes increasingly possible to gain access to interesting biomet… ▽ More

    Submitted 20 May, 2020; originally announced May 2020.

    Journal ref: ACM SE 2020

  16. arXiv:2003.09372  [pdf, other

    cs.LG stat.ML

    One Neuron to Fool Them All

    Authors: Anshuman Suri, David Evans

    Abstract: Despite vast research in adversarial examples, the root causes of model susceptibility are not well understood. Instead of looking at attack-specific robustness, we propose a notion that evaluates the sensitivity of individual neurons in terms of how robust the model's output is to direct perturbations of that neuron's output. Analyzing models from this perspective reveals distinctive characterist… ▽ More

    Submitted 9 June, 2020; v1 submitted 20 March, 2020; originally announced March 2020.

    Comments: Updated 'PGD' columns of Table 1: numbers reported earlier for this column were (100 - accuracy) instead of attack success rates. Observations and conclusions remain unchanged

  17. arXiv:2003.08553  [pdf, other

    cs.IR cs.CL

    QnAMaker: Data to Bot in 2 Minutes

    Authors: Parag Agrawal, Tulasi Menon, Aya Kamel, Michel Naim, Chaikesh Chouragade, Gurvinder Singh, Rohan Kulkarni, Anshuman Suri, Sahithi Katakam, Vineet Pratik, Prakul Bansal, Simerpreet Kaur, Neha Rajput, Anand Duggal, Achraf Chalabi, Prashant Choudhari, Reddy Satti, Niranjan Nayak

    Abstract: Having a bot for seamless conversations is a much-desired feature that products and services today seek for their websites and mobile apps. These bots help reduce traffic received by human support significantly by handling frequent and directly answerable known questions. Many such services have huge reference documents such as FAQ pages, which makes it hard for users to browse through this data.… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

    Comments: Published at The Web Conference 2020 in the demo track

  18. arXiv:1904.03223  [pdf, other

    cs.CL cs.IR

    NELEC at SemEval-2019 Task 3: Think Twice Before Going Deep

    Authors: Parag Agrawal, Anshuman Suri

    Abstract: Existing Machine Learning techniques yield close to human performance on text-based classification tasks. However, the presence of multi-modal noise in chat data such as emoticons, slang, spelling mistakes, code-mixed data, etc. makes existing deep-learning solutions perform poorly. The inability of deep-learning systems to robustly capture these covariates puts a cap on their performance. We prop… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: International Workshop on Semantic Evaluation (SemEval), NAACL-HLT 2019

  19. arXiv:1811.07600  [pdf, other

    cs.AI cs.CL cs.LG

    A Trustworthy, Responsible and Interpretable System to Handle Chit Chat in Conversational Bots

    Authors: Parag Agrawal, Anshuman Suri, Tulasi Menon

    Abstract: Most often, chat-bots are built to solve the purpose of a search engine or a human assistant: Their primary goal is to provide information to the user or help them complete a task. However, these chat-bots are incapable of responding to unscripted queries like "Hi, what's up", "What's your favourite food". Human evaluation judgments show that 4 humans come to a consensus on the intent of a given q… ▽ More

    Submitted 23 November, 2018; v1 submitted 19 November, 2018; originally announced November 2018.

    Comments: 7 pages, 5 figures, The Second AAAI Workshop on Reasoning and Learning for Human-Machine Dialogues (DEEP-DIAL 2019)

  20. arXiv:1802.01448  [pdf, other

    cs.LG cs.CR stat.ML

    Hardening Deep Neural Networks via Adversarial Model Cascades

    Authors: Deepak Vijaykeerthy, Anshuman Suri, Sameep Mehta, Ponnurangam Kumaraguru

    Abstract: Deep neural networks (DNNs) are vulnerable to malicious inputs crafted by an adversary to produce erroneous outputs. Works on securing neural networks against adversarial examples achieve high empirical robustness on simple datasets such as MNIST. However, these techniques are inadequate when empirically tested on complex data sets such as CIFAR-10 and SVHN. Further, existing techniques are design… ▽ More

    Submitted 4 November, 2018; v1 submitted 2 February, 2018; originally announced February 2018.

  21. arXiv:1610.07772  [pdf, other

    cs.SI

    Visual Themes and Sentiment on Social Networks To Aid First Responders During Crisis Events

    Authors: Prateek Dewan, Varun Bharadhwaj, Aditi Mithal, Anshuman Suri, Ponnurangam Kumaraguru

    Abstract: Online Social Networks explode with activity whenever a crisis event takes place. Most content generated as part of this activity is a mixture of text and images, and is particularly useful for first responders to identify popular topics of interest and gauge the pulse and sentiment of citizens. While multiple researchers have used text to identify, analyze and measure themes and public sentiment… ▽ More

    Submitted 25 October, 2016; originally announced October 2016.

    Comments: 8+1 pages