Skip to main content

Showing 1–1 of 1 results for author: Blum, C W

  1. arXiv:2305.16444  [pdf, other

    cs.CL

    Don't Retrain, Just Rewrite: Countering Adversarial Perturbations by Rewriting Text

    Authors: Ashim Gupta, Carter Wood Blum, Temma Choji, Yingjie Fei, Shalin Shah, Alakananda Vempala, Vivek Srikumar

    Abstract: Can language models transform inputs to protect text classifiers against adversarial attacks? In this work, we present ATINTER, a model that intercepts and learns to rewrite adversarial inputs to make them non-adversarial for a downstream text classifier. Our experiments on four datasets and five attack mechanisms reveal that ATINTER is effective at providing better adversarial robustness than exi… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023