Counter-Samples: A Stateless Strategy to Neutralize Black Box Adversarial Attacks

arXiv (Cornell University)（2024）

引用 0|浏览19

暂无评分

摘要

Our paper presents a novel defence against black box attacks, where attackersuse the victim model as an oracle to craft their adversarial examples. Unliketraditional preprocessing defences that rely on sanitizing input samples, ourstateless strategy counters the attack process itself. For every query weevaluate a counter-sample instead, where the counter-sample is the originalsample optimized against the attacker's objective. By countering every blackbox query with a targeted white box optimization, our strategy effectivelyintroduces an asymmetry to the game to the defender's advantage. This defencenot only effectively misleads the attacker's search for an adversarial example,it also preserves the model's accuracy on legitimate inputs and is generic tomultiple types of attacks. We demonstrate that our approach is remarkably effective againststate-of-the-art black box attacks and outperforms existing defences for boththe CIFAR-10 and ImageNet datasets. Additionally, we also show that theproposed defence is robust against strong adversaries as well.

查看译文

关键词

Adversarial Examples,Defenses,Security Analysis

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要