AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples
CoRR(2024)
摘要
Adversarial examples are typically optimized with gradient-based attacks.While novel attacks are continuously proposed, each is shown to outperform itspredecessors using different experimental setups, hyperparameter settings, andnumber of forward and backward calls to the target models. This providesoverly-optimistic and even biased evaluations that may unfairly favor oneparticular attack over the others. In this work, we aim to overcome theselimitations by proposing AttackBench, i.e., the first evaluation framework thatenables a fair comparison among different attacks. To this end, we firstpropose a categorization of gradient-based attacks, identifying their maincomponents and differences. We then introduce our framework, which evaluatestheir effectiveness and efficiency. We measure these characteristics by (i)defining an optimality metric that quantifies how close an attack is to theoptimal solution, and (ii) limiting the number of forward and backward queriesto the model, such that all attacks are compared within a given maximum querybudget. Our extensive experimental analysis compares more than 100 attackimplementations with a total of over 800 different configurations againstCIFAR-10 and ImageNet models, highlighting that only very few attacksoutperform all the competing approaches. Within this analysis, we shed light onseveral implementation issues that prevent many attacks from finding bettersolutions or running at all. We release AttackBench as a publicly availablebenchmark, aiming to continuously update it to include and evaluate novelgradient-based attacks for optimizing adversarial examples.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要