On the (in)feasibility of ML Backdoor Detection As an Hypothesis Testing Problem
arXiv (Cornell University)(2024)
摘要
We introduce a formal statistical definition for the problem of backdoordetection in machine learning systems and use it to analyze the feasibility ofsuch problems, providing evidence for the utility and applicability of ourdefinition. The main contributions of this work are an impossibility result andan achievability result for backdoor detection. We show a no-free-lunchtheorem, proving that universal (adversary-unaware) backdoor detection isimpossible, except for very small alphabet sizes. Thus, we argue, that backdoordetection methods need to be either explicitly, or implicitly adversary-aware.However, our work does not imply that backdoor detection cannot work inspecific scenarios, as evidenced by successful backdoor detection methods inthe scientific literature. Furthermore, we connect our definition to theprobably approximately correct (PAC) learnability of the out-of-distributiondetection problem.
更多查看译文
关键词
Delay Fault Testing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要