Attack-Agnostic Defenses against Adversarial Inputs in Learning Systems

Researcher(s)

Sponsoring Agency
National Science Foundation

Summary

Deep learning technologies hold great promise to revolutionize the way people live and work. However, deep learning systems are inherently vulnerable to adversarial inputs, which are maliciously crafted samples to trigger deep neural networks to misbehave, leading to disastrous consequences in security-critical applications. The fundamental challenges of defending against such attacks stem from their adaptive and variable nature: adversarial inputs are tailored to target deep neural networks, while crafting strategies vary greatly with concrete attacks. This project develops EagleEye, a universal, attack-agnostic defense framework that (i) works effectively against unseen attack variants, (ii) preserves predictive power of deep neural networks, (iii) complements existing defense mechanisms, and (iv) provides comprehensive diagnosis about potential risks in deep learning outputs.

In particular, EagleEye leverages a set of invariant properties underlying most attacks, including the "minimality principle": to maximize attack evasiveness, an adversarial input is generated by applying the minimum possible distortion to a legitimate input. By exploiting such properties in a principled manner, EagleEye effectively discriminates adversarial inputs (integrity checking) and even uncovers their correct outputs (truth recovery). The specific research tasks include: (i) identifying inherently distinct properties (differentiators) of legitimate and adversarial inputs, (ii) developing attack-agnostic adversarial input detection methods based on these differentiators, and (iii) analyzing possible countermeasures by adversaries to evade such defenses. This research not only facilitates the adoption of deep learning-powered systems and services, but also enlightens designing and implementing robust machine learning systems in general. New theories and systems developed in this project are integrated into undergraduate and graduate education and used to raise public awareness of the importance of machine learning security.

Term
 -