TY - GEN
T1 - Bolstering Adversarial Robustness with Latent Disparity Regularization
AU - Schwartz, David
AU - Ditzler, Gregory
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/7/18
Y1 - 2021/7/18
N2 - Recent research has revealed that neural networks and other machine learning models are vulnerable to adversarial attacks that aim to subvert their predictions' integrity or privacy by adding a small calculated perturbation to inputs. Further, the adversary can significantly degrade the performance of the model. The number and severity of attacks continues to grow. However, a dearth of techniques robustly defends machine learning models in a computationally inexpensive way. Against this background, we propose an adversarially robust training procedure and objective function for arbitrary neural network architectures. Robustness of neural networks against adversarial attacks on integrity is achieved by augmentation of a novel regularization term. This regularizer penalizes the discrepancy between the representations induced in hidden layers by benign and adversarial data. We benchmark our regularization approach on the Fashion-Mnist and Cifar-10 datasets. Our model is benchmarked against three state-of-the-art defense methods, namely: (i) regularization to the largest eigenvalue in the Fisher information matrix of the activity of the terminal layer, (ii) a higher-level representation guided denoising autoencoder (trained with adversarial examples), and (iii) training an otherwise undefended model on data distorted by additive white Gaussian noise. Our experiments show that the proposed regularizer provides significant improvements in adversarial robustness over both an undefended baseline model as well as the same model defended with other techniques. This result is observed over several adversarial budgets with only a small (but seemingly unavoidable) decline in benign test accuracy.
AB - Recent research has revealed that neural networks and other machine learning models are vulnerable to adversarial attacks that aim to subvert their predictions' integrity or privacy by adding a small calculated perturbation to inputs. Further, the adversary can significantly degrade the performance of the model. The number and severity of attacks continues to grow. However, a dearth of techniques robustly defends machine learning models in a computationally inexpensive way. Against this background, we propose an adversarially robust training procedure and objective function for arbitrary neural network architectures. Robustness of neural networks against adversarial attacks on integrity is achieved by augmentation of a novel regularization term. This regularizer penalizes the discrepancy between the representations induced in hidden layers by benign and adversarial data. We benchmark our regularization approach on the Fashion-Mnist and Cifar-10 datasets. Our model is benchmarked against three state-of-the-art defense methods, namely: (i) regularization to the largest eigenvalue in the Fisher information matrix of the activity of the terminal layer, (ii) a higher-level representation guided denoising autoencoder (trained with adversarial examples), and (iii) training an otherwise undefended model on data distorted by additive white Gaussian noise. Our experiments show that the proposed regularizer provides significant improvements in adversarial robustness over both an undefended baseline model as well as the same model defended with other techniques. This result is observed over several adversarial budgets with only a small (but seemingly unavoidable) decline in benign test accuracy.
UR - http://www.scopus.com/inward/record.url?scp=85116474954&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85116474954&partnerID=8YFLogxK
U2 - 10.1109/IJCNN52387.2021.9533815
DO - 10.1109/IJCNN52387.2021.9533815
M3 - Conference contribution
AN - SCOPUS:85116474954
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - IJCNN 2021 - International Joint Conference on Neural Networks, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 International Joint Conference on Neural Networks, IJCNN 2021
Y2 - 18 July 2021 through 22 July 2021
ER -