- 40 views
- 42 downloads
Enhancing Adversarial Robustness Through Model Optimization on Clean Data in Deep Neural Networks
-
- Author / Creator
- Yang, Bokai
-
Adversarial robustness has emerged as a critical area in deep learning due to the increasing application of deep neural networks (DNNs) and the consequent demand for their security. Adversarial examples, which are inputs modified with imperceptible perturbations to deceive DNNs, have garnered significant attention since their introduction in 2014. Despite the importance of this issue, current defense methods against adversarial attacks either demand extensive computational resources or offer limited effectiveness, often relying on specific attack assumptions. In this thesis, we propose two novel methods to enhance the adversarial robustness of neural networks without the need for adversarial examples involving in the training process.
The first approach utilizes temperature scaling of the cross-entropy function to enhance adversarial defense against untargeted attacks. Through both theoretical analysis and empirical observations, we demonstrate that increasing the temperature during model training promotes a more balanced learning process. This adjustment helps prevent bias in optimization towards challenging samples, leading to smoother surfaces and reduced gradient updates across non-target classes. As a result, this implicit debiased optimization strategy significantly improves robustness. Our experiments confirm that training at elevated temperatures effectively defends against untargeted adversarial attacks without requiring additional computational resources.
The second approach utilizes implicit dimension reduction of input data/features and online knowledge distillation to enhance model robustness. Recent research in adversarial defense has demonstrated that training networks with low-dimensional input vectors can improve robustness. However, this method typically sacrifices the model’s ability to generalize well on clean data. Based on this theory, we introduce a teacher-student framework, where a teacher model trained with low-dimensional inputs obtains strong adversarial robustness and is used to guide the optimization of a student model trained with higher dimensional inputs. This framework encourages the student model to inherit the teacher’s adversarial robustness while maintaining strong generalization capabilities. Extensive experiments validate that this approach significantly improves adversarial robustness with only a small impact on generalization performance.
Overall, our research presents two innovative strategies for improving the adversarial robustness of neural networks. The temperature scaling method provides a straightforward yet effective way to bolster defenses against untar geted attacks, while the teacher-student framework offers a balanced solution to maintain generalization ability alongside enhanced robustness. These con tributions advance the field of adversarial machine learning, offering practical solutions for developing more secure and reliable deep learning models.
-
- Subjects / Keywords
-
- Graduation date
- Fall 2024
-
- Type of Item
- Thesis
-
- Degree
- Master of Science
-
- License
- This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.