Patch Robustness Certification for Transformer via Black-box Testing Approach

  • Author / Creator
    Huang, Yuheng
  • In recent years, deep neural networks (DNNs) have revolutionized the field of artificial intelligence (AI), leading to breakthroughs in areas such as computer vision, natural language processing, and robotics. Despite their superior performance, studies have demonstrated that they are susceptible to input data changes, making them vulnerable to adversarial attacks, where malicious inputs are designed to deceive the DNNs and cause incorrect outputs. This can be a serious weakness for deploying DNNs in safety-critical scenarios, such as autonomous driving, medical diagnosis, face authentication, intruder detection and aircraft control.

    More recently, we have witnessed the tremendous success of Transformer as a representative DNN architecture. Vision Transformer (ViT), a particular variant applied for vision tasks, has achieved state-of-the-art performance on various benchmark datasets, demonstrating its effectiveness in learning from large amounts of visual data. However, its robustness is still a major limitation. Its record-breaking performance comes at the cost of extreme sensitivity with respect to the inputs. Relevant studies also demonstrate that its ability to defend against a particular type of attack: physical patch attack, is even weaker than classical convolutional neural networks (CNNs). This poses a serious threat to the deployment of ViT in industries, especially in safety-critical domains.

    In this work, we propose PatchCensor, a systematic approach that aims to improve the patch robustness of ViT in a black-box manner. By assuming the attackers have the maximum capability, we design a warning system that can detect the adversary's presence under the worst condition. Our methodology falls into the category of certified defense. Unlike empirical defenses, such approaches can provide a strong guarantee for the inference result. Existing certified defense methods often require substantial efforts in training and usually inevitably sacrifice the base model's performance. To bridge the gap, PatchCensor aims to improve the robustness of the whole system by detecting anomalous inputs rather than relying solely on training a robust model to provide accurate results for all inputs, which could potentially compromise its overall accuracy.

  • Subjects / Keywords
  • Graduation date
    Fall 2023
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.