Common Knowledge Learning for Generating Transferable Adversarial Examples

doi:10.48550/arXiv.2307.00274

Journal Article10.48550/arXiv.2307.00274

Common Knowledge Learning for Generating Transferable Adversarial Examples

Rui Yang, +4 more

- 01 Jul 2023

- arXiv.org

- Vol. abs/2307.00274

TL;DR: In this article , a common knowledge learning (CKL) framework was proposed to learn better network weights to generate adversarial examples with better transferability under fixed network architectures, where the knowledge is distilled from different teacher architectures into one student network.

Abstract: This paper focuses on an important type of black-box attacks, i.e., transfer-based adversarial attacks, where the adversary generates adversarial examples by a substitute (source) model and utilize them to attack an unseen target model, without knowing its information. Existing methods tend to give unsatisfactory adversarial transferability when the source and target models are from different types of DNN architectures (e.g. ResNet-18 and Swin Transformer). In this paper, we observe that the above phenomenon is induced by the output inconsistency problem. To alleviate this problem while effectively utilizing the existing DNN models, we propose a common knowledge learning (CKL) framework to learn better network weights to generate adversarial examples with better transferability, under fixed network architectures. Specifically, to reduce the model-specific features and obtain better output distributions, we construct a multi-teacher framework, where the knowledge is distilled from different teacher architectures into one student network. By considering that the gradient of input is usually utilized to generated adversarial examples, we impose constraints on the gradients between the student and teacher models, to further alleviate the output inconsistency problem and enhance the adversarial transferability. Extensive experiments demonstrate that our proposed work can significantly improve the adversarial transferability.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Figure 1: Output inconsistency among different networks with the same input image, whose truth label is class ‘2’. Although everymodel gives the correct prediction, the output probabilities are obviously different.

Table 1: Non-targeted attack results on CIFAR10. The first column introduces the source models and the first row presents the target models. We report the averaged attack success rate on the entire testing set. ‘*’ denotes the teacher models. △ implies that the source and target model s are identical. MI-FGSM, DI-FGSM and VNI-FGSM are abbreviated as ‘MI’, ‘DI’, ‘VNI’, respectively. ‘+CKL’ represents that our CKL framework is integrated.

Table 7: Targeted attack results on CIFAR100. We generate examples on the testing set and report the tASR value.

Table 8: Comparison with ensemble attack on CIFAR10. We report the averaged attack success rate on the entire testing set. ‘+CKL’ represents that our CKL framework is integrated.

Table 6: Integrating our CKL into SOTA intermediate level attack method on CIFAR10. The first column introduces the source models and the first row presents the test models. We report the averaged attack success rate on the entire testing set. ‘+CKL’ represents that our CKL framework is integrated.

Figure 2: We calculate the output inconsistency and transferability on the CIFAR10 testing set. The results are obtained by averaging from 10,000 images. (a) The output inconsistency. A higher value denotes a higher inconsistency. (b) We take two models in turn as the source model to generate adversarial examples to attack the other and compute the transferability by averaging the two attack results.

References

•Posted Content

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 10 Dec 2015

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

...read moreread less

117.9K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

•Proceedings Article•10.1109/CVPR.2017.243

Densely Connected Convolutional Networks

Gao Huang, +3 more

- 21 Jul 2017

TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.

...read moreread less

46.1K

•Posted Content

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, +11 more

- 22 Oct 2020

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.

...read moreread less

36.9K

•Posted Content

Distilling the Knowledge in a Neural Network

Geoffrey E. Hinton, +2 more

- 09 Mar 2015

- arXiv: Machine Learning

TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.

...read moreread less

21.2K

...

Expand