Open AccessJournal Article10.3390/math11133015

Probability-Distribution-Guided Adversarial Sample Attacks for Boosting Transferability and Interpretability

- 06 Jul 2023

- Vol. 11, Iss: 13, pp 3015-3015

TL;DR: Zhang et al. as mentioned in this paper proposed a score-matching-based attack method to perform adversarial sample attacks by manipulating the probability distribution of the samples, which showed good transferability in the face of different datasets and models and provided reasonable explanations from the perspective of mathematical theory and feature space.

Abstract: In recent years, with the rapid development of technology, artificial intelligence (AI) security issues represented by adversarial sample attack have aroused widespread concern in society. Adversarial samples are often generated by surrogate models and then transfer to attack the target model, and most AI models in real-world scenarios belong to black boxes; thus, transferability becomes a key factor to measure the quality of adversarial samples. The traditional method relies on the decision boundary of the classifier and takes the boundary crossing as the only judgment metric without considering the probability distribution of the sample itself, which results in an irregular way of adding perturbations to the adversarial sample, an unclear path of generation, and a lack of transferability and interpretability. In the probabilistic generative model, after learning the probability distribution of the samples, a random term can be added to the sampling to gradually transform the noise into a new independent and identically distributed sample. Inspired by this idea, we believe that by removing the random term, the adversarial sample generation process can be regarded as the static sampling of the probabilistic generative model, which guides the adversarial samples out of the original probability distribution and into the target probability distribution and helps to boost transferability and interpretability. Therefore, we proposed a score-matching-based attack (SMBA) method to perform adversarial sample attacks by manipulating the probability distribution of the samples, which showed good transferability in the face of different datasets and models and provided reasonable explanations from the perspective of mathematical theory and feature space. Compared with the current best methods based on the decision boundary of the classifier, our method increased the attack success rate by 51.36% and 30.54% to the maximum extent in non-targeted and targeted attack scenarios, respectively. In conclusion, our research established a bridge between probabilistic generative models and adversarial samples, provided a new entry angle for the study of adversarial samples, and brought new thinking to AI security.

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Most frequently asked questions

1. What are the limitations of traditional adversarial sample attack methods and how does the proposed probability distribution-guided approach address these limitations?

Traditional adversarial sample attack methods, such as the classifier decision-boundary-guided approach, have limitations in transferability and interpretability. These methods focus on guiding the generation of adversarial samples based on the decision boundary of the classifier, resulting in irregular perturbations and unclear paths of generation. Additionally, the different structures of classifiers lead to variations in decision boundaries, affecting the transferability of adversarial samples and potentially failing when attacking realistic black box models. The proposed probability distribution-guided approach, on the other hand, manipulates the probability distribution of samples to guide the generation and attack of adversarial samples. By moving the adversarial sample from the source class's probability distribution space to the target class's probability distribution space, this approach overcomes the limitations of classifier structure and achieves high transferability. Furthermore, the probability distribution-guided approach provides a clear generation path and a more reasonable explanation, addressing the issues of insufficient transferability and poor interpretability in traditional methods. This approach leverages the concept of a probabilistic generative model, where the generation of adversarial samples is seen as a specialized model that moves the initial random noise in the direction of the logarithmic gradient of the sample's true conditional probability density. By reducing the randomness and focusing on the true probability density, the proposed method enhances the effectiveness and interpretability of adversarial sample attacks.

References

•Proceedings Article•10.1109/CVPR.2016.90

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

198.7K

Proceedings Article•10.1109/CVPR.2009.5206848

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

- 20 Jun 2009

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

75.9K

•Proceedings Article•10.1109/CVPR.2017.243

Densely Connected Convolutional Networks

Gao Huang, +3 more

- 21 Jul 2017

TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.

...read moreread less

46.1K

Journal Article•10.1162/NECO.2006.18.7.1527

A fast learning algorithm for deep belief nets

Geoffrey E. Hinton, +2 more

- 01 Jul 2006

- Neural Computation

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

...read moreread less

18.3K

•Posted Content

Denoising Diffusion Probabilistic Models

Jonathan Ho, +2 more

- 19 Jun 2020

- arXiv: Learning

TL;DR: High quality image synthesis results are presented using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics, which naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding.

...read moreread less

11.7K

...

Expand

Probability-Distribution-Guided Adversarial Sample Attacks for Boosting Transferability and Interpretability

Chat with Paper

AI Agents for this Paper

Most frequently asked questions

1. What are the limitations of traditional adversarial sample attack methods and how does the proposed probability distribution-guided approach address these limitations?

References

Deep Residual Learning for Image Recognition

ImageNet: A large-scale hierarchical image database

Densely Connected Convolutional Networks

A fast learning algorithm for deep belief nets

Denoising Diffusion Probabilistic Models

Related Papers (5)

How AI Plays Its Tricks: Interpreting the Superior Performance of Deep Learning-Based Approach in Predicting Healthcare Costs

A Statistical Learning Model with Deep Learning Characteristics

Making Attention Mechanisms More Robust and Interpretable with Virtual Adversarial Training for Semi-Supervised Text Classification.

Toward Interpretable Machine Learning: Transparent Deep Neural Networks and Beyond

Development of Interpretable Machine Learning Models to Detect Arrhythmia based on ECG Data