Artificial Interrogation for Attributing Language Models

doi:10.48550/arXiv.2211.10877

Journal Article10.48550/arXiv.2211.10877

Artificial Interrogation for Attributing Language Models

Farhan Dhanani, +1 more

- 20 Nov 2022

- arXiv.org

- Vol. abs/2211.10877

1

TL;DR: The Machine Learn- ing Model Attribution Challenge (MLMAC) was organized by MITRE, Microsoft, Schmidt-Futures, Robust-Intelligence, Lincoln-Network, and Huggingface community as mentioned in this paper .

Abstract: —This paper presents solutions to the Machine Learn- ing Model Attribution challenge (MLMAC) collectively organized by MITRE, Microsoft, Schmidt-Futures, Robust-Intelligence, Lincoln-Network, and Huggingface community. The challenge provides twelve open-sourced base versions of popular language models developed by well-known organizations and twelve ﬁne-tuned language models for text generation. The names and architecture details of ﬁne-tuned models were kept hidden, and participants can access these models only through the rest APIs developed by the organizers. Given these constraints, the goal of the contest is to identify which ﬁne-tuned models originated from which base model. To solve this challenge, we have assumed that ﬁne-tuned models and their corresponding base versions must share a similar vocabulary set with a matching syntactical writing style that resonates in their generated outputs. Our strategy is to develop a set of queries to interrogate base and ﬁne-tuned models. And then perform one-to-many pairing between them based on similarities in their generated responses, where more than one ﬁne-tuned model can pair with a base model but not vice-versa. We have employed four distinct approaches for measuring the resemblance between the responses generated from the models of both sets. The ﬁrst approach uses evaluation metrics of the machine translation, and the second uses a vector space model. The third approach uses state-of-the-art multi-class text classiﬁcation, Transformer models. Lastly, the fourth approach uses a set of Transformer based binary text classiﬁers, one for each provided base model, to perform multi- class text classiﬁcation in a one-vs-all fashion. This paper reports implementation details, comparison, and experimental studies, of these approaches along with the ﬁnal obtained results.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.48550/arXiv.2302.06716

Machine Learning Model Attribution Challenge

Elizabeth Merkhofe, +5 more

- 13 Feb 2023

- arXiv.org

TL;DR: The Machine Learning Model Attribution Challenge (MLMAC) as mentioned in this paper was the first attempt to identify the publicly available base models that underlie a set of anonymous, fine-tuned large language models using only textual output of the models.

...read moreread less

2

References

•Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

- 12 Jun 2017

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

94.2K

•Proceedings Article•10.3115/1073083.1073135

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

- 06 Jul 2002

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

28.9K

•Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, +9 more

- 26 Jul 2019

- arXiv: Computation and Language

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

...read moreread less

26.2K

•Proceedings Article•10.18653/V1/W18-5446

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

Alex Wang, +5 more

- 01 Nov 2018

TL;DR: The gluebenchmark as mentioned in this paper is a benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models.

...read moreread less

7.3K

•Posted Content

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Victor Sanh, +3 more

- 02 Oct 2019

- arXiv: Computation and Language

TL;DR: This work proposes a method to pre-train a smaller general-purpose language representation model, called DistilBERT, which can be fine-tuned with good performances on a wide range of tasks like its larger counterparts, and introduces a triple loss combining language modeling, distillation and cosine-distance losses.

...read moreread less

7.3K

...

Expand