Machine Learning Model Attribution Challenge

doi:10.48550/arXiv.2302.06716

Journal Article10.48550/arXiv.2302.06716

Machine Learning Model Attribution Challenge

Elizabeth Merkhofe, +5 more

- 13 Feb 2023

- arXiv.org

- Vol. abs/2302.06716

2

TL;DR: The Machine Learning Model Attribution Challenge (MLMAC) as mentioned in this paper was the first attempt to identify the publicly available base models that underlie a set of anonymous, fine-tuned large language models using only textual output of the models.

Abstract: We present the findings of the Machine Learning Model Attribution Challenge. Fine-tuned machine learning models may derive from other trained models without obvious attribution characteristics. In this challenge, participants identify the publicly-available base models that underlie a set of anonymous, fine-tuned large language models (LLMs) using only textual output of the models. Contestants aim to correctly attribute the most fine-tuned models, with ties broken in the favor of contestants whose solutions use fewer calls to the fine-tuned models' API. The most successful approaches were manual, as participants observed similarities between model outputs and developed attribution heuristics based on public documentation of the base models, though several teams also submitted automated, statistical solutions.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

What can we learn from Data Leakage and Unlearning for Law?

19 Jul 2023

TL;DR: The authors showed that fine-tuned models not only leak their training data but also leak the pre-training data (and PII) memorized during the training phase, which can pose significant privacy and legal concerns for companies that use large language models to offer services.

...read moreread less

Proceedings Article•10.1109/icassp48485.2024.10446824

Detection and Attribution of Models Trained on Generated Data

Ahmed Salem, +4 more

- 14 Apr 2024

TL;DR: This work takes the first step in the forensic analysis of models trained on GAN-generated data, and detects whether a model is trained on GAN-generated or real data, and attributes these models, trained on GAN-generated data, to their respective source GANs.

...read moreread less

References

•Posted Content

Automatic Detection of Generated Text is Easiest when Humans are Fooled.

Daphne Ippolito, +3 more

- 02 Nov 2019

- arXiv: Computation and Language

TL;DR: The authors performed a benchmarking and analysis of three sampling-based decoding strategies (top-k, nucleus sampling, and untruncated random sampling) and found that they are primarily optimized for fooling humans.

...read moreread less

195

•Proceedings Article•10.18653/V1/P17-3017

Generating Steganographic Text with LSTMs

Tina Tina Fang, +2 more

- 30 May 2017

TL;DR: In this paper, a steganographic system based on a Long Short-Term Memory (LSTM) neural network was proposed to enable two users to exchange encrypted messages without an adversary detecting that such an exchange is taking place.

...read moreread less

169

Journal Article•10.48550/arXiv.2301.04246

Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations

Josh A. Goldstein, +5 more

- 10 Jan 2023

- arXiv.org

TL;DR: The authors assesses how language models might change influence operations in the future and what steps can be taken to mitigate this threat, and provide a framework for stages of the language model-to-influence operations pipeline that mitigations could target (model construction, model access, content dissemination, and belief formation).

...read moreread less

146

Journal Article•10.48550/arXiv.2211.10877

Artificial Interrogation for Attributing Language Models

Farhan Dhanani, +1 more

- 20 Nov 2022

- arXiv.org

TL;DR: The Machine Learn- ing Model Attribution Challenge (MLMAC) was organized by MITRE, Microsoft, Schmidt-Futures, Robust-Intelligence, Lincoln-Network, and Huggingface community as mentioned in this paper .

...read moreread less

1