Exploring Parameter-Efficient Fine-tuning for Improving Communication Efficiency in Federated Learning

doi:10.48550/arXiv.2210.01708

Journal Article10.48550/arXiv.2210.01708

Exploring Parameter-Efficient Fine-tuning for Improving Communication Efficiency in Federated Learning

- Vol. abs/2210.01708

13

TL;DR: FL learning has as a promising paradigm for enabling the collaborative training of models without on devices and by sharing a portion of reductions in can be achieved while maintaining in a of providing for and effective federated.

Abstract: Federated learning (FL) has emerged as a promising paradigm for enabling the collaborative training of models without centralized access to the raw data on local devices. In the typical FL paradigm (e.g., FedAvg), model weights are sent to and from the server each round to participating clients. However, this can quickly put a massive communication burden on the system, especially if more capable models beyond very small MLPs are employed. Recently, the use of pre-trained models has been shown effective in federated learning optimization and improving convergence. This opens the door for new research questions. Can we adjust the weight-sharing paradigm in federated learning, leveraging strong and readilyavailable pre-trained models, to significantly reduce the communication burden while simultaneously achieving excellent performance? To this end, we investigate the use of parameter-efficient fine-tuning in federated learning. Specifically, we systemically evaluate the performance of several parameter-efficient fine-tuning methods across a variety of client stability, data distribution, and differential privacy settings. By only locally tuning and globally sharing a small portion of the model weights, significant reductions in the total communication overhead can be achieved while maintaining competitive performance in a wide range of federated learning scenarios, providing insight into a new paradigm for practical and effective federated systems.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.48550/arXiv.2212.10025

When Federated Learning Meets Pre-trained Language Models' Parameter-Efficient Tuning Methods

Zhuo Zhang, +4 more

- 20 Dec 2022

- arXiv.org

TL;DR: In this article , the authors provide a holistic empirical study of representative pre-trained language models tuning methods in FL and develop a federated tuning framework FedPETuning, which allows practitioners to exploit different PETuning methods under the FL training paradigm conveniently.

...read moreread less

32

Journal Article•10.48550/arxiv.2310.15080

Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization

Tianshi Che, +7 more

- 23 Oct 2023

- arXiv.org

TL;DR: A Parameter-efficient prompt Tuning approach with Adaptive Optimization, i.e., FedPepTAO, to enable efficient and effective FL of LLMs and a novel adaptive optimization method is developed to address the client drift problems on both the device and server sides to enhance performance further.

...read moreread less

28

Journal Article•10.48550/arxiv.2308.12305

FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning

Haokun Chen, +4 more

- 21 Aug 2023

- arXiv.org

TL;DR: Federated Dual-Adapter Teacher (DAT) is the first approach that enables an efficient distributed finetuning of foundation models for a variety of heterogeneous Vision-Language tasks, and substantially outperforms the existing centralized PEFT methods adapted for FL.

...read moreread less

17

Journal Article•10.48550/arxiv.2310.01467

FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models

Jingwei Sun, +6 more

- 02 Oct 2023

- arXiv.org

TL;DR: Federated Black-box Prompt Tuning (FedBPT) is introduced, a framework designed to address challenges of fine-tuning of PLM in the age of large language models that reduces the number of exchanged variables, boosts communication efficiency, and minimizes computational and storage costs.

...read moreread less

8

Journal Article•10.48550/arXiv.2302.02949

Adaptive Parameterization of Deep Learning Models for Federated Learning

Morten From Elvebakken, +2 more

- 06 Feb 2023

- arXiv.org

TL;DR: In this paper , the authors propose to use parallel adapters for federated learning and show that they can achieve similar inference performance compared to training the full model while reducing the communication overhead by roughly 90%.

...read moreread less

3

References

•Journal Article

Visualizing Data using t-SNE

Laurens van der Maaten, +1 more

- 01 Jan 2008

- Journal of Machine Learning Research

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

...read moreread less

45.8K

•Posted Content

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, +11 more

- 22 Oct 2020

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.

...read moreread less

36.9K

•Journal Article

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Apr 2015

- Springer US

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) has been running annually for five years (since 2010) and has become the standard benchmark for large-scale object recognition.

...read moreread less

23.9K

•Posted Content

Communication-Efficient Learning of Deep Networks from Decentralized Data

H. Brendan McMahan, +4 more

- 17 Feb 2016

- arXiv: Learning

TL;DR: This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets.

...read moreread less

11.4K

•Book

The Algorithmic Foundations of Differential Privacy

Cynthia Dwork, +1 more

- 11 Aug 2014

TL;DR: The preponderance of this monograph is devoted to fundamental techniques for achieving differential privacy, and application of these techniques in creative combinations, using the query-release problem as an ongoing example.

...read moreread less

7.2K

...

Expand