Journal Article10.48550/arXiv.2210.01708
Exploring Parameter-Efficient Fine-tuning for Improving Communication Efficiency in Federated Learning
Guangyu Sun,Matias Mendieta,Taojiannan Yang,Chen Chen +3 more
- Vol. abs/2210.01708
13
TL;DR: FL learning has as a promising paradigm for enabling the collaborative training of models without on devices and by sharing a portion of reductions in can be achieved while maintaining in a of providing for and effective federated.
read more
Abstract: Federated learning (FL) has emerged as a promising paradigm for enabling the collaborative training of models without centralized access to the raw data on local devices. In the typical FL paradigm (e.g., FedAvg), model weights are sent to and from the server each round to participating clients. However, this can quickly put a massive communication burden on the system, especially if more capable models beyond very small MLPs are employed. Recently, the use of pre-trained models has been shown effective in federated learning optimization and improving convergence. This opens the door for new research questions. Can we adjust the weight-sharing paradigm in federated learning, leveraging strong and readilyavailable pre-trained models, to significantly reduce the communication burden while simultaneously achieving excellent performance? To this end, we investigate the use of parameter-efficient fine-tuning in federated learning. Specifically, we systemically evaluate the performance of several parameter-efficient fine-tuning methods across a variety of client stability, data distribution, and differential privacy settings. By only locally tuning and globally sharing a small portion of the model weights, significant reductions in the total communication overhead can be achieved while maintaining competitive performance in a wide range of federated learning scenarios, providing insight into a new paradigm for practical and effective federated systems.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
When Federated Learning Meets Pre-trained Language Models' Parameter-Efficient Tuning Methods
TL;DR: In this article , the authors provide a holistic empirical study of representative pre-trained language models tuning methods in FL and develop a federated tuning framework FedPETuning, which allows practitioners to exploit different PETuning methods under the FL training paradigm conveniently.
32
Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization
Tianshi Che,Ji Liu,Yang Zhou,Jiaxiang Ren,Jiwen Zhou,Victor S. Sheng,Huaiyu Dai,Dejing Dou +7 more
TL;DR: A Parameter-efficient prompt Tuning approach with Adaptive Optimization, i.e., FedPepTAO, to enable efficient and effective FL of LLMs and a novel adaptive optimization method is developed to address the client drift problems on both the device and server sides to enhance performance further.
FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning
Haokun Chen,Yao Zhang,Denis Krompaß,Jindong Gu,Volker Tresp +4 more
TL;DR: Federated Dual-Adapter Teacher (DAT) is the first approach that enables an efficient distributed finetuning of foundation models for a variety of heterogeneous Vision-Language tasks, and substantially outperforms the existing centralized PEFT methods adapted for FL.
FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models
Jingwei Sun,Ziyue Xu,Hongxu Yin,Dong Yang,Daguang Xu,Yiran Chen,Holger Roth +6 more
TL;DR: Federated Black-box Prompt Tuning (FedBPT) is introduced, a framework designed to address challenges of fine-tuning of PLM in the age of large language models that reduces the number of exchanged variables, boosts communication efficiency, and minimizes computational and storage costs.
Adaptive Parameterization of Deep Learning Models for Federated Learning
TL;DR: In this paper , the authors propose to use parallel adapters for federated learning and show that they can achieve similar inference performance compared to training the full model while reducing the communication overhead by roughly 90%.
3
References
•Journal Article
Visualizing Data using t-SNE
TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
•Posted Content
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy,Lucas Beyer,Alexander Kolesnikov,Dirk Weissenborn,Xiaohua Zhai,Thomas Unterthiner,Mostafa Dehghani,Matthias Minderer,Georg Heigold,Sylvain Gelly,Jakob Uszkoreit,Neil Houlsby +11 more
TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.
•Journal Article
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Michael S. Bernstein,Li Fei-Fei,Alexander C. Berg,Aditya Khosla +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) has been running annually for five years (since 2010) and has become the standard benchmark for large-scale object recognition.
23.9K
•Posted Content
Communication-Efficient Learning of Deep Networks from Decentralized Data
TL;DR: This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets.
11.4K
•Book
The Algorithmic Foundations of Differential Privacy
Cynthia Dwork,Aaron Roth +1 more
- 11 Aug 2014
TL;DR: The preponderance of this monograph is devoted to fundamental techniques for achieving differential privacy, and application of these techniques in creative combinations, using the query-release problem as an ongoing example.