Open AccessPosted Content
Distributed Gradient Descent with Coded Partial Gradient Computations.
28
TL;DR: A hybrid approach is introduced, called coded partial gradient computation (CPGC), that benefits from the advantages of both coded and uncoded computation schemes, and reduces both the computation time and decoding complexity.
read more
Abstract: Coded computation techniques provide robustness against straggling servers in distributed computing, with the following limitations: First, they increase decoding complexity. Second, they ignore computations carried out by straggling servers; and they are typically designed to recover the full gradient, and thus, cannot provide a balance between the accuracy of the gradient and per-iteration completion time. Here we introduce a hybrid approach, called coded partial gradient computation (CPGC), that benefits from the advantages of both coded and uncoded computation schemes, and reduces both the computation time and decoding complexity.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air
TL;DR: This work introduces a novel analog scheme, called A-DSGD, which exploits the additive nature of the wireless MAC for over-the-air gradient computation, and provides convergence analysis for this approach.
752
Stochastic Gradient Coding for Straggler Mitigation in Distributed Learning
Rawad Bitar,Mary Wootters,Salim El Rouayheb +2 more
- 29 Apr 2020
TL;DR: Stochastic Gradient Coding (SGC) as mentioned in this paper is an approximate gradient coding scheme for distributed gradient descent in the presence of straggglers, which works when the stragglers are random.
112
Speeding Up Distributed Gradient Descent by Utilizing Non-persistent Stragglers
Emre Ozfatura,Deniz Gunduz,Sennur Ulukus +2 more
- 07 Jul 2019
TL;DR: In this paper, the authors proposed a coded distributed gradient descent (DGD) technique which can trade-off the average computation time with the communication load, and showed that the average completion time per iteration can be reduced significantly at a reasonable increase in communication load.
107
A Comprehensive Survey on Coded Distributed Computing: Fundamentals, Challenges, and Networking Applications
Jer Shyuan Ng,Wei Yang Bryan Lim,Nguyen Cong Luong,Zehui Xiong,Alia Asheralieva,Dusit Niyato,Cyril Leung,Chunyan Miao +7 more
TL;DR: Coded distributed computing (CDC) as discussed by the authors is a combination of coding theoretic techniques and distributed computing, which has been recently proposed as a promising solution to reduce communication load and straggler effects.
105
Distributed Few-Shot Learning for Intelligent Recognition of Communication Jamming
TL;DR: A novel jamming recognition method based on distributed few-shot learning that employs a distributed recognition architecture to achieve the global optimization of multiple sub-networks by federated learning and introduces a dense block structure in the sub-network structure to improve network information flow.
References
Optimization Methods for Large-Scale Machine Learning
TL;DR: The authors provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications and discusses how optimization problems arise in machine learning and what makes them challenging.
3.7K
•Proceedings Article
Gradient Coding: Avoiding Stragglers in Distributed Learning
Rashish Tandon,Qi Lei,Alexandros G. Dimakis,Nikos Karampatziakis +3 more
- 17 Jul 2017
TL;DR: This work proposes a novel coding theoretic framework for mitigating stragglers in distributed learning and shows how carefully replicating data blocks and coding across gradients can provide tolerance to failures andstragglers for synchronous Gradient Descent.
•Posted Content
Near-Optimal Straggler Mitigation for Distributed Gradient Methods
TL;DR: This work proves that the proposed Batched Coupon's Collector (BCC) scheme is robust to a near optimal number of random stragglers, and reduces the run-time by up to 85.4% over Amazon EC2 clusters when compared with other straggler mitigation strategies.
60
•Posted Content
Slow and Stale Gradients Can Win the Race
TL;DR: This work presents a novel theoretical characterization of the speed-up offered by asynchronous SGD methods by analyzing the trade-off between the error in the trained model and the actual training runtime (wallclock time).
•Posted Content
$C^{3}LES$: Codes for Coded Computation that Leverage Stragglers
TL;DR: A fine-grained model is proposed that quantifies the level of non-trivial coding needed to obtain the benefits of coding in matrix-vector computation and allows us to leverage partial computations performed by the straggler nodes.
Related Papers (5)
Emre Ozfatura,Deniz Gunduz,Sennur Ulukus +2 more
- 02 Jun 2019
Sinong Wang,Jiashang Liu,Ness B. Shroff,Pengyu Yang +3 more
- 11 Apr 2019
Neophytos Charalambides,Hessam Mahdavifar,Alfred O. Hero +2 more
- 01 Jun 2020
Farzin Haddadpour,Yaoqing Yang,Viveck R. Cadambe,Pulkit Grover +3 more
- 01 Oct 2018