Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Asynchronous communication
  4. 2016
  1. Home
  2. Topics
  3. Asynchronous communication
  4. 2016
Showing papers on "Asynchronous communication published in 2016"
Proceedings Article•
Asynchronous methods for deep reinforcement learning

[...]

Volodymyr Mnih1, Adrià Puigdomènech Badia1, Mehdi Mirza2, Alex Graves1, Tim Harley1, Timothy P. Lillicrap1, David Silver1, Koray Kavukcuoglu1 •
Google1, Université de Montréal2
19 Jun 2016
TL;DR: A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Abstract: We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers. The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU. Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.

9,208 citations

Posted Content•
Revisiting Distributed Synchronous SGD

[...]

Jianmin Chen, Rajat Monga, Samy Bengio, Rafal Jozefowicz
18 Feb 2016-arXiv: Learning
TL;DR: It is demonstrated that a third approach, synchronous optimization with backup workers, can avoid asynchronous noise while mitigating for the worst stragglers and is empirically validated and shown to converge faster and to better test accuracies.
Abstract: Distributed training of deep learning models on large-scale training data is typically conducted with asynchronous stochastic optimization to maximize the rate of updates, at the cost of additional noise introduced from asynchrony. In contrast, the synchronous approach is often thought to be impractical due to idle time wasted on waiting for straggling workers. We revisit these conventional beliefs in this paper, and examine the weaknesses of both approaches. We demonstrate that a third approach, synchronous optimization with backup workers, can avoid asynchronous noise while mitigating for the worst stragglers. Our approach is empirically validated and shown to converge faster and to better test accuracies.

955 citations

Journal Article•10.1145/2827695•
Multiparty Asynchronous Session Types

[...]

Kohei Honda1, Nobuko Yoshida2, Marco Carbone3•
Queen Mary University of London1, Imperial College London2, IT University of Copenhagen3
03 Mar 2016-Journal of the ACM
TL;DR: The theory introduces a new notion of types in which interactions involving multiple peers are directly abstracted as a global scenario, and the fundamental properties of the session type discipline, such as communication safety, progress, and session fidelity, are established.
Abstract: Communication is a central elements in software development. As a potential typed foundation for structured communication-centered programming, session types have been studied over the past decade for a wide range of process calculi and programming languages, focusing on binary (two-party) sessions. This work extends the foregoing theories of binary session types to multiparty, asynchronous sessions, which often arise in practical communication-centered applications. Presented as a typed calculus for mobile processes, the theory introduces a new notion of types in which interactions involving multiple peers are directly abstracted as a global scenario. Global types retain the friendly type syntax of binary session types while specifying dependencies and capturing complex causal chains of multiparty asynchronous interactions. A global type plays the role of a shared agreement among communication peers and is used as a basis of efficient type-checking through its projection onto individual peers. The fundamental properties of the session type discipline, such as communication safety, progress, and session fidelity, are established for general n-party asynchronous interactions.

811 citations

Journal Article•
An Update on Discourse Functions and Syntactic Complexity in Synchronous and Asynchronous Communication.

[...]

Susana M. Sotillo
01 Jun 2016-Language Learning & Technology
TL;DR: The results showed that the quantity and types of discourse functions present in synchronous discussions were similar to the types of interactional modifications found in face-to-face conversations that are deemed necessary for second language acquisition.

492 citations

Proceedings Article•10.1109/CVPR.2016.102•
Simultaneous Optical Flow and Intensity Estimation from an Event Camera

[...]

Patrick Bardow1, Andrew J. Davison1, Stefan Leutenegger1•
Imperial College London1
1 Jun 2016
TL;DR: This work proposes, to the best of the knowledge, the first algorithm to simultaneously recover the motion field and brightness image, while the camera undergoes a generic motion through any scene, within a sliding window time interval.
Abstract: Event cameras are bio-inspired vision sensors which mimic retinas to measure per-pixel intensity change rather than outputting an actual intensity image. This proposed paradigm shift away from traditional frame cameras offers significant potential advantages: namely avoiding high data rates, dynamic range limitations and motion blur. Unfortunately, however, established computer vision algorithms may not at all be applied directly to event cameras. Methods proposed so far to reconstruct images, estimate optical flow, track a camera and reconstruct a scene come with severe restrictions on the environment or on the motion of the camera, e.g. allowing only rotation. Here, we propose, to the best of our knowledge, the first algorithm to simultaneously recover the motion field and brightness image, while the camera undergoes a generic motion through any scene. Our approach employs minimisation of a cost function that contains the asynchronous event data as well as spatial and temporal regularisation within a sliding window time interval. Our implementation relies on GPU optimisation and runs in near real-time. In a series of examples, we demonstrate the successful operation of our framework, including in situations where conventional cameras suffer from dynamic range limitations and motion blur.

378 citations

Journal Article•10.5944/OPENPRAXIS.8.1.212•
Synchronous and Asynchronous E-Language Learning: A Case Study of Virtual University of Pakistan.

[...]

Ayesha Perveen1•
Virtual University of Pakistan1
03 Mar 2016-Open Praxis
TL;DR: The findings revealed that asynchronous e-language learning was quite beneficial for second language (L2) learners, but with some limitations which could be scaffolded by synchronous sessions.
Abstract: This case study evaluated the impact of synchronous and asynchronous E-Language Learning activities (ELL-tivities) in an E-Language Learning Environment (ELLE) at Virtual University of Pakistan. The purpose of the study was to assess e-language learning analytics based on the constructivist approach of collaborative construction of knowledge. The courses selected for random sampling were English Comprehension (Eng101), Business & Technical English (Eng201) and Business Communication (Eng301). Three methods were employed to collect the data: observation of the communication and performance on given channels, students’ opinions on Graded Discussion Board (GDB), and a survey questionnaire. Out of a total population of 9919, 1025 responses were received for the survey questionnaire. The findings revealed that asynchronous e-language learning was quite beneficial for second language (L2) learners, but with some limitations which could be scaffolded by synchronous sessions. Based on the findings, the researcher suggested a blend of both synchronous and asynchronous paradigms to create an ideal environment for e-language learning in Pakistan.

276 citations

Journal Article•
Synchronous and Asynchronous Communication in Distance Learning: A Review of the Literature.

[...]

Lynette Watts
01 Jan 2016-The Quarterly Review of Distance Education
TL;DR: This literature review examines synchronous and asynchronous communication in distance learning, highlighting the importance of both in engaging students, while considering time constraints, technological ability, and motivation to inform effective online interaction strategies.
Abstract: Distance learning is commonplace in higher education, with increasing numbers of students enjoying the flexibility e-learning provides. Keeping students connected with peers and instructors has been a challenge with e-learning, but as technology has advanced, the methods by which educators keep students engaged, synchronously and asynchronously, also have improved. This literature review presents support for both types of interaction; however, findings indicate educators must consider time constraints, technological ability, and motivation for students to interact in the online setting. Recommendations for implementing both synchronous and asynchronous interactions are made, including technological considerations. Finally, suggestions for research in distance learning are presented for consideration.

255 citations

Journal Article•10.1109/TSP.2016.2537271•
Asynchronous Distributed ADMM for Large-Scale Optimization—Part I: Algorithm and Convergence Analysis

[...]

Tsung-Hui Chang1, Mingyi Hong2, Wei-Cheng Liao3, Xiangfeng Wang4•
The Chinese University of Hong Kong1, Iowa State University2, University of Minnesota3, East China Normal University4
01 Jun 2016-IEEE Transactions on Signal Processing
TL;DR: This paper proposes an asynchronous distributed ADMM (AD-ADMM), which can effectively improve the time efficiency of distributed optimization, and analyzes the convergence conditions of the AD- ADMM, under the popular partially asynchronous model, which is defined based on a maximum tolerable delay of the network.
Abstract: Aiming at solving large-scale optimization problems, this paper studies distributed optimization methods based on the alternating direction method of multipliers (ADMM). By formulating the optimization problem as a consensus problem, the ADMM can be used to solve the consensus problem in a fully parallel fashion over a computer network with a star topology. However, traditional synchronized computation does not scale well with the problem size, as the speed of the algorithm is limited by the slowest workers. This is particularly true in a heterogeneous network where the computing nodes experience different computation and communication delays. In this paper, we propose an asynchronous distributed ADMM (AD-ADMM), which can effectively improve the time efficiency of distributed optimization. Our main interest lies in analyzing the convergence conditions of the AD-ADMM, under the popular partially asynchronous model, which is defined based on a maximum tolerable delay of the network. Specifically, by considering general and possibly non-convex cost functions, we show that the AD-ADMM is guaranteed to converge to the set of Karush–Kuhn–Tucker (KKT) points as long as the algorithm parameters are chosen appropriately according to the network delay. We further illustrate that the asynchrony of the ADMM has to be handled with care, as slightly modifying the implementation of the AD-ADMM can jeopardize the algorithm convergence, even under the standard convex setting.

252 citations

Posted Content•
Decentralized Collaborative Learning of Personalized Models over Networks

[...]

Paul Vanhaesebrouck, Aurélien Bellet, Marc Tommasi1•
Lille University of Science and Technology1
17 Oct 2016-arXiv: Learning
TL;DR: In this paper, two asynchronous gossip algorithms running in a fully decentralized manner are proposed to smooth pre-trained local models over the network while accounting for the confidence that each agent has in its initial model.
Abstract: We consider a set of learning agents in a collaborative peer-to-peer network, where each agent learns a personalized model according to its own learning objective. The question addressed in this paper is: how can agents improve upon their locally trained model by communicating with other agents that have similar objectives? We introduce and analyze two asynchronous gossip algorithms running in a fully decentralized manner. Our first approach, inspired from label propagation, aims to smooth pre-trained local models over the network while accounting for the confidence that each agent has in its initial model. In our second approach, agents jointly learn and propagate their model by making iterative updates based on both their local dataset and the behavior of their neighbors. To optimize this challenging objective, our decentralized algorithm is based on ADMM.

215 citations

Journal Article•10.1137/15M1024950•
ARock: An Algorithmic Framework for Asynchronous Parallel Coordinate Updates

[...]

Zhimin Peng1, Yangyang Xu2, Ming Yan, Wotao Yin1•
University of California, Los Angeles1, University of Alabama2
08 Sep 2016-SIAM Journal on Scientific Computing
TL;DR: The problem of finding a fixed point to a nonexpansive operator (i.e., $x^*=Tx^*), where x is the number of points in a non-convex operator, has been studied in numerical linear algebra, optimization, and other areas of data science as discussed by the authors.
Abstract: Finding a fixed point to a nonexpansive operator, i.e., $x^*=Tx^*$, abstracts many problems in numerical linear algebra, optimization, and other areas of data science. To solve fixed-point problems...

193 citations

Proceedings Article•
Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

[...]

Mohammad Babaeizadeh1, Iuri Frosio2, Stephen Tyree2, Jason Clemons2, Jan Kautz2 •
University of Illinois at Urbana–Champaign1, Nvidia2
4 Nov 2016
TL;DR: In this article, a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm is introduced, which achieves a significant speed up compared to a CPU implementation.
Abstract: We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for other asynchronous algorithms as well. Our hybrid CPU/GPU version of A3C, based on TensorFlow, achieves a significant speed up compared to a CPU implementation; we make it publicly available to other researchers at this https URL .
Posted Content•
Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

[...]

Mohammad Babaeizadeh1, Iuri Frosio2, Stephen Tyree2, Jason Clemons2, Jan Kautz2 •
University of Illinois at Urbana–Champaign1, Nvidia2
18 Nov 2016-arXiv: Learning
TL;DR: A hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks, is introduced, achieving a significant speed up compared to a CPU implementation.
Abstract: We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for other asynchronous algorithms as well. Our hybrid CPU/GPU version of A3C, based on TensorFlow, achieves a significant speed up compared to a CPU implementation; we make it publicly available to other researchers at this https URL .
Journal Article•10.1007/S11241-015-9244-X•
Combined task- and network-level scheduling for distributed time-triggered systems

[...]

Silviu S. Craciunas, Ramon Serna Oliver
01 Mar 2016-Real-time Systems
TL;DR: This paper presents an incremental scheduling approach, based on the demand bound test for asynchronous tasks, which significantly improves the scalability of the scheduling problem and demonstrates the performance of the approach with an extensive evaluation of industrial-sized synthetic configurations using alternative state-of-the-art SMT and MIP solvers.
Abstract: Ethernet-based time-triggered networks (e.g. TTEthernet) enable the cost-effective integration of safety-critical and real-time distributed applications in domains where determinism is a key requirement, like the aerospace, automotive, and industrial domains. Time-Triggered communication typically follows an offline and statically configured schedule (the synthesis of which is an NP-complete problem) guaranteeing contention-free frame transmissions. Extending the end-to-end determinism towards the application layers requires that software tasks running on end nodes are scheduled in tight relation to the underlying time-triggered network schedule. In this paper we discuss the simultaneous co-generation of static network and task schedules for distributed systems consisting of preemptive time-triggered tasks which communicate over switched multi-speed time-triggered networks. We formulate the schedule problem using first-order logical constraints and present alternative methods to find a solution, with or without optimization objectives, based on satisfiability modulo theories (SMT) and mixed integer programming (MIP) solvers, respectively. Furthermore, we present an incremental scheduling approach, based on the demand bound test for asynchronous tasks, which significantly improves the scalability of the scheduling problem. We demonstrate the performance of the approach with an extensive evaluation of industrial-sized synthetic configurations using alternative state-of-the-art SMT and MIP solvers and show that, even when using optimization, most of the problems are solved within reasonable time using the incremental method.
Posted Content•
Asynchronous Methods for Deep Reinforcement Learning

[...]

Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu 
04 Feb 2016-arXiv: Learning
TL;DR: Asynchronous actor-critic as discussed by the authors uses asynchronous gradient descent for optimization of deep neural network controllers and achieves state-of-the-art performance on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU.
Abstract: We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers. The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU. Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Proceedings Article•10.1109/ECRTS.2016.27•
Urgency-Based Scheduler for Time-Sensitive Switched Ethernet Networks

[...]

Johannes Specht1, Soheil Samii2•
University of Duisburg-Essen1, General Motors2
5 Jul 2016
TL;DR: An asynchronous traffic scheduling algorithm is introduced, which gives low delay guarantees in a switched Ethernet network, while maintaining a low implementation complexity.
Abstract: Due to increasing bandwidth requirements, Ethernet technology is emerging in embedded systems application areas such as automotive, avionics, and industrial control. In the automotive domain, Ethernet enables integration of cameras, radars, and fusion to build active safety and automated driving systems. While Ethernet provides the necessary communication bandwidth, solutions are needed to satisfy stringent dependability and temporal requirements of such safety-critical systems. This paper introduces an asynchronous traffic scheduling algorithm, which gives low delay guarantees in a switched Ethernet network, while maintaining a low implementation complexity. We present a timing analysis and demonstrate the tightness of the delay bounds by extensive simulation experiments.
Journal Article•10.1016/J.AUTOMATICA.2016.06.011•
Distributed model based event-triggered control for synchronization of multi-agent systems

[...]

Davide Liuzza1, Dimos V. Dimarogonas1, Mario di Bernardo2, Karl Henrik Johansson1•
Royal Institute of Technology1, University of Naples Federico II2
01 Nov 2016-Automatica
TL;DR: An event-triggered strategy able to guarantee the existence of a minimum lower bound between inter-event times for broadcasted information and for control signal updating is proposed, thus allowing applications where both the communication bandwidth and the maximum updating frequency of actuators are critical.
Proceedings Article•10.1145/2993369.2993375•
The asynchronous time warp for virtual reality on consumer hardware

[...]

J. M. P. van Waveren1•
Oculus VR1
2 Nov 2016
TL;DR: The various challenges and the different trade-offs that need to be considered when implementing an asynchronous time warp on consumer hardware are discussed.
Abstract: To help create a true sense of presence in a virtual reality experience, a so called "time warp" may be used. This time warp does not only correct for the optical aberration of the lenses used in a virtual reality headset, it also transforms the stereoscopic images based on the very latest head tracking information to significantly reduce the motion-to-photon delay (or end-to-end latency). The time warp operates as close as possible to the display refresh, retrieves updated head tracking information and transforms a stereoscopic pair of images from representing a view at the time it was rendered, to representing the correct view at the time it is displayed. When run asynchronously to the stereoscopic rendering, the time warp can be used to increase the perceived frame rate and to smooth out inconsistent frame rates. Asynchronous operation can also improve the overall graphics hardware utilization by not requiring the stereoscopic rendering to be synchronized with the display refresh cycle. However, on today's consumer hardware it is challenging to implement a high quality time warp that is fast, has predictable latency and throughput, and runs asynchronously. This paper discusses the various challenges and the different trade-offs that need to be considered when implementing an asynchronous time warp on consumer hardware.
Journal Article•10.1109/TCYB.2015.2453346•
Distributed Consensus of Stochastic Delayed Multi-agent Systems Under Asynchronous Switching

[...]

Xiaotai Wu1, Yang Tang2, Jinde Cao3, Wenbing Zhang4•
Southeast University1, East China University of Science and Technology2, King Abdulaziz University3, Yangzhou University4
01 Aug 2016-IEEE Transactions on Systems, Man, and Cybernetics
TL;DR: Several easy to verified conditions for the existence of an asynchronously switched distributed controller are derived such that stochastic delayed multi-agent systems with asynchronous switching and nonlinear dynamics can achieve global exponential consensus.
Abstract: In this paper, the distributed exponential consensus of stochastic delayed multi-agent systems with nonlinear dynamics is investigated under asynchronous switching. The asynchronous switching considered here is to account for the time of identifying the active modes of multi-agent systems. After receipt of confirmation of mode’s switching, the matched controller can be applied, which means that the switching time of the matched controller in each node usually lags behind that of system switching. In order to handle the coexistence of switched signals and stochastic disturbances, a comparison principle of stochastic switched delayed systems is first proved. By means of this extended comparison principle, several easy to verified conditions for the existence of an asynchronously switched distributed controller are derived such that stochastic delayed multi-agent systems with asynchronous switching and nonlinear dynamics can achieve global exponential consensus. Two examples are given to illustrate the effectiveness of the proposed method.
Posted Content•
Asynchronous Temporal Fields for Action Recognition

[...]

Gunnar A. Sigurdsson1, Santosh K. Divvala2, Ali Farhadi2, Abhinav Gupta2•
Carnegie Mellon University1, Allen Institute for Artificial Intelligence2
19 Dec 2016-arXiv: Computer Vision and Pattern Recognition
TL;DR: In this paper, a fully-connected temporal CRF model is proposed for reasoning over various aspects of activities that includes objects, actions, and intentions, where the potentials are predicted by a deep network.
Abstract: Actions are more than just movements and trajectories: we cook to eat and we hold a cup to drink from it. A thorough understanding of videos requires going beyond appearance modeling and necessitates reasoning about the sequence of activities, as well as the higher-level constructs such as intentions. But how do we model and reason about these? We propose a fully-connected temporal CRF model for reasoning over various aspects of activities that includes objects, actions, and intentions, where the potentials are predicted by a deep network. End-to-end training of such structured models is a challenging endeavor: For inference and learning we need to construct mini-batches consisting of whole videos, leading to mini-batches with only a few videos. This causes high-correlation between data points leading to breakdown of the backprop algorithm. To address this challenge, we present an asynchronous variational inference method that allows efficient end-to-end training. Our method achieves a classification mAP of 22.4% on the Charades benchmark, outperforming the state-of-the-art (17.2% mAP), and offers equal gains on the task of temporal localization.
Proceedings Article•10.1109/SP.2016.20•
TaoStore: Overcoming Asynchronicity in Oblivious Data Storage

[...]

Cetin Sahin1, Victor Zakhary1, Amr El Abbadi1, Huijia Lin1, Stefano Tessaro1 •
University of California, Santa Barbara1
22 May 2016
TL;DR: This paper develops and evaluates a new oblivious storage system, called Tree-based Asynchronous Oblivious Store, or TaoStore for short, which is built on top of a new tree-based ORAM scheme that processes client requests concurrently and asynchronously in a non-blocking fashion.
Abstract: We consider oblivious storage systems hiding both the contents of the data as well as access patterns from an untrusted cloud provider. We target a scenario where multiple users from a trusted group (e.g., corporate employees) asynchronously access and edit potentially overlapping data sets through a trusted proxy mediating client-cloud communication. The main contribution of our paper is twofold. Foremost, we initiate the first formal study of asynchronicity in oblivious storage systems. We provide security definitions for scenarios where both client requests and network communication are asynchronous (and in fact, even adversarially scheduled). While security issues in ObliviStore (Stefanov and Shi, S&P 2013) have recently been surfaced, our treatment shows that also CURIOUS (Bindschaedler at al., CCS 2015), proposed with the exact goal of preventing these attacks, is insecure under asynchronous scheduling of network communication. Second, we develop and evaluate a new oblivious storage system, called Tree-based Asynchronous Oblivious Store, or TaoStore for short, which we prove secure in asynchronous environments. TaoStore is built on top of a new tree-based ORAM scheme that processes client requests concurrently and asynchronously in a non-blocking fashion. This results in a substantial gain in throughput, simplicity, and flexibility over previous systems.
Journal Article•10.1109/TCNS.2015.2428391•
Distributed Source Seeking via a Circular Formation of Agents Under Communication Constraints

[...]

Lara Brinon-Arranz1, Luca Schenato2, Alexandre Seuret1•
Centre national de la recherche scientifique1, University of Padua2
01 Jun 2016-IEEE Transactions on Control of Network Systems
TL;DR: This paper proposes a combination of a cooperative control law to stabilize the agents to a circular formation and a distributed consensus-based source-seeking algorithm, which is guaranteed to steer the circular formation toward the vicinity of the source location.
Abstract: This paper addresses the source-seeking problem in which a group of autonomous vehicles must locate and follow the source of some signal based on measurements of the signal strength at different positions. Based on the observation that the gradient of the signal strength can be approximated by a circular formation of agents via a simple weighted average of the signal measured by each agent, we propose a combination of a cooperative control law to stabilize the agents to a circular formation and a distributed consensus-based source-seeking algorithm, which is guaranteed to steer the circular formation toward the vicinity of the source location. In particular, the proposed algorithm is provided with two tunable parameters that allow for a tradeoff between speed of convergence, noise filtering, and formation stability. The benefit of using consensus-based algorithms resides in a more realist discrete time control of the agents and in asynchronous communication resilient to delays, which is particularly relevant for underwater applications. The analytic results are complemented with numerical simulations.
Proceedings Article•10.1109/GLOCOMW.2016.7849087•
WOLA-OFDM: A Potential Candidate for Asynchronous 5G

[...]

Rafik Zayani1, Yahia Medjahdi, Hmaied Shaiek, Daniel Roviras•
Carthage University1
1 Dec 2016
TL;DR: Investigating the performance, in relaxed synchronization scenario, of a new contender waveform making its appearance recently named Weighted Overlap and Add based OFDM shows that WOLA-OFDM could be a promising candidate waveform, outperforming both CP- OFDM and UFMC in any asynchronous scenario.
Abstract: This paper investigates the performance, in relaxed synchronization scenario, of a new contender waveform making its appearance recently named Weighted Overlap and Add based OFDM (WOLA-OFDM). Indeed, its performance will be studied and compared to the classical CP-OFDM and the wellknown UFMC that offers particular benefits for 5G use cases. Results show that WOLA-OFDM could be a promising candidate waveform, outperforming both CP-OFDM and UFMC in any asynchronous scenario.
Journal Article•10.1109/TVLSI.2015.2405614•
Argo: A Real-Time Network-on-Chip Architecture With an Efficient GALS Implementation

[...]

Evangelia Kasapaki1, Martin Schoeberl1, Rasmus Bo Sørensen1, Christoph Thomas Muller1, Kees Goossens2, Jens Sparsø1 •
Technical University of Denmark1, Eindhoven University of Technology2
01 Feb 2016-IEEE Transactions on Very Large Scale Integration Systems
TL;DR: An area-efficient, globally asynchronous, locally synchronous network-on-chip (NoC) architecture for a hard real-time multiprocessor platform that uses statically scheduled time-division multiplexing (TDM) to control the communication over a structure of routers, links, and network interfaces (NIs).
Abstract: In this paper, we present an area-efficient, globally asynchronous, locally synchronous network-on-chip (NoC) architecture for a hard real-time multiprocessor platform. The NoC implements message-passing communication between processor cores. It uses statically scheduled time-division multiplexing (TDM) to control the communication over a structure of routers, links, and network interfaces (NIs) to offer real-time guarantees. The area-efficient design is a result of two contributions: 1) asynchronous routers combined with TDM scheduling and 2) a novel NI microarchitecture. Together they result in a design in which data are transferred in a pipelined fashion, from the local memory of the sending core to the local memory of the receiving core, without any dynamic arbitration, buffering, and clock synchronization. The routers use two-phase bundled-data handshake latches based on the Mousetrap latch controller and are extended with a clock gating mechanism to reduce the energy consumption. The NIs integrate the direct memory access functionality and the TDM schedule, and use dual-ported local memories to avoid buffering, flow-control, and synchronization. To verify the design, we have implemented a 4 $\times $ 4 bitorus NoC in 65-nm CMOS technology and we present results on area, speed, and energy consumption for the router, NI, NoC, and postlayout.
How to scale distributed deep learning

[...]

Peter H. Jin, Qiaochu Yuan, Forrest Iandola, Kurt Keutzer
14 Nov 2016
TL;DR: It is found, perhaps counterintuitively, that asynchronous SGD, including both elastic averaging and gossiping, converges faster at fewer nodes, whereas synchronous SGD scales better to more nodes (up to about 100 nodes).
Abstract: Training time on large datasets for deep neural networks is the principal workflow bottleneck in a number of important applications of deep learning, such as object classification and detection in automatic driver assistance systems (ADAS). To minimize training time, the training of a deep neural network must be scaled beyond a single machine to as many machines as possible by distributing the optimization method used for training. While a number of approaches have been proposed for distributed stochastic gradient descent (SGD), at the current time synchronous approaches to distributed SGD appear to be showing the greatest performance at large scale. Synchronous scaling of SGD suffers from the need to synchronize all processors on each gradient step and is not resilient in the face of failing or lagging processors. In asynchronous approaches using parameter servers, training is slowed by contention to the parameter server. In this paper we compare the convergence of synchronous and asynchronous SGD for training a modern ResNet network architecture on the ImageNet classification problem. We also propose an asynchronous method, gossiping SGD, that aims to retain the positive features of both systems by replacing the all-reduce collective operation of synchronous training with a gossip aggregation algorithm. We find, perhaps counterintuitively, that asynchronous SGD, including both elastic averaging and gossiping, converges faster at fewer nodes (up to about 32 nodes), whereas synchronous SGD scales better to more nodes (up to about 100 nodes).
Journal Article•10.1080/10494820.2013.841262•
A blended model: simultaneously teaching a quantitative course traditionally, online, and remotely

[...]

Constance A. Lightner1, Carin A. Lightner-Laws•
Fayetteville State University1
02 Jan 2016-Interactive Learning Environments
TL;DR: A blended course model for statistics and quantitative method courses was developed that allowed students to choose between online, remote (via interactive television), and traditional course delivery modes each week and is more flexible and agile than existing blended courses that have more static components.
Abstract: As universities seek to bolster enrollment through distance education, faculty are tasked with maintaining comparable teaching/learning standards in traditional, blended, and online courses. Research has shown that there is an achievement gap between students taking courses exclusively offered online versus those enrolled in face-to-face classes. In an effort to mitigate these observed differences, the School of Business faculty at the research institution investigated various course models to meet the needs of a diverse, non-traditional, and multifaceted student population. Ultimately, a blended course model for statistics and quantitative method courses was developed that allowed students to choose between online, remote (via interactive television), and traditional course delivery modes each week. This model is more flexible and agile than existing blended courses that have more static components. Multiple regression analysis, χ2, and t-tests are used to demonstrate the efficacy of our model in maintai...
Journal Article•10.1109/TSP.2016.2537261•
Asynchronous Distributed ADMM for Large-Scale Optimization—Part II: Linear Convergence Analysis and Numerical Performance

[...]

Tsung-Hui Chang1, Wei-Cheng Liao2, Mingyi Hong3, Xiangfeng Wang4•
The Chinese University of Hong Kong1, University of Minnesota2, Iowa State University3, East China Normal University4
01 Jun 2016-IEEE Transactions on Signal Processing
TL;DR: This paper characterizes the conditions under which the AD-ADMM achieves linear convergence and reveals the impact that various algorithm parameters, network delay, and network size have on the algorithm performance.
Abstract: The alternating direction method of multipliers (ADMM) has been recognized as a versatile approach for solving modern large-scale machine learning and signal processing problems efficiently. When the data size and/or the problem dimension is large, a distributed version of ADMM can be used, which is capable of distributing the computation load and the data set to a network of computing nodes. Unfortunately, a direct synchronous implementation of such algorithm does not scale well with the problem size, as the algorithm speed is limited by the slowest computing nodes. To address this issue, in a companion paper, we have proposed an asynchronous distributed ADMM (AD-ADMM) and studied its worst-case convergence conditions. In this paper, we further the study by characterizing the conditions under which the AD-ADMM achieves linear convergence. Our conditions as well as the resulting linear rates reveal the impact that various algorithm parameters, network delay, and network size have on the algorithm performance. To demonstrate the superior time efficiency of the proposed AD-ADMM, we test the AD-ADMM on a high-performance computer cluster by solving a large-scale logistic regression problem.
Journal Article•10.1002/RNC.3537•
Finite‐time asynchronous ℋ∞ filtering for discrete‐time Markov jump systems over a lossy network

[...]

Hao Shen1, Feng Li1, Zheng-Guang Wu2, Ju H. Park3•
Anhui University of Technology1, Zhejiang University2, Yeungnam University3
25 Nov 2016-International Journal of Robust and Nonlinear Control
TL;DR: The objective is to design a filter that ensures not only the mean-square stochastic finite-time bounded but also a prescribed level of performance for the underlying error system over a lossy network.
Abstract: Summary This paper is concerned with the problem of finite-time asynchronous filtering for a class of discrete-time Markov jump systems. The communication links between the system and filter are assumed to be unreliable, which lead to the simultaneous occurrences of packet dropouts, time delays, sensor nonlinearity and nonsynchronous modes. The objective is to design a filter that ensures not only the mean-square stochastic finite-time bounded but also a prescribed level of performance for the underlying error system over a lossy network. With the help of the Lyapunov–Krasovskii approach and stochastic analysis theory, sufficient conditions are established for the existence of an admissible filter. By using a novel simple matrix decoupling approach, a desired asynchronous filter can be constructed. Finally, a numerical example is presented and a pulse-width-modulation-driven boost converter model is employed to demonstrate the effectiveness of the proposed approach. Copyright © 2016 John Wiley & Sons, Ltd.
Journal Article•10.1109/TVT.2016.2518185•
Cooperative Joint Localization and Clock Synchronization Based on Gaussian Message Passing in Asynchronous Wireless Networks

[...]

Weijie Yuan1, Nan Wu1, Bernhard Etzlinger2, Hua Wang1, Jingming Kuang1 •
Beijing Institute of Technology1, Johannes Kepler University of Linz2
14 Jan 2016-IEEE Transactions on Vehicular Technology
TL;DR: A factor graph representation of the joint localization and time synchronization problem based on TOA measurements, in which the non-line-of-sight (NLOS) measurements are also taken into consideration, and a message passing schedule scheme is proposed to trade off between estimation performance and communication overhead.
Abstract: Localization and synchronization are very important in many wireless applications such as monitoring and vehicle tracking. Utilizing the same time of arrival (TOA) measurements for simultaneous localization and synchronization is challenging. In this paper, we present a factor graph (FG) representation of the joint localization and time synchronization problem based on TOA measurements, in which the non-line-of-sight (NLOS) measurements are also taken into consideration. On this FG, belief propagation (BP) message passing and variational message passing (VMP) are applied to derive two fully distributed cooperative algorithms with low computational requirements. Due to the nonlinearity in the observation function, it is intractable to compute the messages in closed form, and most existing solutions rely on Monte Carlo methods, e.g., particle filtering. We linearize a specific nonlinear term in the expressions of messages, which enables us to use a Gaussian representation for all messages. Accordingly, only the mean and variance have to be updated and transmitted between neighboring nodes, which significantly reduces the communication overhead and computational complexity. A message passing schedule scheme is proposed to trade off between estimation performance and communication overhead. Simulation results show that the proposed algorithms perform very close to particle-based methods with much lower complexity, particularly in densely connected networks.
Proceedings Article•10.1109/IROS.2017.8202141•
Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search

[...]

Ali T. Yahya, Adrian Li, Mrinal Kalakrishnan, Yevgen Chebotar1, Sergey Levine2 •
University of Southern California1, Google2
03 Oct 2016-arXiv: Learning
TL;DR: In this paper, a distributed and asynchronous version of guided policy search is proposed to demonstrate collective policy learning on a vision-based door opening task using four robots and showed that it achieves better generalization, utilization, and training times than the single robot alternative.
Abstract: In principle, reinforcement learning and policy search methods can enable robots to learn highly complex and general skills that may allow them to function amid the complexity and diversity of the real world. However, training a policy that generalizes well across a wide range of real-world conditions requires far greater quantity and diversity of experience than is practical to collect with a single robot. Fortunately, it is possible for multiple robots to share their experience with one another, and thereby, learn a policy collectively. In this work, we explore distributed and asynchronous policy learning as a means to achieve generalization and improved training times on challenging, real-world manipulation tasks. We propose a distributed and asynchronous version of Guided Policy Search and use it to demonstrate collective policy learning on a vision-based door opening task using four robots. We show that it achieves better generalization, utilization, and training times than the single robot alternative.
Proceedings Article•
SparkNet: Training Deep Networks in Spark

[...]

Philipp Moritz1, Robert Nishihara1, Ion Stoica1, Michael I. Jordan1•
University of California, Berkeley1
1 Jan 2016
TL;DR: This work introduces SparkNet, a framework for training deep networks in Spark using a simple parallelization scheme for stochastic gradient descent that scales well with the cluster size and tolerates very high-latency communication.
Abstract: Training deep networks is a time-consuming process, with networks for object recognition often requiring multiple days to train. For this reason, leveraging the resources of a cluster to speed up training is an important area of work. However, widely-popular batch-processing computational frameworks like MapReduce and Spark were not designed to support the asynchronous and communication-intensive workloads of existing distributed deep learning systems. We introduce SparkNet, a framework for training deep networks in Spark. Our implementation includes a convenient interface for reading data from Spark RDDs, a Scala interface to the Caffe deep learning framework, and a lightweight multi-dimensional tensor library. Using a simple parallelization scheme for stochastic gradient descent, SparkNet scales well with the cluster size and tolerates very high-latency communication. Furthermore, it is easy to deploy and use with no parameter tuning, and it is compatible with existing Caffe models. We quantify the dependence of the speedup obtained by SparkNet on the number of machines, the communication frequency, and the cluster's communication overhead, and we benchmark our system's performance on the ImageNet dataset.
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve