Conference
Distributed Event-Based Systems
About: Distributed Event-Based Systems is an academic conference. The conference publishes majorly in the area(s): Complex event processing & Computer science. Over the lifetime, 763 publications have been published by the conference receiving 12253 citations.
Topics: Complex event processing, Computer science, Event (computing), Stream processing, Scalability
Papers published on a yearly basis
Papers
29 Jun 2013
TL;DR: Two advanced generic schedulers for Storm are proposed that provide improved performance for a wide range of application topologies and can produce schedules that achieve significantly better performances compared to those produced by Storm's default scheduler.
Abstract: Today we are witnessing a dramatic shift toward a data-driven economy, where the ability to efficiently and timely analyze huge amounts of data marks the difference between industrial success stories and catastrophic failures. In this scenario Storm, an open source distributed realtime computation system, represents a disruptive technology that is quickly gaining the favor of big players like Twitter and Groupon. A Storm application is modeled as a topology, i.e. a graph where nodes are operators and edges represent data flows among such operators. A key aspect in tuning Storm performance lies in the strategy used to deploy a topology, i.e. how Storm schedules the execution of each topology component on the available computing infrastructure.In this paper we propose two advanced generic schedulers for Storm that provide improved performance for a wide range of application topologies. The first scheduler works offline by analyzing the topology structure and adapting the deployment to it; the second scheduler enhance the previous approach by continuously monitoring system performance and rescheduling the deployment at run-time to improve overall performance. Experimental results show that these algorithms can produce schedules that achieve significantly better performances compared to those produced by Storm's default scheduler.
274 citations
13 Jun 2016
TL;DR: A general formulation of the optimal DSP placement (for short, ODP) as an Integer Linear Programming problem which takes explicitly into account the heterogeneity of computing and networking resources and which encompasses - as special cases - the different solutions proposed in the literature.
Abstract: Data Stream Processing (DSP) applications are widely used to timely extract information from distributed data sources, such as sensing devices, monitoring stations, and social networks. To successfully handle this ever increasing amount of data, recent trends investigate the possibility of exploiting decentralized computational resources (e.g., Fog computing) to define the applications placement. Several placement policies have been proposed in the literature, but they are based on different assumptions and optimization goals and, as such, they are not completely comparable to each other.In this paper we study the placement problem for distributed DSP applications. Our contributions are twofold. We provide a general formulation of the optimal DSP placement (for short, ODP) as an Integer Linear Programming problem which takes explicitly into account the heterogeneity of computing and networking resources and which encompasses - as special cases - the different solutions proposed in the literature. We present an ODP-based scheduler for the Apache Storm DSP framework. This allows us to compare some well-known centralized and decentralized placement solutions. We also extensively analyze the ODP scalability with respect to various parameter settings.
168 citations
8 Jun 2017
TL;DR: This paper establishes a common comparison framework based on the core functionalities of pub/sub systems and enumerates a set of use cases that are best suited for RabbitMQ or Kafka, to guide the reader through a determination table to choose the best architecture given his/her particular set of requirements.
Abstract: Publish/subscribe is a distributed interaction paradigm well adapted to the deployment of scalable and loosely coupled systems.Apache Kafka and RabbitMQ are two popular open-source and commercially-supported pub/sub systems that have been around for almost a decade and have seen wide adoption. Given the popularity of these two systems and the fact that both are branded as pub/sub systems, two frequently asked questions in the relevant online forums are: how do they compare against each other and which one to use?In this paper, we frame the arguments in a holistic approach by establishing a common comparison framework based on the core functionalities of pub/sub systems. Using this framework, we then venture into a qualitative and quantitative (i.e. empirical) comparison of the common features of the two systems. Additionally, we also highlight the distinct features that each of these systems has. After enumerating a set of use cases that are best suited for RabbitMQ or Kafka, we try to guide the reader through a determination table to choose the best architecture given his/her particular set of requirements.
168 citations
20 Jun 2007
TL;DR: SpiderCast is designed to effectively tread the balance between average overlay degree and communication cost of event dissemination, and it is shown experimentally that, for many practical work-loads, the SpiderCast overlays are both topic-connected and have a low per-topic diameter while requiring each node to maintain a low average number of connections.
Abstract: We introduce SpiderCast, a distributed protocol for constructing scalable churn-resistant overlay topologies for supporting decentralized topic-based pub/sub communication. SpiderCast is designed to effectively tread the balance between average overlay degree and communication cost of event dissemination. It employs a novel coverage-optimizing heuristic in which the nodes utilize partial subscription views (provided by a decentralized membership service) to reduce the average node degree while guaranteeing (with high probability) that the events posted on each topic can be routed solely through the nodes interested in this topic (in other words, the overlay is topic-connected). SpiderCast is unique in maintaining an overlay topology that scales well with the average number of topics a node is subscribed to, assuming the subscriptions are correlated insofar as found in most typical workloads. Furthermore, the degree grows logarithmically in the total number of topics, and slowly decreases as the number of nodes increases.We show experimentally that, for many practical work-loads, the SpiderCast overlays are both topic-connected and have a low per-topic diameter while requiring each node to maintain a low average number of connections. These properties are satisfied even in very large settings involving up to 10,000 nodes, 1,000 topics, and 70 subscriptions per-node, and under high churn rates. In addition, our results demonstrate that, in a large setting, the average node degree in SpiderCast is at least 45% smaller than in other overlays typically used to support decentralized pub/sub communication (such as e.g., similarity-based, rings-based, and random overlays).
163 citations
20 Jun 2007
TL;DR: This paper proposes an architecture for implementing the topic-based publish/subscribe paradigm in large scale peer-to-peer systems based on clustering peers subscribed to the same topic and shows it to be scalable along several fundamental dimensions like number of participants, subscriptions, and to exhibit a fair load distribution.
Abstract: The completely decoupled interaction model offered by the publish/subscribe communication paradigm perfectly suits the interoperability needs of todays large-scale, dynamic, peer-to-peer applications. The unmanaged environments, where these applications are expected to work, pose a series of problems (potentially wide number of partipants, low-reliability of nodes, absence of a centralized authority, etc.) that severely limit the scalability of existing approaches which were originally thought for supporting distributed applications built on the top of static and managed environments. In this paper we propose an architecture for implementing the topic-based publish/subscribe paradigm in large scale peer-to-peer systems. The architecture is based on clustering peers subscribed to the same topic. The major novelty of this architecture lies in the mechanism employed to bring events from the publisher to the cluster (namely outer-cluster routing). The evaluation shows that this mechanism for outer-cluster routing has a probability to bring events to the destination cluster very close to 1 while keeping small the involved number of out-of-cluster peers. Finally, the overall architecture is shown to be scalable along several fundamental dimensions like number of participants, subscriptions, and to exhibit a fair load distribution (load distribution closely follows the distribution of subscriptions on nodes).
158 citations
Performance Metrics
| Year | Papers |
|---|---|
| 2023 | 32 |
| 2022 | 32 |
| 2021 | 27 |
| 2020 | 30 |
| 2019 | 45 |
| 2018 | 44 |