M3: Stream Processing on Main-Memory MapReduce

doi:10.1109/ICDE.2012.120

Proceedings Article10.1109/ICDE.2012.120

M3: Stream Processing on Main-Memory MapReduce

Ahmed M. Aly, +6 more

- 01 Apr 2012

- pp 1253-1256

70

TL;DR: M3 extends Hadoop, the open source implementation of MapReduce, bypassing the Hadoops Distributed File System (HDFS) to support main-memory-only processing, and supports continuous execution of the Map and Reduce phases where individual Mappers and Reducers never terminate.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1109/BIGDATA.2015.7364082

Lambda architecture for cost-effective batch and speed big data processing

Mariam Kiran, +4 more

- 29 Oct 2015

TL;DR: An implementation of the lambda architecture design pattern is presented to construct a data-handling backend on Amazon EC2, providing high throughput, dense and intense data demand delivered as services, minimizing the cost of the network maintenance.

...read moreread less

243

Proceedings Article•10.1109/ICDCS.2014.61

T-Storm: Traffic-Aware Online Scheduling in Storm

Jielong Xu, +3 more

- 30 Jun 2014

TL;DR: A new stream data processing system based on Storm, namely, T-Storm, which accelerates data processing by leveraging effective traffic-aware scheduling for assigning/re-assigning tasks dynamically, which minimizes inter-node and inter-process traffic.

...read moreread less

243

•Proceedings Article•10.1145/3240765.3240811

FELIX: fast and energy-efficient logic in memory

Saransh Gupta, +2 more

- 05 Nov 2018

TL;DR: This paper proposes an in-memory implementation of fast and energy-efficient logic (FELIX) which combines the functionality of PIM with memories and is the first PIM logic to enable the single cycle NOR, NOT, NAND, minority, and OR directly in crossbar memory.

...read moreread less

213

•Journal Article•10.1016/J.FUTURE.2015.10.023

CEPSim: Modelling and Simulation of Complex Event Processing Systems in Cloud Environments

Wilson A. Higashino, +3 more

- 01 Dec 2016

- Future Generation Computer Systems

TL;DR: CEPSim is highly customizable and can be used to analyse the performance and scalability of user-defined queries and to evaluate the effects of various query processing strategies, as well as simulate existing systems in large Big Data scenarios with accuracy and precision.

...read moreread less

51

•Journal Article•10.14778/3199517.3199521

Model-free control for distributed stream data processing using deep reinforcement learning

LiTeng, +3 more

- 01 Feb 2018

TL;DR: In this paper, the authors focus on general-purpose Distributed Stream Data Processing Systems (DSDPSs), which deal with processing of unbounded streams of continuous data at scale distributedly in real or real-time.

...read moreread less

41

...

Expand

References

Journal Article•10.21276/IJRE.2018.5.5.4

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 06 Dec 2004

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

22.7K

Journal Article•10.1145/1327452.1327492

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 01 Jan 2008

- Communications of The ACM

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

...read moreread less

18.6K

Journal Article•10.14778/1687553.1687609

Hive: a warehousing solution over a map-reduce framework

Ashish Thusoo, +8 more

- 01 Aug 2009

TL;DR: Hadoop is a popular open-source map-reduce implementation which is being used as an alternative to store and process extremely large data sets on commodity hardware.

...read moreread less

1.8K

•Proceedings Article•10.5555/1855711.1855732

MapReduce online

Tyson Condie, +5 more

- 28 Apr 2010

TL;DR: A modified version of the Hadoop MapReduce framework that supports online aggregation, which allows users to see "early returns" from a job as it is being computed, and can reduce completion times and improve system utilization for batch jobs as well.

...read moreread less

930

Proceedings Article•10.1145/1807167.1807273

A comparison of join algorithms for log processing in MaPreduce

Spyros Blanas, +5 more

- 06 Jun 2010

TL;DR: Key implementation details of a number of well-known join strategies in MapReduce are described and a comprehensive experimental comparison of these join techniques on a 100-node Hadoop cluster is presented.

...read moreread less

490