Apache Kafka

Open AccessBook

Apache Kafka

- 17 Oct 2013

126

TL;DR: This book will follow a step-by-step tutorial approach which will show the readers how to use Apache Kafka for messaging from scratch, and how Kafka works with other tools like Hadoop, Storm, and so on.

Abstract: Set up Apache Kafka clusters and develop custom message producers and consumers using practical, hands-on examples Overview Write custom producers and consumers with message partition techniques Integrate Kafka with Apache Hadoop and Storm for use cases such as processing streaming data Provide an overview of Kafka tools and other contributions that work with Kafka in areas such as logging, packaging, and so on In Detail Message publishing is a mechanism of connecting heterogeneous applications together with messages that are routed between them, for example by using a message broker like Apache Kafka. Such solutions deal with real-time volumes of information and route it to multiple consumers without letting information producers know who the final consumers are. Apache Kafka is a practical, hands-on guide providing you with a series of step-by-step practical implementations, which will help you take advantage of the real power behind Kafka, and give you a strong grounding for using it in your publisher-subscriber based architectures. Apache Kafka takes you through a number of clear, practical implementations that will help you to take advantage of the power of Apache Kafka, quickly and painlessly. You will learn everything you need to know for setting up Kafka clusters. This book explains how Kafka basic blocks like producers, brokers, and consumers actually work and fit together. You will then explore additional settings and configuration changes to achieve ever more complex goals. Finally you will learn how Kafka works with other tools like Hadoop, Storm, and so on. You will learn everything you need to know to work with Apache Kafka in the right format, as well as how to leverage its power of handling hundreds of megabytes of messages per second from multiple clients. What you will learn from this book Download and build Kafka Set up single as well as multi-node Kafka clusters and send messages Learn Kafka design internals and message compression Understand how replication works in Kafka Write Kafka message producers and consumers using the Kafka producer API Get an overview of consumer configurations Integrate Kafka with Apache Hadoop and Storm Use Kafka administration tools Approach The book will follow a step-by-step tutorial approach which will show the readers how to use Apache Kafka for messaging from scratch. Who this book is written for Apache Kafka is for readers with software development experience, but no prior exposure to Apache Kafka or similar technologies is assumed. This book is also for enterprise application developers and big data enthusiasts who have worked with other publisher-subscriber based systems and now want to explore Apache Kafka as a futuristic scalable solution.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1016/J.FUTURE.2018.04.032

An experimental survey on big data frameworks

Wissem Inoubli, +4 more

- 01 Sep 2018

- Future Generation Computer Systems

TL;DR: The challenges of Big Data are discussed and existing Big Data frameworks are surveyed and a presentation of best practices related to the use of studied frameworks in several application domains such as machine learning, graph processing and real-world applications is presented.

...read moreread less

134

•Proceedings Article•10.1145/2987443.2987482

BGPStream: A Software Framework for Live and Historical BGP Data Analysis

Chiara Orsini, +4 more

- 14 Nov 2016

TL;DR: BGPStream is presented, an open-source software framework for the analysis of both historical and real-time Border Gateway Protocol (BGP) measurement data, enabling efficient investigation of events, rapid prototyping, and building complex tools and large-scale monitoring applications.

...read moreread less

112

Proceedings Article•10.1109/ICDCS.2018.00105

LogLens: A Real-Time Log Analysis System

Biplob Debnath, +9 more

- 02 Jul 2018

TL;DR: This paper presents a real-time log analysis system called LogLens that automates the process of anomaly detection from logs with no (or minimal) target system knowledge and user specification, and presents an extensible system for supporting both stateless and stateful log analysis applications.

...read moreread less

111

Online Anomaly Detection over Big Data Streams.

Laura Rettig, +3 more

- 01 Jan 2019

TL;DR: A combination of the two metrics put forward can be applied to detect several types of anomalies - like infrastructure failures, hardware misconfiguration or user-driven anomalies - in large-scale telecommunication networks.

...read moreread less

89

Proceedings Article•10.1145/3472883.3486981

Atoll: A Scalable Low-Latency Serverless Platform

Arjun Singhvi, +5 more

- 01 Nov 2021

TL;DR: In this article, the authors present Atoll, a serverless platform that overcomes the challenges of serverless platforms via a ground-up redesign of the control and data planes.

...read moreread less

87

...

Expand

References

Proceedings Article•10.1145/2523616.2523633

Apache Hadoop YARN: yet another resource negotiator

Vinod Kumar Vavilapalli, +15 more

- 01 Oct 2013

TL;DR: The design, development, and current state of deployment of the next generation of Hadoop's compute platform: YARN is summarized, which decouples the programming model from the resource management infrastructure, and delegates many scheduling functions to per-application components.

...read moreread less

2.2K