Open AccessBook
Apache Kafka
Nishant Garg
- 17 Oct 2013
126
TL;DR: This book will follow a step-by-step tutorial approach which will show the readers how to use Apache Kafka for messaging from scratch, and how Kafka works with other tools like Hadoop, Storm, and so on.
read more
Abstract: Set up Apache Kafka clusters and develop custom message producers and consumers using practical, hands-on examples Overview Write custom producers and consumers with message partition techniques Integrate Kafka with Apache Hadoop and Storm for use cases such as processing streaming data Provide an overview of Kafka tools and other contributions that work with Kafka in areas such as logging, packaging, and so on In Detail Message publishing is a mechanism of connecting heterogeneous applications together with messages that are routed between them, for example by using a message broker like Apache Kafka. Such solutions deal with real-time volumes of information and route it to multiple consumers without letting information producers know who the final consumers are. Apache Kafka is a practical, hands-on guide providing you with a series of step-by-step practical implementations, which will help you take advantage of the real power behind Kafka, and give you a strong grounding for using it in your publisher-subscriber based architectures. Apache Kafka takes you through a number of clear, practical implementations that will help you to take advantage of the power of Apache Kafka, quickly and painlessly. You will learn everything you need to know for setting up Kafka clusters. This book explains how Kafka basic blocks like producers, brokers, and consumers actually work and fit together. You will then explore additional settings and configuration changes to achieve ever more complex goals. Finally you will learn how Kafka works with other tools like Hadoop, Storm, and so on. You will learn everything you need to know to work with Apache Kafka in the right format, as well as how to leverage its power of handling hundreds of megabytes of messages per second from multiple clients. What you will learn from this book Download and build Kafka Set up single as well as multi-node Kafka clusters and send messages Learn Kafka design internals and message compression Understand how replication works in Kafka Write Kafka message producers and consumers using the Kafka producer API Get an overview of consumer configurations Integrate Kafka with Apache Hadoop and Storm Use Kafka administration tools Approach The book will follow a step-by-step tutorial approach which will show the readers how to use Apache Kafka for messaging from scratch. Who this book is written for Apache Kafka is for readers with software development experience, but no prior exposure to Apache Kafka or similar technologies is assumed. This book is also for enterprise application developers and big data enthusiasts who have worked with other publisher-subscriber based systems and now want to explore Apache Kafka as a futuristic scalable solution.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
An experimental survey on big data frameworks
TL;DR: The challenges of Big Data are discussed and existing Big Data frameworks are surveyed and a presentation of best practices related to the use of studied frameworks in several application domains such as machine learning, graph processing and real-world applications is presented.
BGPStream: A Software Framework for Live and Historical BGP Data Analysis
Chiara Orsini,Alistair King,Danilo Giordano,Vasileios Giotsas,Alberto Dainotti +4 more
- 14 Nov 2016
TL;DR: BGPStream is presented, an open-source software framework for the analysis of both historical and real-time Border Gateway Protocol (BGP) measurement data, enabling efficient investigation of events, rapid prototyping, and building complex tools and large-scale monitoring applications.
LogLens: A Real-Time Log Analysis System
Biplob Debnath,Mohiuddin Solaimani,Muhammad Ali Gulzar,Nipun Arora,Cristian Lumezanu,Jian-Wu Xu,Bo Zong,Hui Zhang,Guofei Jiang,Latifur Khan +9 more
- 02 Jul 2018
TL;DR: This paper presents a real-time log analysis system called LogLens that automates the process of anomaly detection from logs with no (or minimal) target system knowledge and user specification, and presents an extensible system for supporting both stateless and stateful log analysis applications.
111
Online Anomaly Detection over Big Data Streams.
Laura Rettig,Mourad Khayati,Philippe Cudré-Mauroux,Michal Piorkowski +3 more
- 01 Jan 2019
TL;DR: A combination of the two metrics put forward can be applied to detect several types of anomalies - like infrastructure failures, hardware misconfiguration or user-driven anomalies - in large-scale telecommunication networks.
89
Atoll: A Scalable Low-Latency Serverless Platform
Arjun Singhvi,Arjun Balasubramanian,Kevin Houck,Mohammed Danish Shaikh,Shivaram Venkataraman,Aditya Akella +5 more
- 01 Nov 2021
TL;DR: In this article, the authors present Atoll, a serverless platform that overcomes the challenges of serverless platforms via a ground-up redesign of the control and data planes.
87
References
Apache Hadoop YARN: yet another resource negotiator
Vinod Kumar Vavilapalli,Arun C. Murthy,Chris Douglas,Sharad Agarwal,Mahadev Konar,Robert Evans,Thomas Graves,Jason Lowe,Hitesh Shah,Siddharth Seth,Bikas Saha,Carlo Curino,Owen O'Malley,Sanjay Radia,Benjamin Reed,Eric Baldeschwieler +15 more
- 01 Oct 2013
TL;DR: The design, development, and current state of deployment of the next generation of Hadoop's compute platform: YARN is summarized, which decouples the programming model from the resource management infrastructure, and delegates many scheduling functions to per-application components.