Fact-checked by Grok 2 weeks ago

Apache Kafka

Apache Kafka is an open-source distributed event streaming platform designed to handle high-throughput, real-time data pipelines by allowing applications to publish, subscribe to, store, and process streams of records, also known as events or messages.^[1] Originally developed at LinkedIn in 2010 to manage real-time data feeds for user activity tracking, it was open-sourced in 2011 and became a top-level Apache Software Foundation project in 2012.^[2] The platform's architecture consists of a cluster of servers called brokers that store data in partitioned topics, with replication for fault tolerance (typically using a replication factor of three), and supports clients for producing and consuming data via a high-performance TCP-based protocol.^[1] Key features include durable storage with configurable retention periods, horizontal scalability to handle trillions of messages per day, real-time stream processing through Kafka Streams API, and integration capabilities via Kafka Connect for connecting to external systems like databases and cloud services.^[1] Kafka is widely adopted across industries for use cases such as operational monitoring, aggregating distributed application statistics, logistics tracking, IoT data processing, and enabling microservices communication, with thousands of companies—including over 80% of the Fortune 100—relying on it for building data-intensive applications.^[3]^[4] Its ecosystem includes APIs for Java, Scala, and community-supported clients in languages like Python, Go, and C++, making it versatile for developers building fault-tolerant, secure streaming systems on-premises, in the cloud, or in containers.^[1]

Overview

Definition and Purpose

Apache Kafka is an open-source, distributed event streaming platform designed for building real-time data pipelines and streaming applications.^[1] It enables the continuous import and export of data between systems, capturing event streams from diverse sources such as databases and sensors for storage, processing, and routing.^[1] The primary purpose of Kafka is to provide high-throughput, fault-tolerant publishing and subscribing to streams of records, functioning as a distributed commit log for big data applications.^[1] This design ensures durable storage of events with configurable retention policies, allowing multiple reads without data deletion upon consumption, and supports replication for reliability even in the event of server failures.^[1] By maintaining constant performance regardless of data volume, Kafka facilitates scalable, real-time data handling in environments requiring low-latency processing.^[1] As a core alternative to traditional message brokers, Kafka emphasizes a publish-subscribe model where producers publish records to topics—logical channels that can be partitioned for parallelism—and multiple consumers subscribe independently.^[1] This contrasts with queue-based systems like RabbitMQ or ActiveMQ, which typically delete messages after delivery to a single consumer, whereas Kafka's log-based persistence retains records for replay and multi-subscriber access, enabling more flexible event-driven architectures.^[5]^[6]

Key Features

Apache Kafka's durability stems from its design as a distributed commit log, where messages are appended to persistent storage on disk and retained for configurable periods, ensuring data is not lost even in the event of failures.^[1] This append-only log structure, combined with configurable replication factors across multiple brokers, provides high availability and fault tolerance, allowing messages to survive broker crashes or network partitions.^[7] Scalability in Kafka is achieved through horizontal partitioning of topics across a cluster of brokers, enabling the system to handle growing data volumes by adding more nodes without downtime.^[1] Replication of partitions ensures data redundancy and load balancing, supporting seamless distribution of read and write operations for massive-scale deployments.^[8] Kafka delivers high throughput, capable of processing millions of messages per second, thanks to efficient batching of records during production and consumption, along with zero-copy techniques that minimize data copying between kernel and user space.^[9] These optimizations maintain consistent performance regardless of data size, making it suitable for real-time applications like activity tracking or log aggregation.^[1] Exactly-once semantics are supported through idempotent producers, which use unique producer IDs and sequence numbers to deduplicate messages, and transactional APIs that enable atomic operations across multiple partitions.^[10] This ensures that read-process-write cycles complete without duplicates or losses, a capability integrated into Kafka Streams for reliable stream processing.^[11] Stream processing integration is facilitated by the Kafka Streams API, which allows building real-time applications directly on Kafka topics without relying on external processing frameworks.^[12] It supports transformations, aggregations, and joins on event streams, enabling in-place analytics and stateful computations at scale.^[1] Backward compatibility is a core principle in Kafka's evolution, ensuring that clients and brokers from older versions can interoperate with newer releases, facilitating rolling upgrades without service interruptions. Recent advancements, such as the shift to KRaft mode for metadata management, maintain this compatibility while improving operational efficiency.^[13]

History and Development

Origins and Early Development

Apache Kafka originated in 2010 at LinkedIn, where engineers Jay Kreps, Neha Narkhede, and Jun Rao developed it to solve pressing data pipeline challenges in tracking user activity across the platform.^[14] At the time, LinkedIn generated enormous volumes of event data from user interactions, requiring a unified system for real-time log aggregation, metrics collection, and distribution to various consumers like search indexes and monitoring tools, which existing solutions handled inefficiently.^[14] The initial motivation stemmed from the limitations of traditional message queues and batch processing tools, which struggled with the scale and latency demands of LinkedIn's operations.^[14] The project began as an internal tool, with its first release deployed at LinkedIn in January 2011 to manage high-throughput streams of activity data, such as page views and connections.^[15] Recognizing its broader potential, the team open-sourced Kafka later in 2011 under the Apache License 2.0, entering it into the Apache Incubator in July of that year.^[15] Early internal adoption focused on unifying event data handling, replacing fragmented pipelines for operational metrics and debugging, which allowed LinkedIn to process billions of messages daily with improved reliability.^[14] The name "Kafka" was selected by co-creator Jay Kreps, drawing inspiration from the author Franz Kafka, whose works explored themes of bureaucracy and inescapable systems—resonating with the platform's role in managing complex distributed data flows—and because the system was optimized for efficient writing.^[16] This early development laid the foundation for Kafka's evolution into a widely adopted open-source project.

Apache Project Milestones

Apache Kafka entered the Apache Incubator on July 4, 2011, marking its formal transition into the Apache Software Foundation's open-source ecosystem.^[17] This step allowed the project to benefit from Apache's governance model, community-driven development, and legal protections while continuing to evolve from its origins at LinkedIn. On October 23, 2012, Kafka graduated from the Incubator to become a top-level Apache project, signifying maturity, active community involvement, and alignment with Apache's meritocratic principles.^[17] Significant milestones in Kafka's release history under Apache include version 0.8.0, released on December 3, 2013, which introduced intra-cluster replication to enhance data durability and availability across brokers.^[18] This feature addressed earlier limitations in fault tolerance, enabling Kafka to support production-scale deployments. Subsequent releases built on this foundation: version 0.9.0, released on November 24, 2015, added the Kafka Connect API for integrating with external data systems, simplifying data pipeline construction.^[19] Version 0.10.0, released on May 24, 2016, introduced the Kafka Streams API, providing a client-side library for building real-time stream processing applications directly on Kafka topics.^[20] The project's community expanded notably with the founding of Confluent in 2014 by Kafka's original creators, which accelerated contributions, tooling, and enterprise adoption while maintaining open-source commitments. The inaugural Kafka Summit was held in San Francisco on April 26, 2016, fostering global collaboration among developers, operators, and users to discuss advancements in event streaming.^[21] These events highlighted Kafka's growing ecosystem, with increasing committers and user contributions driving feature development. Kafka adopted semantic versioning starting with release 0.10.0, structuring versions as MAJOR.MINOR.PATCH to better signal compatibility and changes, which stabilized the project for broader adoption. By the early 2020s, the stable release series had progressed to the 3.x line, incorporating incremental improvements in performance, security, and scalability while preserving backward compatibility.^[22] Kafka's influence within the Apache ecosystem is evident in its widespread adoption for real-time data systems by major companies, such as Netflix for content recommendation pipelines and Uber for processing ride-sharing events at scale.^[23] These implementations underscore Kafka's role as a foundational technology for distributed streaming, powering mission-critical workloads across industries. The project has also paved the way for related advancements, including a transition to ZooKeeper-free operation in later versions.^[24]

Recent Advancements

One of the most significant recent advancements in Apache Kafka is the introduction of KRaft (Kafka Raft) mode, proposed in KIP-500 in 2019, which replaces ZooKeeper with a Raft-based consensus protocol for metadata management directly within Kafka brokers.^[25] This shift simplifies cluster architecture by consolidating metadata responsibilities into Kafka itself, eliminating the need for a separate ZooKeeper ensemble and reducing operational complexity.^[26] KRaft became the default and only supported mode in Apache Kafka 4.0, released on March 18, 2025, marking the complete removal of ZooKeeper dependency and enabling more efficient, scalable metadata handling.^[27] The migration from ZooKeeper to KRaft involves a structured process outlined in KIP-866, beginning with the generation of a unique cluster ID and configuration of a KRaft controller quorum on dedicated nodes.^[28] Brokers are then restarted in dual-write mode, where metadata updates are simultaneously written to both ZooKeeper and the KRaft quorum's metadata log to ensure consistency during transition.^[29] Once synchronization is verified through logs monitoring the metadata log offsets and quorum health, ZooKeeper can be decommissioned, with final steps including broker reconfiguration to KRaft-only operation and validation of cluster stability via tools like the KRaft migration verification scripts.^[30] Apache Kafka 4.0 also introduced early access to Queues for Kafka via KIP-932, enabling traditional queue semantics such as per-message acknowledgments and retries through a new "share group" consumer model that allows multiple consumers to process messages from the same partition without strict ordering guarantees.^[31] This feature addresses limitations in Kafka's pub-sub model for workload queue patterns, supporting cooperative consumption where messages are load-balanced across consumers in a shared group until explicitly acknowledged.^[32] Building on this, Apache Kafka 4.1.0, released on September 2, 2025, includes performance enhancements such as optimized rebalance protocols and improved cloud-native integrations, including better support for containerized deployments and dynamic scaling in environments like Kubernetes.^[22] These updates focus on reducing latency in group coordination and enhancing interoperability with cloud storage services for hybrid deployments.^[33] The latest patch release, 4.1.1, was issued on November 12, 2025, providing bug fixes and stability enhancements.^[22] KRaft mode delivers substantial performance gains, with metadata operations achieving up to 8x faster throughput compared to ZooKeeper-based clusters, particularly in high-scale environments where controller elections and metadata propagation were bottlenecks.^[34] Looking ahead, Apache Kafka's development emphasizes tiered storage capabilities to decouple compute from long-term data retention, allowing older log segments to be offloaded to cost-effective remote storage like object stores while maintaining low-latency access to recent data.^[35] Additionally, integrations with AI and machine learning workflows are gaining traction, leveraging Kafka Streams for real-time feature serving and model inference directly on event streams to support scalable AI pipelines.^[36]

Architecture

Core Components

Apache Kafka's core components form the foundational infrastructure of a distributed cluster, enabling scalable event storage and processing. At the heart of the system are brokers, which are server processes responsible for managing the persistent storage of events and handling incoming requests from producers and consumers. Each broker maintains a subset of the cluster's data and coordinates with others to ensure overall system reliability and performance. Brokers communicate via a high-performance TCP network protocol, forming a distributed storage layer that supports horizontal scaling by distributing workloads across multiple nodes.^[7] The controller is a critical broker elected to oversee cluster-wide operations, including the assignment and reassignment of partition leadership, as well as monitoring broker health and state changes. It acts as the central coordinator, propagating configuration updates and ensuring consistent cluster behavior in response to failures or expansions. In traditional setups, the controller relied on external coordination, but modern implementations integrate this functionality internally for improved efficiency.^[7] Historically, ZooKeeper served as a legacy external service for storing cluster metadata, such as broker registrations, topic configurations, and leader elections, providing a centralized coordination mechanism for the distributed brokers. ZooKeeper ensured atomic updates and high availability through its own quorum-based protocol, but it introduced operational complexity as an additional dependency. With the release of Apache Kafka 4.0 in 2025, ZooKeeper mode was fully removed, marking the end of its role in new deployments.^[37]^[7] In its place, KRaft (Kafka Raft Metadata mode) implements an internal consensus protocol based on Raft, embedding the control plane directly within the Kafka brokers to manage metadata and elect controllers without external services. This transition simplifies operations and enhances scalability by leveraging Kafka's own log-based storage for quorum decisions. Cluster metadata, including details on brokers, configurations, and assignments, is now durably stored in a replicated metadata log, with periodic snapshots for quick recovery and consistency across the active quorum.^[7]^[38] Data storage within brokers occurs through log segments written sequentially to disk, optimizing for high-throughput append operations and efficient retrieval. Retention policies, configurable via parameters such as log.retention.ms for time-based deletion or log.retention.bytes for size-based limits, govern how long segments are preserved, allowing clusters to balance storage costs with data availability requirements. Compaction policies can further optimize space by retaining only the latest records for specific keys, ensuring the storage layer remains performant even as data volumes grow.^[7]

Data Model

Apache Kafka's data model revolves around a publish-subscribe messaging system designed for high-throughput, fault-tolerant streaming of records. At its core, data is organized into topics, which serve as logical channels for categorizing streams of records. Topics act as immutable, append-only logs where producers publish records and consumers subscribe to read them, enabling multi-producer and multi-subscriber semantics. Events in a topic are durably stored with configurable retention policies based on time or size, allowing for replayability and decoupling of producers from consumers.^[39] Within each topic, data is subdivided into partitions, which are ordered, immutable sequences of records that form the fundamental unit of parallelism and scalability in Kafka. Partitions enable horizontal scaling by distributing the load across multiple brokers, with records appended sequentially to maintain total order within a partition but allowing independent processing across partitions. The number of partitions for a topic is specified at creation and can be increased later, though decreasing is not supported to preserve immutability.^[39] Each record in a partition is identified by an offset, a strictly increasing integer that serves as a unique positional identifier within that partition. Offsets allow consumers to track their reading progress and enable precise replay of records from any point, supporting at-least-once, at-most-once, or exactly-once semantics depending on configuration. Consumers commit offsets to track consumption, ensuring fault-tolerant resumption without data loss or duplication beyond configured guarantees.^[39] A record, also referred to as a message or event, is the basic unit of data in Kafka, structured as a key-value pair with optional headers, a timestamp, and metadata for serialization. The key is used for partitioning (e.g., via hashing for consistent distribution), while the value carries the payload, which can be serialized in formats like Avro, JSON, or Protobuf for schema evolution and interoperability. Timestamps can be set by producers (creation time) or brokers (log append time), and headers provide extensible metadata for routing or enrichment without altering the core payload. For example, a record might have a key of "user123" and a value representing a transaction event in JSON format.^[39]^[40] To facilitate scalable consumption, Kafka introduces consumer groups, an abstraction that coordinates multiple consumers to divide the workload of a topic's partitions dynamically. Each partition is assigned to exactly one consumer in the group, enabling parallel processing while ensuring no overlap or gaps in consumption. This load balancing is managed by the group coordinator in the broker, reassigning partitions upon consumer failures or additions for fault tolerance and elasticity. Consumer groups allow for broadcast (each group member processes all partitions) or partitioning (load sharing) semantics, depending on the use case.^[39]

Operation

Message Production and Consumption

In Apache Kafka, producers are client applications responsible for publishing records to specified topics within the cluster. Each record typically consists of a key, value, timestamp, and optional headers, which are serialized before transmission. Producers interact with Kafka brokers to determine the appropriate partition for each record, ensuring efficient distribution across the topic's partitions.^[41] Partitioning strategies in Kafka producers are configurable and play a crucial role in load balancing and ordering guarantees. By default, producers use a key-based hashing mechanism, where records sharing the same key are hashed to the same partition, preserving order for related messages while distributing unrelated ones evenly. If no key is provided, producers employ a round-robin strategy or sticky partitioning to optimize batching and throughput. These approaches allow producers to target specific partitions explicitly if needed, such as through the partition metadata fetched from brokers.^[41] Consumers in Kafka operate on a pull-based model, actively polling brokers for batches of records starting from their current offset positions within assigned partitions. This design enables consumers to control the pace of data ingestion, fetching records in configurable batch sizes to balance latency and throughput. Offsets, which represent the position in the log for each partition, allow consumers to resume processing from the last committed point after interruptions.^[42] For coordinated consumption, consumers join consumer groups where a rebalancing protocol manages partition assignments dynamically. When group membership changes—due to failures, additions, or removals—the protocol triggers a rebalance, revoking and reassigning partitions among members to ensure even distribution and fault tolerance. This process, often using the Kafka Coordinator, relies on heartbeats and session timeouts to detect changes efficiently.^[42] To prevent duplicate messages during production, Kafka supports idempotent producers, which are enabled by default (since Kafka 3.0, with enable.idempotence=true). This assigns sequence numbers to records within each partition, allowing brokers to detect and discard duplicates based on producer ID (PID) and epoch values. This ensures at-least-once delivery without the overhead of full exactly-once semantics, while maintaining compatibility with standard producer retries.^[43] For scenarios requiring stronger guarantees, Kafka transactions enable atomic operations across multiple partitions and topics. Producers initiate transactions with a unique transactional ID, buffering records until a commit or abort is issued, which atomically updates offsets and markers for consumers. This mechanism provides exactly-once semantics for both production and downstream consumption, particularly useful in stream processing applications.^[11] Batch processing enhances efficiency in both production and consumption workflows. Producers accumulate multiple records into batches before sending them to brokers, controlled by parameters like batch.size and linger.ms (default: 5 ms since Kafka 4.0), which reduce network overhead and improve throughput. On the consumer side, fetched batches are processed in groups, with offsets committed periodically—either automatically or manually—to mark progress and enable reliable recovery. This batched approach scales Kafka's performance for high-volume event streams.^[44]^[45]

Replication and Fault Tolerance

Since Apache Kafka 4.0 (released in 2025), clusters operate in KRaft mode by default, where metadata—including controller election—is managed internally via a Raft consensus protocol among broker nodes, eliminating the dependency on Apache ZooKeeper. Apache Kafka ensures data durability and availability through a robust replication mechanism that distributes partitions across multiple brokers, allowing the system to withstand broker failures without data loss. Each topic partition is replicated to a configurable number of brokers, typically set to three in production environments for a balance between redundancy and resource usage, using the replication.factor configuration. This replication model treats Kafka as a distributed commit log, where data is appended sequentially and synchronized across replicas to prevent single points of failure.^[46]^[47] Kafka employs a leader-follower replication protocol for each partition, where one broker serves as the leader, handling all client read and write requests, while the remaining brokers act as followers that passively replicate the leader's log. Followers maintain their own copies of the partition log by periodically sending fetch requests to the leader, pulling new data in batches controlled by parameters such as replica.fetch.max.bytes (default: 1 MB) and replica.fetch.min.bytes (default: 1 byte). This asynchronous replication ensures high throughput, as writes are acknowledged by the leader without waiting for all followers, though producers can specify acknowledgment levels (e.g., acks=all) to require confirmation from in-sync replicas for stronger durability guarantees.^[46]^[48]^[49] A key aspect of fault tolerance is the in-sync replicas (ISR) mechanism, which dynamically tracks the set of followers that are fully caught up with the leader, defined by being within a lag threshold specified by replica.lag.time.max.ms (default: 30 seconds). With Eligible Leader Replicas (ELR, enabled by default since Kafka 4.1 via KIP-966), min.insync.replicas is configured at the cluster level (default: 1, often set to 2 in production), and only eligible replicas within the ISR can become leaders, preventing data loss scenarios like the "last replica standing." The ISR list is maintained by the leader and shared with the active controller; only writes acknowledged by at least min.insync.replicas members of the ISR are considered committed, preventing data loss if the leader fails while followers lag. This configurable threshold allows tuning for availability versus durability—for instance, setting min.insync.replicas=2 with a replication factor of 3 ensures that writes succeed as long as at least two brokers are in sync.^[46]^[50]^[51] The log replication protocol uses a high-water mark (HW) to denote the offset up to which data is considered replicated and safe for consumers to read, ensuring that committed messages are durable across the ISR. Consumers only see messages up to the HW, which advances as followers confirm replication, providing ordered and consistent reads even during failures. To handle leader failures, the active controller detects the outage via heartbeat mechanisms and triggers leader election, preferentially selecting a new leader from the eligible ISR to avoid data loss; unclean leader elections, where out-of-sync replicas could become leaders, are disabled by default (unclean.leader.election.enable=false) to prioritize consistency over availability.^[46]^[52] Fault recovery is automated through partition reassignment and preferred replica election, managed by the controller. Upon failure, the controller reassigns the partition leadership to an in-sync follower, minimizing downtime, and tools like kafka-preferred-replica-election.sh can be used to restore leaders to their preferred brokers for balanced load distribution. This process supports seamless failover, with the replication factor determining the system's tolerance—for example, a factor of three allows survival of up to two broker failures per partition without data unavailability. Overall, these mechanisms enable Kafka clusters to maintain high availability, with recovery times typically under a minute in well-configured setups.^[46]^[53]

APIs and Interfaces

Producer and Consumer APIs

The Producer API in Apache Kafka provides a Java and Scala interface for applications to publish streams of records to one or more topics. It operates asynchronously by default, allowing the KafkaProducer class to send records via the send(ProducerRecord, Callback) method, where each record includes a key-value pair, topic, partition (optional), timestamp, and headers.^[41] The callback mechanism enables handling of asynchronous acknowledgments, invoking onCompletion(RecordMetadata, Exception) upon success or failure after the broker processes the send request.^[41] Acknowledgment modes control the level of durability for sent records, configurable via the acks parameter. Setting acks=0 implements fire-and-forget semantics with no broker acknowledgment, prioritizing throughput but risking data loss. acks=1 requires acknowledgment only from the partition leader, balancing performance and reliability. acks=all (or -1) demands confirmation from all in-sync replicas, ensuring the highest durability against failures.^[44] The Consumer API enables applications to subscribe to one or more topics and process streams of records using a poll-based interface. The KafkaConsumer class fetches batches of records via the poll([Duration](/page/Duration)) method, which blocks until data is available or the timeout expires, supporting efficient, asynchronous consumption.^[42] Consumer groups facilitate coordinated consumption, where multiple consumers share partitions via a group coordinator; the group.id configuration identifies the group, and protocols like the consumer rebalance protocol handle partition assignment and offset management. Deserializers convert byte arrays from the broker into application objects, specified via key.deserializer and value.deserializer (e.g., StringDeserializer for string keys and values).^[42] Key configurations for both APIs include bootstrap.servers, a comma-separated list of broker addresses (e.g., localhost:9092) for initial cluster discovery and metadata fetching. Producers require serializers for keys and values (e.g., key.serializer=org.apache.kafka.common.serialization.StringSerializer), while consumers use corresponding deserializers. For producers, linger.ms (default: 5 ms) delays sending to allow batching multiple records into a single request, improving throughput at the cost of added latency; batch.size (default: 16 KB) limits batch payload size. Consumers use enable.auto.commit (default: true) to automatically commit offsets after polling, with auto.commit.interval.ms (default: 5000 ms) controlling commit frequency.^[44]^[45] Error handling in the Producer and Consumer APIs relies on configurable retries and application-level logic. Producers retry failed sends up to retries attempts (default: Integer.MAX_VALUE since Kafka 2.1.0, previously 0), with delivery.timeout.ms (default: 120000 ms) bounding the total time including retries; retriable exceptions like network timeouts trigger automatic retries, while non-retriable ones invoke the callback's exception handler. Consumers handle errors during polling via try-catch blocks for exceptions like OffsetOutOfRangeException, manually adjusting offsets or reprocessing records as needed. Dead-letter queues, while not built into the core APIs, are a common pattern where failed records are redirected to a dedicated topic for later inspection or reprocessing, implemented by producers sending erroneous records to an error topic upon callback failure.^[44]^[42] Since Kafka 0.11.0 (released in 2017), the Producer API has supported idempotence and transactions for exactly-once semantics. Idempotence, enabled by enable.idempotence=true, assigns a unique producer ID and sequence numbers to records, allowing brokers to deduplicate retries without application changes, requiring acks=all and max.in.flight.requests.per.connection=1 (default: 5). Transactions, configured via a unique transactional.id, enable atomic writes across multiple partitions using methods like initTransactions(), beginTransaction(), send(), commitTransaction(), and abortTransaction(), ensuring all-or-nothing delivery via a two-phase commit protocol and an internal __transaction_state topic. These features build on the new message format introduced in 0.11.0, enhancing reliability for distributed applications. In Kafka 4.0 (released March 2025), transaction support was extended for improved compatibility with KRaft mode.^[54]^[55]

Connect API

The Kafka Connect API, introduced in Apache Kafka version 0.9.0 in November 2015, provides a framework for building scalable and reliable pipelines to import data from external sources into Kafka topics or export data from Kafka to external sinks.^[56] This API abstracts the complexity of integrating Kafka with diverse systems, such as databases or file systems, by enabling continuous, at-scale data movement without requiring custom code for each integration.^[57] It operates independently of the core Producer and Consumer APIs, focusing instead on declarative configuration for extract-transform-load (ETL) workflows. At the core of the Connect API are connectors, which are pluggable components that define the logic for data ingestion or egress. Source connectors pull data into Kafka from external systems, such as JDBC databases or local files, while sink connectors push data out to destinations like HDFS or Elasticsearch.^[58] These connectors are typically distributed through the Apache Kafka community or third-party repositories, allowing users to configure them via simple properties files that specify connection details, topics, and batching behaviors.^[59] Connect workers are the runtime processes that execute connectors, supporting both standalone mode for single-node deployments and distributed mode for production-scale operations. In distributed mode, multiple workers form a cluster coordinated through Kafka topics, providing automatic load balancing and failover. A REST interface exposes endpoints for managing the cluster, such as creating, pausing, or deleting connectors, enabling programmatic administration without direct access to worker processes.^[60] Each connector delegates its workload to one or more tasks, which are the atomic units of parallelism responsible for polling sources or flushing sinks; tasks scale horizontally by assigning more instances to available workers. Fault tolerance is ensured by committing task offsets—records of processed data positions—to dedicated internal Kafka topics, allowing seamless recovery from failures without data loss or duplication.^[61] For handling structured data, the Connect API integrates with the Schema Registry to manage schema evolution, supporting formats like Avro and Protobuf through compatibility modes such as backward or forward evolution. This integration ensures that changes to data schemas, such as adding optional fields, are validated and propagated consistently across connectors, preventing runtime errors in evolving pipelines. In Kafka 4.0 and later, Connect workers gained native support for KRaft metadata, eliminating ZooKeeper dependency.^[62]

Streams API

Kafka Streams is a lightweight, embeddable client library for building real-time stream processing applications directly on Apache Kafka, introduced in version 0.10.0 released in May 2016. It supports Java and Scala programming languages and enables stateful, fault-tolerant processing where both input and output data are stored in Kafka topics, eliminating the need for external processing clusters.^[12] The library allows applications to act as both consumers and producers, processing unbounded streams of records in parallel across multiple instances for scalability.^[63] The Streams Domain-Specific Language (DSL) offers a high-level, declarative API for common stream processing operations, abstracting complex low-level details. It supports stateful transformations such as joins between streams (KStream-KStream with windowed inner, left, or outer joins requiring co-partitioning), tables (KTable-KTable with non-windowed joins), or streams and tables (KStream-KTable with non-windowed inner or left joins).^[64] Aggregations include rolling operations like aggregate, count, and reduce on grouped streams or tables to compute sums or other reductions per key.^[65] Windowing enables time-based grouping for aggregations and joins, with types including tumbling windows (fixed-size, non-overlapping intervals, e.g., 5-minute periods), hopping windows (overlapping fixed intervals), and session windows (dynamic, gap-based merging of records within inactivity periods like 5 minutes).^[66] These features are materialized as KTables or windowed KTables, leveraging automatic state management.^[67] For more flexible control, the Processor API provides a low-level, imperative interface to build custom processing logic by defining individual processors that handle one record at a time. Developers implement the Processor interface with methods like process() for record handling, init() for setup, and close() for cleanup, using ProcessorContext to forward outputs, schedule punctuators (e.g., every 1000 ms for periodic tasks), and access metadata.^[68] Custom topologies connect these processors with source, sink, and state store nodes via the Topology builder, supporting both stateless (e.g., simple transformations) and stateful operations. State stores, essential for aggregations or deduplication, default to RocksDB as an embeddable key-value engine for persistent, local storage, with fault tolerance ensured through compacted changelog topics that log all updates for recovery.^[69]^[70] A Kafka Streams application defines its data flow as a processor topology, a directed acyclic graph (DAG) of interconnected nodes including sources (input topics), processors (transformations), and sinks (output topics).^[71] This structure allows parallel execution across stream tasks partitioned by input topics, with data flowing unidirectionally from sources through processors to sinks. Exactly-once semantics are natively supported via transactional commits that atomically coordinate input record consumption, local state updates, and output production, ensuring no duplicates or losses even during failures—enabled by configuring processing.guarantee=exactly_once (alias for exactly_once_v2 since Kafka 2.7.0).^[72]^[73] Global state stores extend local stores by broadcasting the entire content of an input topic to every application instance, enabling shared, read-only access without partitioning constraints—useful for reference data like dimensions in joins (e.g., via GlobalKTable).^[74] Unlike partitioned stores, global stores do not use changelog topics for restoration but replicate the source topic directly, ensuring all instances maintain identical state copies.^[75] Interactive queries allow external clients to access the results of stream processing by querying state stores in running Kafka Streams instances, supporting both local (direct access via KafkaStreams.store()) and remote (via RPC layers like REST) invocations.^[76] Queries are read-only on built-in types like key-value or window stores, with metadata APIs (allMetadataForStore(), metadataForKey()) aiding discovery of relevant instances in distributed deployments; custom stores require implementing QueryableStoreType for compatibility.^[77] This feature enables real-time querying of aggregates or joins without additional infrastructure.^[78]

Admin API

The Admin API in Apache Kafka provides a Java client library for performing administrative tasks on a Kafka cluster, enabling management and inspection of topics, brokers, configurations, access control lists (ACLs), and other resources. Introduced in Kafka version 0.11.0.0, the API supports asynchronous operations that return KafkaFuture objects for handling results, and it requires a minimum broker version of 0.10.0.0.^[79] The client is thread-safe and can be created using static factory methods like Admin.create(Properties), allowing administrators to interact with the cluster programmatically without relying solely on command-line tools.^[79] In Kafka 4.0 (March 2025), the Admin API fully supports KRaft metadata for ZooKeeper-free clusters. Key operations include creating and deleting topics via the createTopics and deleteTopics methods, which accept collections of topic specifications and return futures for tracking completion.^[79] Configurations can be altered using alterConfigs (deprecated since version 2.3.0) or the preferred incrementalAlterConfigs for dynamic updates to broker, topic, or client settings without restarts.^[79] Cluster inspection is facilitated by describeCluster, which retrieves metadata such as the cluster ID and broker nodes, while listConsumerGroups enumerates active consumer groups with details on their states and members.^[79] Partition management features allow reassigning partitions across brokers with alterPartitionReassignments to balance load or recover from failures, introduced in version 2.4.0.^[79] The number of partitions can be increased using createPartitions (since version 1.0.0), supporting scalable topic growth, and preferred leader election is handled by electLeaders (since version 2.4.0) to optimize partition leadership for performance.^[79] The Admin client exposes operational metrics through its metrics() method, which returns a Metrics object integrable with JMX for monitoring broker and consumer health, including request latencies, success rates, and resource usage under the kafka.admin.client MBean domain.^[80] This integration allows tools like JConsole to track administrative activity in real time, aiding in cluster diagnostics.^[80] Security management is supported through ACL operations, such as createAcls and deleteAcls for adding or removing permissions on resources, and describeAcls for querying existing bindings, all available since version 0.11.0.0 to enforce fine-grained access control.^[79] These methods use AclBinding objects to specify principals, operations, and resources, ensuring secure administrative interactions in authorized environments.^[79]

Advanced Topics

Security Features

Apache Kafka provides robust security mechanisms to protect data in transit and at rest, ensuring secure communication between clients, brokers, and controllers while controlling access to cluster resources. These features include authentication to verify identities, authorization to enforce permissions, encryption to safeguard data confidentiality, audit logging to track security-related activities, and specialized protections in its KRaft mode for metadata management. Implemented through configurable protocols and tools, these capabilities allow Kafka to meet enterprise-grade security requirements in distributed environments.^[81] Authentication in Kafka is handled primarily through SASL (Simple Authentication and Security Layer) mechanisms and mutual TLS (mTLS) for client-broker and inter-broker connections. SASL/PLAIN supports simple username/password authentication, configured via the sasl.jaas.config property in client and broker settings, providing a straightforward mechanism for non-encrypted environments when paired with other protections.^[82] For stronger enterprise integration, SASL/GSSAPI enables Kerberos-based authentication, requiring configurations like sasl.kerberos.service.name and a JAAS login module to map Kerberos principals to Kafka users, ensuring secure identity verification in Kerberos-enabled networks.^[82] Additionally, mTLS uses client certificates for bidirectional authentication, specified through security.protocol=SSL or SASL_SSL in listener configurations, where brokers and clients exchange X.509 certificates to establish trusted connections without relying on shared secrets.^[83] Authorization relies on Access Control Lists (ACLs) to define fine-grained, resource-level permissions, allowing administrators to specify operations such as read, write, or describe on topics, consumer groups, and cluster-wide resources. The StandardAuthorizer, enabled via authorizer.class.name=kafka.security.authorizer.AclAuthorizer, evaluates these ACLs against authenticated principals to enforce access policies, supporting role-based access control (RBAC) patterns by grouping permissions for users or roles.^[84] Integration with external authentication systems is facilitated through SASL/JAAS configurations or custom principal.builder.class implementations, which can leverage directory services for principal resolution, though direct LDAP binding requires compatible JAAS modules.^[84] ACLs for administrative tasks, such as managing permissions, can be applied using the Admin API.^[85] Encryption secures data both in transit and at rest. For data in transit, SSL/TLS protocols encrypt inter-broker traffic and client communications, configured by setting security.inter.broker.protocol=SSL or SASL_SSL and providing keystore/truststore files for certificate management, ensuring confidentiality across the network.^[83] At rest, Kafka does not provide built-in encryption for log segments but supports it through underlying filesystem encryption (e.g., via OS-level tools like LUKS) or tiered storage integrations where remote stores like S3 handle encryption, allowing sensitive data to remain protected on disk without impacting core broker operations.^[86] Audit logging captures security events such as authentication attempts, authorization decisions, and RBAC enforcements, configurable through Log4j2 appenders in the broker's log4j.properties file to log details like principal actions and outcomes for compliance and forensics.^[81] This enables administrators to monitor and audit access patterns, with logs supporting RBAC by recording permission checks against ACLs. In KRaft mode—the only metadata management mode as of Apache Kafka 4.0 (2025), which fully replaces the former ZooKeeper dependency with a Raft-based quorum—security extends to quorum authentication using SASL/SSL protocols on controller.quorum.listeners for secure controller-to-broker communication, preventing unauthorized metadata access.^[13]^[87] Metadata logs are encrypted via the same SSL/TLS configurations applied to broker listeners, ensuring quorum-voted state remains confidential, while ACLs protect metadata operations at the cluster level.^[13]

Performance and Scalability

Apache Kafka achieves high throughput through several optimization techniques integrated into its core design. Zero-copy I/O minimizes data copying between kernel and user space, allowing direct transfer from the file system page cache to the network socket, which reduces CPU overhead and enables efficient handling of large data volumes. Compression codecs such as Snappy and LZ4 are supported at the producer level, reducing network and disk I/O by compressing batches of messages before transmission; Snappy offers a balance of speed and compression ratio, while LZ4 provides faster decompression for read-heavy workloads.^[88] Batch sizing further enhances efficiency by grouping multiple messages into a single request, controlled via parameters like batch.size (default 16KB) and linger.ms (default 5ms), which amortizes overhead and can increase throughput by up to 10x compared to single-message sends.^[44] Scalability in Kafka is primarily enabled by partitioning topics into shards distributed across brokers, allowing horizontal scaling where additional partitions parallelize load and throughput grows linearly with the number of partitions and consumers.^[89] Cluster sizing guidelines recommend starting with 3-5 brokers for production workloads to ensure fault tolerance via replication factors of 3, with each broker handling 100-1000 partitions depending on message size and retention; for high-volume setups, workloads are segmented by topic to avoid overloading individual brokers.^[90]^[91] Effective monitoring is crucial for maintaining performance at scale, focusing on metrics such as under-replicated partitions, which indicate replication lag and potential data loss risks (alert if greater than 0), and consumer lag, measuring the offset difference between producers and consumers to detect processing bottlenecks.^[92] Tools like Cluster Manager for Apache Kafka (CMAK, formerly Kafka Manager) provide web-based interfaces for visualizing these metrics, broker health, and partition distribution across clusters. Benchmarks demonstrate Kafka's capacity for extreme scale; for instance, a three-broker cluster on commodity hardware achieved over 2 million writes per second with sub-millisecond latencies using optimized configurations.^[93] Typical per-partition throughput exceeds 1 million messages per second under ideal conditions, though actual figures vary with hardware and tuning. The adoption of KRaft mode—the only metadata management mode since Apache Kafka 4.0 (2025), which replaces ZooKeeper—yields significant improvements, including up to 8x faster metadata operations in large clusters by streamlining quorum-based consensus.^[13]^[34]^[87] Tuning Kafka involves JVM heap allocation (recommend 6-8GB per broker with G1GC for low-latency garbage collection), disk I/O optimization via SSDs to handle sequential writes exceeding 500 MB/s per broker, and network bandwidth provisioning of at least 10 Gbps to sustain high-throughput replication without bottlenecks.^[94]^[95]

Use Cases and Integrations

Common Applications

Apache Kafka is widely deployed for log aggregation, where it centralizes logs from distributed microservices to facilitate monitoring, debugging, and security analysis in large-scale systems. In this role, Kafka acts as a durable buffer that collects high-throughput log data from various sources, enabling real-time ingestion and downstream processing without overwhelming storage systems. For instance, Netflix utilizes Kafka to aggregate billions of application logs daily from its microservices, supporting operational metrics and troubleshooting across its global streaming infrastructure.^[96]^[97] In real-time analytics, Kafka enables event sourcing to power applications like fraud detection and recommendation systems by providing low-latency access to streaming event data. Organizations leverage Kafka's partitioning and replication to process incoming events in parallel, allowing for immediate pattern recognition and decision-making. Uber, for example, employs Kafka to handle petabytes of real-time data for fraud detection in its mobility services, integrating it with stream processing to identify suspicious activities during ride-hailing transactions.^[98]^[99] Kafka supports data integration by facilitating extract-transform-load (ETL) pipelines that replace traditional batch processing with continuous, event-driven flows, particularly for handling activity streams and user interactions. This approach ensures data freshness and scalability in dynamic environments, where topics serve as centralized conduits for routing data between systems. At LinkedIn, Kafka powers the ingestion and distribution of activity streams, processing trillions of events daily to integrate user data across feeds, notifications, and analytics workflows.^[100] For microservices communication, Kafka provides asynchronous decoupling through its publish-subscribe model, allowing services to exchange events without direct dependencies, which enhances fault tolerance and independent scaling. Producers publish events to topics, while consumers subscribe selectively, reducing coupling and enabling resilient architectures in polyglot environments. This pattern is commonly used to coordinate workflows in cloud-native applications, where Kafka's durability ensures message delivery even during service failures.^[101] In IoT and telemetry scenarios, Kafka manages high-volume sensor data streams by offering scalable ingestion and buffering for continuous data flows from edge devices. It supports protocols like MQTT for efficient connectivity, enabling real-time routing of telemetry to analytics engines for applications such as predictive maintenance and anomaly detection. Companies in manufacturing and smart cities use Kafka to process millions of sensor events per second, ensuring low-latency handling of time-series data across distributed networks.^[102]^[103]

Ecosystem Integrations

Apache Kafka integrates seamlessly with a wide array of big data processing frameworks, enabling efficient data pipelines for both batch and stream processing workloads. For instance, Apache Spark utilizes Kafka as a scalable data source and sink through its Structured Streaming API, which supports exactly-once semantics for reading from and writing to Kafka topics, facilitating real-time analytics on streaming data.^[104] Similarly, Apache Flink's Kafka connector allows for low-latency stream joins and stateful processing, providing exactly-once guarantees when ingesting data from Kafka topics and outputting results back to them.^[105] In cloud environments, Kafka benefits from managed services that abstract infrastructure management while preserving core Kafka compatibility. Confluent Cloud offers a fully managed Apache Kafka service across AWS, Azure, and Google Cloud, handling scaling, security, and updates to support enterprise-grade streaming applications.^[106] Amazon Managed Streaming for Apache Kafka (MSK) provides a fully managed Kafka service on AWS, automating cluster provisioning, patching, and monitoring for high-throughput data streaming.^[107] Azure Event Hubs extends Kafka protocol compatibility, allowing existing Kafka applications to connect without code changes for ingesting and processing events at scale.^[108] Schema management is enhanced through tools like the Confluent Schema Registry, which enables backward and forward compatibility for evolving data schemas in formats such as Avro and Protobuf, ensuring reliable serialization and deserialization across Kafka producers and consumers.^[109] This registry stores versioned schemas and integrates with Kafka serializers to prepend schema IDs to messages, preventing data inconsistencies in distributed systems. For monitoring, the Kafka Exporter exposes JMX metrics from Kafka brokers and clients in a Prometheus-compatible format, allowing collection of key performance indicators like lag and throughput.^[110] These metrics can then be visualized using Grafana dashboards, which provide pre-built panels for tracking topic-level statistics, consumer group offsets, and broker health in real-time.^[111] Advanced processing capabilities extend Kafka's ecosystem with tools like ksqlDB, a streaming SQL engine from Confluent that allows developers to query, filter, and transform Kafka topics using familiar SQL syntax, built directly on Kafka Streams for persistent, elastic stream processing.^[112] Additionally, integration with Elasticsearch via Kafka Connect sinks enables real-time indexing of Kafka data for full-text search and analytics, supporting bulk operations and schema mapping to Elasticsearch indices.^[113]

References

[1]
Introduction - Apache Kafka
Jun 25, 2020 · Kafka is a distributed system consisting of servers and clients that communicate via a high-performance TCP network protocol.
[2]
Apache Kafka: Past, Present and Future | Confluent
Apache Kafka was developed by LinkedIn in 2010, released as open source in 2011, and is now used by over 30% of Fortune 500 companies.
[3]
Use Cases - Apache Kafka
Kafka is often used for operational monitoring data. This involves aggregating statistics from distributed applications to produce centralized feeds of ...
[4]
What is RabbitMQ? [An Introduction] - Confluent
Kafka is a purely distributed log designed for efficient event streaming at a high scale. RabbitMQ is an open-source message broker software that serves as a ...
[5]
Kafka vs. Other Systems (REST, Enterprise Service Bus, Database)
How does Apache Kafka® compare to an ESB, MQ, pub/sub system, or database? This guide compares Kafka vs traditional messaging systems, common misconceptions ...
[6]
https://developer.confluent.io/learn/event-streaming-vs-related-trends/
[7]
https://kafka.apache.org/documentation/#design
[8]
Books & Papers - Apache Kafka
The following academic papers and publications cover Apache Kafka and/or the subject of event streaming in general. KSQL: Streaming SQL Engine for Apache Kafka ...Missing: original | Show results with:original
[9]
https://kafka.apache.org/books-and-papers
[10]
https://kafka.apache.org/documentation/#semantics
[11]
Apache Kafka Streams
Kafka Streams is a client library for building applications and microservices, where the input and output data are stored in Kafka clusters.Tutorial: Write App · Developer Guide · 9.6 Upgrade Guide · Core Concepts
[12]
https://kafka.apache.org/documentation/streams/
[13]
None
Nothing is retrieved...<|separator|>
[14]
First Apache release for Kafka is out! | LinkedIn Engineering
Jan 6, 2012 · January 6, 2012 ... We are pleased to announce the first release of Kafka from the Apache incubator. Kafka is a distributed, persistent, high ...Neha Narkhede · Incubator Progress · Mirroring
[15]
How did Kafka get its name? - Quora
Nov 16, 2016 · Because Kafka means crow in Czech. Franz Kafka (famous writer whose name the boy bears) spoke both German and Czech and lived in Czechoslovakia.What is the relation between Kafka, the writer, and Apache ... - QuoraWas Kafka a German writer despite not being born in Germany?More results from www.quora.com
[16]
Kafka Incubation Status
The Kafka project graduated on 2012-10-23, after entering incubation on 2011-07-04. It is a high throughput distributed publish/subscribe messaging system.
[17]
Kafka 0.8
Release Notes - Kafka - Version 0.8.0. New Feature. [KAFKA-50] - kafka intra-cluster replication support; [KAFKA-188] - Support multiple data directories ...
[18]
Announcing Apache Kafka 0.9.0.1 and Confluent Platform 2.0.1
Feb 19, 2016 · A few months ago, we announced the major release of Apache Kafka 0.9, which added several new features like Security, Kafka Connect, the new
[19]
Kafka Summit Now Sold Out As Anticipation for Inaugural User ...
On Monday, April 25, the Apache Kafka community will hold its first Stream Data Hackathon in San Francisco. ... “Simplifying Event Streaming: Tools for ...
[20]
Download - Apache Kafka
Kafka 4.0.1 fixes 49 issues since the 4.0.0 release. For more information, please read our blog post and the detailed Release Notes. 3.9.1. Released ...
[21]
Apache Kafka: Benefits and Use Cases - Confluent
Kafka benefits Uber and Lyft in the matching of drivers ... As part of the platform's transition to original content creation, Netflix adopted Apache Kafka ...
[22]
Apache Kafka 4.0.0 Release
Apache Kafka 4.0 is a significant milestone, marking the first major release to operate entirely without Apache ZooKeeper®. By running in KRaft mode by default, ...
[23]
KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum
Jul 29, 2019 · Currently, Kafka uses ZooKeeper to store its metadata about partitions and brokers, and to elect a broker to be the Kafka Controller. We would ...Metadata as an Event Log · Overview · Broker Metadata Management
[24]
KRaft Overview - Confluent Documentation
Kafka Raft (KRaft) is the consensus protocol that greatly simplifies Kafka's architecture by consolidating responsibility for metadata into Kafka itself.
[25]
Apache Kafka 4.0 Release: Default KRaft, Queues, Faster Rebalances
Mar 18, 2025 · Apache Kafka 4.0 is a significant milestone, marking the first major release to operate entirely without Apache ZooKeeper™️. By running in KRaft ...
[26]
KIP-866 ZooKeeper to KRaft Migration - Apache Software Foundation
Sep 8, 2022 · This is accomplished by writing two copies of the metadata during the migration – one to the KRaft quorum, and one to ZooKeeper. This KIP ...
[27]
From ZooKeeper to KRaft: How the Kafka migration works - Strimzi
Mar 21, 2024 · Through this blog post we are going to describe the main differences between using ZooKeeper and KRaft to store the cluster metadata and how the migration from ...
[28]
Migrate from ZooKeeper to KRaft on Confluent Platform
Migrating from ZooKeeper to KRaft means migrating existing metadata from Kafka brokers that are using ZooKeeper to store metadata, to brokers that are using a ...
[29]
KIP-932: Queues for Kafka - Apache Software Foundation
May 15, 2023 · In early access (Apache Kafka 4.0), KIP-932 can be used for familiarization and experimentation, but not production use. It is disabled in ...
[30]
Let's Take a Look at... KIP-932: Queues for Kafka! - Gunnar Morling
Mar 5, 2025 · KIP-932: Queues for Kafka adds a long awaited capability to the Apache Kafka project: queue-like semantics, including the ability to acknowledge messages on a ...
[31]
Apache Kafka 4.1 Release: New Features & Upgrade Guide
Sep 4, 2025 · Ready to get started with Apache Kafka 4.1.0? Check out all the details in the upgrade notes and the release notes, and download Apache Kafka ...
[32]
Apache Kafka's KRaft Protocol: How to Eliminate Zookeeper ... - OSO
Jul 29, 2025 · Discover how Apache Kafka's KRaft protocol eliminates Zookeeper dependencies and delivers 8x performance improvements.
[33]
The various tiers of Apache Kafka Tiered Storage - Strimzi
Apr 22, 2025 · Apache Kafka tiered storage has two tiers: local storage, typically block storage, and remote storage, often external like Amazon S3.<|control11|><|separator|>
[34]
https://oso.sh/blog/apache-kafkas-kraft-protocol-how-to-eliminate-zookeeper-and-boost-performance-by-8x/
[35]
https://strimzi.io/blog/2025/04/22/tha-various-tiers-of-apache-kafka-tiered-storage/
[36]
Differences Between KRaft mode and ZooKeeper mode
In KRaft mode, Kafka eliminates its dependency on ZooKeeper, and the control plane functionality is fully integrated into Kafka itself. The process roles are ...Missing: core components
[37]
Apache Kafka documentation
Kafka is a distributed system consisting of servers and clients that communicate via a high-performance TCP network protocol.Kafka Streams · Kafka protocol guide · Kafka Streams Developer Guide · 3.9.X
[38]
Kafka protocol guide
This document covers the wire protocol implemented in Kafka. It is meant to give a readable guide to the protocol that covers the available requests.
[39]
Apache Kafka
Summary of each segment:
[40]
Apache Kafka
Summary of each segment:
[41]
https://kafka.apache.org/documentation/#producerapi
[42]
https://kafka.apache.org/documentation/#consumerapi
[43]
https://kafka.apache.org/documentation/#producerconfigs_enable.idempotence
[44]
https://kafka.apache.org/documentation/#producerconfigs
[45]
https://kafka.apache.org/documentation/#consumerconfigs
[46]
https://kafka.apache.org/documentation/#design_replication
[47]
https://kafka.apache.org/documentation/#topicconfigs_replication.factor
[48]
https://kafka.apache.org/documentation/#brokerconfigs_replica.fetch.max.bytes
[49]
https://kafka.apache.org/documentation/#producerconfigs_acks
[50]
https://kafka.apache.org/documentation/#brokerconfigs_replica.lag.time.max.ms
[51]
https://cwiki.apache.org/confluence/x/mpOzDw
[52]
Apache Kafka
Below is a merged summary of the "Transactions and Idempotence in Kafka 0.11.0" segments, combining all provided information into a concise yet comprehensive response. To maximize detail and clarity, I’ve organized key aspects into tables where appropriate, while retaining narrative explanations for context. All unique details from the summaries are included, with no information omitted.
[53]
KAFKA-1316
Release Notes - Kafka - Version 0.9.0.0. Sub-task. [KAFKA-1316] - Refactor Sender; [KAFKA-1328] - Add new consumer APIs; [KAFKA-1329] - Add metadata fetch ...
[54]
https://kafka.apache.org/0110/documentation.html#exactlyonce
[55]
https://kafka.apache.org/40/documentation.html#transactions
[56]
https://archive.apache.org/dist/kafka/0.9.0.0/RELEASE_NOTES.html
[57]
https://kafka.apache.org/documentation/#connect_intro
[58]
https://kafka.apache.org/documentation/#connect_connectors
[59]
Kafka Streams Basics for Confluent Platform
An application that uses the Kafka Streams API acts as both a producer and a consumer. The data: Data is stored in topics. The topic is the most important ...Duality Of Streams And... · Time · Processing Guarantees<|control11|><|separator|>
[60]
https://kafka.apache.org/documentation/#connect_running
[61]
https://kafka.apache.org/documentation/#connect_offset
[62]
https://kafka.apache.org/40/documentation.html#connect
[63]
https://docs.confluent.io/platform/current/streams/concepts.html
[64]
Apache Kafka
### Summary of Processor API, Custom Topologies, State Stores, and Changelog
[65]
Performance Tuning RocksDB for Kafka Streams' State Stores
Mar 10, 2021 · This blog post will cover key concepts that show how Kafka Streams uses RocksDB to maintain its state and how you can tune RocksDB for Kafka Streams' state ...Kafka Streams basics · State store basics · Introduction to RocksDB
[66]
https://kafka.apache.org/23/documentation/streams/developer-guide/dsl-api.html#windowing
[67]
https://docs.confluent.io/platform/current/streams/concepts.html#windowing
[68]
https://kafka.apache.org/documentation/streams/developer-guide/processor-api.html
[69]
Exactly-once Semantics is Possible: Here's How Apache Kafka Does it
Jun 30, 2017 · In this post, I'd like to tell you what Kafka's exactly-once semantics mean, why it is a hard problem, and how the new idempotence and transaction features in ...What Is Exactly-Once... · Exactly-Once Semantics in... · What Is Exactly-Once for...
[70]
https://kafka.apache.org/documentation/streams/developer-guide/processor-api.html#state-stores
[71]
https://kafka.apache.org/41/documentation/streams/core-concepts#processor-topology
[72]
Interactive Queries - Apache Kafka
Interactive queries in Kafka Streams allow accessing application state from outside, locally or remotely, by querying local state stores and using RPC layers.Missing: global | Show results with:global
[73]
Kafka Streams Interactive Queries for Confluent Platform
Interactive queries in Kafka Streams allow accessing application state from outside, locally or remotely, by connecting state fragments and using an RPC layer.
[74]
https://kafka.apache.org/documentation/streams/developer-guide/processor-api.html#adding-global-state-stores
[75]
Admin (kafka 3.6.2 API)
The administrative client for Kafka, which supports managing and inspecting topics, brokers, configurations and ACLs. Instances returned from the create ...
[76]
Apache Kafka
Summary of each segment:
[77]
https://docs.confluent.io/platform/current/streams/developer-guide/interactive-queries.html
[78]
https://kafka.apache.org/41/documentation/streams/core-concepts#interactive-queries
[79]
https://kafka.apache.org/36/javadoc/org/apache/kafka/clients/admin/Admin.html
[80]
https://kafka.apache.org/documentation/#monitoring
[81]
https://kafka.apache.org/documentation/#security
[82]
https://kafka.apache.org/documentation/#sasl_authentication
[83]
https://kafka.apache.org/documentation/#security_ssl
[84]
https://kafka.apache.org/documentation/#security_authz
[85]
Best practices for right-sizing your Apache Kafka clusters to optimize ...
Mar 17, 2022 · This post explains how the underlying infrastructure affects Apache Kafka performance. We discuss strategies on how to size your clusters.Missing: scalability | Show results with:scalability
[86]
Best Practices for Kafka Production Deployments in Confluent Platform
This section talks about configuring settings dynamically, changing log levels, partition reassignment and deleting topics.
[87]
Monitoring Kafka with JMX in Confluent Platform
This metric provides the consumer lag in offsets only and does not report latency. In addition, it is not reported for any groups that are not alive or are ...
[88]
Benchmarking Apache Kafka: 2 Million Writes Per Second (On ...
Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines) ; 821,557 records/sec. (78.3 MB/sec) ; 786,980 records/sec. (75.1 ...Jay Kreps · Kafka In 30 Seconds · Consumer Throughput
[89]
https://kafka.apache.org/documentation/#intro_topics
[90]
Kafka performance: 7 critical best practices - Instaclustr
Kafka performance tuning involves optimizing various aspects of an Apache Kafka deployment to ensure it runs efficiently.
[91]
How Netflix Uses Kafka for Distributed Streaming - Confluent
Jan 21, 2020 · Netflix embraces Apache Kafka as the de-facto standard for its eventing, messaging, and stream processing needs.
[92]
From Netflix to Walmart: Open Source Kafka in Action - The New Stack
Mar 4, 2025 · These case studies show four critical areas where Kafka excels: real-time data processing, messaging, operational metrics and log aggregation.
[93]
Real-Time Exactly-Once Ad Event Processing with Apache Flink and ...
Sep 23, 2021 · Uber has automated data ingestion flows through Kafka ... This ensures that we remove duplicates and protects against certain kinds of fraud.
[94]
How Uber Manages Petabytes of Real-Time Data
Oct 15, 2024 · Uber collects petabytes of data to power important features such as customer incentives, fraud detection, and predictions made by machine learning models.
[95]
Kafka - LinkedIn Engineering
Kafka is used extensively throughout our software stack, powering use cases like activity tracking, message exchanges, metric gathering, and more.
[96]
Do Microservices Need Event-Driven Architectures? - Confluent
Sep 30, 2025 · How Kafka Allows Microservices Applications to Communicate Via Asynchronous Events ... Decoupling Through Asynchronous Communication. In ...<|separator|>
[97]
Confluent, MQTT, and Apache Kafka Power Real-Time IoT Use Cases
Jul 15, 2020 · This blog post takes a look at IoT, its relation to the MQTT standard, and options for integrating MQTT with Apache Kafka and Confluent Cloud.
[98]
Build a Real-Time IoT Application with Confluent and Apache Kafka
Oct 27, 2022 · Learn how to build an end-to-end motion detection and alerting system using Confluent Cloud, Kafka clusters, and ksqlDB.
[99]
Structured Streaming + Kafka Integration Guide (Kafka broker ...
A Kafka partitioner can be specified in Spark by setting the kafka.partitioner.class option. If not present, Kafka default partitioner will be used. The ...
[100]
Kafka | Apache Flink
Flink provides an Apache Kafka connector for reading data from and writing data to Kafka topics with exactly-once guarantees.Dependency · Kafka Source · Kafka Rack Awareness · Kafka Sink
[101]
Confluent Cloud, a Fully Managed Apache Kafka® Service
Harness Real-Time Data, Lower Your Costs – Any Scale, Any Cloud. Confluent Cloud is the fully managed deployment of our data streaming platform. Its serverless ...Confluent PricingA Complete ComparisonSupportKafka Best PracticesGoogle Cloud Managed ...
[102]
Amazon Managed Streaming for Apache Kafka (Amazon MSK) - AWS
Amazon MSK is a streaming data service that manages Apache Kafka infrastructure and operations, making it easier for developers and DevOps managers to run ...PricingDeveloper GuideServerlessGet started using Amazon MSKFeatures
[103]
Azure Event Hubs for Apache Kafka - Microsoft Learn
Dec 18, 2024 · This article explains how you can use Azure Event Hubs to stream data from Apache Kafka applications without setting up a Kafka cluster on ...
[104]
Schema Registry for Confluent Platform
Confluent Schema Registry supports Avro, JSON Schema, and Protobuf serializers and deserializers ( serdes ). When you write producers and consumers using these ...Avro Schema Serializer and... · Formats, Serializers, and... · Schema Evolution
[105]
danielqsj/kafka_exporter: Kafka exporter for Prometheus - GitHub
Kafka exporter for Prometheus. Contribute to danielqsj/kafka_exporter development by creating an account on GitHub.
[106]
Kafka Metrics | Grafana Labs
The Kafka Metrics dashboard uses the prometheus data source to create a Grafana dashboard with the gauge, graph, singlestat, stat and table-old panels.
[107]
Database Streaming with ksqlDB - Confluent
ksqlDB seamlessly uses your existing Kafka infrastructure to deploy stream processing in just a few SQL statements. Query, read, write, and process Kafka ...Announcing Confluent Cloud... · Process Your Real-Time Data... · Simplified Architecture...
[108]
Kafka Elasticsearch Connector Tutorial with Examples - Confluent
Mar 4, 2020 · You can take data you've stored in Kafka and stream it into Elasticsearch to then be used for log analysis or full-text search.