Wednesday, August 20, 2025

Consumer Failover Across Data Centers

Active-Passive Consumption Across Data Centers
Açıklaması şöyle
In Kafka, a common consumption pattern for multi-data center setups involves making data available in multiple locations but running the consuming service in only one data center (the primary) at a time. This approach is often used to ensure stronger consistency for the service.

If the primary data center becomes unavailable, the consuming service is designed to failover to another data center and resume processing data with minimal disruption.
Şeklen şöyle

Yalnız burada offset replication sorunu var. Açıklaması şöyle
For disaster recovery in active-passive consumption, it is not enough to replicate data between Kafka clusters; you also need to handle consumer group offsets. Because the Kafka clusters are separate and distinct, offsets are not shared and must be translated between clusters. Simply starting the consumer from the earliest or latest offset after failover can lead to data duplication or loss. Proper offset synchronization ensures the consumer resumes processing at the correct point, avoiding these issues.

Cross-DC Failover Options
Bu sorunu çözen bazı ürünler şöyle
1. Stretch Cluster
Açıklaması şöyle
A Kafka stretch cluster is a single cluster with brokers distributed across multiple data centers. By assigning a unique rack ID to each data center, Kafka can ensure that partition replicas are spread across different locations. In the event of a data center failure, the consumer can continue processing from the remaining replicas without needing to synchronize offsets between clusters, since all replicas and offsets are managed within the same stretched cluster.

However, this setup uses a quorum of ZooKeeper nodes or KRaft controllers, and at least three data centers are needed to avoid a split brain scenario. Furthermore, high latency between Agoda’s geographically distant data centers makes this approach impractical for our needs.
Şeklen şöyle


2. Mirror Maker 2
Açıklaması şöyle
MirrorMaker 2 is a tool built on the Kafka Connect framework, designed to replicate topics and offsets from a source Kafka cluster to a target cluster, and uses a set of specialized connectors:
  • MirrorHeartbeatConnector to monitor the health of the replication process,
  • MirrorSourceConnector to replicate data and topic configurations, and
  • MirrorCheckpointConnector to translate and synchronize consumer group offsets.
The main advancement over the original MirrorMaker 1 is MirrorMaker 2’s ability to synchronize consumer group offsets across clusters. This allows consumers to failover from one data center’s Kafka to another.
Şeklen şöyle




Monday, August 18, 2025

Object Storage

Giriş
Açıklaması şöyle
The year 2023 witnessed the emergence of building Kafka on object storage. At least five vendors have introduced a solution like that since 2023. We had WarpStream and AutoMQ in 2023, Confluent Freight Clusters, Bufstream, or Redpanda Cloud Topics in 2024.
Açıklaması şöyle
Each vendor did this with their approach. At the high level, these systems try to speak the Kafka protocol and store complete data in the object storage. 

- Bufstream and Warpstream rewrite the Kafka protocol from scratch. 

- AutoMQ takes a very different approach, leveraging Kafka’s code for the protocol layer to ensure 100% Kafka compatibility while re-implementing the storage layer so the broker can write data to the object storage without sacrificing the latency due to the introduction of the write-ahead log.

Sunday, July 6, 2025

Bufstream - Kafka Muadili

Giriş
Açıklaması şöyle
Bufstream was developed by Buf, a software company founded in 2020 to bring schema-driven development to the world via Protobuf and gRPC for many companies.
Açıklaması şöyle
For the storage, instead of writing to a local disk, Bufstream now writes directly to object storage like AWS S3, Google Cloud Storage, or Azure Blog Storage, allowing these services to be in charge of data durability and availability.

Wednesday, November 13, 2024

kafka-consumer-groups.sh komutu

Giriş
Bir topic'i dinleyen consumer'ları gösterir. Aynı topic'i dinleyen consumer group'ları olabilir. Her topic farklı partition'lara ayrılabildiği için çıktıda group'ların hangi partition'ı dinlediği de görülür.

--delete seçeneği
Örnek
Şöyle yaparız
kafka-consumer-groups.sh --bootstrap-server localhost:9092 --delete --group group-name

--describe seçeneği
Örnek
Şöyle yaparız.
# ./kafka-consumer-groups.sh --describe --bootstrap-server localhost:9092  --describe  
  --group ParserKafkaPipeline
Örnek
Şöyle yaparız. Burada my-topic'i dinleyen iki tane consumer group görülebilir.
# kafka-consumer-groups --bootstrap-server localhost:9092 
--describe --all-groups

GROUP TOPIC  PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG  CONSUMER-ID  HOST        CLIENT-ID
first my-topic  2          0               0            0    consumer-2   /172.18.0.9 consumer-2
first my-topic  0          0               0            0    consumer-2   /172.18.0.9 consumer-2
first my-topic  1          0               0            0    consumer-2   /172.18.0.9 consumer-2

GROUP TOPIC  PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG  CONSUMER-ID  HOST        CLIENT-ID
second my-topic  0          0               0           0    consumer-2   /172.18.0.8 consumer-2
second my-topic  1          0               0           0    consumer-2   /172.18.0.8 consumer-2
second my-topic  2          0               0           0    consumer-2   /172.18.0.8 consumer-2
Açıklaması şöyle.
Sometimes it's useful to see the position of your consumers. We have a tool that will show the position of all consumers in a consumer group as well as how far behind the end of the log they are. To run this tool on a consumer group named my-group consuming a topic named my-topic
--list seçeneği
Örnek
Her şeyi listelemek için şöyle yaparız
kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list
--reset-offsets seçeneği
Örnek
Şöyle yaparız
kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group group-name --reset-offsets --to-earliest --topic topic-name --execute
Örnek
Şöyle yaparız. Burada --dry-run ile ne olacağını test edebiliriz.
kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group group-name --reset-offsets --to-earliest --topic topic-name --dry-run



Wednesday, October 30, 2024

Warpstream - Object-store Based Kafka Muadili

Giriş
Açıklaması şöyle
WarpStream avoids cross-AZ transfer costs by hacking the service discovery to ensure that the client always communicates with the broker in the same AZ. WarpStream’s rewriting of the Kafka protocol plays a vital role here.

AutoMQ

Giriş
Açıklaması şöyle
When bringing Apache Kafka to the cloud, its replication factor causes the leader to send received data to other followers in different Availability Zones (AZs). The data transfer cost may not seem obvious at first compared to compute and storage costs; however, based on observations from Confluent, cross-AZ transfer costs can surprisingly account for more than 50% of the total bill (more on this later).
Açıklaması şöyle
AutoMQ solution is designed to run Kafka efficiently on the cloud by leveraging Kafka’s codebase for the protocol and rewriting the storage layer so it can effectively offload data to object storage with the introduction of the WAL.
Şeklen şöyle

AutoMQ Çözümü - Kafka'dan Iceberg'e
Açıklaması şöyle
The user only needs to set the automq.table.topic.enable to use the Kafka-Iceberg feature.

After enabling it, the producers still use the Kafka protocol to write data for AutoMQ. The brokers first write the data to the Kafka topic, then convert the data into the Iceberg table after batch accumulation in the background. From this time, the query engine can consume this table to serve analytics demands.

AutoMQ will take care of everything from retrieving the schema to committing the writes to the Iceberg catalog. Users no longer need to maintain complex ETL tasks; they only need to use the Kafka API to produce the data, and AutoMQ will seamlessly convert it into Iceberg tables.

Currently, AutoMQ only supports the Table Topic on AWS with different catalogs such as REST, Glue, Nessie, or Hive Metastore. They’re working to expand the support for this feature to other cloud vendors.


Wednesday, September 25, 2024

Kafka'yı Kim Geliştirdi?

Geliştirme Dili
Kafka LinkedIn tarafından Java + Scala kullanılarak geliştirildi. Açıklaması şöyle.
Apache Kafka is an open-source stream-processing software platform developed by LinkedIn and donated to Apache Software Foundation. It is written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency streaming platform for handling and processing real-time data feeds.
Tarihçe
Kafka 2010 yılında geliştirilmeye başladı. Açıklaması şöyle
In 2010, LinkedIn engineers faced the problem of integrating huge amounts of data from their infrastructure into a Lambda architecture. It also included Hadoop and real-time event processing systems. 

As for traditional message brokers, they didn't satisfy LinkedIn's needs. These solutions were too heavy and slow. So, the engineering team developed a scalable and fault-tolerant messaging system without lots of bells and whistles. The new queue manager has quickly transformed into a full-fledged event streaming platform.
Açık Kaynak Olması
2011 yılında açık kaynak oldu ve daha sonra Apache Foundation'a devredildi

 Confluent İle İlişkisi
2014 yılında Kafka'nın geliştiricileri LinkedIn'den ayrıldı ve Confluent şirketini kurdu. Confluent 2021 yılında halka arz edildi

Kafka'nın Sıkıntıları
1. Farklılaşan Gecikme Gereksinimleri
Açıklaması şöyle. Yani Kafka herkesin gecikme isterlerini karşılamıyor
The latency expectations for modern systems have become more polarized. While financial services demand microsecond-level latency for stock trading, other use cases — such as logging or syncing data between operational databases and analytical systems — are fine with second-level latency. A one-size-fits-all solution doesn’t work anymore. Why should a company using Kafka for simple logging pay the same costs as one building mission-critical low-latency applications?
2. Batch systems are building their own ingestion tools
Açıklaması şöyle. Yani veriyi taşımak için farklı seçenekler var
Platforms like Snowflake with Snowpipe, Amazon Redshift with its noETL tool and ClickHouse, which recently acquired PeerDB, now offer built-in streaming data ingestion. These developments reduce the need for Kafka as the go-to system for moving data between environments. Kafka is no longer the only option for feeding data into analytical systems, leading to natural fragmentation in its traditional use cases.
3. Cloud infrastructure has made storage cheaper
çıklaması şöyle. Yani veriyi taşımak için farklı seçenekler var
Object storage solutions like Amazon S3 have become significantly more affordable than compute nodes such as EC2. This makes it increasingly hard to justify using more expensive storage options, especially in a world where companies are constantly optimizing their cloud costs. As a result, Kafka needs to embrace architectures that take advantage of cheaper storage options or risk becoming an overly expensive component in data pipelines.


Consumer Failover Across Data Centers

Active-Passive Consumption Across Data Centers Açıklaması şöyle In Kafka, a common consumption pattern for multi-data center setups in...