Kafka Çorbası: Consumer Failover Across Data Centers

Active-Passive Consumption Across Data Centers

Açıklaması şöyle

In Kafka, a common consumption pattern for multi-data center setups involves making data available in multiple locations but running the consuming service in only one data center (the primary) at a time. This approach is often used to ensure stronger consistency for the service.

If the primary data center becomes unavailable, the consuming service is designed to failover to another data center and resume processing data with minimal disruption.

Şeklen şöyle

Yalnız burada offset replication sorunu var. Açıklaması şöyle

For disaster recovery in active-passive consumption, it is not enough to replicate data between Kafka clusters; you also need to handle consumer group offsets. Because the Kafka clusters are separate and distinct, offsets are not shared and must be translated between clusters. Simply starting the consumer from the earliest or latest offset after failover can lead to data duplication or loss. Proper offset synchronization ensures the consumer resumes processing at the correct point, avoiding these issues.

Cross-DC Failover Options

Bu sorunu çözen bazı ürünler şöyle

1. Stretch Cluster

Açıklaması şöyle

A Kafka stretch cluster is a single cluster with brokers distributed across multiple data centers. By assigning a unique rack ID to each data center, Kafka can ensure that partition replicas are spread across different locations. In the event of a data center failure, the consumer can continue processing from the remaining replicas without needing to synchronize offsets between clusters, since all replicas and offsets are managed within the same stretched cluster.

However, this setup uses a quorum of ZooKeeper nodes or KRaft controllers, and at least three data centers are needed to avoid a split brain scenario. Furthermore, high latency between Agoda’s geographically distant data centers makes this approach impractical for our needs.

Şeklen şöyle

2. Mirror Maker 2

Açıklaması şöyle

MirrorMaker 2 is a tool built on the Kafka Connect framework, designed to replicate topics and offsets from a source Kafka cluster to a target cluster, and uses a set of specialized connectors:
MirrorHeartbeatConnector to monitor the health of the replication process,
MirrorSourceConnector to replicate data and topic configurations, and
MirrorCheckpointConnector to translate and synchronize consumer group offsets.
The main advancement over the original MirrorMaker 1 is MirrorMaker 2’s ability to synchronize consumer group offsets across clusters. This allows consumers to failover from one data center’s Kafka to another.

Şeklen şöyle

Kafka Çorbası

Wednesday, August 20, 2025

Consumer Failover Across Data Centers

No comments:

Post a Comment

Consumer Failover Across Data Centers

Report Abuse

Labels