Wednesday, May 31, 2023

KIP-932 - Queues for Kafka

Giriş
Şeklen şöyle. Apache Kafka 4.0 ile geliyor.

Aslında JMS gibi çalışır. Açıklaması şöyle
The new Kafka share group feature enable multiple consumers to simultaneously read from a single partition. 

Horizontally scaling and adding more consumers can improve the overall system throughput.

Here’s how Kafka worked before:-
- It divided a single topic into multiple partitions.
- Consumer groups with several consumers read messages from partitions.
-  However, one partition supported a single consumer limiting the throughput.

The old Kafka model didn’t have support for automatic message retries, visibility timeout or dead letter queues. 

To tackle these challenges, developers had to:
- Create more partitions to handle the throughput or hot partitions.
- Manually handle the retries and create new Dead-Letter Queue (DLQ) topics.
- Manage a separate Queue infrastructure like RabbitMQ/SQS for async processing.

The Share group feature can be enabled at a consumer group level. The broker would then track the state of each message and allow consumers to independently process and ack the messages.

Here’s how it solves the previous pain-points:
- Hot partitions can be processed by adding more consumers.
- Out of the box support for retries, and message archival.
- Streaming and message queuing are both supported. This avoids the need for a separate message queue.

However, Share groups doesn’t provide ordering guarantees since multiple messages can be processed simultaneously by two or more consumer. 

Applications such as Financial Ledger systems and e-commerce order management require strict ordering guarantees. 

Such applications wouldn’t benefit from Share Groups as it would violate the business semantics due to out of order processing of messages. 
Nasıl Çalışıyor
Açıklaması şöyle
How It Works (Without the Duplicates!)

You might be thinking: "If two people read the same partition, won't they process the same message?" The answer is NO. 

▪️ Individual Acknowledgments: Unlike the old "high-water mark" offset system, Share Groups track state at the individual message level. 
▪️ In-Flight Locking: When a consumer grabs a message, the broker "locks" it (Acquisition Lock). Other consumers in the group simply skip over it and grab the next available message. 
▪️ Throughput Boost: You can now scale your processing power (adding consumers) completely independently of your storage structure (partitions). 
▪️ Automatic Retries: If a consumer crashes while holding a message, the "lock" expires, and the message automatically becomes available for another consumer to try. 
Örnek
Şöyle yaparız
Properties props = new Properties();
props.setProperty("bootstrap.servers", "localhost:9092");
props.setProperty("enable.auto.commit", "false");
props.setProperty("group.type", "share");
props.setProperty("group.id", "myshare");

KafkaConsumer<String, String> consumer =
  new KafkaConsumer<>(props,
                      new StringDeserializer(),
                      new StringDeserializer());

consumer.subscribe(Arrays.asList("foo"));
while (true) {
  // Fetch a batch of records acquired for this consumer
  ConsumerRecords<String, String> records =
    consumer.poll(Duration.ofMillis(100));

  for (ConsumerRecord<String, String> record : records) {
    doProcessing(record);
  }

  // Commit the acknowledgement of all the records in the batch
  consumer.commitSync();
}




No comments:

Post a Comment

Consumer Failover Across Data Centers

Active-Passive Consumption Across Data Centers Açıklaması şöyle In Kafka, a common consumption pattern for multi-data center setups in...