Geliştirme Dili
Kafka LinkedIn tarafından Java + Scala kullanılarak geliştirildi. Açıklaması şöyle.
Apache Kafka is an open-source stream-processing software platform developed by LinkedIn and donated to Apache Software Foundation. It is written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency streaming platform for handling and processing real-time data feeds.
Tarihçe
Kafka 2010 yılında geliştirilmeye başladı. Açıklaması şöyleAçık Kaynak OlmasıIn 2010, LinkedIn engineers faced the problem of integrating huge amounts of data from their infrastructure into a Lambda architecture. It also included Hadoop and real-time event processing systems.As for traditional message brokers, they didn't satisfy LinkedIn's needs. These solutions were too heavy and slow. So, the engineering team developed a scalable and fault-tolerant messaging system without lots of bells and whistles. The new queue manager has quickly transformed into a full-fledged event streaming platform.
2011 yılında açık kaynak oldu ve daha sonra Apache Foundation'a devredildi
Confluent İle İlişkisi
2014 yılında Kafka'nın geliştiricileri LinkedIn'den ayrıldı ve Confluent şirketini kurdu. Confluent 2021 yılında halka arz edildi
Kafka'nın Sıkıntıları
1. Farklılaşan Gecikme Gereksinimleri
Açıklaması şöyle. Yani Kafka herkesin gecikme isterlerini karşılamıyor
The latency expectations for modern systems have become more polarized. While financial services demand microsecond-level latency for stock trading, other use cases — such as logging or syncing data between operational databases and analytical systems — are fine with second-level latency. A one-size-fits-all solution doesn’t work anymore. Why should a company using Kafka for simple logging pay the same costs as one building mission-critical low-latency applications?
2. Batch systems are building their own ingestion tools
Açıklaması şöyle. Yani veriyi taşımak için farklı seçenekler var
Platforms like Snowflake with Snowpipe, Amazon Redshift with its noETL tool and ClickHouse, which recently acquired PeerDB, now offer built-in streaming data ingestion. These developments reduce the need for Kafka as the go-to system for moving data between environments. Kafka is no longer the only option for feeding data into analytical systems, leading to natural fragmentation in its traditional use cases.
3. Cloud infrastructure has made storage cheaper
çıklaması şöyle. Yani veriyi taşımak için farklı seçenekler var
Object storage solutions like Amazon S3 have become significantly more affordable than compute nodes such as EC2. This makes it increasingly hard to justify using more expensive storage options, especially in a world where companies are constantly optimizing their cloud costs. As a result, Kafka needs to embrace architectures that take advantage of cheaper storage options or risk becoming an overly expensive component in data pipelines.