From ff7c49a5078d6a60ccb4f012ccef351c3ad7a3bf Mon Sep 17 00:00:00 2001 From: Erik van Oosten Date: Sun, 10 Nov 2024 16:25:03 +0100 Subject: [PATCH 1/2] Document when zio-kafka is faster than raw java kafka --- docs/index.md | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/docs/index.md b/docs/index.md index 1c8e97d3a..d4a06fac1 100644 --- a/docs/index.md +++ b/docs/index.md @@ -4,7 +4,9 @@ title: "Getting Started with ZIO Kafka" sidebar_label: "Getting Started" --- -[ZIO Kafka](https://github.com/zio/zio-kafka) is a Kafka client for ZIO. It provides a purely functional, streams-based interface to the Kafka client and integrates effortlessly with ZIO and ZIO Streams. +[ZIO Kafka](https://github.com/zio/zio-kafka) is a Kafka client for ZIO. It provides a purely functional, streams-based interface to the Kafka +client and integrates effortlessly with ZIO and ZIO Streams. Often zio-kafka programs have a _higher_ throughput than +programs that use the Java Kafka client directly (see section [Performance](#performance) below). @PROJECT_BADGES@ [![Scala Steward badge](https://img.shields.io/badge/Scala_Steward-helping-blue.svg?style=flat&logo=)](https://scala-steward.org) @@ -135,3 +137,15 @@ Want to see your company here? [Submit a PR](https://github.com/zio/zio-kafka/ed * [KelkooGroup](https://www.kelkoogroup.com) * [Rocker](https://rocker.com) +## Performance + +By default, zio-kafka programs process partitions in parallel. The default java Kafka client does not provide parallel +processing. Of course, there is some overhead in buffering records and distributing them to the fibers that need them. +On 2024-11-23, we estimated this overhead to be 1.2 ms per 1k records +(comparing [benchmarks](https://zio.github.io/zio-kafka/dev/bench/) `throughput` and `kafkaClients`, using the standard +GitHub Action runners (4 cores), and with the Kafka broker in the same JVM). This means that for this particular +combination, when processing needs more than 1.2 ms per 1k records, a zio-kafka based program will have **higher +throughput** than a program based on a java Kafka client. + +If you do not care for the convenient ZStream based API that zio-kafka brings, and latency is of absolute importance, +using the java based Kafka client directly is still the better choice. From 9bb18ed004a32d6d72f0b0c86ed270c1bc2dc173 Mon Sep 17 00:00:00 2001 From: Erik van Oosten Date: Tue, 3 Dec 2024 10:23:40 +0100 Subject: [PATCH 2/2] Refer to blog article --- docs/index.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/docs/index.md b/docs/index.md index d4a06fac1..da7d2dc72 100644 --- a/docs/index.md +++ b/docs/index.md @@ -139,13 +139,12 @@ Want to see your company here? [Submit a PR](https://github.com/zio/zio-kafka/ed ## Performance -By default, zio-kafka programs process partitions in parallel. The default java Kafka client does not provide parallel +By default, zio-kafka programs process partitions in parallel. The default java-kafka client does not provide parallel processing. Of course, there is some overhead in buffering records and distributing them to the fibers that need them. -On 2024-11-23, we estimated this overhead to be 1.2 ms per 1k records -(comparing [benchmarks](https://zio.github.io/zio-kafka/dev/bench/) `throughput` and `kafkaClients`, using the standard -GitHub Action runners (4 cores), and with the Kafka broker in the same JVM). This means that for this particular -combination, when processing needs more than 1.2 ms per 1k records, a zio-kafka based program will have **higher -throughput** than a program based on a java Kafka client. +On 2024-11-23, we estimated that zio-kafka consumes faster than the java-kafka client when processing takes more than +~1.2ms per 1000 records. The precise time depends on many factors. Please +see [this article](https://day-to-day-stuff.blogspot.com/2024/12/zio-kafka-faster-than-java-kafka.html) for more +details. If you do not care for the convenient ZStream based API that zio-kafka brings, and latency is of absolute importance, using the java based Kafka client directly is still the better choice.