Kafka: distributing the load

A single pipeline can consume about 1000 eps. In case you have more than 1000 eps in your Kafka topic you need to split the load between pipelines so data could arrive at Anodot in real-time. You can do that by correctly partitioning the data inside the topic

Partitioning the Kafka topic

You can split data into different partitions and use multiple threads to consume the data Kafka partitions

Managing the order of Kafka data records when streaming to Anodot

Kafka data records are converted to data points in Anodot metrics. Data points are processed in the order they arrive at Anodot - out-of-order data points (in the context of the same metric) are discarded.

Kafka guarantees the order of records within a partition. To enable ordered processing of the Kafka records you need to make sure that:

The number of partitions is larger than or equal to the number of threads, thus each thread is handling a single partition or more, resulting in ordered handling of the records
The producers of a given combination of measurement and dimensions are storing such records to the same partition.
You do not use the transformations feature because changing metrics after fetching them from Kafka may affect the ordering

The order of messages arrival to Anodot from a Kafka topic: Kafka order

For additional information on consumers and ordering, please refer to Kafka documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kafka: distributing the load

Partitioning the Kafka topic

Managing the order of Kafka data records when streaming to Anodot

Clone this wiki locally