-
Notifications
You must be signed in to change notification settings - Fork 5
Kafka: distributing the load
A single pipeline can consume about 1000 eps. In case you have more than 1000 eps in your Kafka topic you need to split the load between pipelines so data could arrive at Anodot in real-time. You can do that by correctly partitioning the data inside the topic
You can split data into different partitions and use multiple threads to consume the data
Kafka data records are converted to data points in Anodot metrics. Data points are processed in the order they arrive at Anodot - out-of-order data points (in the context of the same metric) are discarded.
Kafka guarantees the order of records within a partition. To enable ordered processing of the Kafka records you need to make sure that:
-
The number of partitions is larger than or equal to the number of threads, thus each thread is handling a single partition or more, resulting in ordered handling of the records
-
The producers of a given combination of measurement and dimensions are storing such records to the same partition.
-
You do not use the transformations feature because changing metrics after fetching them from Kafka may affect the ordering
The order of messages arrival to Anodot from a Kafka topic:
For additional information on consumers and ordering, please refer to Kafka documentation
- Home
- CLI reference
- API
- Kubernetes setup using Helm
- Podman setup
- Creating pipelines
- Test sources
- Data formats (JSON, CSV, AVRO, LOG)
- How to parse logs with grok patterns
- How to store sensitive information
- Automated pipelines creation
- Filtering
- Transformation files
- Fields
- DVP Configuration
- Integrations
- Sending events to Anodot