Configuration

This article aims to explain the configuration parameters involved in the deployment of AutoMQ, including definitions, descriptions, setting ranges, and specifications, to assist developers in making necessary custom adjustments in a production environment.

AutoMQ is a storage-compute separated version of Apache Kafka®, so it supports all parameters of Apache Kafka® except for multi-replica storage. These parameters are not listed in this document; please refer to the official configuration documentation.

Public Configuration

Elasticstream.enable

List Item	Description
Configuration Description	Whether to start AutoMQ, this parameter must be set to true.
Value Type	boolean
Default Value	false
Legal Input Range	N/A
Importance Level	High, requires careful configuration

S3.endpoint

List Item	Description
Configuration Description	The access point address for object storage. For example: https://s3.{region}.amazonaws.com.
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

S3.region

List Item	Description
Configuration Description	Identifier for the region of the object storage service, refer to the documentation of cloud providers such as us-east-1.
Value Type	string
Default Value	null
Legal Input Range	N/A
Importance Level	High, requires careful configuration

S3.bucket

List Item	Description
Configuration Description	Object storage bucket used for storing messages.
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, configure with caution

S3.path.style

List Item	Description
Configuration Description	Whether to enable object storage path format. Must be set to true when using MinIO as the storage service.
Value Type	boolean
Default Value	false
Valid Input Range	N/A
Importance Level	High, configure with caution

S3Stream Configuration

S3.wal.path

Item	Description
Configuration Description	The mount path for block storage devices used to store local WAL, such as /dev/xxx or other paths.
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, configure with caution

S3.wal.capacity

List Item	Description
Configuration Description	The size of AutoMQ's local WAL (Write-Ahead Logging). This configuration determines the maximum amount of data that can be written to the buffer before being uploaded to object storage. Larger capacity can tolerate more write jitter in the object storage.
Value Type	long, in bytes
Default Value	2147483648
Legal Input Range	[10485760, ...]
Importance Level	Low, set relatively loosely

S3.wal.cache.size

List Item	Description
Configuration Description	The WAL (Write-Ahead Logging) cache is a FIFO (First-In-First-Out) queue containing data that has not yet been uploaded to object storage, as well as data that has been uploaded but not yet evicted from the cache. When the cache data that has not yet been uploaded fills the entire capacity, storage will apply backpressure on subsequent requests until the data upload is completed. By default, it sets a reasonable value based on memory.
Value Type	long, in bytes
Default Value	-1, automatically set to an appropriate value by the program
Valid Input Range	[1, ...]
Importance Level	Low, with a relatively wide range

S3.wal.iops

List Item	Description
Configuration Description	When data is written to the WAL, it is batched and then periodically persisted to the WAL disk. This configuration determines the frequency at which data is written to the WAL. Higher configuration values result in more writes per second and reduced latency. However, since the IOPS performance of different block storage devices varies, setting this value above the device's IOPS limit may lead to increased latency due to queuing.
Value Type	int, in IOPS
Default Value	3000
Valid Input Range	[1, ...]
Importance Level	High, it is recommended to set it to the actual available IOPS value of the WAL disk

S3.wal.upload.threshold

List Item	Description
Configuration Description	The threshold that triggers the WAL to upload to object storage. The configuration value needs to be less than s3.wal.cache.size. The larger the configuration value, the higher the data aggregation, and the lower the metadata storage cost. By default, it will set a reasonable value based on the memory.
Value Type	long, in bytes
Default Value	-1, the program automatically sets an appropriate parameter value
Valid Input Range	[1, ...]
Importance Level	Low, set relatively loosely

S3.block.cache.size

List Item	Description
Configuration Description	s3.block.cache.size specifies the size of the block cache. The block cache is used to cache cold data read from object storage. It is recommended to set this configuration to greater than 4MB * the number of concurrent cold reads per partition to achieve better cold read performance. By default, it will set a reasonable value based on memory.
Value Type	long, in bytes
Default Value	-1, automatically set by the program to an appropriate value
Valid Input Range	[1, ...]
Importance Level	Low, with a relatively broad setting

S3.stream.object.compaction.interval.minutes

List Item	Description
Configuration Description	The interval period for stream object compaction. The larger the interval, the lower the cost of API calls, but it increases the scale of metadata storage.
Value Type	int, in minutes
Default Value	30
Valid Input Range	[1, ...]
Importance Level	Low, set relatively broadly

S3.stream.object.compaction.max.size.bytes

List Item	Description
Configuration Description	Stream object compaction allows the maximum size of synthesized objects. The larger this value, the higher the cost of API calls, but the smaller the scale of metadata storage.
Value Type	long, in bytes
Default Value	1073741824
Legal Input Range	[1, ...]
Importance Level	Low, with relatively broad settings

S3.stream.set.object.compaction.interval.minutes

List Item	Description
Configuration Description	Sets the interval for stream object compaction. The smaller this value, the smaller the scale of metadata storage, and the sooner the data becomes compact. However, the number of compactions that the resulting stream object undergoes will increase.
Value Type	int, in minutes
Default Value	20
Legal Input Range	[1, ...]
Importance Level	Low, with relatively broad settings

S3.stream.set.object.compaction.cache.size

List Item	Description
Configuration Description	The amount of memory available during the stream object compaction process. The larger this value, the lower the cost of API calls.
Value Type	long, in bytes
Default Value	209715200
Valid Input Range	[1048576, ...]
Importance Level	Low, with a relatively broad setting

S3.stream.set.object.compaction.stream.split.size

List Item	Description
Configuration Description	During the Stream object compaction process, if the amount of data in a single Stream exceeds this threshold, the Stream's data will be directly split and written into a single Stream object. The smaller this value, the earlier the data will be split from the Stream set object, reducing the cost of subsequent API calls for Stream object compaction, but increasing the cost of split API calls.
Value Type	long, in bytes
Default Value	8388608
Valid Input Range	[1, ...]
Importance Level	Low, set relatively lenient

S3.stream.set.object.compaction.max.num

List Item	Description
Configuration Description	The maximum number of stream set objects that can be compressed at one time.
Value Type	int
Default Value	500
Valid Input Range	[1, ...]
Importance Level	Low, set relatively wide

S3.network.baseline.bandwidth

List Item	Description
Configuration Description	The total available bandwidth for object storage requests. This is used to prevent stream set object compaction and catch-up reads from occupying normal read and write traffic. Production and consumption will also consume ingress and egress traffic respectively. For example, if this value is set to 100MB/s and the normal read and write traffic is 80MB/s, then the available bandwidth for stream set object compaction is 20MB/s.
Value Type	long, in bytes per second (byte/s)
Default Value	104857600
Valid Input Range	[1, ...]
Importance Level	Low, set relatively wide

S3.stream.allocator.policy

List Item	Description
Configuration Description	S3Stream memory allocator policy. Please note that when configured to use DIRECT memory, you need to adjust the heap size (e.g., -Xmx) and the direct memory size (e.g., -XX:MaxDirectMemorySize) in the virtual machine options. You can set these through the environment variable KAFKA_HEAP_OPTS.
Value Type	string
Default Value	POOLED_HEAP
Legal Input Range	POOLED_HEAP, POOLED_DIRECT
Importance Level	Low, with relatively wide settings

S3.telemetry.metrics.enable

List Item	Description
Configuration Description	Whether to enable the OTel metrics exporter.
Value Type	boolean
Default Value	true
Valid Input Range	true, false
Importance Level	Low, set relatively lenient

S3.telemetry.metrics.level

List Item	Description
Configuration Description	Sets the level of Metrics recording. The "INFO" level includes metrics that most users should care about, such as throughput and latency of common stream operations. The "DEBUG" level includes detailed metrics helpful for diagnostics, such as latencies at different stages of writing to the underlying block device.
Value Type	string
Default Value	INFO
Valid Input Range	INFO, DEBUG
Importance Level	Low, relatively loose settings

S3.telemetry.metrics.exporter.type

List Item	Description
Configuration Description	Metrics exporter type list: The "otlp" type will use the OTLP protocol to export metrics to backend services. The "prometheus" type will start a built-in HTTP server that allows the Prometheus backend to scrape metrics from it.
Value Type	list
Default Value	null
Valid Input Range	otlp, prometheus
Importance Level	Low, relatively loose settings

S3.telemetry.exporter.report.interval.ms

Item	Description
Configuration Description	Set the time interval for exporting metrics.
Value Type	int, in milliseconds
Default Value	30000
Valid Input Range	N/A
Importance Level	Low, relatively flexible setting

S3.telemetry.exporter.otlp.protocol

Item	Description
Configuration Description	The transport protocol used by the OTLP exporter.
Value Type	string
Default Value	grpc
Valid Input Range	grpc, http
Importance Level	Low, set relatively broadly

S3.telemetry.exporter.otlp.endpoint

List Item	Description
Configuration Description	The address exposed by the backend service when using the OTLP Exporter.
Value Type	string
Default Value	null
Legal Input Range	N/A
Importance Level	Low, set relatively lenient

S3.telemetry.exporter.otlp.compression.enable

List Item	Description
Configuration Description	Whether the OTLP exporter enables data compression. If enabled, OTLP will use the gzip compression algorithm to compress Metrics.
Value Type	boolean
Default Value	false
Legal Input Range	true, false
Importance Level	Low, set relatively lenient

S3.metrics.exporter.prom.host

List Item	Description
Configuration Description	The built-in Prometheus HTTP server address for exposing OTel Metrics.
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	Low, set relatively loosely

S3.metrics.exporter.prom.port

List Item	Description
Configuration Description	The built-in Prometheus HTTP server port for exposing OTel Metrics.
Value Type	int
Default Value	0
Valid Input Range	N/A
Importance	Low, set relatively broad

Continuous Self-balancing Configuration

Metric.reporters

List Item	Description
Configuration Description	A list of classes for metrics reporters. Implementing the org.apache.kafka.common.metrics.MetricsReporter interface allows for dynamic loading of new metrics. JmxReporter is always included to register JMX statistics. To enable auto-balancing, metric.reporters must include kafka.autobalancer.metricsreporter.AutoBalancerMetricsReporter.
Value Type	list
Default Value	""
Valid Input Range	N/A
Importance	Low, set relatively loosely

Autobalancer.reporter.metrics.reporting.interval.ms

List Item	Description
Configuration Description	The interval for Metrics Reporter to report data.
Value Type	long, in milliseconds
Default Value	10000
Valid Input Range	[1000, ...]
Importance Level	High, requires careful configuration

Autobalancer.controller.enable

List Item	Description
Configuration Description	Enable or disable self-balancing.
Value Type	boolean
Default Value	false
Valid Input Range	N/A
Importance Level	High, requires careful configuration

Autobalancer.controller.anomaly.detect.interval.ms

List Item	Description
Configuration Description	The controller checks for the need to perform data self-balancing at minimum intervals. The actual time of the next self-balancing also depends on the number of partitions that have been reassigned. Reducing the minimum check interval can increase the sensitivity of data reassignment. This value should be greater than the broker metrics reporting interval to prevent the controller from missing recent reassignment results.
Value Type	long, in milliseconds
Default Value	60000
Valid Input Range	[1, ...]
Importance Level	High, should be configured with caution

Autobalancer.controller.metrics.delay.ms

List Item	Description
Configuration Description	When the controller performs load balancing, if the latest metrics delay of the broker exceeds the configured value, the broker will be excluded due to its lag in synchronization status. This configuration should not be less than the broker metrics reporting interval.
Value Type	long, in milliseconds
Default Value	60000
Legal Input Range	[1, ...]
Importance	High, requires careful configuration

Autobalancer.controller.goals

List Item	Description
Configuration Description	Goal settings for self-balancing optimization.
Value Type	list
Default Value	kafka.autobalancer.goals.NetworkInUsageDistributionGoal,kafka.autobalancer.goals.NetworkOutUsageDistributionGoal
Legal Input Range	N/A
Importance Level	High, requires careful configuration

Autobalancer.controller.network.in.usage.distribution.detect.threshold

Item	Description
Configuration Description	The detection threshold for NetworkInUsageDistributionGoal. If a broker's write bandwidth is below this configuration value, then this broker will not be actively reassigned during self-balancing.
Value Type	long, in the unit of byte/s
Default Value	1048576
Valid Input Range	[1, ...]
Importance Level	High, requires careful configuration

Autobalancer.controller.network.out.usage.distribution.detect.threshold

Item	Description
Configuration Description	Detection threshold for NetworkOutUsageDistributionGoal. If a broker's read bandwidth is below this value, it will not be proactively rebalanced.
Value Type	long, measured in byte/s
Default Value	1048576
Valid Input Range	[1, ...]
Importance Level	High, configure with caution

Autobalancer.controller.network.in.distribution.detect.avg.deviation

List Item	Description
Configuration Description	This configuration defines the acceptable deviation range for average write bandwidth. The default value is 0.2, which means the expected network traffic range will be [0.8 * loadAvg, 1.2 * loadAvg].
Value Type	double
Default Value	0.2
Valid Input Range	N/A
Importance Level	High, requires cautious configuration

Autobalancer.controller.network.out.distribution.detect.avg.deviation

List Item	Description
Configuration Description	This configuration defines the acceptable deviation range for average read bandwidth. The default value is 0.2, which means the expected network traffic range will be [0.8 * loadAvg, 1.2 * loadAvg].
Value Type	double
Default Value	0.2
Valid Input Range	N/A
Importance Level	High, requires careful configuration

Autobalancer.controller.exclude.topics

Item	Description
Configuration Description	List of Topics to be excluded from self-balancing.
Value Type	list
Default Value	""
Valid Input Range	N/A
Importance Level	High, requires careful configuration

Autobalancer.controller.exclude.broker.ids

Item	Description
Configuration Description	List of Broker Ids to exclude from self-balancing.
Value Type	list
Default Value	""
Valid Input Range	N/A
Importance Level	High, requires careful configuration

AutoMQ Wiki Key Pages

What is automq

Getting started

Architecture

Deployment

Migration

Observability

Integrations

Data analysis
- RisingWave
- Databend
- Timeplus
- Apache Doris
- Flink
- StarRocks
Object storage
- MinIO
- Ceph
- CubeFS
Kafka ui
- Kafdrop
- Redpanda Console
Observability
- Flashcat
- Guance Cloud
Data integration
- CloudCanal