You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We run kage to monitor our kafka brokers and run in to the following issue:
After a consumer group coordinator changes, the following line is logged a few thousand times each minute:
kage-host kage-kafka-host[9999]: t=2019-11-21T06:25:27+0000 lvl=eror msg="monitor: cannot get group topic offsets 3: kafka server: Request was for a consumer group that is not coordinated by this broker."
// Coordinator returns the coordinating broker for a consumer group. It will// return a locally cached value if it's available. You can call// RefreshCoordinator to update the cached value. This function only works on// Kafka 0.8.2 and higher.Coordinator(consumerGroupstring) (*Broker, error)
// RefreshCoordinator retrieves the coordinator for a consumer group and stores it// in local cache. This function only works on Kafka 0.8.2 and higher.RefreshCoordinator(consumerGroupstring) error
So at least once the error above occurs, RefreshCoordinator needs to be called for that group.
The quick&dirty fix would be to set a flag once this error occurs at least once, and call RefreshCoordinator on all groups once if the flag is set.
Would you accept a PR that does that?
Additionally, I think kage should detect if a log message repeats and not write the same message thousands of time, but something like "last message repeated 3123 times".
Would you accept a PR that does that?
Thanks in advance!
Cheers,
Kosta
The text was updated successfully, but these errors were encountered:
Hi!
We run kage to monitor our kafka brokers and run in to the following issue:
After a consumer group coordinator changes, the following line is logged a few thousand times each minute:
I believe this is due to this line:
https://github.com/msales/kage/blob/master/kafka/monitor.go#L295
According to sarama v1.19.0 which is used here:
https://github.com/Shopify/sarama/blob/v1.19.0/client.go#L68
So at least once the error above occurs,
RefreshCoordinator
needs to be called for that group.The quick&dirty fix would be to set a flag once this error occurs at least once, and call RefreshCoordinator on all groups once if the flag is set.
Would you accept a PR that does that?
Additionally, I think kage should detect if a log message repeats and not write the same message thousands of time, but something like "last message repeated 3123 times".
Would you accept a PR that does that?
Thanks in advance!
Cheers,
Kosta
The text was updated successfully, but these errors were encountered: