-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consumer doesn't consume after onLost
#1288
Comments
onLost
onLost
For now, your options are:
|
Correction: #1252 is already part of zio-kafka 2.8.0, so something else is going on. |
The newest log line (first line) indicates that no partitions ( Can you check the java consumer configurations? |
The settings are almost default. Things we changed are |
We're seeing the same thing again as well. In the 2.7.5 version where the What we often see (especially in the case of a low number of instances) is that all instance lose the partitions at the same time. Everything then stops processing, and no rebalances are triggered. We're moving back to the 2.7.5 version, since that has well defined behaviour. |
def shouldPoll = subscriptionState.isSubscribed && (pendingRequests.nonEmpty || pendingCommits.nonEmpty || assignedStreams.isEmpty) What if Are we sure we are clearing the lost partitions from WDYT @erikvanoosten |
Yes, that sounds extremely plausible! Good find! |
Fixes #1288. See also #1233 and #1250. When all partitions are lost after some connection issue to the broker, the streams for lost partitions are ended but polling stops, due to the conditions in `Runloop.State#shouldPoll`. This PR fixes this by removing the lost partition streams from the `assignedStreams` in the state, thereby not disabling polling. Also adds a warning that is logged whenever the assigned partitions (according to the apache kafka consumer) are different from the assigned streams, which helps to identify other issues or any future regressions of this issue. ~Still needs a good test, the `MockConsumer` used in other tests unfortunately does not allow simulating lost partitions, and the exact behavior of the kafka client in this situation is hard to predict..~ Includes a test that fails when undoing the change to Runloop
The issue of no more polling after all partitions were lost is (very likely, assuming our reproduction is fully representative of the issue) fixed in v2.8.3. |
When the broker is down, the consumer loses connection to the broker, and tries to reconnect, then
onLost
happens and after that runloop will never callpoll()
, so there are no new events. Is it a desired behavior? If yes, then how to restart consumer whenonLost
happens? (Last event was consumed at 20:08). Also in the application there are two consumer groups and they read the same topic (in parallel); one fails (onLost
happens), one continues to work (onLost
doesn't happen, since it's connected to a broker that doesn't go down).Related #1250. Version: 2.8.0.
Logs (the first message — the newest)
The text was updated successfully, but these errors were encountered: