-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jetcd watcher is not able to reconnect when etcd leader goes down or when etcd cluster loses its quorum and comes back #1352
Comments
I don't know if this of any help, but I would recommend to verify if the same behavior exists with the latest code |
Hey @lburgazzoli , Thanks this solved the issue. |
this is a little bit tricky because as today, the underlying implementation creates an individual stream, for each watcher, but in the future I would love to be able to use a single stream so the concept of a connection is not really something that would make much sense. Eventually this is something that can be done in general but I don't have much time. Maybe it would be useful to know when an actual subscription actually succeed, for that I would really appreciated it you can do some research and provide a PR. |
@deekshith-n Maybe You use option |
@giri-vsr Thanks for the suggestion. But i use jetcd version 0.7.5 where the above option you have mentioned is not available. But anyway i am able to add retry mechanism when watcher loses it connection when listener throws exception(I close the old watcher and create new one). But sometimes the watcher cannot reconnect when etcd leader pod goes down(Even if i use round robin as load balancer policy while creating client). Any solution for this issue? |
Hello @lburgazzoli , is there a way to get periodic notification as WatchResponse just to track the etcd revision? I know watchOption has something called withProgressNotify() but using that i am not getting any periodic response from it. Is there any code sample how to use it? Please suggest if there is a way. |
I don't have time nowadays to digg into the issue so I woukd recommend to try to debug a little bit the code and provide a Pr with a reproducer so I can take a look |
@deekshith-n ping |
Hi, have you solved this problem now? |
I am using jetcd library for connecting to etcd in java 8. I was trying a reconnecting mechanism whenever etcd goes down. We have 3 etcd pod cluster where we follow leader follower mechanism. When the etcd pod goes down where watcher is connected, the listener throws the exception asynchronously and there i am calling the same function to retry connecting watcher. The code works fine when a follower goes down that is watch is able to reconnect to available etcd pods. But when the leader etcd pod goes down or the etcd cluster loses quorum, the function keeps on retrying but is never able to reconnect. Please let me know how to fix this issue. Please find the code below.
public void watchAndListen(HandlerWrapper<JsonObject> handler) { Watch.Listener listen = Watch.listener(watchHandler(handler), throwable -> { System.out.println("Exception in watch"+ throwable.getCause()); if (throwable instanceof EtcdException) { // Retry mechanism watchAndListen(handler); } }); Watch watchClient = etcdClient.getWatchClient(); watchClient.watch(storeKey, listen); }
To Reproduce
Run the etcd cluster.
Delete the leader pod.
See the watchAndListen keeps on retrying.
Expected behavior
Watcher should be able to reconnect to the etcd pods which are alive in every scenario.
Additional context
I tried different approach. I tried closing the client and recreating new one. This fixed the issue. However it was throwing RejectedExecutionException when i closed the client.
Error in this case:
The text was updated successfully, but these errors were encountered: