-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buggy cluster initialization #233
Comments
You can refer to the comment here If you want to have multiple members in the cluster at the time of the initialization of each member, you can let the state manager return a pre-defined cluster config, which contains the member list you want to have. Please refer to the comment here: |
Oh... thanks for the clarification, @greensky00. |
Agreed. Will update the page as well as examples. |
Hello, I believe the behavior described in this issue is easily misunderstood and should be addressed. The inconsistency in cluster listings across different nodes can cause confusion. Additionally, in real-world environments, it's useful for the system to tolerate certain potential misoperations. I've submitted a fix pull request #504 for this issue. I would appreciate it if you could take a look at it when convenient. |
Hello. I found 2, IMO, buggy examples.
Run
echo_server
example in different terminals nodes:Example 1
Then add node 2 to the first server and to the third:
Ok. Let's see the list of nodes on each server:
How can it be, that node 2 is listed now in, actually, two clusters with different leaders?
Note, that node 2 really follows node 1, it ignores logs from node 3.
If I shut down node 1, some time later node 2 starts following node 3, although the following command doesn't even show node 3 and the current leader:
calc 2> list server id 1: localhost:10001 server id 2: localhost:10002
Another funny thing occurs when I restart node 1 and add node 2 to it (because after the restart, it thinks it's the only leader node in the cluster). Node 2 still continue receiving logs from node 3 only, but it thinks that node 1 is now a leader:
calc 2> list server id 1: localhost:10001 (LEADER) server id 2: localhost:10002
Example 2
Add node 2 to node 1, then add node 1 to node 3.
Now the picture is as follows:
Node 1 accepts logs from node 3 although it doesn't even report that node 3 exists. Node 2 accepts nothing, it's just dummy.
How I fell into that situation: My task is a distributed application, each node knows only the list of other nodes. If a cluster isn't formed (e.g. it's the first time it's run or the previous cluster has decayed), then it should be formed automatically once two or more nodes are alive (w/o "run all applications and only after that pick a leader and register all other nodes on the leader"). Therefore I've tried the following naive approach: when a node starts, it adds (
add_srv
) other nodes. But suddently I found two problems:What's going on in the examples? How to assemble the cluster automatically based only on the list of nodes (some of them may be offline at the moment)?
The text was updated successfully, but these errors were encountered: