-
-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Understanding use of raft+serf in jocko #140
Comments
What it says in that article worries me, because I believe that membership changes of a consensus cluster need to be done in a controlled manner. In other words:
The section of that article headed "How Jocko handles state changing requests" is fine. It shows topic state updates going via Raft, which is the right thing to do. But that process doesn't require any gossip. I suppose the fundamental question is this: in the long run, is it intended that in a Jocko cluster, ALL brokers are Raft servers, or A SUBSET of brokers are Raft servers? If the answer is "ALL brokers": then there is no need for gossip, since every broker is a raft node, and it can learn all it needs to know through raft. But this setup won't scale: a 20-node broker cluster also implies a 20-node Raft cluster. Plus, there will be problems growing and shrinking the cluster, as you have to change the raft membership at the same time as you change the broker topology. If the answer is "A SUBSET of brokers": then the gossip protocol does have value (to locate at least one Raft server node), and it will scale. But then you effectively have a separate pool of brokers and a separate pool of raft servers. You might as well just run a separate consul or etcd, which is much simpler to understand and manage. |
@candlerb so if am understanding you correctly, serf’s role with service discovery and configuration is redundant because raft can store all the config that needed. Therefore we can remove serf as a dependency? |
IFF every broker will also be a raft node/replica, then there's no need for service discovery. Consul separates the concepts for scalability. If you have 1000 nodes in a data centre, you definitely do not want 1000 raft replicas (with a quorum of 501) as it would be terribly inefficient. What you want is 3 or 5 stable raft replicas, plus a way for the other 997 or 995 nodes to find them. |
Not all brokers will be raft node's/replicas cause it's not necessary. The plan is to make only a subset of brokers have to run raft, all run serf for service discovery. |
The point for building in serf/raft is that it isn't that much work to do so, and I don't have to rely on another service. There are also advantages in terms of control and more hooks to tie into. |
IMO that is the right answer; the one which scales anyway. But looking at the command line options, there doesn't seem to be any way at the moment to create a cluster with a mixture of raft and non-raft brokers; add or remove a raft broker; add or remove a non-raft broker; and so on. These are non-trivial operations, especially changing the number of raft nodes. You might also want to have nodes which run raft but are not a broker at all (e.g. a set of reliable, well-connected machine but without much disk space). There is much complexity to add yet before this works. However, if you use a separate service like consul or etcd, all these aspects are taken care of, and documented. To me, that's the right way to do it: follow the Unix model. Each component does one thing and does it well. Connect together the components you need. |
Are there any plans to support etcd in addition to the current raft+serf schema? |
I'd like to better understand how jocko is intended to use raft and serf, inclulding how the number of broker nodes is managed (i.e. how this interacts with performing membership changes in the consensus algorithm)
I'll start by outlining a couple of other systems to compare.
Kafka
In Kafka, you have a completely separate Zookeeper instance for storing state with its consensus algorithm. You could have (for example) a 3-node Zookeeper, but a cluster of 10 Kafka brokers.
AFAICS, Zookeeper is manually configured with its set of nodes in
zookeeper.properties
, and Kafka is configured with the list of Zookeeper nodes to communicate with inzookeeper.connect
There is a very specific process for growing the number of nodes in the Zookeeper cluster without breaking quorum.
Consul
Consul's architecture has:
Configuration of each agent marks whether it is a
-server
or not. It's recommended that no more than 5 agents are running in server mode. The process of adding and removing server nodes without breaking Raft is documented. In particular, to shrink the Raft cluster you have to issue a "leave" command to the running cluster.Adding a new server is straightforward. You can issue a
join
command pointing at any existing node (even a non-server node). It will learn the set of server nodes and configure itself correctly.Jocko
This is where I'm unclear. It seems to me that the design of Jocko is that every broker is also a Raft node. This has a number of implications:
There seems to be little point in having a discovery/gossip protocol (serf) when you already know that the local broker is a member of the raft cluster. Finding the leader must be done via the raft protocol anyway.
Increasing the number of brokers will increase the size of the raft cluster, possibly beyond the optimum of 5, or to an even number.
Adding or removing brokers for data scaling/replication must also change the membership of the Raft cluster - a process which must be done with care. I can't see a way to issue a "leave" command to a running node. This is not something you could do over the Kafka protocol, so does this mean that each Jocko node will need a separate management socket for sending such commands?
Questions
It's not clear to me why Jocko chose to integrate Raft and Serf directly, rather than make use of a separate service (e.g. consul, etcd). Certainly it makes firing up a quick one-node cluster easier if there's only one daemon. But longer term, I think that a separate consensus cluster would be better tested, have well-documented bootstrap / scale up / scale down semantics, and allow you to scale the number of broker nodes separately from the number of consensus nodes.
It might be that Jocko eventually plans to allow different types of node (Broker+Raft and Broker Only). If so, I think it will require a bunch more configuration which will make it more complicated to setup - at which point, a separate consensus service would probably be clearer and simpler to operate, especially as you may already have it running for other purposes.
Also, the process for safely removing a Broker+Raft node will have to be carefully designed and tested - whereas with Kafka, the process for removing a broker node (which involves redistributing topics between brokers) is completely independent from removing a zookeeper node (which involves changing the member set of the consensus protocol)
The text was updated successfully, but these errors were encountered: