-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aim for minimal use of TCP in the Derecho code base #118
Comments
Of course, RDMA connections need to be setup and they need TCP (or some out-of-band mechanism) to exchange data such as memory addresses, queue pair information etc. When I say TCP-free, I do not mean that we get rid of that. That is also only required at connection setup time and involves only a few bytes of data transfer. It, therefore, does not factor into performance at all. |
From my understanding of ViewManager, there are three remaining tasks that use TCP:
I created a separate issue, #157, for the task of converting state transfer to use RDMA, because this seems like it has the greatest potential for performance improvement and is easiest to separate from the other tasks. |
Since it's not likely that we will completely eliminate TCP sockets from ViewManager any time soon (see issues #118 and #157), we should at least make our usage of TCP less confusing. The port named "rpc_port" in all of our configuration files is actually not used for RPC operations at all, but for transferring Views and object state between nodes during a view change. Renaming this port will make it clear that there is no RPC activity going over TCP.
Since it's not likely that we will completely eliminate TCP sockets from ViewManager any time soon (see issues #118 and #157), we should at least make our usage of TCP less confusing. The port named "rpc_port" in all of our configuration files is actually not used for RPC operations at all, but for transferring Views and object state between nodes during a view change. Renaming this port will make it clear that there is no RPC activity going over TCP.
Edward's work seems to have resolved this issue as of v2.0 |
Since
libfabric
can emulate RDMA over TCP, we should just use RDMA operations throughout the code so that we are TCP free on a cluster that has RDMA. What happens when we add external clients will not be affected by this.We have already in the past migrated our P2P layer to use RDMA. A major (and may be the only) subsystem that uses TCP is the view change system that uses it to send states, logs, views etc. We should create some temporary RDMA P2P connections to transfer these at high speeds.
The text was updated successfully, but these errors were encountered: