You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When starting a receptor node the RECEPTOR_NODE_ID environment variable can be set and receptor will use as the node ID. The problem with that is: there is no validation that the specified node ID is not being used by another node in the mesh.
When more than one node has the same ID the message routing does not work as expected and therefore can lead to message loss since a message may be routed to the wrong node.
This can be easily seen by doing the following. First run a 3 nodes mesh:
$ poetry run receptor --debug --node-id=controller -d /tmp/controller controller --listen=receptor://127.0.0.1:9999
$ poetry run receptor --debug --node-id=node-b d /tmp/node-b node --listen=receptor://127.0.0.1:9997 --peer=receptor://localhost:9998
The above will start a mesh where controller -> node-a -> node-b. Then run two ping commands in parallel, one pinging node-a and the other pinging node-b. Use the controller node as the peer for both ping commands and set the same RECEPTOR_NODE_ID for both:
$ export RECEPTOR_NODE_ID="15477521-bcc0-446d-abc3-e3d80d57ec6b"
$ poetry run receptor -d /tmp/ping-a ping --peer=receptor://127.0.0.1:9999 --delay 0 --count 10 node-a {"initial_time": "2020-03-06T18:27:28.124886", "response_time": "2020-03-06 18:27:28.158935", "active_work": []}{"initial_time": "2020-03-06T18:27:28.130734", "response_time": "2020-03-06 18:27:28.164155", "active_work": []}{"initial_time": "2020-03-06T18:27:28.138053", "response_time": "2020-03-06 18:27:28.179573", "active_work": []}{"initial_time": "2020-03-06T18:27:28.146100", "response_time": "2020-03-06 18:27:28.192977", "active_work": []}{"initial_time": "2020-03-06T18:27:28.158007", "response_time": "2020-03-06 18:27:28.206066", "active_work": []}{"initial_time": "2020-03-06T18:27:28.162428", "response_time": "2020-03-06 18:27:28.211155", "active_work": []}WARNING 2020-03-06 13:27:28,237 receptor Received response to acccd167-2361-4b85-8d86-655bf1c70489 but no record of sent message.WARNING 2020-03-06 13:27:28,247 receptor Received response to 2624d5ed-6815-481a-b16a-ef13844087ac but no record of sent message.WARNING 2020-03-06 13:27:28,252 receptor Received response to 2248820e-0009-42f6-91b6-93cce83ef8df but no record of sent message.WARNING 2020-03-06 13:27:28,256 receptor Received response to ebd639c8-1d23-4757-8270-247a1b357257 but no record of sent message.^C
$ export RECEPTOR_NODE_ID="15477521-bcc0-446d-abc3-e3d80d57ec6b"
$ poetry run receptor -d /tmp/ping-b ping --peer=receptor://127.0.0.1:9999 --delay 0 --count 10 node-b {"initial_time": "2020-03-06T18:27:28.120859", "response_time": "2020-03-06 18:27:28.156398", "active_work": []}{"initial_time": "2020-03-06T18:27:28.125913", "response_time": "2020-03-06 18:27:28.172401", "active_work": []}{"initial_time": "2020-03-06T18:27:28.133262", "response_time": "2020-03-06 18:27:28.188726", "active_work": []}WARNING 2020-03-06 13:27:28,207 receptor Received response to 9762e4e9-3fba-414c-9e4f-496ca0382634 but no record of sent message.{"initial_time": "2020-03-06T18:27:28.140208", "response_time": "2020-03-06 18:27:28.205756", "active_work": []}WARNING 2020-03-06 13:27:28,232 receptor Received response to 0e5de5d0-9c3e-40e0-b0db-3672c1b3399b but no record of sent message.WARNING 2020-03-06 13:27:28,244 receptor Received response to 327ec7dc-d779-451e-988d-ba01e99bb36d but no record of sent message.WARNING 2020-03-06 13:27:28,250 receptor Received response to c7547678-997e-4c38-8b29-d243d9e0ca4b but no record of sent message.{"initial_time": "2020-03-06T18:27:28.163389", "response_time": "2020-03-06 18:27:28.237886", "active_work": []}{"initial_time": "2020-03-06T18:27:28.170925", "response_time": "2020-03-06 18:27:28.246019", "active_work": []}^C
Observe the WARNING messages on both ping command logs. Because both nodes had the same node ID and therefore the router sent incorrectly a message to a node that wasn't the expected one.
All the above is summarized by the following:
The text was updated successfully, but these errors were encountered:
When starting a receptor node the
RECEPTOR_NODE_ID
environment variable can be set and receptor will use as the node ID. The problem with that is: there is no validation that the specified node ID is not being used by another node in the mesh.When more than one node has the same ID the message routing does not work as expected and therefore can lead to message loss since a message may be routed to the wrong node.
This can be easily seen by doing the following. First run a 3 nodes mesh:
$ poetry run receptor --debug --node-id=controller -d /tmp/controller controller --listen=receptor://127.0.0.1:9999
$ poetry run receptor --debug --node-id=node-a -d /tmp/node-a node --listen=receptor://127.0.0.1:9998 --peer=receptor://localhost:9999
$ poetry run receptor --debug --node-id=node-b d /tmp/node-b node --listen=receptor://127.0.0.1:9997 --peer=receptor://localhost:9998
The above will start a mesh where
controller
->node-a
->node-b
. Then run two ping commands in parallel, one pingingnode-a
and the other pingingnode-b
. Use thecontroller
node as the peer for both ping commands and set the sameRECEPTOR_NODE_ID
for both:Observe the WARNING messages on both ping command logs. Because both nodes had the same node ID and therefore the router sent incorrectly a message to a node that wasn't the expected one.
All the above is summarized by the following:
The text was updated successfully, but these errors were encountered: