You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently we make a simple connection to the node to check if it's up. It means that even if something is broken but the node accepts incoming connections it's status will be saved as up.
The easiest idea I have is to:
Create a new monitoring stellar-core instance.
Generate the config file using nodes.js file, add all nodes to quorum set. Remove history entries (except local one that will obviously do nothing).
Check /quorum endpoint missing field. If a node is there it means it's down.
Regenerate a config file with new nodes every X hours. If the list has changed, restart core.
Questions:
What will happen when len(missing)>fail_at? Does the node continue updating quorum information?
Will it continue to work with, say, 200 nodes in quorum set? I tested this with 39 nodes we currently have in the Dashboard and it's been working fine (for a couple minutes so far).
Currently we make a simple connection to the node to check if it's up. It means that even if something is broken but the node accepts incoming connections it's status will be saved as
up
.The easiest idea I have is to:
nodes.js
file, add all nodes to quorum set. Remove history entries (except local one that will obviously do nothing)./quorum
endpointmissing
field. If a node is there it means it's down.Questions:
len(missing)>fail_at
? Does the node continue updating quorum information?CC: @MonsieurNicolas @vogel
The text was updated successfully, but these errors were encountered: