-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5/15 Bodies] No block bodies to write in this log period block number=27910849 #51
Comments
workaround has been to identify node being stuck and restarting service something like this #!/bin/bash
# Variables
LOG_FILE="bsc.log"
PATTERN="No block bodies to write"
MIN_OCCURRENCES=5
NODE_RESTART_CMD="systemctl restart your-node.service"
# Function to count consecutive occurrences of the pattern with the same block number
count_occurrences() {
local log_file="$1"
local pattern="$2"
local count=0
local prev_block_number=""
grep -F "$pattern" "$log_file" | while read -r line; do
if [[ $line == *"$pattern"* ]]; then
block_number=$(echo "$line" | grep -oP 'block number=\K\d+')
if [[ "$block_number" == "$prev_block_number" ]]; then
((count++))
else
count=1
fi
if (( count >= MIN_OCCURRENCES )); then
return 0
fi
prev_block_number="$block_number"
fi
done
return 1
}
# Main loop
while true; do
if count_occurrences "$LOG_FILE" "$PATTERN"; then
echo "Restarting node due to consecutive occurrences of '$PATTERN' with the same block number in $LOG_FILE"
$NODE_RESTART_CMD
fi
sleep 60
done |
Suggest to use the latest mdbx snapshot for archive node: https://github.com/bnb-chain/bsc-snapshots |
Following our call today, here are the 'ETL' logs we see in the client container
We are not sure that these logs are relevant to the issue, but we are also running other network on Erigon without issue and we don't the see these 'ETL' logs there. Also could you provide the configuration on which you are running your own archive and full nodes? |
got it, we are checking on it. |
@broyeztony For very short-term, you could use something like following script to detect "No block write.." and restart automatically. For the long-term, we're analyzing carefully about the p2p sentry code and will optimize it soon.
|
Also, can refer to this script |
A restart does not work at all for us. This error continues to come without the node syncing. We used the latest snapshot on the official page. Is there any workaround? |
After every restart:
And it continues to never sync, only |
@broyeztony @fylsan3 After careful debug, the root cause is for two reason:
Please let us know if any issue |
@calmbeing thank you for the update. Despite running with the suggested parameters on Also, these are the startup command parameters for reference
|
I assume you should use v1.0.8 with the latest fixes. I deployed it yesterday with the flag as was recommended by calmbeing and so far erigon has been working for 24+ hours flawlessly and without |
@yusben Thank you for your inputs.
Using On average, it takes our best nodes to sync ~400 blocks over ~20m. Pasting a example below:
As a consequence, our nodes always fall behind head by > 400 blocks. |
@broyeztony , good to hear that, your sync slow problem should be resolved by this latest commit cc23b96, we'll draft a release later. |
closed, seems the sync is much smooth, could reopen if reoccur |
I am running a pruned node on baremetal server with 3 nvme ssds in raid0 ext4. Now it is working for 10 days without
|
System information
Erigon version:
2.40.0-dev-3da15fcb
Erigon Command (with flags/config):
Concensus Layer:
Concensus Layer Command (with flags/config):
Chain/Network: bsc / 56
Expected behaviour
node sync to head and remain synced consistently
Actual behaviour
The node stopped syncing (here, stucked at bodies stage) and drifted away from chain's head without any changes of configuration.
Also, please note that we already use the config flags mentioned in #41.
The solutions proposed there using the
integration
command do not work either.Steps to reproduce the behaviour
launch the client, and sync stucks at a fixed block number.
Backtrace
The text was updated successfully, but these errors were encountered: