-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crashing with DaemonError: {'code': -32700, 'message': 'Parse error'}
. (RPC_PARSE_ERROR)
#238
Comments
It is the first trace that is more interesting.
Looks like e-x sent a What version of bitcoind do you use? How are the two processes configured, e.g. is bitcoind on the same machine as electrumx? |
Can you check the bitcoind logs around the same time the crash happened? Perhaps it logged something more about the parse error. |
Its bitcoind 25.1.0 locally on the same machine configured with rpcuser/rpcpassword and rpcworkqueue of 250. Nothing special in the bitcoind logs during that time, I guess:
Something else seems broken: With "electrumx_rpc getinfo" I get for both daemon height and db height the current latest blockchain height, but when I connect with Electrum to my server it tells me my server is lagging 30 blocks and https://1209k.com/bitcoin-eye/ele.php reports it as BEHIND as well. |
Did this error only happen once? Or does it reproduce somehow?
Is this still the case or did the server recover? Is the process at 100% cpu usage?
How beefy is the hardware? Is there anything else connected to bitcoind besides electrumx? |
I have also just started seeing the height mismatch on my server, cpu usage is fine and I don't see any other errors though. |
The parse error happened afterwards again and before as well, I have 5 findings in my syslog (in the old .gz ones it didn't happen):
The one on Nov 27 16:48:00 was just that single line, nothing else, no crash. On the other parse errors the server did crash another time (Nov 24 04:59) but on Nov 21 14:58 and Nov 24 19:53 it didn't crash. I've looked into Bitcoins debug.log at the times of all 5 Parse errors but never saw something special.
On Nov 21 I still had Bitcoin version 0.24 running.
On Nov 24 I already had Bitcoin version 0.25. Next one:
The other issue: Electrumx apparently didn't parse any new blocks at all and after a day or so I restarted it and now they are up to date again. During the lag I didn't notice any high CPU usage nor RAM or I/O. Edit: Electrumx stopped processing blocks again without anything interesting in the logfile during that time. Block processing usually takes between 1s and 11s on this system and there is no high CPU nor IO load. In the electrumx log I see about 10 "timed out after 30 secs" messages for transaction.get or transaction.broadcast per minute - its the about the same amount in the beginning when still processing blocks and afterwards without processing blocks. Currently "electrumx_rpc getinfo" gives me:
Where 818735 is the height reported when connecting via Electrum as well. At the same time "bitcoind getblockcount" on that system reports the correct current block 818756. With "bitcoin-cli getrpcinfo" I see only the active getrpcinfo call, nothing else. |
see #238 So far it is unclear why bitcoind sends RPC_PARSE_ERROR, but instead of doing a full shutdown, let's log the error(+request) and retry.
Ok, so it was not a one-off for you then. In that case, I've pushed a small change, please try with that: 3148936
Perhaps this is not related at all then. Still I am interested in what will get logged re |
DaemonError: {'code': -32700, 'message': 'Parse error'}
. (RPC_PARSE_ERROR)
Thanks! I am now running the current main branch. |
I have seen similar on an LTC electrumx, I tried to get the full aiohttp error with rich.console which I've attached. I have seen the prefetcher stopping without a parse error, it will just stop and I don't see any error around that happening to suggest why. Is there something that could be done to make the prefetcher more resilient when the RPC node isn't located on the same VM or LAN segment? The attachment should be read bottom to top. edit: attachment inlined and reversed: log
|
@biatwc in your log, there is:
The first two are handled properly I think, they are just logged and then we retry. I am not sure where the cancellation originates from -- that is what ultimately kills the prefetcher. |
I've looked at the logs around the time of the CancelledError on a number of occasions and there is nothing other than the usual INFO level |
How often does this bug happen for you? Is it only the prefetcher stopping, is the process otherwise still running and serving sessions? I guess you don't have a reliable way to reproduce, do you? :P
It can be a lot of things. For example, if you intentionally shut down electrumx in a clean way, that too will happen via cancellations (explicit calls to task.cancel and multiple CancelledErrors propagating out and caught at some locations), so we can't just catch CancelledError and retry :/ |
Hi, I am running the newest commit on Ubuntu 22.04 LTS and ElectrumX crashed with this "Fatal error on SSL transport" and "Bad file descriptor" error. Afterwards it was possible to start it again without issues, I hope there are no corrupt files. Maybe issue #22 was similar?
The text was updated successfully, but these errors were encountered: