[BUG] Runtime error when trying to save notification URL #216

Stitch10925 · 2024-10-22T21:00:52Z

Hi,

Every time I try to add a notification URL the Beszel container crashes with the following error:

panic: runtime error: invalid memory address or nil pointer dereference
 [signal SIGSEGV: segmentation violation code=0x1 addr=0x70 pc=0xf53789]
 
 goroutine 103 [running]:
 github.com/pocketbase/pocketbase/models.(*Record).Set(0xc00016d880, {0x162bdbc, 0x6}, {0x14020c0, 0xc0007a8e20})
 	/go/pkg/mod/github.com/pocketbase/[email protected]/models/record.go:305 +0x209
 beszel/internal/hub.(*Hub).updateSystem(0xc00010ae10, 0xc000603960)
 	/app/internal/hub/hub.go:310 +0x83b
 created by beszel/internal/hub.(*Hub).updateSystems in goroutine 13
 	/app/internal/hub/hub.go:252 +0x170

The URL I'm trying to save has the following format: ntfy://user:passw@hostname/topic

Also, sometimes when I refresh the browser it doesn't show me my systems anymore, but I guess that is another problem since no errors are logged for that.

PS: THANK YOU (!) for this project. I have been looking for a long time for a simple monitoring tool with centralized management for alerts. This tool is just perfect.

The text was updated successfully, but these errors were encountered:

henrygd · 2024-10-22T22:00:19Z

No worries.

Can you please go to the Export Collections page (/_/#/settings/export-collections), copy your collections, and paste them here?

Or download the JSON file and attach it.

Stitch10925 · 2024-10-23T07:59:29Z

Here is the export:

collections_export.json

henrygd · 2024-10-23T15:36:49Z

Are you sure this is triggered when trying to save notification settings, or could the timing be coincidence?

The trace is not directly related to that. I was able to replicate the error by changing the name of the container_stats collection, which is why I wanted to double check your collection schema. But that looks fine.

This is also the first I've heard of systems sometimes not showing up.

Regardless, I'll refactor that section of code for the next release to prevent the panic from happening.

Can you let me know if you see any errors in the browser devtools console besides "ClientResponseError 0: The request was autocancelled"? That one is expected and harmless.

* adds error handling for collection lookup (#216)

Stitch10925 · 2024-10-23T21:03:52Z

While I was trying to get the collection export for you I had a lot of crashes in Beszel. I also noticed that other services I had running on Docker were unstable. Tonight I had another look at it. I shut down some of the services and everything became stable again.

I assume that my NAS (which hosts the NFS volumes for my Docker services) isn't able to provide fast enough I/O for the Docker services. This is probably also why Beszel is behaving a bit odd. I am going to replace some of the disks in my NAS with SSDs to see if that fixes the issue.

henrygd · 2024-10-23T22:53:58Z

Sounds good.

The specific error you ran into should be impossible now in 0.6.2, but let me know if any other weirdness continues.

Stitch10925 · 2024-10-25T12:04:21Z

I added the SSDs, but the problem still remains. Sometime when I load Beszel no systems appear and I see the following in the browser's error logs:

Then after a while the systems finally show up. Not sure if it is a problem with collecting info from the agents or not.

I also see the following errors:

Not sure if they're relevant or not.

Stitch10925 · 2024-10-25T12:07:59Z

Just saw this in the logs of the agents:

Not sure if that is causing any delays or issues towards the app.

henrygd · 2024-10-25T19:23:59Z

Merging your other issue #221 here. I think these are all connected to an underlying issue on your system, probably with Docker.

In regard to "Something went wrong while processing your request," please go to the logs page /_/#/logs and search for "error" -- you may find more information about what's going wrong.

For the concurrent map write - the most likely cause is the agent getting stuck somehow in the Docker related code for so long that the hub re-requests the stats while the previous call is still running. I'll try to add a check for this, but it seems like a symptom of a larger issue.

Here are some things that would be helpful to know:

Were the issues introduced after upgrading to a certain version, or have you always experienced this?
Are you running the agent and hub on the same system?
If you have agents on different systems, do they all display similar issues?
What is the OS and architecture of your system(s)?
If you run docker stats on the agent system, does the information look correct with no zero values?

Please put the agent in debug log level and let it run for a few minutes so it fields some requests from the hub. Attach or paste the output here.

Thanks

Stitch10925 · 2024-10-25T21:03:25Z

For context:

I have been running Beszel on Docker Swarm. I have replicated the agent to all nodes in the Swarm and was running the App on one of the nodes. The volumes are being served by NFS, which SQLite is not very happy about, but I haven't had too many issues with it in the past.

It seems however, that PocketBase and/or Beszel seem to be very sensitive to I/O speed. I have changed to volume of Beszel to use the node's filesystem instead of my NFS share and it is much more stable now. But, of course, I lose the advantages of the Swarm doing it this way.

henrygd · 2024-10-26T17:47:15Z

Gotcha. This definitely sounds like a compatibility or configuration issue with swarm. I don't use it myself so I haven't done any testing on it.

There's related issue you can check out here: #17

From your logs screenshot it looks like your agents may be handling two simultaneous calls from the hub. This would explain your concurrent map write error. Not sure if this is because there are two instances of the hub, or swarm is just grabbing the first node to respond, like the issue above.

I do want to add an option for agent -> hub data flow at some point which should fix this. But for now you probably need to use the same workaround that's explained in the linked issue.

saket1999 · 2024-11-15T17:10:09Z

Hi, I am also getting this similar issue, not sure what is triggerring it, I was just going though the data after updating the beszel to 0.8.0. I am attaching both export and logs.
beszel.log
pb_schema.json

henrygd · 2024-11-15T19:20:38Z

@saket1999 Thanks, I'll fix that in the next release.

henrygd added the troubleshooting Maybe bug, maybe not label Oct 22, 2024

henrygd added a commit that referenced this issue Oct 23, 2024

limit collection lookups and other small refactoring

c7463f2

* adds error handling for collection lookup (#216)

henrygd mentioned this issue Oct 25, 2024

Agent Crash: fatal error: concurrent map writes #221

Closed

henrygd mentioned this issue Nov 11, 2024

ERROR Error encoding stats err="json: unsupported value: NaN" #280

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Runtime error when trying to save notification URL #216

[BUG] Runtime error when trying to save notification URL #216

Stitch10925 commented Oct 22, 2024

henrygd commented Oct 22, 2024

Stitch10925 commented Oct 23, 2024

henrygd commented Oct 23, 2024

Stitch10925 commented Oct 23, 2024

henrygd commented Oct 23, 2024

Stitch10925 commented Oct 25, 2024

Stitch10925 commented Oct 25, 2024

henrygd commented Oct 25, 2024

Stitch10925 commented Oct 25, 2024

henrygd commented Oct 26, 2024

saket1999 commented Nov 15, 2024

henrygd commented Nov 15, 2024

[BUG] Runtime error when trying to save notification URL #216

[BUG] Runtime error when trying to save notification URL #216

Comments

Stitch10925 commented Oct 22, 2024

henrygd commented Oct 22, 2024

Stitch10925 commented Oct 23, 2024

henrygd commented Oct 23, 2024

Stitch10925 commented Oct 23, 2024

henrygd commented Oct 23, 2024

Stitch10925 commented Oct 25, 2024

Stitch10925 commented Oct 25, 2024

henrygd commented Oct 25, 2024

Stitch10925 commented Oct 25, 2024

henrygd commented Oct 26, 2024

saket1999 commented Nov 15, 2024

henrygd commented Nov 15, 2024