-
Hi, I've been looking at slurm-web for a couple of weeks now, as I have two production clusters I'd like to deploy it on. I have started off by deploying a localhost instance on each cluster separately, so make sure everything is stable. On one cluster it has worked perfectly and I have no issues, from following the Quickstart guide. On the second cluster, using the same major version of RHEL, and the same method, everything starts but the slurm-web-gateway has issues. In this gateway if I try to move around between pages I am constantly getting the following error: Server error: Request error: canceled I've tried running the gateway interactively in debug and I am seeing no issues, and similarly I can't see any error messages coming from the agent in this cluster. I can't find any reference to these errors in the command line and was wondering if I have encountered a bug? Could you please let me know any information I need to provide to help diagnose this? Thanks. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Additionally I have also been seeing: Server error: Request error: Network Error |
Beta Was this translation helpful? Give feedback.
-
Hello @atb1r21, I just converted the issue you opened #455 into this discussion as this is more a support request than an actual bug, at least for now. Did you enabled redis cache on the agent? For reference, see: https://docs.rackslab.io/slurm-web/install/quickstart.html#cache If not, you should definitely do it in the first place, as it saves many requests to slurmrestd and speeds up page rendering a lot in most cases. Does this second cluster have more jobs than the first one? In presence of thousands of jobs in slurmctld queue, slurmrestd can be slow to render the list of jobs. You can test it with: $ time curl --silent --unix-socket /run/slurmrestd/slurmrestd.socket http://slurm/slurm/v0.0.40/jobs If this command takes many seconds to complete, this is probably the root cause. If not, I would then look in browser developers console to identify which networks requests take time to get response specifically. |
Beta Was this translation helpful? Give feedback.
Hello @atb1r21, I just converted the issue you opened #455 into this discussion as this is more a support request than an actual bug, at least for now.
Did you enabled redis cache on the agent? For reference, see: https://docs.rackslab.io/slurm-web/install/quickstart.html#cache
If not, you should definitely do it in the first place, as it saves many requests to slurmrestd and speeds up page rendering a lot in most cases.
Does this second cluster have more jobs than the first one? In presence of thousands of jobs in slurmctld queue, slurmrestd can be slow to render the list of jobs. You can test it with:
$ time curl --silent --unix-socket /run/slurmrestd/slurmrestd.socket http://slurm/slur…