-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check_ps_service sshd fails on Ubuntu #99
Comments
The only time(s) Without additional info, I'm not sure I can help much; I do not run any Ubuntu systems. (I got a "bad taste in my mouth" from Debian many, many moons ago, and I've been so deep in the RedHat/RPM world since then that I've never felt like I needed to go back. Not to mention I'm waaaaaaay too old to start over! 😜) If you run For example, here's a snippet of the output on my system of >>[{L6/S0/D2/R0}@lbnl_ps.nhc:412:check_ps_service()]> (( i++ ))
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:412:check_ps_service()]> (( i < 378 ))
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:413:check_ps_service()]> THIS_PID=2174
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:414:check_ps_service()]> [[ 0 == 1 ]]
>>[{L6/S0/D2/R1}@lbnl_ps.nhc:417:check_ps_service()]> ARGS=(${PS_ARGS[$THIS_PID]})
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:418:check_ps_service()]> THIS_SVC=/usr/sbin/sshd
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:420:check_ps_service()]> dbg 'Checking 2174: "*sshd" vs. "/usr/sbin/sshd"'
>>[{L6/S0/D3/R0}@nhc:95:dbg()]> local PREFIX=
>>[{L6/S0/D3/R0}@nhc:97:dbg()]> [[ 0 == \1 ]]
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:421:check_ps_service()]> mcheck /usr/sbin/sshd '*sshd'
>>[{L6/S0/D3/R0}@common.nhc:323:mcheck()]> local STRING=/usr/sbin/sshd
>>[{L6/S0/D3/R0}@common.nhc:324:mcheck()]> local 'MATCH=*sshd'
>>[{L6/S0/D3/R0}@common.nhc:325:mcheck()]> local i NEG=0
>>[{L6/S0/D3/R0}@common.nhc:328:mcheck()]> [[ * == \! ]]
>>[{L6/S0/D3/R0}@common.nhc:334:mcheck()]> [[ * == \/ ]]
>>[{L6/S0/D3/R0}@common.nhc:341:mcheck()]> [[ * == \{ ]]
>>[{L6/S0/D3/R0}@common.nhc:348:mcheck()]> (( i=0 ))
>>[{L6/S0/D3/R0}@common.nhc:348:mcheck()]> (( i < 0 ))
>>[{L6/S0/D3/R0}@common.nhc:356:mcheck()]> mcheck_glob /usr/sbin/sshd '*sshd'
>>[{L6/S0/D4/R0}@common.nhc:286:mcheck_glob()]> [[ /usr/sbin/sshd == *sshd ]]
>>[{L6/S0/D4/R0}@common.nhc:287:mcheck_glob()]> dbg 'Glob match check: /usr/sbin/sshd matches *sshd'
>>[{L6/S0/D5/R0}@nhc:95:dbg()]> local PREFIX=
>>[{L6/S0/D5/R0}@nhc:97:dbg()]> [[ 0 == \1 ]]
>>[{L6/S0/D4/R0}@common.nhc:288:mcheck_glob()]> return 0
>>[{L6/S0/D3/R0}@common.nhc:357:mcheck()]> return 0
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:424:check_ps_service()]> dbg 'Checking 2174: "root" vs. "root"'
>>[{L6/S0/D3/R0}@nhc:95:dbg()]> local PREFIX=
>>[{L6/S0/D3/R0}@nhc:97:dbg()]> [[ 0 == \1 ]]
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:425:check_ps_service()]> [[ -n root ]]
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:426:check_ps_service()]> mcheck root root
>>[{L6/S0/D3/R0}@common.nhc:323:mcheck()]> local STRING=root
>>[{L6/S0/D3/R0}@common.nhc:324:mcheck()]> local MATCH=root
>>[{L6/S0/D3/R0}@common.nhc:325:mcheck()]> local i NEG=0
>>[{L6/S0/D3/R0}@common.nhc:328:mcheck()]> [[ r == \! ]]
>>[{L6/S0/D3/R0}@common.nhc:334:mcheck()]> [[ r == \/ ]]
>>[{L6/S0/D3/R0}@common.nhc:341:mcheck()]> [[ r == \{ ]]
>>[{L6/S0/D3/R0}@common.nhc:348:mcheck()]> (( i=0 ))
>>[{L6/S0/D3/R0}@common.nhc:348:mcheck()]> (( i < 0 ))
>>[{L6/S0/D3/R0}@common.nhc:356:mcheck()]> mcheck_glob root root
>>[{L6/S0/D4/R0}@common.nhc:286:mcheck_glob()]> [[ root == root ]]
>>[{L6/S0/D4/R0}@common.nhc:287:mcheck_glob()]> dbg 'Glob match check: root matches root'
>>[{L6/S0/D5/R0}@nhc:95:dbg()]> local PREFIX=
>>[{L6/S0/D5/R0}@nhc:97:dbg()]> [[ 0 == \1 ]]
>>[{L6/S0/D4/R0}@common.nhc:288:mcheck_glob()]> return 0
>>[{L6/S0/D3/R0}@common.nhc:357:mcheck()]> return 0
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:431:check_ps_service()]> [[ 0 -eq 1 ]]
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:431:check_ps_service()]> [[ 0 -eq 1 ]]
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:431:check_ps_service()]> [[ -n '' ]]
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:528:check_ps_service()]> return 0
>[{L6/S0/D1/R0}@nhc:717:()]> exit 0 As you can see above, the trace output from BASH includes filename, line number, and function information along with some BASH info and the return code of the prior command (e.g., I re-ran the same command using >>[{L6/S0/D2/R0}@lbnl_ps.nhc:412:check_ps_service()]> (( i++ ))
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:412:check_ps_service()]> (( i < 378 ))
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:413:check_ps_service()]> THIS_PID=2174
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:414:check_ps_service()]> [[ 0 == 1 ]]
>>[{L6/S0/D2/R1}@lbnl_ps.nhc:417:check_ps_service()]> ARGS=(${PS_ARGS[$THIS_PID]})
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:418:check_ps_service()]> THIS_SVC=/usr/sbin/sshd
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:420:check_ps_service()]> dbg 'Checking 2174: "*sssshd" vs. "/usr/sbin/sshd"'
>>[{L6/S0/D3/R0}@nhc:95:dbg()]> local PREFIX=
>>[{L6/S0/D3/R0}@nhc:97:dbg()]> [[ 0 == \1 ]]
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:421:check_ps_service()]> mcheck /usr/sbin/sshd '*sssshd'
>>[{L6/S0/D3/R0}@common.nhc:323:mcheck()]> local STRING=/usr/sbin/sshd
>>[{L6/S0/D3/R0}@common.nhc:324:mcheck()]> local 'MATCH=*sssshd'
>>[{L6/S0/D3/R0}@common.nhc:325:mcheck()]> local i NEG=0
>>[{L6/S0/D3/R0}@common.nhc:328:mcheck()]> [[ * == \! ]]
>>[{L6/S0/D3/R0}@common.nhc:334:mcheck()]> [[ * == \/ ]]
>>[{L6/S0/D3/R0}@common.nhc:341:mcheck()]> [[ * == \{ ]]
>>[{L6/S0/D3/R0}@common.nhc:348:mcheck()]> (( i=0 ))
>>[{L6/S0/D3/R0}@common.nhc:348:mcheck()]> (( i < 0 ))
>>[{L6/S0/D3/R0}@common.nhc:356:mcheck()]> mcheck_glob /usr/sbin/sshd '*sssshd'
>>[{L6/S0/D4/R0}@common.nhc:286:mcheck_glob()]> [[ /usr/sbin/sshd == *sssshd ]]
>>[{L6/S0/D4/R1}@common.nhc:290:mcheck_glob()]> dbg 'Glob match check: /usr/sbin/sshd does not match *sssshd' As you can see above, NHC has turned the service name specified ( Eventually it runs out of processes to check; once that happens, NHC knows that the specified process isn't there, so the check fails. The handling of that case begins here. (The exact line number may vary depending on which version you're running, but it's whatever line sets the initial value for >>[{L6/S0/D2/R0}@lbnl_ps.nhc:541:check_ps_service()]> MSG='check_ps_service: Service sssshd (process sssshd) owned by root not running; start'
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:553:check_ps_service()]> [[ 0 -eq 1 ]]
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:553:check_ps_service()]> [[ 0 -eq 1 ]]
>>[{L6/S0/D2/R1}@lbnl_ps.nhc:573:check_ps_service()]> MSG='check_ps_service: Service sssshd (process sssshd) owned by root not running; start in progress'
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:552:check_ps_service()]> /bin/bash -c '/sbin/service sssshd start'
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:576:check_ps_service()]> [[ 0 -eq 1 ]]
>>[{L6/S0/D2/R0}@lbnl_ps.nhc:582:check_ps_service()]> die 1 'check_ps_service: Service sssshd (process sssshd) owned by root not running; start in progress' The trace output you see will continue inside the Hopefully that will help you troubleshoot! If I had to guess, I'd bet it's something about the |
This is very insightful! The
it is interesting that |
I had a feeling it would turn out to be something like that. 😁 It's actually not uncommon to have a mismatch between the name of the service itself and the name of the process(es) associated with that service. For example, Postfix uses the service That's why You can read up on all the available |
the check
* || check_ps_service -u root -S sshd
fails on Ubuntu 20.04. I knowsshd
running on the node because I logged in it withssh
andsystemctl is-active sshd
confirms it.Changing
sshd
tossh
, testing for another user, or no user, also does not work.I think the reason might be this function uses
/sbin/service
instead of systemd?I compiled nhc from the
dev
branch.The text was updated successfully, but these errors were encountered: