-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sshj-ssh sudo timeout issue #15
Comments
Hi, |
This is still being an issue even after upgrading from 4.70 to 4.15 |
Hi folks, I tried to reproduce this issue unsuccessfully. That is the last attempt config/steps to compare to your environment:
As you see, I'm using the private key method to access the remote node. Also tried the filesystem private key like the original ticket config:
But this doesn't work on the SSHJ 0.1.8 (this is a known bug fixed on the 0.1.9 version).
<?xml version="1.0" encoding="UTF-8"?>
<project>
<node name="node00" description="Node 00" tags="" hostname="192.168.56.10" osArch="amd64" osFamily="unix" osName="Linux" osVersion="4.18.0-372.26.1.el8_6.x86_64" username="vagrant" ssh-key-storage-path="keys/rundeck"/>
</project> As you see the private key is defined on the
- defaultTab: nodes
description: Just an example.
executionEnabled: true
id: 5e2a904e-1048-4c21-9595-c9a1b03a6999
loglevel: INFO
name: HelloWorld
nodeFilterEditable: false
nodefilters:
dispatch:
excludePrecedence: true
keepgoing: false
rankOrder: ascending
successOnEmptyNodeFilter: false
threadcount: '1'
filter: 'name: node00 '
nodesSelectedByDefault: true
plugins:
ExecutionLifecycle: null
scheduleEnabled: true
sequence:
commands:
- fileExtension: .sh
interpreterArgsQuoted: false
script: |-
#!/bin/bash
my_start_date=`date`
my_max=10000
my_curr_try=0
my_sleep=480
while [ $my_curr_try -lt $my_max ]
do
my_curr_date=`date`
(( my_curr_try = my_curr_try + 1 ))
echo "Start: ${my_start_date} Curr: ${my_curr_date} Try: ${my_curr_try} of ${my_max}"
echo "Sleeping ${my_sleep}"
sleep ${my_sleep}
done
scriptInterpreter: sudo su - root
keepgoing: false
strategy: node-first
uuid: 5e2a904e-1048-4c21-9595-c9a1b03a6999 Also, I tried a simple job to make sure that sudo is working: - defaultTab: nodes
description: ''
executionEnabled: true
id: 8384ac4b-1df3-43e2-9e0c-fb68f53b8763
loglevel: INFO
name: TestSUDO
nodeFilterEditable: false
nodefilters:
dispatch:
excludePrecedence: true
keepgoing: false
rankOrder: ascending
successOnEmptyNodeFilter: false
threadcount: '1'
filter: 'name: node00 '
nodesSelectedByDefault: true
plugins:
ExecutionLifecycle: null
scheduleEnabled: true
sequence:
commands:
- fileExtension: .sh
interpreterArgsQuoted: false
script: whoami
scriptInterpreter: sudo su - root
keepgoing: false
strategy: node-first
uuid: 8384ac4b-1df3-43e2-9e0c-fb68f53b8763 Ok, this works on my env but I'm curious about what things are missing here.
I'm pretty sure that I'm missing something. Thanks! |
For our scenario; we are not allowed to use ssh keys nor password-less sudo. As I don't see in your node definition for: ssh-password-option: option.sshPassword Are you using passwordless sudo access as well? I don't see any options in your job prompting for username/password (unless I am overlooking it). I can try again setting up a new project and job; but we cannot use keys or password-less sudo. |
Our /etc/sudoers contains the following:
which is why the tty is so important. |
Also, our /etc/ssh/sshd_config has the following settings:
which is why the keepalive is important. |
I created a "sshtest" project with:
My test node is basically:
|
My test job is:
|
Sadly even after new project and job setup; sudo fails until I add
into SSHJExec.java and build a new plugin file. I've yet to figure out how to get KeepAliveRunners to show up after the "sleep" command with the new version of the plugin. |
Thanks for the complete information @eagle-rr! I reproduced the issue partially:
node00:
ssh-password-option: option.sshPassword
nodename: node00
hostname: 192.168.56.10
osFamily: unix
sudo-password-option: option.sudoPassword
description: Rocky8
ssh-authentication: password
sudo-command-enabled: 'true'
tags: centos
username: vagrant And this simple job definition "just for testing": - defaultTab: nodes
description: ''
executionEnabled: true
id: 9534837e-353a-4280-b5b9-22992a705933
loglevel: INFO
name: SingleCommand
nodeFilterEditable: false
nodefilters:
dispatch:
excludePrecedence: true
keepgoing: false
rankOrder: ascending
successOnEmptyNodeFilter: false
threadcount: '1'
filter: 'name: node00 '
nodesSelectedByDefault: true
options:
- name: sshPassword
secure: true
storagePath: keys/passwd
- name: sudoPassword
secure: true
storagePath: keys/passwd
plugins:
ExecutionLifecycle: null
scheduleEnabled: true
sequence:
commands:
- fileExtension: .sh
interpreterArgsQuoted: false
script: whoami
scriptInterpreter: sudo su - root
keepgoing: false
strategy: node-first
uuid: 9534837e-353a-4280-b5b9-22992a705933 I ran the job by using the JCSH node executor ("SSH") successfully: But on the SSHJ, the job keeps the execution "forever". I'm still looking into it. |
Yes - the running job should eventually time out. Basically, the sudo prompt's expect is waiting forever due to no tty present. |
Exactly, thanks for providing the context, I've reproduced it. |
Glad you could reproduce it. My only work-around for this tty part was to add into SSHJExec.java the following "allocateDefaultPTY()" entry:
Granted, it should probably check if "EnablePTY" is enabled in project settings and such before allocating pty. However after allocateDefaultPTY() runs successfully and the sudo now works, the sleep in the command will eventually timeout. When comparing my same job in 4.7 versus the newer 4.16; I see in debug mode the following entries after the sleep command in v0.1.2 of sshj plugin:
In comparing version 0.1.2 and 0.1.9; it looks like SSHJBase.java's connect had the following lines changed from:
to just:
Wouldn't it be the following line that is preventing the KeepAliveRunner from kicking off?
|
Any further questions or thoughts on this? |
As we are at that time again (quarterly patching), are there any further questions or thoughts on this? |
After trying to upgrade from 4.7 to 4.10; the linux nodes are hitting the following error after running for 5 minutes:
My related project settings are:
The commands being ran are via sudo.
I can reliably re-produce this error by setting up a job to run the following inline script on remote servers:
Using invocation string: sudo su - root
File Extension: .sh
Rundeck release 4.10.2 (sshj plugin version 0.1.4)
The text was updated successfully, but these errors were encountered: