-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix infinite loop caused by edge case during terminus_job_poll() #188
Conversation
Adding a bit more detail, here... This is the infinite loop of errors that show up: Cloning database from test.................CONNECTION ERROR: Operation timed out after 30001 milliseconds with 0[error]
bytes received
No response for site jobs! [error]
Invalid argument supplied for foreach() terminus.drush.inc:1754 [warning]
.curl_setopt_array(): 85 is not a valid cURL handle resource [warning]
terminus.drush.inc:3802
curl_exec(): 85 is not a valid cURL handle resource [warning]
terminus.drush.inc:3808
curl_errno(): 85 is not a valid cURL handle resource [warning]
terminus.drush.inc:3815
curl_getinfo(): 85 is not a valid cURL handle resource [warning]
terminus.drush.inc:3821 |
I think this sounds like a reasonable. Let me test and get back to you. |
Not sure I understand the reasoning behind the timeout though. |
I was having issues in situations where no data was being received at all (e.g. the terminus API endpoint had not yet responded within 30s). If I bumped it up, I'd hit the server-side timeout at 45 seconds, but with data coming in mid-stream. For whatever reason, Terminus was better able to recover in situations where some data was being returned, rather than 0 bytes. Widening that timeout window to 45 seconds helped. Though I'm unable to verify anymore because I think you guys cleared our job queue history. |
So I'm a little hesitant here on the timeout. The reason is that from a ops POV there's some risk that if we suddenly start holding connections open longer I worry there could be residual unintended impacts on the API server. So I'd like to get @nstielau @joemiller or @joshkoenig thoughts here. |
Sure. If it's a no-go, I'm happy to back out that timeout change. |
Thanks for submitting, Eric. I'll take a closer look at this shortly. |
I don't think there's anything wrong with this change right now, but I do believe we are going to be shortening the timeout in the future. Internally it's set for 10 seconds. Generally speaking we want the parameters of the system to be such that we can always deliver a response in that amount of time. In cases like yours where there's a prodigious amount of data, we will be implementing paging. However, for right now having the client timeout match the current backend timeout is fine. @mikevanwinkle - feel free to merge this. 👍 |
Any word on this @mikevanwinkle and/or @joshkoenig ? We're pretty consistently hitting these again; would be great to get this merged in if possible. |
Yeah lets merge this right now. We'll sort out a holistic solution in the forthcoming TERMINUS 2.0 ⚡ ⚡ ⚡ |
Fix infinite loop caused by edge case during terminus_job_poll()
done |
Problem / motivation
In cases where
terminus_request()
errors (e.g. a timeout), it will returnFALSE
and close the curl handle / resource. Normally, this is fine and doesn't matter... But whenterminus_request()
is called byterminus_job_poll()
, it's possible to get stuck in an infinite loop.Underlying cause is that the curl handle is statically cached in terminus_request... Although there's a check to re-initialize it, the check isn't explicit enough (specifically,
$ch
will be an integer when closed, which is truthy).Proposed resolution
Explicitly check that the curl handle is a resource. Also, increase the timeout to match the server-side timeout (which appears to be 45 seconds).