Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream_select() timeout useless for pipes on Windows #16889

Open
dktapps opened this issue Nov 21, 2024 · 7 comments · May be fixed by #16917
Open

stream_select() timeout useless for pipes on Windows #16889

dktapps opened this issue Nov 21, 2024 · 7 comments · May be fixed by #16917

Comments

@dktapps
Copy link
Contributor

dktapps commented Nov 21, 2024

Description

$n = @stream_select($r, $w, $e, $timeout);

stream_select() and pipes don't play nice together on Windows. It returns immediately & causes it to get stuck inside fread().

Could use socket instead of pipe for stdout and stderr, (a la #5777), but this might break tests that use stream_select() on STDOUT or STDERR.

This was the real issue causing me trouble with #16849, as run-tests.php did not kill the test after the allotted time & the build hung forever.

(Side note: This code looks dodgy too. What if only stderr has output? it's gonna block in stdout until something happens:

php-src/run-tests.php

Lines 1200 to 1213 in f44eaac

if ($n > 0) {
if ($captureStdOut) {
$line = fread($pipes[1], 8192);
} elseif ($captureStdErr) {
$line = fread($pipes[2], 8192);
} else {
$line = '';
}
if (strlen($line) == 0) {
/* EOF */
break;
}
$data .= $line;
}
)

PHP Version

8.1/8.2/8.3/8.4

Operating System

Windows

@cmb69
Copy link
Member

cmb69 commented Nov 21, 2024

stream_select() and pipes don't play nice together on Windows. It returns immediately & causes it to get stuck inside fread().

Sad story:

php-src/win32/select.c

Lines 20 to 30 in f44eaac

/* Win32 select() will only work with sockets, so we roll our own implementation here.
* - If you supply only sockets, this simply passes through to winsock select().
* - If you supply file handles, there is no way to distinguish between
* ready for read/write or OOB, so any set in which the handle is found will
* be marked as ready.
* - If you supply a mixture of handles and sockets, the system will interleave
* calls between select() and WaitForMultipleObjects(). The time slicing may
* cause this function call to take up to 100 ms longer than you specified.
* - Calling this with NULL sets as a portable way to sleep with sub-second
* accuracy is not supported.
* */

and then:

if ((self->is_pipe || self->is_process_pipe) && !self->is_pipe_blocking) {
HANDLE ph = (HANDLE)_get_osfhandle(data->fd);
int retry = 0;
DWORD avail_read = 0;
do {
/* Look ahead to get the available data amount to read. Do the same
as read() does, however not blocking forever. In case it failed,
no data will be read (better than block). */
if (!PeekNamedPipe(ph, NULL, 0, NULL, &avail_read, NULL)) {
break;
}
/* If there's nothing to read, wait in 10us periods. */
if (0 == avail_read) {
usleep(10);
}
} while (0 == avail_read && retry++ < 3200000);
/* Reduce the required data amount to what is available, otherwise read()
will block.*/
if (avail_read < count) {
count = avail_read;
}
}

So if there is nothing to read, we sleep for 32 seconds (in theory; likely way more in practice).


Something enlightening for German speakers: https://learn.microsoft.com/de-de/windows/win32/api/winsock2/nf-winsock2-select

@dktapps
Copy link
Contributor Author

dktapps commented Nov 21, 2024

I may be wrong on the specifics but I do know those tests got stuck for way longer than 32 seconds. I ended up having to abort them after about half an hour or so. Very suboptimal either way lol

@dktapps
Copy link
Contributor Author

dktapps commented Nov 21, 2024

With the default scheduler frequency on Windows (assuming usleep doesn't spin-wait; I didn't look at the impl) this could sleep for as long as 5000 seconds, or well over an hour.

@cmb69
Copy link
Member

cmb69 commented Nov 21, 2024

Yeah, that looks like what happens if no data is coming in. The code has been introduced by 0c98279, with some good reasoning (although still an excessive 18 sec wait in theory), and later got changed to what we have since 6f3dd4d. However, even drastically reducing this wait won't help in this case, since the timeout is on select() which won't timeout for pipes on Windows.

@cmb69
Copy link
Member

cmb69 commented Nov 22, 2024

However, even drastically reducing this wait won't help in this case, since the timeout is on select() which won't timeout for pipes on Windows.

So I modified php_select() to not select read pipes without data:

 win32/select.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/win32/select.c b/win32/select.c
index f443325cf0..f404ae3982 100644
--- a/win32/select.c
+++ b/win32/select.c
@@ -135,7 +135,12 @@ PHPAPI int php_select(php_socket_t max_fd, fd_set *rfds, fd_set *wfds, fd_set *e
 				for (i = 0; i < n_handles; i++) {
 					if (WAIT_OBJECT_0 == WaitForSingleObject(handles[i], 0)) {
 						if (SAFE_FD_ISSET(handle_slot_to_fd[i], rfds)) {
-							FD_SET((uint32_t)handle_slot_to_fd[i], &aread);
+							DWORD avail_read = 0;
+							if (!PeekNamedPipe(handles[i], NULL, 0, NULL, &avail_read, NULL) || avail_read > 0) {
+								FD_SET((uint32_t)handle_slot_to_fd[i], &aread);
+							} else {
+								retcode--;
+							}
 						}
 						if (SAFE_FD_ISSET(handle_slot_to_fd[i], wfds)) {
 							FD_SET((uint32_t)handle_slot_to_fd[i], &awrite);

The timeout works fine this way, but terminating the process won't work (so there are errors reported in the following), because we're going through the shell (cmd.exe) and killing this process, won't kill php.exe. I don't think there is any way to kill grandchild processes from PHP, and even doing this in C would be tedious (would need to enumerate all processes, and kill those with the given parent procid; not even sure what to do with grandchildren). The alternative would be to bypass the shell, but that will not work if there are redirects involved.

@dktapps
Copy link
Contributor Author

dktapps commented Nov 22, 2024

However, even drastically reducing this wait won't help in this case, since the timeout is on select() which won't timeout for pipes on Windows.

So I modified php_select() to not select read pipes without data:

 win32/select.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/win32/select.c b/win32/select.c
index f443325cf0..f404ae3982 100644
--- a/win32/select.c
+++ b/win32/select.c
@@ -135,7 +135,12 @@ PHPAPI int php_select(php_socket_t max_fd, fd_set *rfds, fd_set *wfds, fd_set *e
 				for (i = 0; i < n_handles; i++) {
 					if (WAIT_OBJECT_0 == WaitForSingleObject(handles[i], 0)) {
 						if (SAFE_FD_ISSET(handle_slot_to_fd[i], rfds)) {
-							FD_SET((uint32_t)handle_slot_to_fd[i], &aread);
+							DWORD avail_read = 0;
+							if (!PeekNamedPipe(handles[i], NULL, 0, NULL, &avail_read, NULL) || avail_read > 0) {
+								FD_SET((uint32_t)handle_slot_to_fd[i], &aread);
+							} else {
+								retcode--;
+							}
 						}
 						if (SAFE_FD_ISSET(handle_slot_to_fd[i], wfds)) {
 							FD_SET((uint32_t)handle_slot_to_fd[i], &awrite);

The timeout works fine this way, but terminating the process won't work (so there are errors reported in the following), because we're going through the shell (cmd.exe) and killing this process, won't kill php.exe. I don't think there is any way to kill grandchild processes from PHP, and even doing this in C would be tedious (would need to enumerate all processes, and kill those with the given parent procid; not even sure what to do with grandchildren). The alternative would be to bypass the shell, but that will not work if there are redirects involved.

Do you know why select wasn't implemented this way to begin with? Seems like a simple enough fix considering the impact of this issue & how often it's been reported.

@cmb69
Copy link
Member

cmb69 commented Nov 24, 2024

No, I have no idea. Maybe that has just been overlooked, or not deemed a holistic approach (after all, only read pipes would be catered to; and the proper solution for having non blocking pipes on Windows would be overlapped IO). Or maybe few use select() with pipes on Windows (maybe because it is useless). See e.g. https://bugs.php.net/51800 (which triggered the PeekNamedPipe() hack in php_stdiop_read()); no select(), just trying to read. Or maybe it is because select(2) is considered obsoleted by poll(2) anyway.

Anyhow, I'm going to re-purpose this ticket for the more general stream_select() issue. We can tackle other issues with run-tests.php via different tickets.

@cmb69 cmb69 changed the title run-tests.php timeout doesn't work on Windows stream_select() timeout useless for pipes on Windows Nov 24, 2024
cmb69 added a commit to cmb69/php-src that referenced this issue Nov 24, 2024
Pipes are blocking on Windows, but `php_select()` always returns them
as ready for read/write.  This renders the `stream_select()` timeout
useless, what can cause a following read to block for a very long time.

While there is no general fix (and least not within reach for a stable
version), we can at least cater to the important case of read pipes by
peeking the pipe to check whether data is available.  If there is none,
we do not add the handle to the read set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants