Skip to content

Add PTY support to proc_open (again after 16 long years) #5525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

alexdowad
Copy link
Contributor

@alexdowad alexdowad commented May 4, 2020

Back in 2004, a feature was added to proc_open which allowed it to open a PTY, connecting specific FDs in the child process to the slave end of the PTY and returning the master end of the PTY (wrapped as a PHP stream) in the $pipes array. However, this feature was disabled just about a month later. Little information is available about why this was done, but from talking to the original implementer, it seems there were portability problems with some rare flavors of Unix.

Re-enable this feature with a simplified implementation which uses openpty(). No attempt is made to support PTYs if the platform does not have openpty(). The configure script checks if linking with -lutil is necessary to use openpty(), but if anything else is required, like including some special header or linking with some other library, PTY support will be disabled.

The original PTY support for proc_open automatically daemonized the child process (disassociating it from the TTY session and process group of the parent). However, I don't think this is a good idea. Just because a user opens a child process in a PTY, it doesn't mean they want it to continue running even when the parent process is killed. Of course, if the child process is some kind of server, it will likely daemonize itself; but we have no reason to preempt that decision.

It turns out that since 2015, there has been one test case for PTY support in proc_open() in the test suite. This test was added in GitHub PR #1588. That PR mentioned that the PHP binary in the Debian/Ubuntu repositories is patched to enable PTY support. Checking the Debian PHP repository (https://salsa.debian.org/php-team/php.git) shows that this is still true. Debian's patch does not modify the implementation from 2004 in any way; it just removes the #if 0 line which disables it.

Naturally, the test case is skipped if PTY support is not enabled. This means that ever since it was added, every test run against the 'vanilla' PHP codebase has skipped it.

Interestingly, the test case which was added in 2015 fails on my Linux Mint PC... both with this simplified implementation and when enabling the original implementation. Investigation reveals the reason: when the child process using the slave end of the PTY exits and its FDs are all closed, and all buffered data is read from the master end of the PTY, any further attempt to read from the master end fails with EIO. The test case seems to expect that reading from the master end will always return an empty string if no data is available.

Could this mean that on the platform where this test case was originally developed, the PHP parent process ran faster than the child process, so that all the calls to fread() complete before the child process exits and closes its FDs? Or could something else about the platform have been different, such that read() in the parent does not fail with EIO even after the child process exits? Another question: Are the Debian packagers even running this test case every time they build PHP binaries? Intriguing questions, these!

There may be another way out of this sticky dilemma: IF at least one FD referring to the slave end of the PTY is kept open in the parent process, the failure with EIO will not occur even after the child process exits. This might be a good approach, though we would need a way to ensure the FD will be closed eventually in long-running programs. Perhaps associate it with the proc resource and close it when the proc is freed or explicitly closed with proc_close?

And the rabbit hole goes deeper. Although (for now) the test case from 2015 has been updated so it does not attempt to fread() multiple times from the PTY, it still fails intermittently. The reason? When the child process writes "foo\n" to the PTY, the parent sometimes receives "foo" (3 bytes) and sometimes "foo\r\n" (5 bytes). The "\r" is obviously from the TTY line discipline converting "\n" to "\r\n", but why on earth would the parent sometimes read the newline and sometimes not? What is more, strace clearly shows that this is happening in the kernel. The child process always executes a single write("foo\n") syscall, but when the blocking read() syscall issued by the parent process returns, sometimes it returns "foo" and sometimes "foo\r\n". This doesn't happen when I extract the test code and run it directly from the PHP CLI, only when the test case is run by run-tests.php.

Thanks to Nikita Popov for suggesting that we should just use openpty() rather than grantpt(), unlockpt(), etc.

Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this mean that on the platform where this test case was originally developed, the PHP parent process ran faster than the child process, so that all the calls to fread() complete before the child process exits and closes its FDs? Or could something else about the platform have been different, such that read() in the parent does not fail with EIO even after the child process exits?

PHP only reports fread() errors since PHP 7.4, previously I would expect it to return an empty string. As the test is always skipped, we did not adjust it for this change.

@alexdowad
Copy link
Contributor Author

PHP only reports fread() errors since PHP 7.4, previously I would expect it to return an empty string. As the test is always skipped, we did not adjust it for this change.

That makes a lot of sense.

Do you think that fread() failing with EIO if the slave end of the PTY is closed is a problem?

I think that it's good that fread() returns false rather than just an empty string. If you have a long-running child process, and you keep reading from the pipe in a loop, you need some way to know that the child has gone away (maybe crashed, or just terminated normally). What I don't like is the warning message printed at the console.

When you are spawning sub-processes to do work for you, it's expected that they will terminate. And you won't always know whether your child processes are still going or not, so sometimes you might try reading from a pipe, only to discover the child isn't there. I don't like the idea of the terminal being spammed with warning messages in such cases.

Does it mean that when using the PTY support, you always have to use feof() or something to check if it's safe to fread()?

@alexdowad alexdowad force-pushed the enable-pty-in-proc-open branch from 3414251 to 6329dc9 Compare May 4, 2020 21:01
@alexdowad
Copy link
Contributor Author

Regarding the issue of "foo\n", I intend to play a bit with the TTY settings (termios flags) and see if it makes a difference. Then start poking around in the kernel source to see what I can find.

@nikic
Copy link
Member

nikic commented May 6, 2020

I tried to reproduce the newline issue locally, but I'm always getting the "foo\r\n" variant...

Do you think that fread() failing with EIO if the slave end of the PTY is closed is a problem?

I think that it's good that fread() returns false rather than just an empty string. If you have a long-running child process, and you keep reading from the pipe in a loop, you need some way to know that the child has gone away (maybe crashed, or just terminated normally). What I don't like is the warning message printed at the console.

When you are spawning sub-processes to do work for you, it's expected that they will terminate. And you won't always know whether your child processes are still going or not, so sometimes you might try reading from a pipe, only to discover the child isn't there. I don't like the idea of the terminal being spammed with warning messages in such cases.

Does it mean that when using the PTY support, you always have to use feof() or something to check if it's safe to fread()?

I/O handling in PHP tends to be a bit ... messy :) It's one of those areas where liberal application of @fread() etc is considered fine.

One thing worth pointing out is that feof() doesn't actually return true in this case... you only get the error on fread().

@alexdowad
Copy link
Contributor Author

Cloned up the latest kernel source and built me a brand spanking new 5.7.0-rc4+ kernel.

Ran the problematic test in a loop with this lil' shell script:

#!/bin/sh

count=0
while "$@"
do
  count=$((count + 1))
  echo "*** Passed $count times\n"
done

With my new 5.7.0 kernel, it failed on the 16th run. Same thing, "foo" instead of "foo\r\n".

@alexdowad alexdowad force-pushed the enable-pty-in-proc-open branch from 6329dc9 to 7a78548 Compare May 7, 2020 06:46
@alexdowad
Copy link
Contributor Author

I found that when not all the data written to the pipe is returned by the first call to fread, a second call can still be made to get what remains. Updated the test case accordingly.

Please have a look. I ran this test case over 2000 times locally without failure.

@alexdowad alexdowad marked this pull request as ready for review May 7, 2020 07:25
@alexdowad alexdowad force-pushed the enable-pty-in-proc-open branch from 7a78548 to 0433d3c Compare May 7, 2020 07:42
@alexdowad
Copy link
Contributor Author

It seems that Mac OS defines the prototype for openpty() in util.h instead of pty.h. Let's see if this code will compile on Mac OS now...

@alexdowad
Copy link
Contributor Author

And we're green!

@alexdowad
Copy link
Contributor Author

Added another commit to fix a problem with proc_open -- It would leak memory if a wrong resource type was passed in the descriptorspec.

@alexdowad alexdowad force-pushed the enable-pty-in-proc-open branch from 6a5b7b8 to 3f69271 Compare May 12, 2020 08:20
@alexdowad
Copy link
Contributor Author

Did some more refactoring and added comments. I think this is ready for merge. Would love to hear comments on the 3 points flagged above, though.

@alexdowad
Copy link
Contributor Author

Test just added by @cmb69 yesterday is failing... looking at it.

@cmb69
Copy link
Member

cmb69 commented May 12, 2020

@alexdowad, known issue. There appears to be a bug in FFI regarding s390x (unrelated to the fix). I'll try to debug via Travis.

@alexdowad
Copy link
Contributor Author

@cmb69 Look at ap_php_conv_p2 in main/snprintf.c, called from line 1111. I haven't totally figured it out, but it appears that on a big-endian architecture, when you write 42 into the int member of the union, and then reinterpret that integral value as a pointer, and pass that pointer into zend_strpprintf, somewhere along the way it is getting treated as a very large number rather than a small one.

@nikic
Copy link
Member

nikic commented May 12, 2020

Looks like the Windows build is broken.

Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the Windows build issue, this looks fine to me. Just some style nits.

}
/* }}} */

/* {{{ PHP_MINIT_FUNCTION(proc_open) */
PHP_MINIT_FUNCTION(proc_open)
{
le_proc_open = zend_register_list_destructors_ex(proc_open_rsrc_dtor, NULL, "process", module_number);
le_proc_open = zend_register_list_destructors_ex(proc_open_rsrc_dtor, NULL,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A number of changes have been made to squeeze most of the code into 80 columns. Is this a good idea? Do the maintainers care?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I, personally, don't consider 80 columns to be practical for php-src, but rather try to avoid very long rows (more like > 120 columns).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really have a standard on this, but I keep lines to 100 column. 80 is definitely on the low side for php-src.

running = wstatus == STILL_ACTIVE;
exitcode = running ? -1 : wstatus;

#elif HAVE_SYS_WAIT_H

errno = 0;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see any reason why errno needs to be zeroed here. Yes, some libc library calls require one to zero errno before calling, so you can check its value afterward, but waitpid is not one of these, and errno is not checked afterward in any event.

Am I wrong?

@@ -913,7 +985,7 @@ PHP_FUNCTION(proc_open)
#ifdef PHP_WIN32
if (other_options) {
suppress_errors = get_option(other_options, "suppress_errors");
/* TODO Deprecate in favor of array command? */
/* TODO: Deprecate in favor of array command? */
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintainers, what do you think?? Should the bypass_shell option (under Windows) be deprecated?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might better be discussed on internals@. AFAIK, bypass_shell=true is default for Symfony, were the deprecation could be a big issue. @nicolas-grekas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we would work around the deprecation, no worries on this side

@alexdowad alexdowad force-pushed the enable-pty-in-proc-open branch from 3f69271 to a01202e Compare May 12, 2020 17:48
@alexdowad
Copy link
Contributor Author

Let's see how the build is looking now.

alexdowad added 3 commits May 13, 2020 18:01
Back in 2004, a feature was added to proc_open which allowed it to open a PTY,
connecting specific FDs in the child process to the slave end of the PTY and returning
the master end of the PTY (wrapped as a PHP stream) in the `$pipes` array. However,
this feature was disabled just about a month later. Little information is available
about why this was done, but from talking to the original implementer, it seems there
were portability problems with some rare flavors of Unix.

Re-enable this feature with a simplified implementation which uses openpty(). No
attempt is made to support PTYs if the platform does not have openpty(). The configure
script checks if linking with -lutil is necessary to use openpty(), but if anything
else is required, like including some special header or linking with some other library,
PTY support will be disabled.

The original PTY support for proc_open automatically daemonized the child process
(disassociating it from the TTY session and process group of the parent). However,
I don't think this is a good idea. Just because a user opens a child process in a
PTY, it doesn't mean they want it to continue running even when the parent process
is killed. Of course, if the child process is some kind of server, it will likely
daemonize itself; but we have no reason to preempt that decision.

It turns out that since 2015, there has been one test case for PTY support in
proc_open() in the test suite. This test was added in GitHub PR php#1588
(php#1588). That PR mentioned that the PHP
binary in the Debian/Ubuntu repositories is patched to *enable* PTY support. Checking
the Debian PHP repository (https://salsa.debian.org/php-team/php.git) shows that this
is still true. Debian's patch does not modify the implementation from 2004 in any
way; it just removes the #if 0 line which disables it.

Naturally, the test case is skipped if PTY support is not enabled. This means that ever
since it was added, every test run against the 'vanilla' PHP codebase has skipped it.

Interestingly, the test case which was added in 2015 fails on my Linux Mint PC... both
with this simplified implementation *and* when enabling the original implementation.
Investigation reveals the reason: when the child process using the slave end of the
PTY exits and its FDs are all closed, and all buffered data is read from the master
end of the PTY, any further attempt to read from the master end fails with EIO. The
test case seems to expect that reading from the master end will always return an
empty string if no data is available.

Likely this is because PHP's fread() was updated to report errors from the underlying
system calls only recently.

One way out of this dilemma: IF at least one FD referring to the slave end of the PTY is
kept open *in the parent process*, the failure with EIO will not occur even after the child
process exits. However, that would raise another issue: we would need a way to ensure the FD
will be closed eventually in long-running programs.

Another discovery made while testing this code is that fread() does not always return
all the data written to the slave end of the PTY in a single call, even if the data was
written with a single syscall and it is only a few bytes long.

Specifically, when the child process in the test case writes "foo\n" to the PTY, the parent
sometimes receives "foo" (3 bytes) and sometimes "foo\r\n" (5 bytes). (The "\r" is from the
TTY line discipline converting "\n" to "\r\n".) A second call to fread() does return the
remaining bytes, though sometimes all the data is read in the first call, and by the time
the second call is made, the child process has already exited. It seems that liberal use
of the @ operator is needed when using fread() on pipes.

Thanks to Nikita Popov for suggesting that we should just use openpty() rather than
grantpt(), unlockpt(), etc.
proc_open can accept stream resources in the descriptorspec, like this:

    proc_open("command", array(0 => $resource), $pipes);

Previously, if a resource which was *not* of type "stream" was passed, proc_open would
return without freeing dynamically allocated memory. It's fixed now.
@alexdowad alexdowad force-pushed the enable-pty-in-proc-open branch from a01202e to d6ddb24 Compare May 13, 2020 16:01
This time a number of comments have been added to make it easy for new devs to understand
what is going on. Also adjusted error message to use colons rather than dashes.
@alexdowad alexdowad force-pushed the enable-pty-in-proc-open branch from d6ddb24 to 507cac0 Compare May 13, 2020 16:03
@alexdowad
Copy link
Contributor Author

Updated so that all error messages use colons instead of dashes. Also wrapped to 100 columns rather than 80 (as per comments from @nikic and @cmb69).

@nikic
Copy link
Member

nikic commented May 14, 2020

Merged as a84cd96, b983580, dc1496e.

@nikic nikic closed this May 14, 2020
@nikic
Copy link
Member

nikic commented Nov 2, 2020

Something I'm wondering about is whether we have the right semantics for something like [['pty'], ['pty'], ['pty']]. For everything but ptys, this would mean that three separate pipes/sockets/whatevers are attached to stdin/stdout/stderr. For ptys it will attach the same one instead.

I wonder if it would make more sense to make this conig attach three different ptys and use normal redirection if one pty should be used for multiple stdio streams, i.e. [['pty'], ['redirect', 0], ['redirect', 0]].

Though maybe there's something special about ptys that makes that moot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants