Fix signal leak in max execution timers #13027

withinboredom · 2023-12-25T23:33:45Z

Ensures only a single timer/handler is registered per thread (only applicable to ZTS builds).

(1) is especially important in ZTS builds because ts_resource() and php_request_startup() both eventually call zend_max_execution_timer_init(), which can result in a leak of signals.

Zend/zend_max_execution_timer.c

nielsdos

It looks mainly good to me, sorry for my multiple review comments instead of reviewing in one pass.

Just what I'm not sure about is whether the signal needs to be cleared before deleting the timer.
It might also warrant a comment in the code that you're doing the signal clearing because the signals stay in the pending state even if the timer is deleted.
I'll defer the final review to Arnaud though.

Zend/zend_max_execution_timer.c

iluuu1994

I'm not sure I understand why the signals are never delivered. Can you clarify?

Zend/zend_max_execution_timer.c

iluuu1994 · 2024-01-03T16:22:48Z

Zend/zend_max_execution_timer.c

@@ -41,6 +43,14 @@ ZEND_API void zend_max_execution_timer_init(void) /* {{{ */
 	sev.sigev_signo = SIGRTMIN;
 	sev.sigev_notify_thread_id = (pid_t) syscall(SYS_gettid);


zend_max_execution_timer_init is called from pcntl_fork. With this change, a new timer will no longer be created. However, it seems syscall(SYS_gettid) will no longer match. This might be a problem.

I'll create a test for this case to ensure we are not breaking it.

I couldn't figure out how to write a phpt test for this case. I added some logging and a simple script to see what was going on though:

<?php echo "about to fork\n"; $pid = pcntl_fork(); if(!$pid) { echo sprintf("child process: %s \n", getmypid()); sleep(1); } else { echo sprintf("parent process: %s\n", getmypid()); sleep(2); }

And it produces the following output:

$ php -dmax_execution_time=10 test.php Timer 0 created on thread 54661 Setting timer 0 on thread 54661 (10 seconds)... about to fork Timer 0 created on thread 54662 parent process: 54661 child process: 54662 Setting timer 0 on thread 54662 (0 seconds)... Deleting timer 0 on thread 54662... Setting timer 0 on thread 54661 (0 seconds)... Deleting timer 0 on thread 54661...

So it appears this might just be defensive programming, since it seems a fork creates its timers (and maybe doesn't inherit the parent's settings?).

I can also see it handles max_execution_time correctly as well:

<?php echo "about to fork\n"; $pid = pcntl_fork(); if(!$pid) { ini_set('max_execution_time', 1); echo sprintf("child process: %s \n", getmypid()); sleep(3); } else { ini_set('max_execution_time', 4); echo sprintf("parent process: %s\n", getmypid()); sleep(5); }

which gives the following output:

$ php -dmax_execution_time=10 test.php Timer 0 created on thread 54667 Setting timer 0 on thread 54667 (10 seconds)... about to fork Setting timer 0 on thread 54667 (0 seconds)... Setting timer 0 on thread 54667 (4 seconds)... parent process: 54667 Timer 0 created on thread 54668 Setting timer 0 on thread 54668 (0 seconds)... Setting timer 0 on thread 54668 (1 seconds)... child process: 54668 Setting timer 0 on thread 54668 (0 seconds)... PHP Fatal error: Maximum execution time of 1 second exceeded in /usr/local/src/php/test.php on line 5 Fatal error: Maximum execution time of 1 second exceeded in /usr/local/src/php/test.php on line 5 Setting timer 0 on thread 54668 (0 seconds)... Deleting timer 0 on thread 54668... Setting timer 0 on thread 54667 (0 seconds)... PHP Fatal error: Maximum execution time of 4 seconds exceeded in /usr/local/src/php/test.php on line 14 Fatal error: Maximum execution time of 4 seconds exceeded in /usr/local/src/php/test.php on line 14 Setting timer 0 on thread 54667 (0 seconds)... Deleting timer 0 on thread 54667...

I'd love to write a test for this, but I'll have to dive into how phpt works to understand it. However, it looks like there may not be an issue here for pcntl_fork.

I think switching to EG(pid) == pid implicitly fixed this. 👍 This could be tested by forking, making the child timeout, and monitoring the chlid from the parent. But I'm not sure a test is worth it.

Zend/zend_max_execution_timer.c

arnaud-lb

As I understand the issue, we leak one timer per thread because we create one during ts_resource (one per thread) and an other one during php_request_startup (once per request), and the former is never deleted.

However there are two things I don't get: What is generating the signals ? I would assume it's the leaked timers, but they are never armed, so they should never generate signals. Also, as Ilija noted, it's not clear why the signals are not handled.

Zend/zend_max_execution_timer.c

arnaud-lb · 2024-01-04T11:23:43Z

Zend/zend_max_execution_timer.c

+	ign_sigrtmin.sa_flags = 0;
+	sigemptyset(&ign_sigrtmin.sa_mask);
+
+	if(sigaction(SIGRTMIN, &ign_sigrtmin, NULL) < 0) {


Unfortunately this will reset the signal handler for the whole process.

What do you mean by that? If the shutdown function is called, the process should be done with whatever PHP script it was working on. If the process wishes to start handling another PHP script, it will call zend_max_execution_timer_init again. Either way, the pending signals must be ignored to clear them, otherwise they will sit in SIGQ forever, even if another handler is set.

Currently zend_max_execution_timer_init() and zend_max_execution_timer_shutdown() are called in php_request_start() and php_request_shutdown() respectively, so they are indeed called once per request. The signal handler is setup in zend_set_timeout_ex(), at least once per request.

However, when zend_max_execution_timer_shutdown() is called, there may be other requests executing in other threads of the same process. Resetting the signal handler here will affect them.

Ah, I see that now. I don't think it is an issue though because in zend_max_execution_timer_init() we also reset the signal handler if shutdown has been previously called. Now that you've pointed this out, I think that was part of the reasoning for #12539, allowing timers to be reused, which would probably be out of scope here.

The call to sigaction(sev.sigev_signo, NULL, &EG(oldact)); in zend_max_execution_timer_init() is not resetting the handler, it's just copying the existing handler to &EG(oldact). So by resetting handlers in zend_max_execution_timer_shutdown(), we are introducing a new issue.

We can not reset signals handlers at the end of each request, since this affects the whole process, including requests running in other threads. So I'm not sure how we can fix this issue if the only solution is to reset the signal handler.

However, I would like to challenge our understanding of the issue. Right now we are assuming that we leak signals when calling timer_delete() while some signals are pending. I would expect this to be a very rare occurrence, and it's unlikely that we are leaking the amount of signal shown in php/frankenphp#440, so I don't think we have found the actual cause of the issue.

signal(7) describes when signals are handled:

Whenever there is a transition from kernel-mode to user-mode execution (e.g., on return from a system call or scheduling of a thread onto the CPU), the kernel checks whether there is a pending unblocked signal for which the process has established a signal handler.

For a signal to leak (assuming that timer_delete() causes the leak), we would need a signal to be queued before a timer_delete() call, and that no context switch happened between these two events, which is unlikely.

What seems more likely is that the signal is masked, which is why it may be queued.

An other possibility is that the signal queue is filled with nested queues created by timer_create(). timer_create() appears to allocate a signal queue upon invocation. This counts towards RLIMIT_SIGPENDING, and the function will return EAGAIN (Resource temporarily unavailable, the error you are getting) if RLIMIT_SIGPENDING is exceeded. SigQ in /proc/pid/status reports RLIMIT_SIGPENDING usage rather than the actual number of pending signals. So it's possible that the issue is only caused by the leak of one timer per thread.

Does this change alone (without the signal handler reset) fixes the issue for you?

How many different threads had executed a PHP script when SigQ reached 50k?

However, I would like to challenge our understanding of the issue.

Good call here. I did, in-fact, fix the leak through SIG_IGN before discovering that we were calling zend_max_execution_timer_init twice. I actually didn't validate whether the SIG_IGN was still required after fixing that. 🤦

Does this change alone (without the signal handler reset) fixes the issue for you?

It does fix the issue. I've updated the PR.

withinboredom · 2024-01-04T15:31:36Z

@arnaud-lb I've updated the PR description with more details.

Zend/zend_max_execution_timer.c

arnaud-lb

Looks good to me, thank you!

There are conflicts with 8.2, could you rebase? I will merge soon after that.

ts_resource() and php_request_startup() both eventually call zend_max_execution_timer_init(), which didn't have a guard to prevent recreating signal handlers, thus resulting in leaking signals. This adds a guard to prevent the leak.

withinboredom · 2024-01-05T17:07:33Z

rebased and squashed the branch @arnaud-lb.

* PHP-8.2: [ci skip] NEWS Fix timer leak (#13027)

* PHP-8.3: [ci skip] NEWS [ci skip] NEWS Fix timer leak (#13027)

withinboredom force-pushed the fix/timers branch from fc6cc79 to 37d7061 Compare December 25, 2023 23:34

withinboredom force-pushed the fix/timers branch 2 times, most recently from 9de65b1 to cae3ece Compare December 25, 2023 23:37

javiereguiluz reviewed Dec 26, 2023

View reviewed changes

Zend/zend_max_execution_timer.c Outdated Show resolved Hide resolved

devnexen requested a review from arnaud-lb December 26, 2023 11:35

nielsdos reviewed Dec 26, 2023

View reviewed changes

Zend/zend_max_execution_timer.c Outdated Show resolved Hide resolved

nielsdos reviewed Dec 29, 2023

View reviewed changes

Zend/zend_max_execution_timer.c Outdated Show resolved Hide resolved

iluuu1994 reviewed Jan 3, 2024

View reviewed changes

arnaud-lb reviewed Jan 4, 2024

View reviewed changes

iluuu1994 reviewed Jan 5, 2024

View reviewed changes

Zend/zend_max_execution_timer.c Show resolved Hide resolved

arnaud-lb approved these changes Jan 5, 2024

View reviewed changes

withinboredom force-pushed the fix/timers branch from 4f92fcc to 95c812b Compare January 5, 2024 17:02

Fix signal leak in max execution timers

e946b90

ts_resource() and php_request_startup() both eventually call zend_max_execution_timer_init(), which didn't have a guard to prevent recreating signal handlers, thus resulting in leaking signals. This adds a guard to prevent the leak.

withinboredom force-pushed the fix/timers branch from 95c812b to e946b90 Compare January 5, 2024 17:07

arnaud-lb merged commit 6342f73 into php:PHP-8.2 Jan 5, 2024

arnaud-lb added a commit that referenced this pull request Jan 5, 2024

Merge branch 'PHP-8.2' into PHP-8.3

26f54eb

* PHP-8.2: [ci skip] NEWS Fix timer leak (#13027)

arnaud-lb added a commit that referenced this pull request Jan 5, 2024

Merge branch 'PHP-8.3'

44a246e

* PHP-8.3: [ci skip] NEWS [ci skip] NEWS Fix timer leak (#13027)

withinboredom deleted the fix/timers branch January 5, 2024 18:51

dunglas mentioned this pull request Jan 29, 2024

FrankenPHP crashes on high loads laravel/octane#791

Closed

dunglas mentioned this pull request Sep 25, 2024

perf: reuse timers #12539

Closed

		@@ -41,6 +43,14 @@ ZEND_API void zend_max_execution_timer_init(void) /* {{{ */
		sev.sigev_signo = SIGRTMIN;
		sev.sigev_notify_thread_id = (pid_t) syscall(SYS_gettid);

Fix signal leak in max execution timers #13027

Fix signal leak in max execution timers #13027

Uh oh!

Conversation

withinboredom commented Dec 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nielsdos left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

iluuu1994 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

arnaud-lb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arnaud-lb Jan 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

withinboredom commented Jan 4, 2024

Uh oh!

Uh oh!

arnaud-lb left a comment

Choose a reason for hiding this comment

Uh oh!

withinboredom commented Jan 5, 2024

Uh oh!

Uh oh!

withinboredom commented Dec 25, 2023 •

edited

Loading

arnaud-lb left a comment •

edited

Loading

arnaud-lb Jan 5, 2024 •

edited

Loading