Skip to content

#69724: pm.ondemand forks fewer child workers than it should #1308

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

bjaspan
Copy link

@bjaspan bjaspan commented May 28, 2015

This PR is a proposed solution to https://bugs.php.net/bug.php?id=69724. The changes are subtle and to critical code within FPM, so it needs a careful review. I took the liberty of starting a new file, DESIGN.md, to document how FPM works internally. This allowed me to describe the complete situation and how this change solves it.


### Dynamic

### Ondemand
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section describes the reported problem and proposed solution.

@bjaspan
Copy link
Author

bjaspan commented May 28, 2015

An additional benefit of this change is that it makes pm.ondemand work on any supported OS since it no longer uses edge-triggering.

@smalyshev smalyshev added the Bug label Jun 1, 2015
@peterbowey
Copy link

A cleaned version of the above 'patch' that checks against the PHP 5.6.10 branch:

diff --git a/sapi/fpm/DESIGN.md b/sapi/fpm/DESIGN.md
new file mode 100644
index 0000000..370f0a2
--- /dev/null
+++ b/sapi/fpm/DESIGN.md
@@ -0,0 +1,186 @@
+# Introduction
+
+This document describes the design of PHP-FPM. It is woefully incomplete, but
+at least we are now starting.
+
+## Request stages
+
+FPM defines a set of request stages for tracking a child process' current
+status:
+
+* FPM_REQUEST_ACCEPTING: About to call or blocked in accept() on the pool's
+  listening socket. The parent considers a child in this stage to be idle.
+* FPM_REQUEST_ACCEPTED: Just returned from accept() but not yet in
+  READING_HEADERS. This stage exists so that ondemand workers can enter it
+  before signaling the parent that they returned from accept(). Without it, a
+  race condition exists in which the parent could decide the child was idle
+  just *before* it began processing a request, and thus not start listening on
+  the pool's socket again.
+* FPM_REQUEST_READING_HEADERS:.
+* FPM_REQUEST_INFO:.
+* FPM_REQUEST_EXECUTING:.
+* FPM_REQUEST_END:.
+* FPM_REQUEST_FINISHED:
+
+
+## Scoreboard
+
+### Allocation
+
+FPM maintains a scoreboard data structure that tracks information about the
+current state and accumulated activities of workers. Each fpm_worker_pool_s
+contains one fpm_scoreboard_s structure, and wp->scoreboard->procs is an array
+of wp->config->mp_max_children pointers to fpm_scoreboard_proc_s
+structures. All of these structures are stored in shared memory allocated with
+mmap(MAP_ANONYMOUS|MAP_SHARED).
+
+The scoreboard proc slots are allocated as needed to children by
+fpm_scoreboard_proc_alloc() and _free(), which are non-locked functions that
+can only be called by the parent. Each scoreboard proc has a boolean used
+flag. The scoreboard maintains an index, free_proc, which indicates that a proc
+that might be unused. A scoreboard proc is allocated to a specific child
+process by fpm_scoreboard_proc_alloc(). This function first checks whether
+scoreboard->procs[scoreboard->free_proc] is unused, and if it is scans through
+the array looking for one that is unused. Whichever way it finds one, it marks
+that proc used and provides the now-allocated index to its caller. The
+free_proc hint is then incremented one (circularly) to suggest a new free
+proc. fpm_scoreboard_proc_free() releases a proc slot by memset()ing it to
+zero, and setting scoreboard->free_proc to the just-released index.
+
+Just before the parent forks a new child, it allocated a scoreboard proc for
+the impending child in fpm_resources_prepare(). Once the child is forked,
+fpm_child_resources_use() first frees the scoreboard for all other worker pools
+(to prevent accidentally updating them, presumably), and calls
+fpm_scoreboard_child_use() which sets the global fpm_scoreboard pointer to the
+worker's scoreboard and fpm_scoreboard_i to the child's allocated index in the
+worker's scoreboard->procs array. Finally, the child's scoreboard proc is
+updated with the child's pid and start time.
+
+A scoreboard proc is atomically locked and unlocked with
+fpm_scoreboard_proc_acquire() and _release(). The parent specifies the worker
+pool scoreboard and proc index to lock. A child specifies sentinel values that
+instruct the functions to use the global fpm_scoreboard and fpm_scoreboard_i
+variables.
+
+### Tracking request stages
+
+Each scoreboard proc structure has a field, enum fpm_request_stage_e
+request_stage, to track the current stage. Functions such as
+fpm_request_accepting() are called from a child to lock the child's scoreboard
+proc, update the request stage, set whatever other proc values are appropriate
+for the stage, and release the proc.
+
+The parent can inspect a child's current stage via functions like
+fpm_request_is_idle(), which accepts a specific fpm_child_s structure,
+retrieves its scoreboard proc, and checks the proc's request_stage.
+
+Note that fpm_request_is_idle() explicitly DOES NOT lock the child's proc
+structure before reading the request_stage. In Java, this would mean that the
+parent had no guarantee of reading the current value. In C, with no specific
+memory model, ... all bets are off, I suspect.
+
+## Process management styles
+
+There are three supported process management styles, but they all share a
+common flow. In main(), the parent calls fpm_run(). fpm_run() forks initial
+worker processes for each pool, if any, with fpm_children_create_initial(), and
+calls fpm_event_loop() which runs forever.
+
+The forked children return from fpm_run(), inheriting their listening socket
+from the parent. They do a little more initialization, and enter their main loop
+calling fcgi_accept_request(). Each call to that function calls accept() on
+the socket and then reads and processes a single FastCGI request.
+
+### Static
+
+### Dynamic
+
+### Ondemand
+
+No initial worker processes are started. Instead, fpm_children_create_initial()
+installs an FPM_EV_READ event on the pool's listening socket. When a connection
+arrives, the event calls fpm_pctl_on_socket_accept(). If there is an idle child
+available to run the request, the function does nothing; an idle child is
+already waiting in accept() and will respond. Otherwise, if the current system
+state allows it, the function forks a new child; if not, the request function
+does nothing, and the request stays queued until the next child is
+available. Either way, the parent trusts that a newly launched or existing
+child will eventually handle the request.
+
+The strategy for deciding when to fork is subtle. The quick summary:
+
+* fpm_pctl_on_socket_accept() always removes the pool's listening socket,
+* a child always restores its pool's listening socket after calling accept(),
+* a child's first call to accept() is non-blocking,
+* a pool's listening socket is always restore when a child exits, and
+* a timer restores listening sockets removed due to external system state.
+
+In detail:
+
+Suppose FPM uses a normal level-triggered select()/poll() on the listening
+socket. When a connection arrives, fpm_pctl_on_socket_accept() is triggered by
+the event system. It finds no idle children, so it forks one, and returns. If
+the parent polls again before the child makes it to accept(), the connection is
+still pending so fpm_pctl_on_socket_accept() is called again. The child is
+idle, so the function returns, but until the child calls accept(), the parent
+will spin in this loop. Also, there are a race conditions in which loop occurs
+before the child is marked idle (which happens immediately prior to its calling
+accept()), in which case the parent will rapidly fork extra processes.
+
+To avoid this, the initial version of ondemand used edge-triggering on the
+listening socket (and thus only worked on systems that support epoll() or
+kqueue()). This prevents the spin loop because each pending connection is only
+returned once, and thus only triggers one call to fpm_pctl_on_socket_accept()
+no matter how long the child takes to accept() it. Unfortunately, it causes
+different problems. Edge triggering semantics requires that the polling process
+read the edge-triggered file until it is empty; in this case, that requires
+calling accept() until it returns EAGAIN/EWOULDBLOCK. However, with ondemand,
+the parent process polls the listening socket, while the child processes call
+accept(), so there is no way for the parent to comply with the edge-triggering
+contract. As a result, multiple requests can arrive, generate only a single
+edge-triggered event, and launch insufficient processes to handle the requests.
+
+The solution is to use level-triggered polling in a way that avoids spinning
+the CPU. The trick is for the parent to remove a pool's listening socket from
+its polled set during necessary periods, and to restore it when appropriate. To
+facilitate restoring the listening socket, during
+fpm_children_create_initial(), the parent creates a per-pool child_accept pipe
+and registers an event to call fpm_pctl_on_child_accept() when the pipe is
+readable. When invoked, this function reads sizeof(pid_t) from one of the
+children and adds the listening socket event back to the polling set.
+
+For removing the listening socket event, there are three cases to
+consider. Each time a request arrives and fpm_pctl_on_socket_accept() is
+invoked:
+
+* An idle child is available. The parent removes the listening socket. When the
+idle child returns from accept(), it writes to the child_accept pipe, restoring
+the listening socket.
+
+* No idle child is available, and the parent forks a new child. The parent
+removes the listening socket. The child starts and, after calling accept(), it
+writes to the child_accept pipe. For this to work, the first call to accept()
+must be non-blocking---a sibling process may grab the request first, and the
+new child must inform the parent it called accept() so new requests can be
+processed whether accept() suceeded or failed.
+
+* No idle child is available, but the parent cannot fork a new child due to
+system state (such as already having pm_max_procs children). If there are zero
+children and the parent cannot fork one, the parent exits with an error
+status. Otherwise, the parent again removes the listening socket, because there
+is no point in acting on a pending request when there is no child process to
+run it (since level triggering is in place, any pending requests will again
+generate a poll event when the socket is restored). The listening socket event
+is restored when:
+
+       * A child exits for any reason, since now a new one can possiby be
+      started. The SIGCHLD triggers a call to fpm_children_bury(), which looks
+      up the child's fpm_child_s record, and writes to the child's child_accept
+      pipe regardless of why it exited.
+
+       * A child becomes idle, since it may empty the pending queue. An idle child
+      immediately calls accept(), by definition. If there is still a pending
+      request, accept() returns, and as per above, the child writes to its
+      child_accept pipe. If there is no pending request, it is okay that the
+      parent continues ignoring listening socket; the next request to arrive
+      will go to an idle child which will write to its child_accept pipe.
diff --git a/sapi/fpm/fpm/fastcgi.c b/sapi/fpm/fpm/fastcgi.c
index d77b6f8..8db600f 100644
--- a/sapi/fpm/fpm/fastcgi.c
+++ b/sapi/fpm/fpm/fastcgi.c
@@ -803,8 +803,36 @@ void fcgi_close(fcgi_request *req, int force, int destroy)
        return 0;
 }

-int fcgi_accept_request(fcgi_request *req)
+/**
+ * Set a file descriptor to be blocking or non-blocking.
+ *
+ * @param fd: the file descriptor
+ * @param nonblocking: zero to set blocking, non-zero to set nonblocking
+ * @returns 0 on success, <0 on error
+ */
+int fcgi_set_nonblocking(int fd, int nonblocking)
+{
+       int flags = fcntl(fd, F_GETFL, 0);
+       if (flags < 0) {
+               return flags;
+       }
+       flags = nonblocking ? (flags | O_NONBLOCK) : (flags & ~O_NONBLOCK);
+       return fcntl(fd, F_SETFL, flags);
+}
+
+/**
+ * Accept a connect and parse a FastCGI request.
+ *
+ * @param req: the request structure to fill in
+ * @param nonblock_once: if non-zero, the listening socket is set nonblocking
+ *   for one call to accept(); if accept() fails with EAGAIN/EWOULDBLOCK, it is
+ *   called a second time in blocking mode.
+ * @param accept_pipe: if non-negative, write the pid to this fd every time
+ *   accept() returns, regardless of its return value.
+ * @returns the accepted fd, or -1 on error.
+ */
+int fcgi_accept_request(fcgi_request *req, int nonblock_once, int accept_pipe)
 {
 #ifdef _WIN32
        HANDLE pipe;
        OVERLAPPED ov;
@@ -845,12 +845,37 @@ int fcgi_accept_request(fcgi_request *req)
 #endif
                                        sa_t sa;
                                        socklen_t len = sizeof(sa);
+                                       int accept_errno;
+                                       do {
+                                               fpm_request_accepting();

-                                       fpm_request_accepting();
-
-                                       FCGI_LOCK(req->listen_socket);
-                                       req->fd = accept(listen_socket, (struct sockaddr *)&sa, &len);
-                                       FCGI_UNLOCK(req->listen_socket);
+                                               FCGI_LOCK(req->listen_socket);
+                                               if (nonblock_once) {
+                                                       fcgi_set_nonblocking(req->listen_socket, 1);
+                                               }
+                                               req->fd = accept(listen_socket, (struct sockaddr *)&sa, &len);
+                                               accept_errno = errno;
+                                               if (nonblock_once) {
+                                                       fcgi_set_nonblocking(req->listen_socket, 0);
+                                                       nonblock_once = 0;
+                                               }
+                                               // Mark ourselves as no longer ACCEPTING before telling
+                                               // the parent to restore our listening
+                                               // socket. Otherwise, the parent can decide we are
+                                               // still idle and about to grab a new request when we
+                                               // are not.
+                                               fpm_request_accepted();
+                                               if (accept_pipe >= 0) {
+                                                       pid_t pid = getpid();
+                                                       if (write(accept_pipe, &pid, sizeof(pid)) != sizeof(pid)) {
+                                                               zlog(ZLOG_ERROR, "Failed writing to accept_pipe; something bad is going to happen.");
+                                                       }
+                                               }
+                                               if (req->fd < 0 && (accept_errno == EAGAIN || accept_errno == EWOULDBLOCK)) {
+                                                       zlog(ZLOG_DEBUG, "Child pid %d lost the accept() race, retrying", getpid());
+                                               }
+                                               FCGI_UNLOCK(req->listen_socket);
+                                       } while (req->fd < 0 && (accept_errno == EAGAIN || accept_errno == EWOULDBLOCK));

                                        client_sa = sa;
                                        if (req->fd >= 0 && !fcgi_is_allowed()) {
diff --git a/sapi/fpm/fpm/fastcgi.h b/sapi/fpm/fpm/fastcgi.h
index 34f9eef..f34ede9 100644
--- a/sapi/fpm/fpm/fastcgi.h
+++ b/sapi/fpm/fpm/fastcgi.h
@@ -115,7 +115,7 @@ typedef struct _fcgi_request {
 int fcgi_init(void);
 void fcgi_shutdown(void);
 void fcgi_init_request(fcgi_request *req, int listen_socket);
-int fcgi_accept_request(fcgi_request *req);
+int fcgi_accept_request(fcgi_request *req, int nonblock_once, int accept_pipe);
 int fcgi_finish_request(fcgi_request *req, int force_close);

 void fcgi_set_in_shutdown(int);
diff --git a/sapi/fpm/fpm/fpm_children.c b/sapi/fpm/fpm/fpm_children.c
index 45cc075..81bd21d 100644
--- a/sapi/fpm/fpm/fpm_children.c
+++ b/sapi/fpm/fpm/fpm_children.c
@@ -146,8 +146,9 @@ static struct fpm_child_s *fpm_child_find(pid_t pid) /* {{{ */
 static void fpm_child_init(struct fpm_worker_pool_s *wp) /* {{{ */
 {
        fpm_globals.max_requests = wp->config->pm_max_requests;
-
-       if (0 > fpm_stdio_init_child(wp)  ||
+
+       if (0 > fpm_pctl_init_child(wp)   ||
+           0 > fpm_stdio_init_child(wp)  ||
            0 > fpm_log_init_child(wp)    ||
            0 > fpm_status_init_child(wp) ||
            0 > fpm_unix_init_child(wp)   ||
@@ -187,6 +188,13 @@ void fpm_children_bury() /* {{{ */

                child = fpm_child_find(pid);

+               if (child->wp->config->pm == PM_STYLE_ONDEMAND) {
+                       pid_t pid = getpid();
+                       if (write(child->wp->child_accept_pipe[1], &pid, sizeof(pid)) != sizeof(pid)) {
+                               zlog(ZLOG_ERROR, "Failed writing to accept_pipe; something bad is going to happen.");
+                       }
+               }
+
                if (WIFEXITED(status)) {

                        snprintf(buf, sizeof(buf), "with code %d", WEXITSTATUS(status));
@@ -446,10 +452,26 @@ int fpm_children_create_initial(struct fpm_worker_pool_s *wp) /* {{{ */
                }

                memset(wp->ondemand_event, 0, sizeof(struct fpm_event_s));
-               fpm_event_set(wp->ondemand_event, wp->listening_socket, FPM_EV_READ | FPM_EV_EDGE, fpm_pctl_on_socket_accept, wp);
+               fpm_event_set(wp->ondemand_event, wp->listening_socket, FPM_EV_READ, fpm_pctl_on_socket_accept, wp);
                wp->socket_event_set = 1;
                fpm_event_add(wp->ondemand_event, 0);

+               if (pipe(wp->child_accept_pipe) < 0) {
+                       zlog(ZLOG_ERROR, "[pool %s] unable to create child_accept_pipe", wp->config->name);
+                       // FIXME handle crash
+                       return 1;
+               }
+
+               wp->child_accept_event = (struct fpm_event_s *)malloc(sizeof(struct fpm_event_s));
+               if (!wp->child_accept_event) {
+                       zlog(ZLOG_ERROR, "[pool %s] unable to malloc the child_accept socket event", wp->config->name);
+                       // FIXME handle crash
+                       return 1;
+               }
+               memset(wp->child_accept_event, 0, sizeof(struct fpm_event_s));
+               fpm_event_set(wp->child_accept_event, wp->child_accept_pipe[0], FPM_EV_READ, fpm_pctl_on_child_accept, wp);
+               fpm_event_add(wp->child_accept_event, 0);
+
                return 1;
        }
        return fpm_children_make(wp, 0 /* not in event loop yet */, 0, 1);
diff --git a/sapi/fpm/fpm/fpm_conf.c b/sapi/fpm/fpm/fpm_conf.c
index 688c640..2d9809c 100644
--- a/sapi/fpm/fpm/fpm_conf.c
+++ b/sapi/fpm/fpm/fpm_conf.c
@@ -834,11 +834,6 @@ static int fpm_conf_process_all_pools() /* {{{ */
                } else if (wp->config->pm == PM_STYLE_ONDEMAND) {
                        struct fpm_worker_pool_config_s *config = wp->config;

-                       if (!fpm_event_support_edge_trigger()) {
-                               zlog(ZLOG_ALERT, "[pool %s] ondemand process manager can ONLY be used when events.mechanisme is either epoll (Linux) or kqueue (*BSD).", wp->config->name);
-                               return -1;
-                       }
-
                        if (config->pm_process_idle_timeout < 1) {
                                zlog(ZLOG_ALERT, "[pool %s] pm.process_idle_timeout(%ds) must be greater than 0s", wp->config->name, config->pm_process_idle_timeout);
                                return -1;
diff --git a/sapi/fpm/fpm/fpm_main.c b/sapi/fpm/fpm/fpm_main.c
index e879325..59ccf9d 100644
--- a/sapi/fpm/fpm/fpm_main.c
+++ b/sapi/fpm/fpm/fpm_main.c
@@ -1889,8 +1889,20 @@  /* library is already initialized, now init our request */
        fcgi_init_request(&request, fcgi_fd);

        zend_first_try {
-               while (fcgi_accept_request(&request) >= 0) {
+               // TODO: There has to be a better way to pass this info from the parent
+               // to the child. I should be able to access the fpm_worker_pool_s.
+               int pm, accept_pipe[2];
+               fpm_pctl_child_info(&pm, accept_pipe);
+               int first_accept_nonblocking = 0, child_accept_fd = -1;
+               if (pm == PM_STYLE_ONDEMAND) {
+                       // TODO: move this close to some setup routine somewhere
+                       close(accept_pipe[0]);
+                       first_accept_nonblocking = 1;
+                       child_accept_fd = accept_pipe[1];
+               }
+               while (fcgi_accept_request(&request, first_accept_nonblocking, child_accept_fd) >= 0) {
+                       first_accept_nonblocking = 0;
                        char *primary_script = NULL;
                        request_body_fd = -1;
                        SG(server_context) = (void *) &request;
                        init_request_info(TSRMLS_C);
diff --git a/sapi/fpm/fpm/fpm_process_ctl.c b/sapi/fpm/fpm/fpm_process_ctl.c
index 76ea4d3..4fb23e4 100644
--- a/sapi/fpm/fpm/fpm_process_ctl.c
+++ b/sapi/fpm/fpm/fpm_process_ctl.c
@@ -507,6 +507,10 @@ void fpm_pctl_on_socket_accept(struct fpm_event_s *ev, short which, void *arg) /

        wp->socket_event_set = 0;

+       // Remove the listening socket from the epoll set. We'll add it back
+       // when we read from wp->child_accept_pipe.
+       fpm_event_del(wp->ondemand_event);
+
 /*     zlog(ZLOG_DEBUG, "[pool %s] heartbeat running_children=%d", wp->config->name, wp->running_children);*/

        if (wp->running_children >= wp->config->pm_max_children) {
@@ -520,7 +524,7 @@ void fpm_pctl_on_socket_accept(struct fpm_event_s *ev, short which, void *arg) /
        }

        for (child = wp->children; child; child = child->next) {
-               /* if there is at least on idle child, it will handle the connection, stop here */
+               /* if there is at least one idle child, it will handle the connection, stop here */
                if (fpm_request_is_idle(child)) {
                        return;
                }
@@ -537,3 +541,52 @@ void fpm_pctl_on_socket_accept(struct fpm_event_s *ev, short which, void *arg) /
 }
 /* }}} */

+void fpm_pctl_on_child_accept(struct fpm_event_s *ev, short which, void *arg) /* {{{ */
+{
+       struct fpm_worker_pool_s *wp = (struct fpm_worker_pool_s *)arg;
+       pid_t pid;
+
+       if (fpm_globals.parent_pid != getpid()) {
+               /* prevent a event race condition when child process
+                * have not set up its own event loop */
+               return;
+       }
+
+       // TODO: Isn't this redundant with the check above?
+       if (fpm_globals.is_child) {
+               return;
+       }
+
+       if (read(wp->child_accept_pipe[0], &pid, sizeof(pid)) != sizeof(pid)) {
+               zlog(ZLOG_ERROR, "[pool %s] error %d reading child accept pipe", wp->config->name, errno);
+       }
+
+       // Add the listening socket back to the epoll set.
+       fpm_event_add(wp->ondemand_event, 0);
+
+       zlog(ZLOG_DEBUG, "[pool %s] listening on pool after child %d signaled", wp->config->name, pid);
+}
+/* }}} */
+
+// TODO: There has to be a better way to pass this info from the parent
+// to the child. I should be able to access the fpm_worker_pool_s.
+int fpm_pctl_child_pm, fpm_pctl_child_accept_pipe[2];
+int fpm_pctl_init_child(struct fpm_worker_pool_s *wp)  /* {{{ */
+{
+       fpm_pctl_child_pm = wp->config->pm;
+       memcpy(fpm_pctl_child_accept_pipe, wp->child_accept_pipe, sizeof(fpm_pctl_child_accept_pipe));
+       return 0;
+}
+/* }}} */
+
+void fpm_pctl_child_info(int *pm, int *pipe)  /* {{{ */
+{
+       *pm = fpm_pctl_child_pm;
+       memcpy(pipe, fpm_pctl_child_accept_pipe, sizeof(fpm_pctl_child_accept_pipe));
+}
+/* }}} */
+
+
+
+
+
diff --git a/sapi/fpm/fpm/fpm_process_ctl.h b/sapi/fpm/fpm/fpm_process_ctl.h
index 86a6ef0..0ff9249 100644
--- a/sapi/fpm/fpm/fpm_process_ctl.h
+++ b/sapi/fpm/fpm/fpm_process_ctl.h
@@ -24,9 +24,12 @@ void fpm_pctl_kill_all(int signo);
 void fpm_pctl_heartbeat(struct fpm_event_s *ev, short which, void *arg);
 void fpm_pctl_perform_idle_server_maintenance_heartbeat(struct fpm_event_s *ev, short which, void *arg);
 void fpm_pctl_on_socket_accept(struct fpm_event_s *ev, short which, void *arg);
+void fpm_pctl_on_child_accept(struct fpm_event_s *ev, short which, void *arg);
 int fpm_pctl_child_exited();
 int fpm_pctl_init_main();

+int fpm_pctl_init_child(struct fpm_worker_pool_s *wp);
+void fpm_pctl_child_info(int *pm, int *pipe);

 enum {
        FPM_PCTL_STATE_UNSPECIFIED,
diff --git a/sapi/fpm/fpm/fpm_request.c b/sapi/fpm/fpm/fpm_request.c
index bf431a0..93c7109 100644
--- a/sapi/fpm/fpm/fpm_request.c
+++ b/sapi/fpm/fpm/fpm_request.c
@@ -58,6 +58,24 @@ void fpm_request_accepting() /* {{{ */
 }
 /* }}} */

+void fpm_request_accepted() /* {{{ */
+{
+       struct fpm_scoreboard_proc_s *proc;
+       struct timeval now;
+
+       fpm_clock_get(&now);
+
+       proc = fpm_scoreboard_proc_acquire(NULL, -1, 0);
+       if (proc == NULL) {
+               zlog(ZLOG_WARNING, "failed to acquire proc scoreboard");
+               return;
+       }
+
+       proc->request_stage = FPM_REQUEST_ACCEPTED;
+       fpm_scoreboard_proc_release(proc);
+}
+/* }}} */
+
 void fpm_request_reading_headers() /* {{{ */
 {
        struct fpm_scoreboard_proc_s *proc;
diff --git a/sapi/fpm/fpm/fpm_request.h b/sapi/fpm/fpm/fpm_request.h
index aebd36c..815e21a 100644
--- a/sapi/fpm/fpm/fpm_request.h
+++ b/sapi/fpm/fpm/fpm_request.h
@@ -22,6 +22,7 @@ int fpm_request_last_activity(struct fpm_child_s *child, struct timeval *tv);

 enum fpm_request_stage_e {
        FPM_REQUEST_ACCEPTING = 1,
+       FPM_REQUEST_ACCEPTED,
        FPM_REQUEST_READING_HEADERS,
        FPM_REQUEST_INFO,
        FPM_REQUEST_EXECUTING,
diff --git a/sapi/fpm/fpm/fpm_scoreboard.c b/sapi/fpm/fpm/fpm_scoreboard.c
index 24463a9..4f922ad 100644
--- a/sapi/fpm/fpm/fpm_scoreboard.c
+++ b/sapi/fpm/fpm/fpm_scoreboard.c
@@ -261,6 +261,7 @@ void fpm_scoreboard_child_use(struct fpm_scoreboard_s *scoreboard, int child_ind
        }
        proc->pid = pid;
        proc->start_epoch = time(NULL);
+       zlog(ZLOG_DEBUG, "[pool %s] child %d using scoreboard proc %d", scoreboard->pool, (int) pid, child_index);
 }
 /* }}} */

diff --git a/sapi/fpm/fpm/fpm_worker_pool.h b/sapi/fpm/fpm/fpm_worker_pool.h
index 05c993d..96c7821 100644
--- a/sapi/fpm/fpm/fpm_worker_pool.h
+++ b/sapi/fpm/fpm/fpm_worker_pool.h
@@ -40,9 +40,10 @@ struct fpm_worker_pool_s {
        char **limit_extensions;

        /* for ondemand PM */
-       struct fpm_event_s *ondemand_event;
+        struct fpm_event_s *ondemand_event, *child_accept_event;
        int socket_event_set;
+        int child_accept_pipe[2];

 #ifdef HAVE_FPM_ACL
        void *socket_acl;
 #endif


Patch was tested against PHP 5.6.10 branch.

@bjaspan
Copy link
Author

bjaspan commented Jun 11, 2015

I submitted my PR against 5.5 because the bug submission guidelines specifically said to do so. I'm happy to re-submit it against 5.6 if that is preferred.

@bdraco
Copy link

bdraco commented Aug 24, 2016

I just found this after submitting https://bugs.php.net/bug.php?id=72935

@bdraco
Copy link

bdraco commented Aug 24, 2016

Patch updated for newer 5.6

0001-ONDEMAND-fix-from.patch.txt

@krakjoe
Copy link
Member

krakjoe commented Jan 6, 2017

Since this targets a security fix only branch, and since a patch against a supported branch would have to be different, I'm closing this PR.

Please take this action as encouragement to open a clean PR against a supported branch.

On a side note, if you wish to add a document like the design document included in this PR, can I ask that this is made as a separate pull request from this fix.

@krakjoe krakjoe closed this Jan 6, 2017
@bjaspan
Copy link
Author

bjaspan commented Jan 9, 2017

@krakjoe I opened this PR in May 2015, 14 months before PHP 5.5 was EOL'ed. Please explain why I should feel motivated to update and re-test the patch against a newer version. Will my new PR be acted on in a timely fashion?

@nikic
Copy link
Member

nikic commented Jan 9, 2017

That's a reasonable question. This is a larger patch to FPM, a component most of us are not familiar with... Do we have someone who is able+willing to review a rebased version of this change?

@krakjoe
Copy link
Member

krakjoe commented Jan 9, 2017

@bjaspan sorry about the way this has been handled, that it was ignored for so long ...

We are trying to find someone who can review it ... please do the PR, and I'll make noise until someone ok's it, that's the best I can do ... as Nikita said, most of us are unfamiliar, or have cursory familiarity with fpm.

We are fixing the way all pull requests are dealt with ... again, sorry ...

@bjaspan
Copy link
Author

bjaspan commented Jan 9, 2017

You want it first for PHP 7 now, right?

@nikic
Copy link
Member

nikic commented Jan 9, 2017

I would recommend to target the master branch. (This probably won't go into PHP-7.0 at this point, per this comment.)

@krakjoe
Copy link
Member

krakjoe commented Jan 9, 2017

Yeah, master is best.

@bdraco
Copy link

bdraco commented Jan 9, 2017

@bjaspan Please see the updated patch in https://bugs.php.net/bug.php?id=69724 . The original patch fixed the problem caused a FPM to spawn too many (potentially 100s) of processes when you just connected to the socket.

cPanel tracked this as CPANEL-9788

Replication of that problem below from that case:

  1. Ensure that PHP-FPM is enanbled in on demand mode
  2. Wait for a scheduled service check to happen, or run {{/scripts/restartsrv_cpanel_php_fpm --check}}, OR run {{nc -U /var/cpanel/php-fpm/cpanelroundcube/sock < /dev/null}}.
  3. Run {{ps auxww | grep "php-fpm"}} and see that more than one child process may have been spawned as a result of this. I use the word may because this this is a race-condition or timing related due to its variability.

Example of what I see happen consistently on a test machine:

[root]cPs# echo "Before:"; ps auxww | grep fp[m]; echo "Connecting..."; nc -U /var/cpanel/php-fpm/cpanelphpmyadmin/sock < /dev/null; echo "After:"; ps auxww | grep fp[m]
Before:
root      218186  0.0  0.1 247172 10680 ?        Ss   Nov11   2:05 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)
Connecting...
After:
root      218186  0.0  0.1 247172 10680 ?        Ss   Nov11   2:05 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)
cpanelp+  569549  0.0  0.0 255916  6300 ?        S    03:45   0:00 php-fpm: pool cpanelphpmyadmin
cpanelp+  569550  0.0  0.0 255916  6300 ?        S    03:45   0:00 php-fpm: pool cpanelphpmyadmin
... skip 100+ lines....
cpanelp+  569720  0.0  0.0 255940  6180 ?        S    03:45   0:00 php-fpm: pool cpanelphpmyadmin
cpanelp+  569721  0.0  0.0 255940  6180 ?        S    03:45   0:00 php-fpm: pool cpanelphpmyadmin
cpanelp+  569722  0.0  0.0 255940  6180 ?        S    03:45   0:00 php-fpm: pool cpanelphpmyadmin
root      569723  0.0  0.0 255940  5932 ?        S    03:45   0:00 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)
root      569724  0.0  0.0 255940  5932 ?        S    03:45   0:00 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)
root      569725  0.0  0.0 255940  5932 ?        S    03:45   0:00 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)
root      569726  0.0  0.0 255940  5932 ?        S    03:45   0:00 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)
root      569727  0.0  0.0 255940  5932 ?        S    03:45   0:00 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)
root      569728  0.0  0.0 255940  5932 ?        S    03:45   0:00 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)

I used "nc" in these examples because it's only connecting to the socket and then disconnecting.

On my personal test VM the problem is much less severe (this is a race condition), but php-fpm does spawn multiple processes when it should not, I got 6 of them out of this test, sometimes it is only 3:

root@whm-11-60-c7 [~]# echo "Before:"; ps auxww | grep fp[m]; echo "Connecting..."; nc -U /var/cpanel/php-fpm/cpanelphpmyadmin/sock < /dev/null; echo "After:"; ps auxww | grep fp[m]
Before:
root       7794  0.0  0.4 240432  7680 ?        Ss   10:16   0:00 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)
Connecting...
After:
root       7794  0.0  0.4 240432  7680 ?        Ss   10:16   0:00 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)
cpanelp+  13756  0.0  0.2 238720  4796 ?        S    14:03   0:00 php-fpm: pool cpanelphpmyadmin
cpanelp+  13757  0.0  0.2 238720  4792 ?        S    14:03   0:00 php-fpm: pool cpanelphpmyadmin
cpanelp+  13758  0.0  0.2 238720  4796 ?        S    14:03   0:00 php-fpm: pool cpanelphpmyadmin
cpanelp+  13759  0.0  0.2 238720  4792 ?        S    14:03   0:00 php-fpm: pool cpanelphpmyadmin
cpanelp+  13760  0.0  0.2 238720  4792 ?        S    14:03   0:00 php-fpm: pool cpanelphpmyadmin
cpanelp+  13761  0.0  0.2 238720  4792 ?        S    14:03   0:00 php-fpm: pool cpanelphpmyadmin

Its hard to trigger the race condition so 99/100 times it works as exepected:

[root]cPs# echo "Before:"; ps auxww | grep fp[m]; echo "Connecting..."; nc -U /var/cpanel/php-fpm/cpanelroundcube/sock < /dev/null; echo "After:"; ps auxww | grep fp[m]
Before:
root      572171  0.0  0.1 247040 10512 ?        Ss   04:08   0:00 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)
Connecting...
After:
root      572171  0.0  0.1 247040 10512 ?        Ss   04:08   0:00 php-fpm: master process (/usr/local/cpanel/etc/php-fpm.conf)
cpanelr+  572189  0.0  0.0 255808  6112 ?        S    04:08   0:00 php-fpm: pool cpanelroundcube

@bjaspan
Copy link
Author

bjaspan commented Jan 9, 2017

@bdraco Please clarify: Are you saying that the change proposed in this PR, and in https://bugs.php.net/bug.php?id=69724, has the race condition bug causing many extra child procs to be spawned on socket connect/disconnect?

@bdraco
Copy link

bdraco commented Jan 9, 2017

@bjaspan
Copy link
Author

bjaspan commented Jan 10, 2017

Realistically, I am unlikely to have time to update and test this PR. Since @bdraco seems to have become involved, I recommend going with his work.

I do strongly suggest accepting the design documentation that I wrote. One big problem with FPM is that it is a big pile of completely undocumented code, making it almost impossible for anyone to improve it. I wrote the design docs as I figured out what I needed to figure out to write this PR. It would be a shame to lose that knowledge.

@HoustonDad
Copy link

Hi,

As a note, cPanel has hired RogueWave to dig into this breakage and hopefully come to a proper solution. We are working with Ryan and Maurice at this point on this bug.

@bjaspan
Copy link
Author

bjaspan commented Jan 11, 2017

What issues or PRs should I subscribe to to be notified of further progress? Just https://bugs.php.net/bug.php?id=69724?

@HoustonDad
Copy link

Both places would be good to monitor for movement.

@mpdude
Copy link

mpdude commented May 4, 2019

@bjaspan just found your design document by chance, after a day of trying to understand how FPM works internally.

May I encourage you to submit that as a PR on its own? It’s a valuable addition that you must have put a lot of time into.

@bukka
Copy link
Member

bukka commented Dec 29, 2024

Just a note here that I finally picked this up. I'm not too thrilled about introducing new pipes but can't currently think about better solution. I will try to think more about it but if I don't figure out something better, I will update this to some more acceptable form and play with it. It should hopefully get fixed in 2025 - it's currently my FPM priority.

@bjaspan
Copy link
Author

bjaspan commented Dec 29, 2024

Glad to hear it! I have long since moved on and am no longer involved with PHP, though.

@bukka
Copy link
Member

bukka commented Dec 29, 2024

Sure no worries. Apology for such a long delay. It was created before I started maintaining FPM and way before PHP Foundation was created so there was no real funding to get it handled. Things are better now so this should get finally sorted out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants