Skip to content

Three stream read optimizations #8032

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions ext/standard/file.c
Original file line number Diff line number Diff line change
Expand Up @@ -566,6 +566,11 @@ PHP_FUNCTION(file_get_contents)
RETURN_FALSE;
}

/* disabling the read buffer allows doing the whole transfer
in just one read() system call */
if (php_stream_is(stream, PHP_STREAM_IS_STDIO))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxKellermann I don't quite understand why this pull request got closed in it's entirety. I know there is an issue around commits 2+3, but commit 1 (file_get_contents without read buffer) is a clear, simple and non-breaking and uncontroversial improvement IMO. Maybe commit #e45f829 can be extracted into a new pull request to get merged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK here you are #8547

php_stream_set_option(stream, PHP_STREAM_OPTION_READ_BUFFER, PHP_STREAM_BUFFER_NONE, NULL);

if (offset != 0 && php_stream_seek(stream, offset, ((offset > 0) ? SEEK_SET : SEEK_END)) < 0) {
php_error_docref(NULL, E_WARNING, "Failed to seek to position " ZEND_LONG_FMT " in the stream", offset);
php_stream_close(stream);
Expand Down
5 changes: 0 additions & 5 deletions ext/standard/tests/streams/bug54946.phpt
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,6 @@ fclose($stream);
unlink($filename);
?>
--EXPECTF--
Notice: stream_get_contents(): Read of 8192 bytes failed with errno=9 Bad file descriptor in %s on line %d
string(0) ""

Notice: stream_get_contents(): Read of 8192 bytes failed with errno=9 Bad file descriptor in %s on line %d
string(0) ""

Notice: stream_get_contents(): Read of 8192 bytes failed with errno=9 Bad file descriptor in %s on line %d
string(0) ""
28 changes: 26 additions & 2 deletions main/streams/streams.c
Original file line number Diff line number Diff line change
Expand Up @@ -706,7 +706,11 @@ PHPAPI ssize_t _php_stream_read(php_stream *stream, char *buf, size_t size)
break;
}

if (!stream->readfilters.head && ((stream->flags & PHP_STREAM_FLAG_NO_BUFFER) || stream->chunk_size == 1)) {
/* if the read buffer is currently empty, don't buffer
* large reads (larger than chunk_size), because
* buffering would only hurt performance in such an
* edge case */
if (!stream->readfilters.head && ((stream->flags & PHP_STREAM_FLAG_NO_BUFFER) || stream->chunk_size == 1 || (stream->writepos == stream->readpos && size >= stream->chunk_size))) {
toread = stream->ops->read(stream, buf, size);
if (toread < 0) {
/* Report an error if the read failed and we did not read any data
Expand Down Expand Up @@ -1470,6 +1474,26 @@ PHPAPI zend_string *_php_stream_copy_to_mem(php_stream *src, size_t maxlen, int
return ZSTR_EMPTY_ALLOC();
}

if (php_stream_is(src, PHP_STREAM_IS_STDIO) &&
src->readfilters.head == NULL &&
php_stream_stat(src, &ssbuf) == 0 &&
#ifdef S_ISREG
S_ISREG(ssbuf.sb.st_mode) &&
#endif
ssbuf.sb.st_size >= 0) {
/* optimized code path for unfiltered regular files:
* if we know the exact size, we can allocate the
* right buffer size and issue only one read() system
* call */

if (ssbuf.sb.st_size <= src->position)
return ZSTR_EMPTY_ALLOC();

const size_t remaining = ssbuf.sb.st_size - src->position;
if (remaining < maxlen)
maxlen = remaining;
}

if (maxlen == PHP_STREAM_COPY_ALL) {
maxlen = 0;
}
Expand Down Expand Up @@ -1517,7 +1541,7 @@ PHPAPI zend_string *_php_stream_copy_to_mem(php_stream *src, size_t maxlen, int
ptr = ZSTR_VAL(result);

// TODO: Propagate error?
while ((ret = php_stream_read(src, ptr, max_len - len)) > 0){
while (!php_stream_eof(src) && (ret = php_stream_read(src, ptr, max_len - len)) > 0){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not mistaken, php_stream_eof() also generates some system calls (poll(), recv() with MSG_PEEK flag).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if the eof flag has not yet been set:

if (!stream->eof && PHP_STREAM_OPTION_RETURN_ERR ==
php_stream_set_option(stream, PHP_STREAM_OPTION_CHECK_LIVENESS,
0, NULL)) {

But you are right, PHP's "xp_streams" implementation is rather clumsy. The way it is implemented, my change adds one system call for streams (replacing recv() with poll()+recv()); it still eliminates one extra read() system call for regular files!
The xp_streams code could be optimized easily. Anybody interested?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two system calls, when it's not at the EOF, we got poll() + recv() + recv(), and CHECK_LIVENESS call usually returns true.
php_stream is not only designed for files, it also works for sockets.

The xp_streams code could be optimized easily. Anybody interested?

How? I can not understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two system calls, when it's not at the EOF, we got poll() + recv() + recv()

Oh yes, you are right. I thought about the "eof" case, but forgot about the "not-eof".

How? I can not understand.

By omitting the poll() on operating systems where MSG_DONTWAIT is available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always assume that doing so might cause a BC break.
In addition, this can also be applied when the stream is in non-blocking mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always assume that doing so might cause a BC break.

How?

In addition, this can also be applied when the stream is in non-blocking mode.

You mean on operating systems which don't support MSG_DONTWAIT? (i.e. Windows)

len += ret;
if (len + min_room >= max_len) {
result = zend_string_extend(result, max_len + step, persistent);
Expand Down