-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fix bug #79836: Segfault in concat_function #10049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9d84e97
to
09e4545
Compare
I had some trouble because the regression test produces the output |
There's a test failure in the tracing JIT for Also now that I'm looking at this again, I believe there were actually 2 bugs: a type confusion and a UAF. I fixed the type confusion issue, but maybe not the user-after-free: If we are in the case `opX_string = Z_STR_P(opX);", and the string is not interned and dropped by a userland function, then I think we have a UAF. I'll look into this again in the evening. |
aa52ce3
to
f3656c5
Compare
Thank you for the PR! Since this is somewhat related to https://bugs.php.net/81705, I've checked that, and found that we're leaking memory with this patch when running the following script: <?php
$arr = [0];
$my_var = str_repeat("a", 1);
set_error_handler(
function() use(&$my_var) {
echo("error\n");
$my_var = 0x123;
}
);
$my_var .= $GLOBALS["arr"];
var_dump($my_var); Arguably, that's better than a segfault due to the type confusion, but maybe we can improve that; at least we need to make sure that there's no memory leak for working code. |
Hi Christoph, thank you for the review. |
Zend/tests/bug79836.phpt
Outdated
--TEST-- | ||
Bug #79836 (Segfault in concat_function) | ||
--INI-- | ||
error_reporting = E_ALL & ~E_WARNING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be error_reporting = E_WARNING
. Otherwise no warning is printed, thus the ob closure isn't called and $c
isn't modified. But I think tests generally assume E_WARNING
is enabled.
Zend/zend_operators.c
Outdated
|
||
do { | ||
if (UNEXPECTED(Z_TYPE_P(op1) != IS_STRING)) { | ||
if (Z_ISREF_P(op1)) { | ||
op1 = Z_REFVAL_P(op1); | ||
if (Z_TYPE_P(op1) == IS_STRING) break; | ||
if (Z_TYPE_P(op1) == IS_STRING) { | ||
op1_string = zend_string_copy(Z_STR_P(op1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should avoid copying the string here and then only release the string when op1_string_from_func
is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this suggestion would introduce a use-after-free:
Suppose Z_STR_P(op1) is a non-interned string. It'll be assigned to op1_string.
If in the second do-while (for op2) we get to: zval_get_string_func(op2)
, then the userland function can decrease the refcount of op1_string, and if the only reference was from op1, we'd be accessing freed memory later down the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I think I see a way to avoid the copy, by doing it in the second do-while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old implementation essentially uses op1 in the happy path (when op1 is a string) and only uses op1_copy when new memory is allocated. There shouldn't be a use-after-free unless you actually free op1 (or the string is contains). It would be nice if we could avoid the needless copy here.
Zend/zend_operators.c
Outdated
} | ||
} | ||
op1 = &op1_copy; | ||
} else { | ||
op1_string = zend_string_copy(Z_STR_P(op1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, I don't think this should be copied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reasoning as in the above comment applies here I think.
Edit: I found a way to do it like you suggested.
Zend/zend_operators.c
Outdated
} | ||
} | ||
op1 = &op1_copy; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'd flip the condition (and UNEXPECTED)
Zend/zend_operators.c
Outdated
ZVAL_UNDEF(&op1_copy); | ||
ZVAL_UNDEF(&op2_copy); | ||
zend_string *op1_string, *op2_string = NULL; | ||
bool op1_string_from_func = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
op1_is_non_string
?
Zend/zend_operators.c
Outdated
} | ||
} else if (UNEXPECTED(Z_STRLEN_P(op2) == 0)) { | ||
if (EXPECTED(result != op1)) { | ||
zend_string_release(op1_string); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this release should be handled at the end of the function, and only be executed when op1_string_from_func
is set (as that would indicate it was allocated, maybe that variable needs a better name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not possible because of the use-after-free situation I described earlier I think.
I agree that the variable needs a better name though and I'll use your suggested name.
Edit: I found a way to do it like you suggested.
Zend/zend_operators.c
Outdated
if (result == op1) { | ||
/* As result == op1, op1_string is equal to Z_STR_P(result) and we thus we have a double reference. | ||
* We should drop it before extending such that it can be extended in place in most cases. */ | ||
if (!op1_string_from_func) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both releases here should go away with proper memory management above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reasoning applies as above comment I think.
Edit: I found a way to do it like you suggested.
f3656c5
to
d81fc7d
Compare
Thanks for the review, I applied your suggestions and pushed the branch.
This seems to indicate |
I rather guess that this conversion occurs during compilation (optimization). You probably need to enable OPcache ( |
d81fc7d
to
94d84bd
Compare
You are right. The literal compaction optimisation pass caused the array to string conversion warning at compile time. That caused the ob_start callback to also not trigger. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes!
There is one failing test in the last CI run: an oci_connect test, it timed out but I think it is unrelated.
Yes, unfortunately we got quite a few flaky tests 🙁
Zend/zend_operators.c
Outdated
if (result == op1) { | ||
if (op1_is_non_string_or_copy) { | ||
i_zval_ptr_dtor(result); | ||
op1_is_non_string_or_copy = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given how these values are used, maybe a better name would be free_op1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's indeed better. Actually I'd prefer free_op1_string
to make it clear it's about the string, not about the op.
Zend/zend_operators.c
Outdated
/* account for the case where result == op1 == op2 and the realloc is done */ | ||
if (op1_string == op2_string) { | ||
op2_string = result_str; | ||
op2_is_non_string_or_copy = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can never be true
here, maybe we can verify this with an assert instead? If this were true
we'd need to release it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, actually this might be true for constants in CVs.
$l = false;
$r = false;
$x = $l . $r;
Because a static empty string is returned for those. In this case we don't need a release, but I'm not sure if it's not more confusing not to add it. Maybe there's also another case I'm not thinking of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this, and I came to the conclusion that it is possible to get here using the following reasoning and piece of code.
Reasoning
This can only be true if line 1902 is executed.
That line requires op2_string to be NULL in order to be executed. That means that line 1870 did not execute. That means !(result == op1 && op1 == op2) which means result != op1 || op1 != op2
That means if we get to the branch starting at line 1948, result == op1 && op1 != op2 must hold.
Code
$global_non_interned_string = str_repeat("a", 3);
class Test {
public function __toString() {
global $global_non_interned_string;
return $global_non_interned_string;
}
}
$test1 = new Test;
$test2 = new Test;
$test1 .= $test2;
echo $test1 . "\n";
The current code has a memleak now because the reference to op2_string must be released before re-assigning. I'll fix this and add this test code as an additional test case.
Your example doesn't work because result != op1, which is a requirement to get in that branch.
Thanks for the review once again! Just another quick question: this PR will fix https://bugs.php.net/bug.php?id=81705 too (and maybe there are other related issues?). Do I need to change the commit name to also mention that issue, or what is usually done in this case? |
94d84bd
to
ea121a2
Compare
The following sequence of actions was happening which caused a null pointer dereference: 1. debug_backtrace() returns an array 2. The concatenation to $c will transform the array to a string via `zval_get_string_func` for op2 and output a warning. Note that zval op1 is of type string due to the first do-while sequence. 3. The warning of an implicit "array to string conversion" triggers the ob_start callback to run. This code transform $c (==op1) to a long. 4. The code below the 2 do-while sequences assume that both op1 and op2 are strings, but this is no longer the case. A dereference of the string will therefore result in a null pointer dereference. The solution used here is to work with the zend_string directly instead of with the ops.
Co-authored-by: changochen1@gmail.com
Co-authored-by: cmbecker69@gmx.de Co-authored-by: yukik@risec.co.jp
This tests the path op1_string_from_func == true, result == op1, op1 == op2 in the do-while, which wasn't tested before.
This tests for the case where result == op1, op1 != op2 and op1_string and op2_string are non-interned strings that are equal.
I did a revision of this after a long time. I changed the branch to master because:
According to benchmarks there is no regression (there's even a tiny improvement in Zend/bench.php). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Introduced by phpGH-10049
free_op2_string may be set to false when op1 == op2, by re-using the same string for both operands. In that case, the string should still be copied to result because result is not actually a string. Introduced by phpGH-10049
free_op2_string may be set to false when op1 == op2, by re-using the same string for both operands. In that case, the string should still be copied to result because result is not actually a string. Introduced by phpGH-10049
free_op2_string may be set to false when the operands are not strings, and `result == op1 == op2`, by re-using the same string for both operands. In that case, the string should still be copied to result because result is not actually a string. Also change the op1 branch to stay consistent. Introduced by phpGH-10049
free_op2_string may be set to false when the operands are not strings, and `result == op1 == op2`, by re-using the same string for both operands. In that case, the string should still be copied to result because result is not actually a string. Also change the op1 branch to stay consistent. Introduced by phpGH-10049
free_op2_string may be set to false when the operands are not strings, and `result == op1 == op2`, by re-using the same string for both operands. In that case, the string should still be copied to result because result is not actually a string. Also change the op1 branch to stay consistent. Introduced by GH-10049
This fixes bug #79836 (bugs.php.net bugtracker).
Bug analysis
The following sequence of actions was happening in the bug report code which caused a null pointer dereference:
zval_get_string_func
for op2 and output a warning.Note that zval op1 is of type string due to the first do-while
sequence.
the ob_start callback to run. This code transform $c (==op1) to a long.
are strings, but this is no longer the case. A dereference of the
string will therefore result in a null pointer dereference.
Test case reduction
I reduced the test case to the following:
I have also added a second test case for this bug which is about the same except that that one checks op2 instead of op1.
Finally, I also added a third test case to test a path in
concat_function
that wasn't tested before.I gave credit via the Co-authored-by tag to the original bug reporter because I created my test code by reducing theirs. I hope I did this correctly :)
Solution
The solution used here is to work with the zend_string directly instead of with the ops.
The code is slightly more complicated now, and I wanted to make sure I did not introduce any performance regressions.
I wasn't too sure how to test this best, so I wrote a little micro-benchmark:
I checked callgrind and it reports a tiny decrease in I refs:
The wall clock time is the same however (ran it 50 times and averaged it and the difference is negligible.
I also checked with running the
artisan
command from a Laravel project and the I refs also drop a tiny tiny amount, and the wall clock time is the same.I therefore think there is no negative performance impact.