Skip to content

Fix bug #75237 (jsonSerialize() - Returning new instance of self causes segfault) #2763

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

SammyK
Copy link
Contributor

@SammyK SammyK commented Sep 20, 2017

This patch fixes bug 75237. I'm pretty sure the garbage collection is correct, but just need to have a second set of eyes on it to make sure I didn't inadvertently break anything.

Also - this just throws a generic Exception. If you think we need to throw a specific exception, let me know and I can update the PR. :)

@jhdxr
Copy link
Member

jhdxr commented Sep 21, 2017

I didn't think this PR fix the core problem. this test script still crash.

<?php
class Foo implements JsonSerializable {
  public function jsonSerialize() {
    return new Bar;
  }
}

class Bar implements JsonSerializable {
  public function jsonSerialize() {
    return new Foo;
  }
}

try {
  var_dump(json_encode(new Foo));
} catch (Exception $e) {
  echo $e->getMessage();
}

@bukka
Copy link
Member

bukka commented Sep 21, 2017

This is not a but but just an infinite recursion...

This will limit any recursion which can be easily valid. Just imagine code like this:

<?php

class Foo implements JsonSerializable {
    private $i;

    public function __construct($i) {
        $this->i = $i;
    }
    
    public function jsonSerialize() {
        if ($this->i === 0) {
            return [];
        }
        return [new self($this->i - 1)];
    }
}

$foo = new Foo(2);

echo json_encode($foo);

So nothing to fix in here...

@SammyK
Copy link
Contributor Author

SammyK commented Sep 21, 2017

@jhdxr Nice catch! We'll have to think harder about this.

@bukka I see your point, but a segfault in any case should be fixed with a PHP error message. A possible solution would be to honor the $depth argument in this case.

Any more thoughts on this? :)

@jhdxr
Copy link
Member

jhdxr commented Sep 21, 2017

In fact I'm working on a similar issue: https://bugs.php.net/bug.php?id=74977

previously, my idea is to detect if there is a recursion. but as @bukka pointed out, it also limits some good recursion.

So, how about if we throw a stack overflow exception if we detected the level is too deep?

@SammyK
Copy link
Contributor Author

SammyK commented Sep 21, 2017

I like that idea. I'm still quite a newbie to internals dev so I'm not quite sure how we'd check to see how deep in the stack we are, but I'm guessing we might be able to access it via EG() somehow? Or I guess we could just add a new var to track how deep we are? :)

@SammyK
Copy link
Contributor Author

SammyK commented Sep 21, 2017

I'm not sure if we can harness the same type of error checking when something like this happens:

function foo() {
    foo();
}

foo();

On 3v4l it gives a memory error:

Fatal error: Allowed memory size of %d bytes exhausted (tried to allocate %d bytes) in %s on line %d

On my machine it gives a max function nesting error:

Fatal error: Maximum function nesting level of '%d' reached, aborting! in %s on line %d

Just throwing out some ideas. :)

@rmccullagh
Copy link
Contributor

If there is a defined limit on stack depth, there should be an ini setting that controls it.

@SammyK
Copy link
Contributor Author

SammyK commented Sep 21, 2017

@rmccullagh True that. I'm curious if we should be looking at a stack depth setting or look at the $depth argument in json_encode()? Or perhaps look to $depth first and then fall back to a max stack depth setting in the event someone does something like $depth = PHP_INT_MAX or something?

@@ -483,6 +483,14 @@ static int php_json_encode_serializable_object(smart_str *buf, zval *val, int op
}

if ((Z_TYPE(retval) == IS_OBJECT) &&
(Z_OBJ(retval) != Z_OBJ_P(val)) &&
ce->name == Z_OBJCE(retval)->name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you trying to compare strings here? ce->name is a zend_string, so if you want to compare the char values, == will only compare the addresses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! So if I wanted to compare zend_string's, would I change it to something like this: ZSTR_VAL(ce->name) == ZSTR_VAL(Z_OBJCE(retval)->name)?

Copy link
Contributor

@rmccullagh rmccullagh Sep 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use zend_string_equals, which takes 2 zend_string * arguments. So I believe you could do: zend_string_equals(ce->name, Z_OBJCE(retval)->name) I'm not sure what Z_OBJCE(retval)->name expands to? Is it a zend_string *. or a char *?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh awesome - thanks so much! :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rmccullagh it's a zend_string *, Z_OBJCE returns a _zend_class_entry, and you can find the definition of the structure here. Anyway, I recommend zend_string_equals as well. If there is a wrapper method, then use it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already merged, but I just watched your video so here's some thoughts:

  1. If you're just trying to match the class, you can compare the ce only. This will be more correct than comparing names and (nominally) more performant. if ((Z_TYPE_P(retval) == IS_OBJECT) && (Z_OBJCE_P(retval) == ce)) { ... }

  2. The segfault is still easily triggerable via this script. As you mention in your video, there's a much bigger task involved in fixing this properly.

class A implements \JsonSerializable {
  public function jsonSerialize() {
    return new B;
  }
}

class B implements \JsonSerializable {
  public function jsonSerialize() {
    return new A;
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgolemon this pr has been closed without merge. and the example is exactly what I gave here. BTW, what do you mean by video? I don't see any video mentioned here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sgolemon! I'll update my blog with your comment. :)

@jhdxr I made a little screencast of how I found & patched this bug. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhdxr Oh, heh. I should really read the conversation first. I was only looking at the code review comments and interpreted the "Closed" as merged because we don't really merge via github. :p

@jhdxr
Copy link
Member

jhdxr commented Sep 21, 2017

On my machine it gives a max function nesting error:
Fatal error: Maximum function nesting level of '%d' reached, aborting! in %s on line %d

It's an error generated by xdebug.

I'm curious if we should be looking at a stack depth setting or look at the $depth argument in json_encode()?

IMHO, using $depth of json_encode is enough. It's set to 512 by default, and user cannot bypass this limit.

@bukka
Copy link
Member

bukka commented Sep 24, 2017

@SammyK Well depth is just for the depth of the final encoded data but recursive call doesn't have to cause extra depth. Consider following code:

<?php

class Foo implements JsonSerializable {
    private $i;

    public function __construct($i) {
        
        $this->i = $i;
    }
    
    public function jsonSerialize() {
        if ($this->i === 0) {
            return 1;
        }
        echo "do something\n";
        return new self($this->i - 1);
    }
}

$foo = new Foo(2);

echo json_encode($foo);

I'm not saying this is the right thing to do but it's a perfectly valid code. In this case the final depth is 0 and you still have recursion...

I think this is a much wider problem and it needs a more generic solution.

@SammyK
Copy link
Contributor Author

SammyK commented Sep 24, 2017

It's an error generated by xdebug.

Ah, gotcha - thanks. :)

IMHO, using $depth of json_encode is enough. It's set to 512 by default, and user cannot bypass this limit.

The user can change the depth in the third argument of json_encode() by doing something like this:

$json = json_encode($value, 0, PHP_INT_MAX);

So that's the other case I was wanting to protect against. :)

@SammyK
Copy link
Contributor Author

SammyK commented Sep 24, 2017

I think this is a much wider problem and it needs a more generic solution.

Gotcha - so you think something like a generic max stack ini setting or something that could apply to all cases that have this issue?

@bukka
Copy link
Member

bukka commented Sep 24, 2017

I think you might want to read this first: #290 :)

@nikic
Copy link
Member

nikic commented Oct 5, 2017

Closing this per the above discussion.

If someone wants to take a shot at fixing the root problem, my suggestion would be to look into stack guard pages. We may be able to catch segmentation faults caused by stack overflow an display a more user-friendly error message.

@nikic nikic closed this Oct 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants