Skip to content

Clone with #9497

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed

Clone with #9497

wants to merge 5 commits into from

Conversation

kocsismate
Copy link
Member

@kocsismate kocsismate commented Sep 7, 2022

This PR used to provide implementation for https://wiki.php.net/rfc/readonly_amendments, but those changes were split off to #10389.

Now, this PR only implements the https://wiki.php.net/rfc/clone_with RFC.

@kocsismate kocsismate marked this pull request as draft September 7, 2022 15:39
Copy link
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable overall. I'm a bit unsure if unset should be allowed because that also means the property can be uninitialized after clone but I guess that would be consistent with new.

@kocsismate kocsismate changed the title WIP: Add support for cloning readonly properties Add support for cloning readonly properties Oct 6, 2022
@kocsismate kocsismate marked this pull request as ready for review October 6, 2022 11:40
@kocsismate
Copy link
Member Author

My latest commit adds support for "clone with". It seems to be working for me, but I'd appreciate any review and testing. cc @nicolas-grekas

@kocsismate kocsismate force-pushed the readonly-clone branch 2 times, most recently from a65ac9b to 3189938 Compare October 6, 2022 12:15
@nicolas-grekas
Copy link
Contributor

nicolas-grekas commented Oct 7, 2022

Thanks for working on this!

I'm wondering: how is __clone() going to interact with clone-with? Will __clone() be called before clone-with properties are set? I guess it would make sense this way. Can you add a test case for this behavior please?

I've already told so on other channels: I personally think that the proposed "clone-with" syntax should be split from the patch that unlocks reinitializing readonly properties during cloning. The reason is that the new syntax might raise a lengthy discussion, while reinitializing readonly properties during cloning should be consensual since it's fixing a limitation that has not reasons to exist.

Copy link
Contributor

@TysonAndre TysonAndre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test implementations should be run twice in the PHP --TEST-- body itself (expecting the same output twice) to properly test the optimized handlers - the unoptimized implementation of ZEND_INIT_OBJ is what's being tested right now, not the optimizations, which seem to have bugs

Also, there could be more tests where clone with throws

  1. TypeError from incompatible properties (twice)
  2. An exception from clone with overwriting a declared untyped property that has __destruct - i.e. clone function_return_obj_with_myProperty_val_throwing_in_destructor() with {myProperty: new stdClass()}
  3. Same, but for #[AllowDynamicProperties] and a property with no declaration

if (UNEXPECTED(RETURN_VALUE_USED(opline))) {
ZVAL_COPY(EX_VAR(opline->result.var), value);
}
ZEND_VM_C_GOTO(exit_init_obj);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be ZEND_VM_C_GOTO(free_and_exit_init_obj) instead, like the above branch?

Can the label for exit_init_obj be deleted entirely? I'd hope so.

If this opcode were to throw, then php relies on the temporary value being placed in the return value's temporary variable to free it.

I think this isn't reached in any of the unit tests,
and the error handler checking for ZEND_INIT_OBJ is why this passes?

For many of the tests, they should be run in a PHP for loop to run twice, to check that they do the same thing twice in a row with the same run time cache.

					if (UNEXPECTED((Z_PROP_FLAG_P(variable_ptr) & IS_PROP_REINITABLE)) ||
						(UNEXPECTED(EG(current_execute_data) && EG(current_execute_data)->opline &&
						EG(current_execute_data)->opline->opcode == ZEND_INIT_OBJ))

Copy link
Member Author

@kocsismate kocsismate Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same question arises: does OP_DATA have to be freed or not? And why ZEND_ASSIGN_OBJ doesn't do it?

ZEND_VM_C_LABEL(init_object):
zobj = Z_OBJ_P(object);
void **cache_slot = CACHE_ADDR(opline->extended_value);
if (EXPECTED(zobj->ce == CACHED_PTR(opline->extended_value))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (EXPECTED(zobj->ce == CACHED_PTR(opline->extended_value))) { - This code path will only be reached the second time you execute code.

So for many of the tests, they should be run in a for loop to run twice, to check that they do the same thing twice in a row.

The --repeat option won't help, since I'm pretty sure the run time cache is distinct for separate php requests

It still makes sense to have, since the optimized code will be faster in applications that call a given php clone with expression multiple times

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fast path really needed? That might just might increase cache pressure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yes, good question! I'm not yet sure... I guess if the opcode is not used often then the gain with the cache is not much. In the same time, if the cache is not used much then it doesn't cause a big pressure (?). Anyway, I'm open to remove this path (given the RFC passes) :)

@kocsismate kocsismate changed the title Allow modifying readonly properties during cloning Clone with Mar 3, 2023
@theodorejb
Copy link
Contributor

I think there needs to be a test to ensure that clone with cannot be used on public readonly properties from outside the class.

@kocsismate
Copy link
Member Author

I think there needs to be a test to ensure that clone with cannot be used on public readonly properties from outside the class.

Makes sense, thanks for the idea! So far only the private scope of non-readonly properties has been tested.

@dstogov
Copy link
Member

dstogov commented Apr 17, 2023

Why can't we simple reuse constructor?

function withB($b) {
    return new self(a: $this->a, b: $b, c: $this->c); 
}

@KapitanOczywisty
Copy link

I might be not popular with this, but did you consider using an array syntax for with ... part? Maybe even accepting a variable? Proposed syntax introduces another way to write array-like data, which might get confusing. Though If currently proposed syntax passes it could be a good candidate for stdClass constructor. Thanks for the RFC

@iluuu1994
Copy link
Member

Wasn't the idea to pass the arguments to __clone? This would be more flexible (allow passing things that aren't properties, e.g. bool $deepCloneFoo), allow for validation of the provided values, and play nicely with readonly properties given that they can now be re-assigned.

@Crell
Copy link
Contributor

Crell commented Apr 18, 2023

@iluuu1994 Passing arguments to __clone() is something I experimented with here, and found it to be absolutely godawful in practice: https://peakd.com/hive-168588/@crell/object-properties-part-2-examples

@dstogov That mandates that the constructor be a 1:1 map of the properties of the object. That's only viable in a small fraction of use cases. It also means that every time you increase the number of properties, you also add another line to every with*() method. See the "Value objects" section of https://peakd.com/hive-168588/@crell/object-properties-and-immutability; you can skip to the "PSR-7" link if you want to jump straight to the argument itself.

@kocsismate
Copy link
Member Author

Thank you @Crell for your reply! I completely agree with these arguments. :)


if (UNEXPECTED(is_readonly)) {
is_clone_with_op = EG(current_execute_data) && EG(current_execute_data)->opline &&
EG(current_execute_data)->opline->opcode == ZEND_CLONE_INIT_PROP;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this re-use the IS_PROP_REINITABLE flag? Otherwise, any path that currently checks for IS_PROP_REINITABLE needs to be adjusted (see JIT).

ZEND_VM_C_LABEL(init_object):
zobj = Z_OBJ_P(object);
void **cache_slot = CACHE_ADDR(opline->extended_value);
if (EXPECTED(zobj->ce == CACHED_PTR(opline->extended_value))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fast path really needed? That might just might increase cache pressure.

/* Unlocks readonly properties during cloning for exactly one assignment. It has no impact on regular properties. */
#define IS_PROP_REINITABLE (1<<1)
/* Re-locks readonly properties during "clone with" operations. It has no impact on regular properties. */
#define IS_PROP_REINITED (1<<2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between IS_PROP_REINITED and the absence of the IS_PROP_REINITABLE flag? Do we need distinction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the distinction is very subtle, but I didn't manage to come up with an easier to understand solution. Basically, the difference between them is the following:

  • IS_PROP_REINITABLE indicates that a property is modifiable by any operation because __clone() unlocked the possibility. Then - after __clone() finishes, the possibility is gone and the flag is removed.
  • IS_PROP_REINITED indicates that a property is not modifiable anymore in the ZEND_CLONE_INIT_PROP opcode. Here, we cannot rely on the absense of IS_PROP_REINITABLE because we cannot make these properties IS_PROP_REINITABLE in advance. Therefore we can only re-lock them when the assignment happened.

kocsismate and others added 2 commits March 24, 2024 23:09
Co-Authored-By: Arnaud Le Blanc <arnaud.lb@gmail.com>
@rraat
Copy link

rraat commented Jun 26, 2024

Perhaps I'm dumb but wouldn't it be much easier to just redesign "clone $object" to be handled like "new class"?
the __clone method would then basically become an instance constructor similar to __construct

e.g.

class Foo{
   public readonly string $id;
   public readonly string $name;
   
   public function __construct(string $name){
        $this->id = spl_object_hash($this);
        $this->name = $name;
   }
   
   public function __clone(string $name){
        $this->id = spl_object_hash($this);
        $this->name = $name;
   }
}

$bar = new Foo("what's my");
$foo = clone $bar("name");

cloning would then still be allowed anywhere without the need to check if it's inside a method of the class to be cloned
what logic is in __clone and how awful you make it is independent of this implementation.

the godawful ness as per below is a result of it's usage not the implementation of having arguments on ___clone

@iluuu1994 Passing arguments to __clone() is something I experimented with here, and found it to be absolutely godawful in practice: https://peakd.com/hive-168588/@crell/object-properties-part-2-examples

One could put complex the logic that would be required in both __construct and __clone in a separate function as already allowed in __construct()

@Wirone
Copy link

Wirone commented Jun 26, 2024

@rraat the thing is your example is simplified too much. In real world you will have multiple public readonly properties in a single class and you don't want to make __clone() signature with all those fields listed explicitly. Especially that some of them can be promoted, and some created as a result of some arbitrary logic in the constructor.

Cloning is done in different contexts, sometimes you want to override property X, sometimes Y. With explicit signature you would need to pass all the values every time you want to clone an object. The approach with with allows you to cherry-pick properties you want to override (the other ones are simply cloned):

final readonly class Foo {
    public int $x;
    public string $y;
    public array $z;

    public function __construct(
        public int $a,
        public int $b,
        public string $c,
        public string $d,
        public array $e,
        public array $f
    ) {
        $this->resolve();
    }

    // You can't use `= null` for each argument to make it optional
    // since it's ambiguous (you want to assign `null` or skip the overriding?)
    public function __clone(
        public int $a,
        public int $b,
        public string $c,
        public string $d,
        public array $e,
        public array $f
    ) {
        $this->resolve();
    }

    private function resolve() {
        $this->x = $this->a + $this->b;
        $this->y = $this->c . $this->d;
        $this->z = array_merge($this->e, $this->f);
    }
}

$foo = new Foo(1, 2, '3', '4', [5], [6]);

// Superfluous and cumbersome
$bar = clone $foo($foo->a, $foo->b, $foo->c, $foo->d, $foo->e, [7]);

// Short and clear
$baz = clone $foo with ['f' => [7]];

As you can see, even with extracting the logic to a method shared between __construct() and __clone() it's still better to use with as it does not require listing properties that are not touched during cloning.

@rraat
Copy link

rraat commented Jun 27, 2024

@Wirone

"clone ... With []" vs "clone ... ()" doesn't really make much difference it's about the scope where and when things are allowed/unlocked, the point is more about having the changed fields send to __clone or not.

when not sending the fields to update to __clone

  • "clone" would still be allowed outside the class and would still need to "unlock" the readonly properties inside the __clone method (deep cloning)
  • "clone with" would only be allowed inside a class and only on properties that are defined explicitly on that class including properties that, when extending a class exist on the base class but are overridden in the extending class.
    If you do want to change properties of the base class one would need to call a method on the parent:: first and then perform the clone on that (clone twice)

This creates a situation where "clone" and even "clone with" has different behaviour depending on where it is used and on what members.
In addition when __clone is finally being called it would know nothing about any "when" changes that are already applied.

when fields to update ARE send to __clone

  • "clone" can be done from anywhere.
  • "clone with" can be done from anywhere.

__clone would be the only place that readonly properties would need to be unlocked (just like in __construct) and would only be allow to change properties that would normally be allowed to be changed in __construct, to update properties on the parent one could call parent::__clone (just like parent::__construct)

How __clone is defined in a class would make the detection of provided or default possible or not
defining it as __clone(string $foo=null, string $bar = null) is one way, __clone(...$properties) another
how to use it and what is practical in what situation would depend on the implementation of the class.
redefining it signature as __clone(array $properties=[]) like __set_state does would also make sense

It seems just so much simpler to implement, explain and use, to have __clone handle cloning then doing it in both __clone and outside of it (all over the place).

Finally

You can make __clone as complicated/simple as you want.

function __clone($with){
    foreach($with as $key=>$value) $this->{$key} = $value;
}

but then again... that might just be me.

p.s. "readonly clone ... with" is effectively the property equivalent of having interface with "get/with" methods (interfaces with with/get shouldn't need to use readonly properties as we are using the interface to access them)

@Tak-Pesar
Copy link

Isn't something like this added in version 8.4 ? :(

@Tak-Pesar
Copy link

Isn't this feature going to be added ? 😒

@kocsismate
Copy link
Member Author

No, I decided not to pursue this. Properties are already extremely difficult handling after some recent changes, and I don't want to complicate the situation even more - at least this is what I think now. Of course anyway is free to pick up the RFC and the implementation (which became quite much outdated in the meanwhile).

@kocsismate kocsismate closed this Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.