Allow variable strings to be defined for a string offset #5013

Girgias · 2019-12-15T19:24:51Z

This stops truncating strings with multiple bytes to one byte and using the first byte as the char to replace the string offset.

At the same time allow to pass an empty string to remove the byte from the string.

This partially invalidates Bug 71572.

nikic · 2019-12-15T19:38:10Z

This is an interesting idea... implicit truncation is definitely not good, but I'm not totally sure if I like these semantics better than making the assignment of multi-char strings an error.

Girgias · 2019-12-15T19:45:36Z

This is an interesting idea... implicit truncation is definitely not good, but I'm not totally sure if I like these semantics better than making the assignment of multi-char strings an error.

I'll bring it up on internals for discussion seems like the best way forward to see what people think.

This stops truncating strings with multiple bytes to one byte and using the first byte as the char to replace the string offset. At the same time allow to pass an empty string to remove the byte from the string. This partially invalidates Bug 71572.

Because I don't know how to fix the error with the char * defintion This reverts commit 1c9a8aa.

Due to some null bytes disapering and I have no idea why. This reverts commit 236c583.

claudepache · 2019-12-16T10:20:03Z

Problem: What is the expected result of

$str = "あello world"; // utf-8-encoded
$str[0] = "H";

? I guess, a non-surprising result is not really possible, because it would need the knowledge of the expected encoding of the string.

Girgias · 2019-12-16T10:28:11Z

Problem: What is the expected result of
$str = "あello world"; // utf-8-encoded
$str[0] = "H";
? I guess, a non-surprising result is not really possible, because it would need the knowledge of the expected encoding of the string.

You'll just get a byte stream which doesn't have a correct encoding. This is the case currently and has been since forever. This doesn't change any of the behaviour on write.

Now if one day we have unicode (or some sort of it) support then I would expect the codepoint to be altered and not the byte.

cmb69 · 2019-12-19T12:39:31Z

You'll just get a byte stream which doesn't have a correct encoding.

That. And I'd rather not "improve" support for something that can't generally work; instead I'd let multibyte assignments fail.

Girgias · 2019-12-19T17:01:02Z

You'll just get a byte stream which doesn't have a correct encoding.

That. And I'd rather not "improve" support for something that can't generally work; instead I'd let multibyte assignments fail.

ACK.

Closing this in favour of an implementation which adds an explicit warning for multi bytes values.
Will try to code this when I've got time.

Girgias force-pushed the variable-str-length-str-offset-replacement branch 3 times, most recently from 855242b to f5efc92 Compare December 16, 2019 05:57

Girgias added 7 commits December 16, 2019 09:08

Possible fix for JIT

f898af9

Revert "Possible fix for JIT"

d1c762b

Because I don't know how to fix the error with the char * defintion This reverts commit 1c9a8aa.

New implementation (without leaks)

236c583

Revert "New implementation (without leaks)"

7ed0a4d

Due to some null bytes disapering and I have no idea why. This reverts commit 236c583.

Fix leaks

7e61e6c

Add JIT implementation

0b6d52c

Girgias force-pushed the variable-str-length-str-offset-replacement branch from f5efc92 to 0b6d52c Compare December 16, 2019 09:17

Girgias closed this Dec 19, 2019

Girgias deleted the variable-str-length-str-offset-replacement branch January 7, 2020 20:20

Girgias mentioned this pull request Jan 7, 2020

Add warning and convert to exception in string offset assignment: #5063

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow variable strings to be defined for a string offset #5013

Allow variable strings to be defined for a string offset #5013

Uh oh!

Girgias commented Dec 15, 2019

Uh oh!

nikic commented Dec 15, 2019

Uh oh!

Girgias commented Dec 15, 2019

Uh oh!

claudepache commented Dec 16, 2019

Uh oh!

Girgias commented Dec 16, 2019

Uh oh!

cmb69 commented Dec 19, 2019

Uh oh!

Girgias commented Dec 19, 2019

Uh oh!

Uh oh!

Allow variable strings to be defined for a string offset #5013

Allow variable strings to be defined for a string offset #5013

Uh oh!

Conversation

Girgias commented Dec 15, 2019

Uh oh!

nikic commented Dec 15, 2019

Uh oh!

Girgias commented Dec 15, 2019

Uh oh!

claudepache commented Dec 16, 2019

Uh oh!

Girgias commented Dec 16, 2019

Uh oh!

cmb69 commented Dec 19, 2019

Uh oh!

Girgias commented Dec 19, 2019

Uh oh!

Uh oh!