You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some iconv implementations allow you to append "//IGNORE" to the
target charset as an indication that it should skip certain
conversions. This is true for the two GNU iconv implementations, and
is apparently based on a Solaris extension. It has also recently been
added to POSIX 2024. Specifically, //IGNORE will cause iconv() to skip
input sequences that are valid but cannot be translated to the target
charset. It does not skip sequences that are invalid inputs; these
still cause an EILSEQ.
There is a minimal check for "//IGNORE" in ext/iconv/iconv.c, based on
a config.m4 check for ICONV_BROKEN_IGNORE. If the iconv implementation
does not support "//IGNORE", PHP will check for it in the target
charset string, and will attempt its own workaround. This workaround
is wrong for two reasons:
1. The string "//IGNORE" is left in the target charset, so if the
current iconv implementation doesn't understand it, it's just
going to fail. This is what happens on musl.
2. The PHP workaround looks for EILSEQ, and then attempts to skip
the untranslatable sequence. But with non-GNU iconvs (i.e. all
affected implementations), this will be backwards. In non-GNU
implementations, EILSEQ is raised only for invalid input
sequences, and not for untranslatable ones. Even in POSIX 2024,
EILSEQ is reserved for input sequences that are invalid rather
than untranslatable. In other words, sequences that cause an
EILSEQ should never be ignored.
This commit removes the workaround. It's not quite right, and removing
it doesn't cause any new test failures on musl. It would be nice if
PHP could abstract the //IGNORE magic away from the user, but since it
has been added to POSIX 2024, I think the simplest thing to do is wait
it out. Eventually the other implementations will add support for it,
and then there will be no need to work around its absence.
0 commit comments